Files

Medusa Slockbower 3d4ea4398a - Setup Contexts to pull more info from the GPU

- Started outlining OpenGL implementation

2025-07-28 21:06:52 -04:00

24 KiB

Raw Blame History

Planning Documentation for fennec

Introduction
TODO
1. Security Ramblings
2. Platform Support
C++ Language
Math Library
Memory Library
Containers Library
Format Processing
Core
1. Tick
2. Frame
Platform Support Layer
Scene
2D Graphics
3D Graphics
1. Structures
2. Stages
3D Physics
Artificial Intelligence

Introduction

This file serves as a general planning document for engine structure, systems, pipelines, and implementation.

Implementations of core engine systems should strive to be O(1) in implementations, both in terms of runtime and memory performance. This is obviously not a realistic goal, so rather than the goal requiring the entire engine to be O(1), we should more specifically look at achieving O(1) performance on hot paths. I distinctly use 'strive' and 'goal' as different concepts, where designs should strive to accommodate function implementations for O(1), however the specifics of the implementation might not always be able to achieve that, so the end goal is that hot paths should be O(1).

Functions should be highly verbose and any bugprone or erroneous behaviour should throw assertions. DO NOT USE EXCEPTIONS.

System implementations should be independent of architecture or platforms. i.e. the code of the graphics system should not care if OpenGL or Vulkan is used and should not use any direct calls to OpenGL or Vulkan.

The engine should not care about the types of objects loaded from a so/dll. In fact, most of the code should be type independent. Any shared information among a collection of objects should be held either implicitly or explicitly in the super-class. It will be the responsibility of the linked code to initialize and cleanup the objects related to it. This principle should extend to the submodules of the engine.

It is also best to avoid objects having behaviour that is not defined by the system they are in. There are some exceptions in extensions or mods and should be given configurability and programmability within those systems and their stages. This however can be achieved using events at different stages of those engines that are on-demand.

TODO

2D Graphics (gfx2d)
2D Physics (physics2d)
2D & 3D Audio (audio)

File Security Ramblings:

Windows is starting to piss me off, so I am considering dropping official support for MSVC. MinGW and Cygwin will still work for compiling on Windows if this ends up being the case. The reason for this is that there are a lot of platform dependent security issues. MinGW and Cygwin wrap Linux and glibc headers for Windows, which would push the security onus onto the compiler and end-user.

The biggest blocker at the moment in terms of this is the filesystem. If we want to implement a filesystem that is safe across platforms, stdc++ and iso libc have no guarantees about the safety of their functions.

The crux of this issue falls at the following specific behaviour:

User selects an existing file to write to
Application interface confirms overwrite action
Application writes to the file after confirmation

A threat actor can introduce a malicious file or symlink to the file that was attempted access between the check and
usage of the file. This is called TOCTOU (time of check, time of use).

This issue can be solved using fopen("<file>", "a+") and ftell, however this specific behaviour is not intuitive to
those first learning how to work with file systems. We can attempt to abstract this away with another wrapper, or simply
write the file structure to handle this behaviour properly. The downside to this method overall is that it will break
common conventions of how humans interpret filesystems and the related control flow logic. What we can do is force the
'+' flag to always be present for write operations, and raise an error when desired, if the file is not empty. This
unfortunately would have the downside of being unable to open a file as write only.

Using "wx" in this instance would not be sufficient since it would require a second call to fopen, which would
create the conditions for the TOCTOU error described above.

Another issue arises when we are parsing a directory tree. The best we can do is take ownership of the directory that
is opened as the root. However, this requires dirent.h which is not implemented in MSVC. A custom implementation of
dirent.h may be written for MSVC, however this is one of the few things I am not willing to outsource to another
library. Developing our own implementation would take a non-insignificant amount of time, between writing the library,
debugging it, and testing for vulnerabilities. As stated above, this implementation is native to MinGW and Cygwin,
so we would not have to entirely drop support for Windows. However, MSVC is the most widely used compiler for Windows
applications and is native to Visual Studio and VSCode.

What is probably the best solution is to wrap everything in a file interface that does not allow the direct setting of
these flags. Then we set our own usage type for the file that informs which flags should be used.

We need to be able to handle the following types of files:

Assets, such as scenes, audio, textures, metadata, meshes, etc.
Save files, setting files, etc.

One of the nice things about the assets is that they are guaranteed to be read-only once an application is installed
on the computer of the end-user. Therefore, this issue only arises with save files and custom file formats.

When the editor is run, all these files should be opened in read/write mode.

Naming conventions should exist for the types of files and how they are read. For example, in release mode,
most assets should be opened once, and then closed immediately. However, this does not make sense for formats
that are continuous and too large to be kept around in memory, such as video formats.

Perhaps the following conventions:

Static Asset
Stream Asset
Resource

We can turn this into an object-oriented approach by having different formats inherit these base types. We may still
have a base file type that wraps C functionality, but discourage developers from using the interface.

We could also declare the file interface extern so that only internal files know the implementation. However, I would
not be satisfied by doing this since it would prevent developers from implementing custom file type implementations.

Conserving memory is not really an issue here as long as we are smart about our implementation. Files should only be
open when necessary and be closed when it is no longer necessary to have them open. Data should be streamed unless the
all the data in the file is required.

When built in release mode, we also need to pack static assets into some sort of archive that is mountable to reduce
disk space consumption of a program.

I was considering encryption for archives, however it does not make much sense. Assuming someone intends to pirate the
game, there is not much stopping them from running the files. I will add Steam support at some point which would allow
you to use Steam's DRM to prevent the executable from being run. Otherwise, there is no point in attempting to encrypt
game files. Even Unreal PAK files can be cracked in seconds, and even if I managed to write something that cannot be
trivially cracked locally, you can scrape most assets from the GPU and Audio Card.

I have managed to solve the specific case provided at the top of this section, which was done by wrapping C I/O calls
into a file wrapper. This wrapper can handle a few different mode flags and has specific conditions for the flags. See
the documentation for fennec/fproc/io/file.h for more info.

One question remains unanswered on this front; should a read/write file open as r+ or a+. rewind is slightly faster
than fseek(SEEK_END), however for the case of save files and editor assets, r+ makes more sense from a usage perspective

Directories remain an issue, with dirent.h being the only sensible option at time of writing. The issue with using
dirent.h boils back down to security issues on Windows. However, the only option is to write a custom implementation
for MSVC.

Platform & API Support

I have decided to forgo SDL, this is so the engine can provide specific support for specific platforms. Also, SDL implements a lot of things that will need to be implemented specifically for the engine, so only the window management would be used.

Platform support will be implemented in the following order:

Linux/BSD
- Wayland
  - OpenGL (EGL) ✔
  - XKB
  - PulseAudio
  - Vulkan
- X11
  - ALSA
  - Vulkan
Microsoft Windows
- XInput
- OpenGL (WGL)
- WASAPI
- Vulkan
Android
- OpenGL ES
- AAudio
- openslES
macOS/iOS
- cocoa
  - OpenGL
  - Core Audio
  - Vulkan
  - Metal

Linux Wayland will be implemented first. Once setup, the core engine will be implemented and tested on top of Wayland. Once the engine is in a stable state, then support for other platforms will be resumed.

Most consoles will never get official platform support due to NDAs which conflict with the principles of this engine. fennec will avoid using proprietary libraries except when strictly necessary, such as support for Windows and MacOS. fennec will interact with any drivers required for the listed operating systems above, even if proprietary.

C++ Language Library (`lang`)

Implement header files for standard functions relating to the C++ Language.

So far this is implemented on an as-needed basis. A full implementation should be worked on continuously.

Math Library (`math`)

Implement math functions according to the OpenGL 4.6 Shading Language Specification.

"Extensions" has a different meaning here. Extensions for the math library are any functions that are not defined within the Specification.

Additional extensions should be implemented to provide standard definitions for functions predominantly related to Linear Algebra, Mathematical Analysis, and more specifically Discrete Analysis. Additional extensions will be implemented on an as-needed basis.

Memory Library (`memory`)

Implement headers related to memory allocation in C++.

Smart Pointers
- Unique Pointer
- Shared Pointer

Memory Allocation
- Allocation

Containers Library (`containers`)

All containers of the C++ Standard Library should be implemented.

Here are essential data-structures not specified in the C++ stdlib:

Graph → AI graph
- Necessary for 2D and 3D navigation.
Rooted Directed Tree → Scene rd_tree
- Defines the scene structure.

Format Processing (`fproc`)

This library contains information for any data that is formatted. This includes basic string formats, file formats, and eventually programming languages

fennec should be able to use Doxygen and LaTeX externally. Consider including binaries with releases.

Notes

String Analysis (fproc/strings)
- Search
- Manipulation
- Delimiting
- Regex

File Formats (fproc/formats)
- Serialization
  - JSON
  - HTML
  - XML
  - YAML
- Configuration
  - INI
  - TOML
- Documents
  - ODF
  - Markdown
  - PDF
- Spreadsheets & Tables
  - ODS
  - CSV
- Audio Formats
  - MP3
  - WAV
  - AAC
- Graphics Formats
  - Textures
    - BMP
    - DDS
    - JPG
    - PNG
    - TIFF
  - Vectors
    - OTF
    - SVG
    - TTF
  - Models
    - FBX
    - Wavefront OBJ
- Video Formats
  - MP4
  - AVI
  - MPG
  - MOV

TODO LATER

Compilation (fproc/code)
- Lexical Analysis
- Syntax Analysis
- Semantic Analysis
- Intermediate Code Generation
- Optimization
- Target Code Generation

Core (`core`)

This will be the core of the engine.

Event System
- Most events will fire at the start of the next tick, especially those related to physics and input.
- Events for graphics or audio should propagate immediately.
- Events for stages should also propagate immediately, this is to support extensions and mods.
Core Engine Loop
- System Manager
- Ticks vs. Frames

The following systems are not essential to the core engine, but are instead major systems that should be defined in their operation order:

Tick

Update
- Events
- Scripts
- AI
Physics
- Newtonian Commit
  - Apply Forces (Updates Accelerations)
  - Acceleration (Updates Velocities)
  - Apply Velocities (Updates Position and Rotation)
- Constraint Resolution
- Collision Detection
- Collision Resolution
- Collision Response
  - Calculate Forces & Velocities
  - Queue events for next tick

Frame

Physics
- Physics Interpolation
Graphics
- 2D Graphics
- Generate 3D Mask
- 3D Graphics
Audio

Platform Support Layer (`platform`)

This is the core part of platform support for fennec. All necessary drivers
and OS specific functionality will be wrapped up nicely into these interfaces.

See implementation order here

Scene (`scene`)

In-Array Directed Tree
- Elegant method for providing O(1) insertions and O(log(n)) deletions.
Bounding Volume Hierarchy
- Octree

2D Graphics (`gfx2d`)

Links:

Object Structure. The mesh is implicit data.

Structures (`gfx2d`)

For the 2d rendering framework, Materials need to be rendered independently because we have
no size constraints for images. This disallows us from using a meta-shader like in
the 3d rendering framework.

struct Object
{
    vec2 location, scale; // A matrix would be 36 bytes, this is instead 20 bytes
    float rotation;
}

BVH
- Quadtree
  - Leaf Size and Tree Depth should be calculated by the scene, constraints are as follows:
    - Min Object Size
    - Max Object Size
    - Scene Center
    - Scene Edge
  - Insertions and Updates are done on the CPU
  - Nodes
    - Start Index 32-bits
    - Object Count 32-bits
  - Objects
    - Buffer of Object IDs grouped by Octree Node
- Culling
  - Starting at each Octree Leaf, traverse upwards.
  - Insert Visible Leaf IDs
    - Track using atomic buffer
- Generate the Command Buffer for Culled Meshes from the Visible Leaf Buffer
  - Count Materials
  - Count Meshes per Material
- Generate the Culled Object Buffer by copying objects from the Object Buffer
  - Adjust Buffer Size using the counts
  - Insert using another atomic buffer
Translucent objects will be sorted. We can cheat by using a z-index instead of a z-coordinate.
This will allow us to sort objects as they are created. We can still bulk render each z-index,
with meshes and objects being grouped by material.

3D Graphics (`gfx3d`)

Links:

DirectX will never have official support. If you would like to make a fork, have at it, but know that I will hold a deep disdain for you.

The graphics pipeline will have a buffer with a list of objects and their rendering data.
This will be referred to as the Object Buffer. There will be two, for both the Deferred and Forward Passes.

The buffers will be optimized by scene prediction.
This involves tracking the meshes and textures directly and indirectly used by a scene.
A callback function in the graphics system for scene loading can do this.

Materials and Lighting models will be run via a shader metaprogram to make the pipeline independent of this aspect. This allows the GPU to draw every single deferred rendered mesh in a single draw call for each stage of the renderer.

Specifications for debugging views via early breaks are included in the stages.

There will be three profiles for OpenGL implementation:

modern
fallback
legacy

All profiles will have the same feature set, however their implementations will differ. The modern context will use up-to-date features to get as much performance out of the pipeline as possible.

Structures (`gfx3d`)

Object Structure. The mesh is implicit data.

struct Object
{
    vec3 location, scale; // A matrix would be 64 bytes, this is instead 28 bytes
    quat rotation;
    int material;
}

Textures for 3D rendering are stored in various buffers with sizes of powers of 2. Ratios of 1:1 and 2:1 are allowed. The 2:1 ratio is specifically for spherical and cylindrical projection. UVs may be transformed to use a 2:1 as if it were 1:2. Cubemaps may only be 1:1, I would be concerned if you are using any other ratio.

8-Bit R Texture 4096, 2048, 1024, 512 (8)
8-Bit RG Texture 4096, 2048, 1024, 512 (8)
8-Bit RGB Texture 4096, 2048, 1024, 512 (8)
8-Bit RGBA Texture 4096, 2048, 1024, 512 (8)
8-Bit RGB Cubemap 1024, 512, 256, 128 (4)

16-Bit HDR RGB Texture 4096, 2048, 1024, 512 (8)
16-Bit HDR RGBA Texture 4096, 2048, 1024, 512 (8)
16-Bit HDR RGB Cubemap 1024, 512, 256, 128 (4)

16-Bit Shadow Texture 4096, 2048, 1024, 512 (8)
16-Bit Shadow Cubemap 2048, 1024, 512, 256 (4)

Documentation should provide guidelines on categories of Art Assets and the resolution of textures to use.

Textures are identified by an 8-bit integer and 16-bit integer.

int8 → the texture buffer
int16 → the layer in the buffer

Artists should be informed on the texture structure of the engine and its limitations. However, these principles should be followed in other game engines as these are guided by what is most efficient for typical GPU hardware.

Materials are, for the most part, user-defined. Documentation should make the user aware of this. Material buffers will be a sequence of the Material Struct instances. They will at the very least contain the id of their shader.

Stages (`gfx3d`)

This is the set of stages for the graphics pipeline that runs every frame: Unless otherwise specified, each stage will be run on the GPU.

BVH
- Octree (8 Bpn, 64 bpn) [6-Layers ≈ 2.1MB]
  - Leaf Size and Tree Depth should be calculated by the scene, constraints are as follows:
    - Min Object Size
    - Max Object Size
    - Scene Center
    - Scene Edge
  - Buffer has implicit locations due to the tree having 8 children.
  - Insertions and Updates are done on the CPU
  - Nodes
    - Start Index int32
    - Object Count int32
  - Objects
    - Buffer of Object IDs grouped by Octree Node
- Leaf Culling
  - Starting at each Octree Leaf, traverse upwards.
  - Insert Visible Leaf IDs
    - Track using atomic buffer
- Generate the Command Buffer for Culled Mesh LODs from the Visible Leaf Buffer
  - Track counts using atomic buffers
  - To avoid double counting due to the construction of the Octree output, we have some options
    - Ignore Leaf Instances based on occurrences of the mesh in the surrounding 8 Quadtree Leaves. This would require a bias towards a specific corner of the filter.
    - Perform a preprocessing step on the CPU to erase duplicate elements and fix the buffer continuity.
    - Let the duplicates be rendered.
  - Generate the Culled Object Buffer with the respective object IDs
    - Adjust Buffer Size using the counts
    - Insert by reusing the count buffer, clipped to only contain used meshes

Debug View: Object ID, Mesh ID, LOD

Visibility
- Buffer (15 Bpp, 120 bpp) [1920x1080] ≈ 39.4MB
  - Depth Buffer → D24
  - Visibility Info → RGB32I
    - R = Object ID
    - G = Mesh ID
    - B = Material ID
- Regenerate the Command Buffer for Visible Mesh LODs
- Regenerate the Culled Object Buffer

Debug View: Visibility Buffer

G-Buffer Pass (17 Bpp, 136 bpp) [1920x1080] ≈ 35.3MB
- Depth - Stencil → D24_S8
  - S → used to represent the lighting model.
- Diffuse → RGBA8
  - A → Ambient Occlusion
- Emission → RGB8
- Normal → RGB8
- Specular → RGB8
  - R → Roughness
  - G → Specularity (sometimes called the Metallicness)
  - B → Index of Refraction (IOR)

Debug View: Depth, Stencil, Diffuse, Emission, Normal, Specularity

Deferred Lighting Pass (10 Bpp, 80 bpp) [1920x1080] ≈ 2 x 16.3MB + 8.3MB ≈ 24.6MB
- Depth Buffer → D24
- Lighting Buffer → RGB16 (w/ Mipmapping when Bloom or DoF are enabled)
- Stencil Buffer $rarr; S8
- Generate Dynamic Shadows
- Generate Dynamic Reflections (Optional)
- SSAO (Optional)
- Apply Lighting Model

Debug View: Shadows, Reflections, SSAO, Deferred Lighting

Forward Pass
- BVH, Same as Above
- LOD Selection, Same as Above
- Translucent Materials
  - Dual Depth Peeling

Debug View: Forward Mask

Post Processing
- Depth of Field (Optional)
  - When enabled, the Visiblity Buffer, G-Buffer, and Deferred Lighting Pass will be double layered.
  - At this point the Lighting Buffers will be Flattened
- Bloom (Optional) → Mipmap Blurring (6Bpp, 48bpp) [1920x1080] ≈ 16.3MB
- Tonemapping (Optional)
- HDR Correction

3D Physics `(physics3d)`

Links:

Systems

Rigid Body Physics
- Newtonian Physics and Collision Resolution
- Articulated Skeletal Systems
  - Inverse Kinematics
  - Stiff Rods

Particle Physics

Soft Body Physics
- Elastics → Finite Element Simulation
- Cloth → Position-Based Dynamics
- Water
  - Oceans → iWave
    - Reasoning: iWave provides interactive lightweight fluid dynamics suitable for flat planes of water.
  - 3D Fluid Dynamics → Smoothed-Particle Hydrodynamics
    - Reasoning: This is the simplest method for simulating 3D bodies of water. This should exclusively be used for small scale simulations where self-interactive fluids are necessary. I.E. pouring water into a glass.
  - 2D Fluid Dynamics → Force-Based Dynamics
    - Reasoning: This model, like iWave, provides lightweight interactive fluid dynamics, but is more easily adapted to flowing surfaces such as streams and rivers.

Artificial Intelligence (`ai`)

This artificial intelligence method only differs in static generation between 2D and 3D. The solvers are dimension independent since they work on a graph.

The general process is;

Static:

generate a static navigation graph (sometimes called a NavMesh)

Update:

resolve dynamic blockers
update paths using dijkstra's algorithm
apply rigid-body forces with constraints

The update loop for artificial intelligence should only update every n ticks. Where n <= k, with k being the tick rate of the physics engine.

24 KiB Raw Blame History

Planning Documentation for fennec

Table of Contents

Introduction

TODO

File Security Ramblings:

Platform & API Support

C++ Language Library (lang)

Math Library (math)

Memory Library (memory)

Containers Library (containers)

Format Processing (fproc)

Notes

Core (core)

Tick

Frame

Platform Support Layer (platform)

Scene (scene)

2D Graphics (gfx2d)

Structures (gfx2d)

3D Graphics (gfx3d)

Structures (gfx3d)

Stages (gfx3d)

3D Physics (physics3d)

Artificial Intelligence (ai)

24 KiB

Raw Blame History

C++ Language Library (`lang`)

Math Library (`math`)

Memory Library (`memory`)

Containers Library (`containers`)

Format Processing (`fproc`)

Core (`core`)

Platform Support Layer (`platform`)

Scene (`scene`)

2D Graphics (`gfx2d`)

Structures (`gfx2d`)

3D Graphics (`gfx3d`)

Structures (`gfx3d`)

Stages (`gfx3d`)

3D Physics `(physics3d)`

Artificial Intelligence (`ai`)