24 KiB
Planning Documentation for fennec
Table of Contents
- Introduction
- TODO
- C++ Language
- Math Library
- Memory Library
- Containers Library
- Format Processing
- Core
- Platform Support Layer
- Scene
- 2D Graphics
- 3D Graphics
- 3D Physics
- Artificial Intelligence
Introduction
This file serves as a general planning document for engine structure, systems, pipelines, and implementation.
Implementations of core engine systems should strive to be O(1) in implementations,
both in terms of runtime and memory performance. This is obviously not a realistic goal,
so rather than the goal requiring the entire engine to be O(1), we should more specifically look
at achieving O(1) performance on hot paths. I distinctly use 'strive' and 'goal' as different concepts, where designs
should strive to accommodate function implementations for O(1), however the specifics of the implementation might not always
be able to achieve that, so the end goal is that hot paths should be O(1).
Functions should be highly verbose and any bugprone or erroneous behaviour should throw assertions. DO NOT USE EXCEPTIONS.
System implementations should be independent of architecture or platforms. i.e. the code of the graphics system should not care if OpenGL or Vulkan is used and should not use any direct calls to OpenGL or Vulkan.
The engine should not care about the types of objects loaded from a so/dll. In fact, most of the code should be type independent. Any shared information among a collection of objects should be held either implicitly or explicitly in the super-class. It will be the responsibility of the linked code to initialize and cleanup the objects related to it. This principle should extend to the submodules of the engine.
It is also best to avoid objects having behaviour that is not defined by the system they are in. There are some exceptions in extensions or mods and should be given configurability and programmability within those systems and their stages. This however can be achieved using events at different stages of those engines that are on-demand.
TODO
- 2D Graphics (
gfx2d) - 2D Physics (
physics2d) - 2D & 3D Audio (
audio)
File Security Ramblings:
Windows is starting to piss me off, so I am considering dropping official support for MSVC. MinGW and Cygwin will still work for compiling on Windows if this ends up being the case. The reason for this is that there are a lot of platform dependent security issues. MinGW and Cygwin wrap Linux and glibc headers for Windows, which would push the security onus onto the compiler and end-user.
The biggest blocker at the moment in terms of this is the filesystem. If we want to implement a filesystem that is safe across platforms, stdc++ and iso libc have no guarantees about the safety of their functions.
The crux of this issue falls at the following specific behaviour:
- User selects an existing file to write to
- Application interface confirms overwrite action
- Application writes to the file after confirmation
A threat actor can introduce a malicious file or symlink to the file that was attempted access between the check and
usage of the file. This is called TOCTOU (time of check, time of use).
This issue can be solved using fopen("<file>", "a+") and ftell, however this specific behaviour is not intuitive to
those first learning how to work with file systems. We can attempt to abstract this away with another wrapper, or simply
write the file structure to handle this behaviour properly. The downside to this method overall is that it will break
common conventions of how humans interpret filesystems and the related control flow logic. What we can do is force the
'+' flag to always be present for write operations, and raise an error when desired, if the file is not empty. This
unfortunately would have the downside of being unable to open a file as write only.
Using "wx" in this instance would not be sufficient since it would require a second call to fopen, which would
create the conditions for the TOCTOU error described above.
Another issue arises when we are parsing a directory tree. The best we can do is take ownership of the directory that
is opened as the root. However, this requires dirent.h which is not implemented in MSVC. A custom implementation of
dirent.h may be written for MSVC, however this is one of the few things I am not willing to outsource to another
library. Developing our own implementation would take a non-insignificant amount of time, between writing the library,
debugging it, and testing for vulnerabilities. As stated above, this implementation is native to MinGW and Cygwin,
so we would not have to entirely drop support for Windows. However, MSVC is the most widely used compiler for Windows
applications and is native to Visual Studio and VSCode.
What is probably the best solution is to wrap everything in a file interface that does not allow the direct setting of
these flags. Then we set our own usage type for the file that informs which flags should be used.
We need to be able to handle the following types of files:
- Assets, such as scenes, audio, textures, metadata, meshes, etc.
- Save files, setting files, etc.
One of the nice things about the assets is that they are guaranteed to be read-only once an application is installed
on the computer of the end-user. Therefore, this issue only arises with save files and custom file formats.
When the editor is run, all these files should be opened in read/write mode.
Naming conventions should exist for the types of files and how they are read. For example, in release mode,
most assets should be opened once, and then closed immediately. However, this does not make sense for formats
that are continuous and too large to be kept around in memory, such as video formats.
Perhaps the following conventions:
- Static Asset
- Stream Asset
- Resource
We can turn this into an object-oriented approach by having different formats inherit these base types. We may still
have a base file type that wraps C functionality, but discourage developers from using the interface.
We could also declare the file interface extern so that only internal files know the implementation. However, I would
not be satisfied by doing this since it would prevent developers from implementing custom file type implementations.
Conserving memory is not really an issue here as long as we are smart about our implementation. Files should only be
open when necessary and be closed when it is no longer necessary to have them open. Data should be streamed unless the
all the data in the file is required.
When built in release mode, we also need to pack static assets into some sort of archive that is mountable to reduce
disk space consumption of a program.
I was considering encryption for archives, however it does not make much sense. Assuming someone intends to pirate the
game, there is not much stopping them from running the files. I will add Steam support at some point which would allow
you to use Steam's DRM to prevent the executable from being run. Otherwise, there is no point in attempting to encrypt
game files. Even Unreal PAK files can be cracked in seconds, and even if I managed to write something that cannot be
trivially cracked locally, you can scrape most assets from the GPU and Audio Card.
I have managed to solve the specific case provided at the top of this section, which was done by wrapping C I/O calls
into a file wrapper. This wrapper can handle a few different mode flags and has specific conditions for the flags. See
the documentation for fennec/fproc/io/file.h for more info.
One question remains unanswered on this front; should a read/write file open as r+ or a+. rewind is slightly faster
than fseek(SEEK_END), however for the case of save files and editor assets, r+ makes more sense from a usage perspective
Directories remain an issue, with dirent.h being the only sensible option at time of writing. The issue with using
dirent.h boils back down to security issues on Windows. However, the only option is to write a custom implementation
for MSVC.
Platform & API Support
I have decided to forgo SDL, this is so the engine can provide specific support for specific platforms. Also, SDL implements a lot of things that will need to be implemented specifically for the engine, so only the window management would be used.
Platform support will be implemented in the following order:
- Linux/BSD
- Wayland
- OpenGL (EGL) ✔
- XKB
- PulseAudio
- Vulkan
- X11
- ALSA
- Vulkan
- Wayland
- Microsoft Windows
- XInput
- OpenGL (WGL)
- WASAPI
- Vulkan
- Android
- OpenGL ES
- AAudio
- openslES
- macOS/iOS
- cocoa
- OpenGL
- Core Audio
- Vulkan
- Metal
- cocoa
Linux Wayland will be implemented first. Once setup, the core engine will be implemented and tested on top of Wayland. Once the engine is in a stable state, then support for other platforms will be resumed.
Most consoles will never get official platform support due to NDAs which conflict with the principles of this engine. fennec will avoid using proprietary libraries except when strictly necessary, such as support for Windows and MacOS. fennec will interact with any drivers required for the listed operating systems above, even if proprietary.
C++ Language Library (lang)
Implement header files for standard functions relating to the C++ Language.
So far this is implemented on an as-needed basis. A full implementation should be worked on continuously.
Math Library (math)
Implement math functions according to the OpenGL 4.6 Shading Language Specification.
"Extensions" has a different meaning here. Extensions for the math library are any functions that are not defined within the Specification.
Additional extensions should be implemented to provide standard definitions for functions predominantly related to Linear Algebra, Mathematical Analysis, and more specifically Discrete Analysis. Additional extensions will be implemented on an as-needed basis.
Memory Library (memory)
Implement headers related to memory allocation in C++.
- Smart Pointers
- Unique Pointer
- Shared Pointer
- Memory Allocation
- Allocation
Containers Library (containers)
All containers of the C++ Standard Library should be implemented.
Here are essential data-structures not specified in the C++ stdlib:
- Graph → AI
graph- Necessary for 2D and 3D navigation.
- Rooted Directed Tree → Scene
rd_tree- Defines the scene structure.
Format Processing (fproc)
This library contains information for any data that is formatted. This includes basic string formats, file formats, and eventually programming languages
fennec should be able to use Doxygen and LaTeX externally. Consider including binaries with releases.
Notes
- String Analysis (
fproc/strings)- Search
- Manipulation
- Delimiting
- Regex
- File Formats (
fproc/formats)- Serialization
- JSON
- HTML
- XML
- YAML
- Configuration
- INI
- TOML
- Documents
- ODF
- Markdown
- Spreadsheets & Tables
- ODS
- CSV
- Audio Formats
- MP3
- WAV
- AAC
- Graphics Formats
- Textures
- BMP
- DDS
- JPG
- PNG
- TIFF
- Vectors
- OTF
- SVG
- TTF
- Models
- FBX
- Wavefront OBJ
- Textures
- Video Formats
- MP4
- AVI
- MPG
- MOV
- Serialization
TODO LATER
- Compilation (
fproc/code)- Lexical Analysis
- Syntax Analysis
- Semantic Analysis
- Intermediate Code Generation
- Optimization
- Target Code Generation
Core (core)
This will be the core of the engine.
- Event System
- Most events will fire at the start of the next tick, especially those related to physics and input.
- Events for graphics or audio should propagate immediately.
- Events for stages should also propagate immediately, this is to support extensions and mods.
- Core Engine Loop
- System Manager
- Ticks vs. Frames
The following systems are not essential to the core engine, but are instead major systems that should be defined in their operation order:
Tick
- Update
- Events
- Scripts
- AI
- Physics
- Newtonian Commit
- Apply Forces (Updates Accelerations)
- Acceleration (Updates Velocities)
- Apply Velocities (Updates Position and Rotation)
- Constraint Resolution
- Collision Detection
- Collision Resolution
- Collision Response
- Calculate Forces & Velocities
- Queue events for next tick
- Newtonian Commit
Frame
- Physics
- Physics Interpolation
- Graphics
- 2D Graphics
- Generate 3D Mask
- 3D Graphics
- Audio
Platform Support Layer (platform)
This is the core part of platform support for fennec. All necessary drivers
and OS specific functionality will be wrapped up nicely into these interfaces.
See implementation order here
Scene (scene)
- In-Array Directed Tree
- Elegant method for providing
O(1)insertions andO(log(n))deletions.
- Elegant method for providing
- Bounding Volume Hierarchy
- Octree
2D Graphics (gfx2d)
Links:
- https://en.wikipedia.org/wiki/Quadtree
- https://developer.nvidia.com/gpugems/gpugems3/part-iv-image-effects/chapter-25-rendering-vector-art-gpu
Object Structure. The mesh is implicit data.
Structures (gfx2d)
For the 2d rendering framework, Materials need to be rendered independently because we have
no size constraints for images. This disallows us from using a meta-shader like in
the 3d rendering framework.
struct Object
{
vec2 location, scale; // A matrix would be 36 bytes, this is instead 20 bytes
float rotation;
}
-
BVH
- Quadtree
- Leaf Size and Tree Depth should be calculated by the scene, constraints are as follows:
- Min Object Size
- Max Object Size
- Scene Center
- Scene Edge
- Insertions and Updates are done on the CPU
- Nodes
- Start Index 32-bits
- Object Count 32-bits
- Objects
- Buffer of Object IDs grouped by Octree Node
- Leaf Size and Tree Depth should be calculated by the scene, constraints are as follows:
- Culling
- Starting at each Octree Leaf, traverse upwards.
- Insert Visible Leaf IDs
- Track using atomic buffer
- Generate the Command Buffer for Culled Meshes from the Visible Leaf Buffer
- Count Materials
- Count Meshes per Material
- Generate the Culled Object Buffer by copying objects from the Object Buffer
- Adjust Buffer Size using the counts
- Insert using another atomic buffer
- Quadtree
-
Translucent objects will be sorted. We can cheat by using a z-index instead of a z-coordinate.
This will allow us to sort objects as they are created. We can still bulk render each z-index,
with meshes and objects being grouped by material.
3D Graphics (gfx3d)
Links:
- https://en.wikipedia.org/wiki/Octree
- https://www.adriancourreges.com/blog/2015/11/02/gta-v-graphics-study/
- https://learnopengl.com/PBR/Lighting
- https://learnopengl.com/PBR/IBL/Diffuse-irradiance
- https://en.wikipedia.org/wiki/Schlick%27s_approximation
- https://pixelandpoly.com/ior.html
- https://developer.download.nvidia.com/SDK/10/opengl/screenshots/samples/dual_depth_peeling.html
DirectX will never have official support. If you would like to make a fork, have at it, but know that I will hold a deep disdain for you.
The graphics pipeline will have a buffer with a list of objects and their rendering data.
This will be referred to as the Object Buffer. There will be two, for both the Deferred and Forward Passes.
The buffers will be optimized by scene prediction.
This involves tracking the meshes and textures directly and indirectly used by a scene.
A callback function in the graphics system for scene loading can do this.
Materials and Lighting models will be run via a shader metaprogram to make the pipeline independent of this aspect. This allows the GPU to draw every single deferred rendered mesh in a single draw call for each stage of the renderer.
Specifications for debugging views via early breaks are included in the stages.
There will be three profiles for OpenGL implementation:
- modern
- fallback
- legacy
All profiles will have the same feature set, however their implementations will differ. The modern context will use up-to-date features to get as much performance out of the pipeline as possible.
Structures (gfx3d)
Object Structure. The mesh is implicit data.
struct Object
{
vec3 location, scale; // A matrix would be 64 bytes, this is instead 28 bytes
quat rotation;
int material;
}
Textures for 3D rendering are stored in various buffers with sizes of powers of 2.
Ratios of 1:1 and 2:1 are allowed. The 2:1 ratio is specifically for spherical and cylindrical projection.
UVs may be transformed to use a 2:1 as if it were 1:2.
Cubemaps may only be 1:1, I would be concerned if you are using any other ratio.
- 8-Bit R Texture
4096, 2048, 1024, 512(8) - 8-Bit RG Texture
4096, 2048, 1024, 512(8) - 8-Bit RGB Texture
4096, 2048, 1024, 512(8) - 8-Bit RGBA Texture
4096, 2048, 1024, 512(8) - 8-Bit RGB Cubemap
1024, 512, 256, 128(4)
- 16-Bit HDR RGB Texture
4096, 2048, 1024, 512(8) - 16-Bit HDR RGBA Texture
4096, 2048, 1024, 512(8) - 16-Bit HDR RGB Cubemap
1024, 512, 256, 128(4)
- 16-Bit Shadow Texture
4096, 2048, 1024, 512(8) - 16-Bit Shadow Cubemap
2048, 1024, 512, 256(4)
Documentation should provide guidelines on categories of Art Assets and the resolution of textures to use.
Textures are identified by an 8-bit integer and 16-bit integer.
int8→ the texture bufferint16→ the layer in the buffer
Artists should be informed on the texture structure of the engine and its limitations. However, these principles should be followed in other game engines as these are guided by what is most efficient for typical GPU hardware.
Materials are, for the most part, user-defined. Documentation should make the user aware of this. Material buffers will be a sequence of the Material Struct instances. They will at the very least contain the id of their shader.
Stages (gfx3d)
This is the set of stages for the graphics pipeline that runs every frame: Unless otherwise specified, each stage will be run on the GPU.
- BVH
- Octree
(8 Bpn, 64 bpn) [6-Layers ≈ 2.1MB]- Leaf Size and Tree Depth should be calculated by the scene, constraints are as follows:
- Min Object Size
- Max Object Size
- Scene Center
- Scene Edge
- Buffer has implicit locations due to the tree having 8 children.
- Insertions and Updates are done on the CPU
- Nodes
- Start Index
int32 - Object Count
int32
- Start Index
- Objects
- Buffer of Object IDs grouped by Octree Node
- Leaf Size and Tree Depth should be calculated by the scene, constraints are as follows:
- Leaf Culling
- Starting at each Octree Leaf, traverse upwards.
- Insert Visible Leaf IDs
- Track using atomic buffer
- Generate the Command Buffer for Culled Mesh LODs from the Visible Leaf Buffer
- Track counts using atomic buffers
- To avoid double counting due to the construction of the Octree output, we have some options
- Ignore Leaf Instances based on occurrences of the mesh in the surrounding 8 Quadtree Leaves. This would require a bias towards a specific corner of the filter.
- Perform a preprocessing step on the CPU to erase duplicate elements and fix the buffer continuity.
- Let the duplicates be rendered.
- Generate the Culled Object Buffer with the respective object IDs
- Adjust Buffer Size using the counts
- Insert by reusing the count buffer, clipped to only contain used meshes
- Octree
Debug View: Object ID, Mesh ID, LOD
- Visibility
- Buffer
(15 Bpp, 120 bpp) [1920x1080] ≈ 39.4MB- Depth Buffer →
D24 - Visibility Info →
RGB32I- R = Object ID
- G = Mesh ID
- B = Material ID
- Depth Buffer →
- Regenerate the Command Buffer for Visible Mesh LODs
- Regenerate the Culled Object Buffer
- Buffer
Debug View: Visibility Buffer
- G-Buffer Pass
(17 Bpp, 136 bpp) [1920x1080] ≈ 35.3MB- Depth - Stencil →
D24_S8- S → used to represent the lighting model.
- Diffuse →
RGBA8- A → Ambient Occlusion
- Emission →
RGB8 - Normal →
RGB8 - Specular →
RGB8- R → Roughness
- G → Specularity (sometimes called the Metallicness)
- B → Index of Refraction (IOR)
- Depth - Stencil →
Debug View: Depth, Stencil, Diffuse, Emission, Normal, Specularity
- Deferred Lighting Pass
(10 Bpp, 80 bpp) [1920x1080] ≈ 2 x 16.3MB + 8.3MB ≈ 24.6MB- Depth Buffer →
D24 - Lighting Buffer →
RGB16(w/ Mipmapping when Bloom or DoF are enabled) - Stencil Buffer $rarr;
S8 - Generate Dynamic Shadows
- Generate Dynamic Reflections (Optional)
- SSAO (Optional)
- Apply Lighting Model
- Depth Buffer →
Debug View: Shadows, Reflections, SSAO, Deferred Lighting
- Forward Pass
- BVH, Same as Above
- LOD Selection, Same as Above
- Translucent Materials
- Dual Depth Peeling
Debug View: Forward Mask
- Post Processing
- Depth of Field (Optional)
- When enabled, the Visiblity Buffer, G-Buffer, and Deferred Lighting Pass will be double layered.
- At this point the Lighting Buffers will be Flattened
- Bloom (Optional) → Mipmap Blurring
(6Bpp, 48bpp) [1920x1080] ≈ 16.3MB - Tonemapping (Optional)
- HDR Correction
- Depth of Field (Optional)
3D Physics (physics3d)
Links:
- https://www.researchgate.net/publication/264839743_Simulating_Ocean_Water
- https://arxiv.org/pdf/2109.00104
- https://www.youtube.com/watch?v=rSKMYc1CQHE
- https://tflsguoyu.github.io/webpage/pdf/2013ICIA.pdf
- https://animation.rwth-aachen.de/publication/0557/
- https://github.com/InteractiveComputerGraphics/PositionBasedDynamics?tab=readme-ov-file
- https://www.cs.umd.edu/class/fall2019/cmsc828X/LEC/PBD.pdf
Systems
- Rigid Body Physics
- Newtonian Physics and Collision Resolution
- Articulated Skeletal Systems
- Inverse Kinematics
- Stiff Rods
- Particle Physics
- Soft Body Physics
- Elastics → Finite Element Simulation
- Cloth → Position-Based Dynamics
- Water
-
Oceans → iWave
- Reasoning: iWave provides interactive lightweight fluid dynamics suitable for flat planes of water.
- Reasoning: iWave provides interactive lightweight fluid dynamics suitable for flat planes of water.
-
3D Fluid Dynamics → Smoothed-Particle Hydrodynamics
- Reasoning: This is the simplest method for simulating 3D bodies of water. This should exclusively be
used for small scale simulations where self-interactive fluids are necessary. I.E. pouring water into
a glass.
- Reasoning: This is the simplest method for simulating 3D bodies of water. This should exclusively be
used for small scale simulations where self-interactive fluids are necessary. I.E. pouring water into
a glass.
-
2D Fluid Dynamics → Force-Based Dynamics
- Reasoning: This model, like iWave, provides lightweight interactive fluid dynamics, but is more easily adapted to flowing surfaces such as streams and rivers.
-
Artificial Intelligence (ai)
This artificial intelligence method only differs in static generation between 2D and 3D. The solvers are dimension independent since they work on a graph.
The general process is;
Static:
- generate a static navigation graph (sometimes called a NavMesh)
Update:
- resolve dynamic blockers
- update paths using dijkstra's algorithm
- apply rigid-body forces with constraints
The update loop for artificial intelligence should only update every n ticks. Where n <= k, with k being the
tick rate of the physics engine.