15 KiB
Planning Documentation for fennec
Table of Contents
- Introduction
- TODO
- C++ Language
- Math Library
- Memory Library
- Containers Library
- Format Processing
- Core
- Application Layer
- Scene
- 2D Graphics
- 3D Graphics
- 3D Physics
- Artificial Intelligence
Introduction
This file serves as a general planning document for engine structure, systems, pipelines, and implementation.
Implementations of core engine systems should strive to be O(1)
in implementations,
both in terms of runtime and memory performance. This is obviously not a realistic goal,
so rather than the goal requiring the entire engine to be O(1)
, we should more specifically look
at achieving O(1)
performance on hot paths.
Functions should be highly verbose, and in debug mode any bugprone or erroneous behaviour should throw assertions. DO NOT USE EXCEPTIONS.
System implementations should be independent of architecture or platforms. i.e. the code of the graphics system should not care if OpenGL or Vulkan is used and should not use any direct calls to OpenGL or Vulkan.
TODO
- 2D Graphics (
gfx2d
) - 2D Physics (
physics2d
) - 2D & 3D Audio (
audio
)
C++ Language Library (lang
)
Implement header files for standard functions relating to the C++ Language. So far this is implemented on an as-needed basis. A full implementation should be worked on continuously.
Math Library (math
)
Implement math functions according to the OpenGL 4.6 Shading Language Specification.
Additional extensions should be implemented to provide standard definitions for functions predominantly related to Linear Algebra, Mathematical Analysis, and Discrete Analysis. Additional extensions will be implemented on a as-needed basis.
Memory Library (memory
)
Implement headers related to memory allocation in C++.
- Smart Pointers
- Unique Pointer
- Shared Pointer
- Memory Allocation
- Allocation
Containers Library (containers
)
All containers of the C++ Standard Library should be implemented.
Here are essential data-structures not specified in the C++ stdlib:
- Graph → AI
graph
- Necessary for 2D and 3D navigation.
- Rooted Directed Tree → Scene
rd_tree
- Defines the scene structure.
Format Processing (fproc
)
No, this won't include Machine Learning, it will mostly include tools for processing human-readable files. fennec should be able to use Doxygen and LaTeX externally. Consider including binaries with releases.
- String Analysis (
fproc/strings
)- Search
- Manipulation
- Delimiting
- Regex
- File Formats (
fproc/formats
)- Serialization
- JSON
- HTML
- XML
- YAML
- Configuration
- INI
- TOML
- Documents
- ODF
- Markdown
- Spreadsheets & Tables
- ODS
- CSV
- Graphics Formats
- Textures
- BMP
- DDS
- JPG
- PNG
- TIFF
- Vectors
- OTF
- SVG
- TTF
- Models
- FBX
- Wavefront OBJ
- Textures
- Serialization
MAYBE
- Compilation (
fproc/code
)- Lexical Analysis
- Syntax Analysis
- Semantic Analysis
- Intermediate Code Generation
- Optimization
- Target Code Generation
Core (core
)
This will be the core of the engine.
- Event System
- Most events will fire at the start of the next tick, especially those related to physics and input.
- Events for graphics or audio should propagate immediately.
- Core Engine Loop
- System Manager
- Ticks vs. Frames
The following systems are not essential to the core engine, but are instead major systems that should be defined in their operation order:
Tick
- Update
- Events
- Scripts
- AI
- Physics
- Newtonian Commit
- Apply Forces (Updates Acceleration and Torque)
- Apply Torque & Acceleration (Updates Velocities)
- Apply Velocities (Updates Position and Rotation)
- Constraint Resolution
- Collision Detection
- Collision Resolution
- Collision Response
- Calculate Forces & Velocities
- Queue events for next tick
- Newtonian Commit
Frame
- Physics
- Physics Interpolation
- Graphics
- 2D Graphics
- Generate 3D Mask
- 3D Graphics
- Audio
Application Layer (app
)
This is the core windowing system of fennec. The implementation will initially be an SDL3 wrapper. Custom implementation may be further down the roadmap, however this is extremely complicated and there are better implementations than I could write.
Scene (scene
)
- In-Array Directed Tree
- Elegant method for providing
O(1)
insertions andO(log(n))
deletions.
- Elegant method for providing
- Bounding Volume Hierarchy
- Octree
2D Graphics (gfx2d
)
Links:
- https://en.wikipedia.org/wiki/Quadtree
- https://developer.nvidia.com/gpugems/gpugems3/part-iv-image-effects/chapter-25-rendering-vector-art-gpu
Object Structure. The mesh is implicit data.
Structures (gfx2d
)
For the 2d rendering framework, Materials need to be rendered independently because we have no size constraints for images. This disallows us from using a meta-shader like in the 3d rendering framework.
struct Object
{
vec2 location, scale; // A matrix would be 36 bytes, this is instead 20 bytes
float rotation;
}
-
BVH
- Quadtree
- Leaf Size and Tree Depth should be calculated by the scene, constraints are as follows:
- Min Object Size
- Max Object Size
- Scene Center
- Scene Edge
- Insertions and Updates are done on the CPU
- Nodes
- Start Index 32-bits
- Object Count 32-bits
- Objects
- Buffer of Object IDs grouped by Octree Node
- Leaf Size and Tree Depth should be calculated by the scene, constraints are as follows:
- Culling
- Starting at each Octree Leaf, traverse upwards.
- Insert Visible Leaf IDs
- Track using atomic buffer
- Generate the Command Buffer for Culled Meshes from the Visible Leaf Buffer
- Count Materials
- Count Meshes per Material
- Generate the Culled Object Buffer by copying objects from the Object Buffer
- Adjust Buffer Size using the counts
- Insert using another atomic buffer
- Quadtree
-
Translucent objects will be sorted. We can cheat by using a z-index instead of a z-coordinate. This will allow us to sort objects as they are created. We can still bulk render each z-index, with meshes and objects being grouped by material.
3D Graphics (gfx3d
)
Links:
- https://en.wikipedia.org/wiki/Octree
- https://www.adriancourreges.com/blog/2015/11/02/gta-v-graphics-study/
- https://learnopengl.com/PBR/Lighting
- https://learnopengl.com/PBR/IBL/Diffuse-irradiance
- https://en.wikipedia.org/wiki/Schlick%27s_approximation
- https://pixelandpoly.com/ior.html
- https://developer.download.nvidia.com/SDK/10/opengl/screenshots/samples/dual_depth_peeling.html
DirectX will never have official support. If you would like to make a fork, have at it, but know that I will hold a deep disdain for you.
The graphics pipeline will have a buffer with a list of objects and their rendering data. This will be referred to as the Object Buffer. There will be two, for both the Deferred and Forward Passes.
The buffers will be optimized by scene prediction. This involves tracking the meshes and textures directly and indirectly used by a scene. A callback function in the graphics system for scene loading can do this.
Materials and Lighting models will be run via a shader metaprogram to make the pipeline independent of this aspect. This allows the GPU to draw every single deferred rendered mesh in a single draw call for each stage of the renderer.
Specifications for debugging views via early breaks are included in the stages.
Structures (gfx3d
)
Object Structure. The mesh is implicit data.
struct Object
{
vec3 location, scale; // A matrix would be 64 bytes, this is instead 28 bytes
quat rotation;
int material;
}
Textures for 3D rendering are stored in various buffers with sizes of powers of 2.
Ratios of 1:1
and 2:1
are allowed. The 2:1
ratio is specifically for spherical and cylindrical projection.
UVs may be transformed to use a 2:1
as if it were 1:2
.
Cubemaps may only be 1:1
, I would be concerned if you are using any other ratio.
- 8-Bit R Texture
4096, 2048, 1024, 512
(8) - 8-Bit RG Texture
4096, 2048, 1024, 512
(8) - 8-Bit RGB Texture
4096, 2048, 1024, 512
(8) - 8-Bit RGBA Texture
4096, 2048, 1024, 512
(8) - 8-Bit RGB Cubemap
1024, 512, 256, 128
(4)
- 16-Bit HDR RGB Texture
4096, 2048, 1024, 512
(8) - 16-Bit HDR RGBA Texture
4096, 2048, 1024, 512
(8) - 16-Bit HDR RGB Cubemap
1024, 512, 256, 128
(4)
- 16-Bit Shadow Texture
4096, 2048, 1024, 512
(8) - 16-Bit Shadow Cubemap
2048, 1024, 512, 256
(4)
Documentation should provide guidelines on categories of Art Assets and the resolution of textures to use.
Textures are identified by an 8-bit integer and 16-bit integer.
int8
→ the texture bufferint16
→ the layer in the buffer
Artists should be informed on the texture structure of the engine and its limitations. However, these principles should be followed in other game engines as these are guided by what is most efficient for typical GPU hardware.
Materials are, for the most part, user-defined. Documentation should make the user aware of this. Material buffers will be a sequence of the Material Struct instances. They will at the very least contain the id of their shader.
Stages (gfx3d
)
This is the set of stages for the graphics pipeline that runs every frame: Unless otherwise specified, each stage will be run on the GPU.
- BVH
- Octree
(8 Bpn, 64 bpn) [6-Layers ≈ 2.1MB]
- Leaf Size and Tree Depth should be calculated by the scene, constraints are as follows:
- Min Object Size
- Max Object Size
- Scene Center
- Scene Edge
- Buffer has implicit locations due to the tree having 8 children.
- Insertions and Updates are done on the CPU
- Nodes
- Start Index
int32
- Object Count
int32
- Start Index
- Objects
- Buffer of Object IDs grouped by Octree Node
- Leaf Size and Tree Depth should be calculated by the scene, constraints are as follows:
- Leaf Culling
- Starting at each Octree Leaf, traverse upwards.
- Insert Visible Leaf IDs
- Track using atomic buffer
- Generate the Command Buffer for Culled Mesh LODs from the Visible Leaf Buffer
- Track counts using atomic buffers
- To avoid double counting due to the construction of the Octree output, we have some options
- Ignore Leaf Instances based on occurrences of the mesh in the surrounding 8 Quadtree Leaves. This would require a bias towards a specific corner of the filter.
- Perform a preprocessing step on the CPU to erase duplicate elements and fix the buffer continuity.
- Let the duplicates be rendered.
- Generate the Culled Object Buffer with the respective object IDs
- Adjust Buffer Size using the counts
- Insert by reusing the count buffer, clipped to only contain used meshes
- Octree
Debug View: Object ID, Mesh ID, LOD
- Visibility
- Buffer
(15 Bpp, 120 bpp) [1920x1080] ≈ 39.4MB
- Depth Buffer →
D24
- Visibility Info →
RGB32I
- R = Object ID
- G = Mesh ID
- B = Material ID
- Depth Buffer →
- Regenerate the Command Buffer for Visible Mesh LODs
- Regenerate the Culled Object Buffer
- Buffer
Debug View: Visibility Buffer
- G-Buffer Pass
(17 Bpp, 136 bpp) [1920x1080] ≈ 35.3MB
- Depth - Stencil →
D24_S8
- S → used to represent the lighting model.
- Diffuse →
RGBA8
- A → Ambient Occlusion
- Emission →
RGB8
- Normal →
RGB8
- Specular →
RGB8
- R → Roughness
- G → Specularity (sometimes called the Metallicness)
- B → Index of Refraction (IOR)
- Depth - Stencil →
Debug View: Depth, Stencil, Diffuse, Emission, Normal, Specularity
- Deferred Lighting Pass
(10 Bpp, 80 bpp) [1920x1080] ≈ 2 x 16.3MB + 8.3MB ≈ 24.6MB
- Depth Buffer →
D24
- Lighting Buffer →
RGB16
(w/ Mipmapping when Bloom or DoF are enabled) - Stencil Buffer $rarr;
S8
- Generate Dynamic Shadows
- Generate Dynamic Reflections (Optional)
- SSAO (Optional)
- Apply Lighting Model
- Depth Buffer →
Debug View: Shadows, Reflections, SSAO, Deferred Lighting
- Forward Pass
- BVH, Same as Above
- LOD Selection, Same as Above
- Translucent Materials
- Dual Depth Peeling
Debug View: Forward Mask
- Post Processing
- Depth of Field (Optional)
- When enabled, the Visiblity Buffer, G-Buffer, and Deferred Lighting Pass will be double layered.
- At this point the Lighting Buffers will be Flattened
- Bloom (Optional) → Mipmap Blurring
(6Bpp, 48bpp) [1920x1080] ≈ 16.3MB
- Tonemapping (Optional)
- HDR Correction
- Depth of Field (Optional)
3D Physics (physics3d)
Links:
- https://www.researchgate.net/publication/264839743_Simulating_Ocean_Water
- https://arxiv.org/pdf/2109.00104
- https://www.youtube.com/watch?v=rSKMYc1CQHE
- https://tflsguoyu.github.io/webpage/pdf/2013ICIA.pdf
- https://animation.rwth-aachen.de/publication/0557/
- https://github.com/InteractiveComputerGraphics/PositionBasedDynamics?tab=readme-ov-file
- https://www.cs.umd.edu/class/fall2019/cmsc828X/LEC/PBD.pdf
Systems
- Rigid Body Physics
- Newtonian Physics and Collision Resolution
- Articulated Skeletal Systems
- Inverse Kinematics
- Stiff Rods
- Particle Physics
- Soft Body Physics
- Elastics → Finite Element Simulation
- Cloth → Position-Based Dynamics
- Water
-
Oceans → iWave
- Reasoning: iWave provides interactive lightweight fluid dynamics suitable for flat planes of water.
- Reasoning: iWave provides interactive lightweight fluid dynamics suitable for flat planes of water.
-
3D Fluid Dynamics → Smoothed-Particle Hydrodynamics
- Reasoning: This is the simplest method for simulating 3D bodies of water. This should exclusively be
used for small scale simulations where self-interactive fluids are necessary. I.E. pouring water into
a glass.
- Reasoning: This is the simplest method for simulating 3D bodies of water. This should exclusively be
used for small scale simulations where self-interactive fluids are necessary. I.E. pouring water into
a glass.
-
2D Fluid Dynamics → Force-Based Dynamics
- Reasoning: This model, like iWave, provides lightweight interactive fluid dynamics, but is more easily adapted to flowing surfaces such as streams and rivers.
-
Artificial Intelligence (ai
)
This artificial intelligence method only differs in static generation between 2D and 3D. The solvers are dimension independent since they work on a graph.
The general process is;
Static:
- generate a static navigation graph (sometimes called a NavMesh)
Update:
- resolve dynamic blockers
- update paths using dijkstra's algorithm
- apply rigid-body forces with constraints
The update loop for artificial intelligence should only update every n
ticks. Where n <= k
, with k
being the
tick rate of the physics engine.