Files
fennec/planning/LANGUAGE_PROCESSING.md

248 lines
7.8 KiB
Markdown

# Language Processing Library (`langproc`)
## Table of Contents
<!-- TOC -->
* [Home](./CONTENTS.md#planning-documentation-for-fennec)
* [Language Processing Library (`langproc`)](#language-processing-library-langproc)
* [Table of Contents](#table-of-contents)
* [Introduction](#introduction)
* [String Analysis (`langproc/strings`)](#string-analysis-langprocstrings)
* [Implementation](#implementation)
* [File System (`filesystem`)](#file-system-filesystem)
* [Implementation](#implementation-1)
<!-- TOC -->
## Introduction
&ensp; This library contains implementations of headers and classes related to processing
languages. This includes; ascii/utf8/utf16 string processing, file formats, machine language,
and programming languages.
&ensp; fennec should be able to process documentation in files, the main ways it will support
this is through Doxygen and LaTeX. Consider including binaries with releases.
## String Analysis (`langproc/strings`)
&ensp; fennec reimplements the C++ Strings Library as a submodule of this library. This
is because C++ `std::string` has a lot of overhead. I would say that `std::string`
is a Jeep, while `fennec::string` is an F2 Car, if that analogy makes any sense. i.e.
`std::string` offers a lot of use cases, but is slower, while an F2 Car is barebones and
highly performant on the right surface.
### Implementation
| Symbol | Implemented | Passed |
|:---------|:-----------:|:------:|
| cstring | ✅ | ✅ |
| string | ✅ | ✅ |
| wcstring | ✅ | ✅ |
| wstring | ✅ | ✅ |
## File System (`filesystem`)
&ensp; fennec *does not* reimplement the C++ I/O Library. What it does do
is create C++ classes that handle file streams, directory streams, and file paths.
### Implementation
| Symbol | Implemented | Passed |
|:----------|:-----------:|:------:|
| path | ✅ | ✅ |
| file | ✅ | ✅ |
| directory | ⛔ | ⛔ |
## Interpreter (`langproc/parse`)
&ensp; This submodule will contain classes for parsing data. We will need to be
able to do the following things to achieve support for files that adhere to a
certain specification. Here are some concepts that will need to be implemented as classes:
### Reading
- Tokenization
- Useful for text-based formats
- Data Parser
- Useful for binary-based formats
- Lexical Analysis
- Necessary for Syntax Coloring
- Syntax Analysis
- Necessary for Syntax Coloring
- Semantic Analysis
- Necessary for Code Completion
- Intermediate Code Generation
- Necessary for any custom programming language in fennec
- Target Code Generation / Optimization?
- Necessary for any custom programming language that needs to compile to binary
### Writing
&ensp; The writers will be responsible for writing data as a specific format. I.E. converting
data values (e.g. floats, ints, etc.) to a readable language (e.g. ascii/utf8/utf16).
- Writer
- Binary Writer
## Formats (`langproc/formats`)
&ensp; This submodule will contain classes for processing a variety of file formats.
### Serialization
| Symbol | Implemented | Passed |
|:-------|:-----------:|:------:|
| JSON | ⛔ | ⛔ |
| HTML | ⛔ | ⛔ |
| XML | ⛔ | ⛔ |
| YAML | ⛔ | ⛔ |
### Configuration
| Symbol | Implemented | Passed |
|:-------|:-----------:|:------:|
| INI | ⛔ | ⛔ |
| TOML | ⛔ | ⛔ |
### Documents
| Symbol | Implemented | Passed |
|:---------|:-----------:|:------:|
| ODF | ⛔ | ⛔ |
| Markdown | ⛔ | ⛔ |
| PDF | ⛔ | ⛔ |
### Spreadsheets & Tables
| Symbol | Implemented | Passed |
|:---------|:-----------:|:------:|
| ODS | ⛔ | ⛔ |
| CSV | ⛔ | ⛔ |
### Audio Formats
| Symbol | Implemented | Passed |
|:---------|:-----------:|:------:|
| MP3 | ⛔ | ⛔ |
| WAV | ⛔ | ⛔ |
| AAC | ⛔ | ⛔ |
### Graphics Formats
#### Raster Textures
| Symbol | Implemented | Passed |
|:-------|:-----------:|:------:|
| BMP | ⛔ | ⛔ |
| DDS | ⛔ | ⛔ |
| JPG | ⛔ | ⛔ |
| PNG | ⛔ | ⛔ |
| TIFF | ⛔ | ⛔ |
#### Vector Graphics
| Symbol | Implemented | Passed |
|:-------|:-----------:|:------:|
| OTF | ⛔ | ⛔ |
| SVG | ⛔ | ⛔ |
| TTF | ⛔ | ⛔ |
#### 3D Model Formats
&ensp; unfortunately, most formats are esoteric due to copyright/trademark/etc.
I will be using assimp for the time being, below is a list of formats supported
by assimp.
| Symbol | Implemented | Passed |
|:----------------|:-----------:|:------:|
| 3D | ⛔ | ⛔ |
| 3DS | ⛔ | ⛔ |
| 3MF | ⛔ | ⛔ |
| AC | ⛔ | ❌ |
| AC3D | ❌ | ❌ |
| ACC | ❌ | ❌ |
| AMJ | ❌ | ❌ |
| ASE | ❌ | ❌ |
| ASK | ❌ | ❌ |
| B3D | ❌ | ❌ |
| BVH | ❌ | ❌ |
| CSM | ❌ | ❌ |
| COB | ❌ | ❌ |
| DAE/Collada | ❌ | ❌ |
| DXF | ❌ | ❌ |
| ENFF | ❌ | ❌ |
| FBX | ❌ | ❌ |
| glTF 1.0 + GLB | ❌ | ❌ |
| glTF 2.0 | ❌ | ❌ |
| HMB | ❌ | ❌ |
| IFC-STEP | ❌ | ❌ |
| IQM | ❌ | ❌ |
| IRR / IRRMESH | ❌ | ❌ |
| LWO | ❌ | ❌ |
| LWS | ❌ | ❌ |
| LXO | ❌ | ❌ |
| M3D | ❌ | ❌ |
| MD2 | ❌ | ❌ |
| MD3 | ❌ | ❌ |
| MD5 | ❌ | ❌ |
| MDC | ❌ | ❌ |
| MDL | ❌ | ❌ |
| MESH / MESH.XML | ❌ | ❌ |
| MOT | ❌ | ❌ |
| MS3D | ❌ | ❌ |
| NDO | ❌ | ❌ |
| NFF | ❌ | ❌ |
| OBJ | ❌ | ❌ |
| OFF | ❌ | ❌ |
| OGEX | ❌ | ❌ |
| PLY | ❌ | ❌ |
| PMX | ❌ | ❌ |
| PRJ | ❌ | ❌ |
| Q3O | ❌ | ❌ |
| Q3S | ❌ | ❌ |
| RAW | ❌ | ❌ |
| SCN | ❌ | ❌ |
| SIB | ❌ | ❌ |
| SMD | ❌ | ❌ |
| STP | ❌ | ❌ |
| STL | ❌ | ❌ |
| TER | ❌ | ❌ |
| UC | ❌ | ❌ |
| USD | ❌ | ❌ |
| VTA | ❌ | ❌ |
| X | ❌ | ❌ |
| X3D | ❌ | ❌ |
| XGL | ❌ | ❌ |
| ZGL | ❌ | ❌ |
#### Video Formats
| Symbol | Implemented | Passed |
|:-------|:-----------:|:------:|
| MP4 | ❌ | ❌ |
| AVI | ❌ | ❌ |
| MPG | ❌ | ❌ |
| MOV | ❌ | ❌ |