I will add details about my coding projects here, whether just demoscene related or professional research.
A long time ago I wrote a 3d Engine on the Gameboy Advance, mostly written in C and optimised ARM assembler for graphics routines.
The Gameboy Advance has an ARM7TDMI(-S) processor running 15 MIPS @ 16.8 MHz, the good thing about this processor is that it has a 3-stage pipeline, and Thumb mode. Thumb mode allows coders to write code with a reduced size instruction set which is only 16bit wide, ARM instructions are 32bit wide but run faster than thumb if used correctly (although using THUMB instructions could reduce the instruction bandwidth so that reduced bitwidth (<32bit) architectures can benefit from this), you can change modes on the fly in code and therefore have best of both size coding and speed coding.
The intention was to use this engine and produce some demos with it, but alas when I was getting to a point where I could use it for this purpose I got my first job in the games industry and have not had the time to look back at it since. I do from time to time go back to the code and start getting all excited about the prospect of writing ARM assembler again using its many instructions to optimise out graphics routines. Some of the instructions are great allowing you to optimise the instruction pipeline using single instructions to do multiple operations on any given data, for example you could do an add instruction with a post bitwise shift or rotate and only do this when a conditional is met, all in one instruction!
Some design specs taken from the ARM wikipedia article:
The ARM architecture includes the following RISC features:
- Load/store architecture
- No support for misaligned memory accesses (now supported in ARMv6 cores, with some exceptions related to load/store multiple word instructions)
- Uniform 16 × 32-bit register file
- Fixed instruction width of 32 bits to ease decoding and pipelining, at the cost of decreased code density. (Later, “Thumb mode” increased code density.)
- Mostly single-cycle execution
To compensate for the simpler design, compared with contemporary processors like the Intel 80286 and Motorola 68020, some additional design features were used:
- Conditional execution of most instructions, reducing branch overhead and compensating for the lack of a branch predictor
- Arithmetic instructions alter condition codes only when desired
- 32-bit barrel shifter which can be used without performance penalty with most arithmetic instructions and address calculations
- Powerful indexed addressing modes
- A link register for fast leaf function calls.
- Simple, but fast, 2-priority-level interrupt subsystem with switched register banks
The conditional execution feature is implemented with a 4-bit condition code selector on every instruction; one of the four-bit codes is reserved as an “escape code” to specify certain unconditional instructions, but nearly all common instructions are conditional. Most CPU architectures only have condition codes on branch instructions.
My engine runs in MODE4 which is a double buffered 240×160 pixel screen mode, the pixel format is a 256 indexed pallete with each pallette entry being 16bpp. I used Milkshape3d to knock up test models for the engine, exporting the object data and packing into a custom format for the engine, I would recommend this for any basic lowtech engine developers as it is easy to use even if it can be a little buggy at times and the fileformat was easy to parse.
Below is a few screenshots of a few effects I knocked up (they were all captured running the code within the MAPPYVM GBA emulator):
Envmapped chamfered cube (204 verts running @ 34fps)
Simple Particle System
I started work recently on a PC Demo Engine and DemoTool which I have been working on for about the last 6 months on and off when work and other creative adventures allow. I have the basic engine set up with an effects framework and material system. Models are exported using a plugin in 3d Studio Max 2008, I have written my shaders to use within MAX as standalone materials on meshes. This is only a basic set up to get something running first but in the future I will write my own exporters for exporting a whole host of different attributes from 3ds or another package (e.g. Maya or Lightwave), for now it allows me to test my system perfectly fine.
I will post progress on this project up on my site. A small preview is shown below of the demotool, its not very functional yet but it is written for expansion in mind, the demo effects are automatically picked up by the tool and are at present just compiled into DLLs. Using a common interface for the demo effects these DLLs are quickly added to any project in the demotool editor app. The Demotool is written using WinForms and C# while the backend is C++ and DirectX9.