A passage from Mortimer J. Adler and Charles Van Doren's "How to Read a Book," discusses reading for understanding:
"If the book is completely intelligible to you from start to finish, then the author and you are as two minds in the same mold. The symbols on the page merely express common understanding you had before you met.
Let us take our second alternative. You do not understand the book perfectly. Let us even assume--what unhappily is not always true--that you understand enough to know that you do not understand it all. You know the book has more to say than you understand and hence it contains something that can increase your understanding...
Without external help of any sort, you go to work on the book. With nothing but the power of your own mind, you operate on the symbols before you in such a way that you gradually lift yourself from a state of understanding less to one of understanding more. Such elevation, accomplished by the mind working on a book, is highly skilled reading, the kind of reading that a book which challenges your understanding deserves"
The goal was to publish source code to a GPU that is register compatible with the late 90's era Number Nine "Ticket To Ride IV" GPU. Although the project didn't meet its funding goal, the person behind it later published the code on github.
Although this is an older design, it has a lots that is worth studying. It's instructive to compare it to the VideoCore GPU that I walked through in a previous post. While there are some fundamental differences, there are surprising number of similarities, which shows how modern GPUs evolved from earlier ones.
A few years ago, Broadcom released full specifications for their VideoCore IV GPU, which is in the system-on-chip on the popular Raspberry Pi dev board. Before this, most details of commercial GPUs were secret. Although GPU manufacturers released white papers and some academic publications, they were often greatly simplified and lacked important details.
I added support for lightmaps to the Quake renderer I discussed in the last post. Lightmaps were a big innovation when they first appeared in Quake. By precalculating light and shadow, they allowed much more realistic scenes than would have been possible to compute in realtime. Lightmaps are still used in many games.
I wrote a custom engine to render Quake levels with my GPGPU. After fixing many subtle hardware lockups and compiler backend gremlins, I'm right chuffed to see this running reliably on an FPGA. It stresses a lot of functionality: hardware multithreading, heavy floating point math, and complex data structure accesses.
I won't be challenging anyone to a deathmatch any time soon
But... it's running at around 1 frame per second. While this is only a single core running at 50 Mhz, the original Quake with a software renderer ran fine on a 75 Mhz Pentium, so there's a lot of room for improvement. I'll dig more into the performance in a bit, but first, some background on how this works.
I reworked the 3D renderer recently, improving performance and adding features to make it more general purpose. This included improved bin assignment to reduce overhead, tracking state transitions properly, and adding features like clipping.