Ideas for 1st person Minecraft-like
Posted: Mon Aug 05, 2024 6:34 pm
The video about 8-bit blocks made me think ...
I like the idea, but you know, the isometric view only really works as long as you are on the surface.
As soon as you go underground, you have absolutely no idea what is happening anymore.
So I started thinking about whether it is possible to create a 1st person Minecraft-like experience on the Commander X16.
I think it's possible.
The main compromises are that you will only be able to
Here are my two core ideas:
1. Pre-calculated Projections
This idea is best explained with a simple example.
Imagine you are standing on a perfectly flat world in Minecraft, and you are looking directly north.
You can see a bunch of squares on the ground in front of you, which are the upper surfaces of the blocks you are walking on. Let's focus on the block which is 5 units in front of you, at the surface.
Of this block, you can see its top surface with its four corners. These corners have some coordinates in the bottom half of the visible image.
Let's say, for example:
far-left (140, 170)
far-right (180, 170)
near-left (132, 190)
near-right (188, 190)
Now imagine you move forward by exactly one block. Now the coordinates of that block, which is now 4 units in front of you, might read
far-left (132, 190)
far-right (188, 190)
near-left (96, 220)
near-right (224, 220)
What happened? Basically, the block's former "near" vertices have become its new "far" vertices.
Conversely, the new "near" vertices are the same ones as the old "near" vertices of the block directly in front of it.
The same principle applies to movement in any direction. Vertices cover up perfectly whenever you move exactly one block or turn the camera by exactly 90 degrees.
In other words, it is possible to navigate the entire blocky world without having to re-project 3d vertices to 2d screen coordinates -- ever.
We just need to map world data (blocks) to pre-calculated 2d coordinates.
With a single lookup table like this, we could cover the four viewing angles (north, south, east, west), and also directly up and directly down.
Of course, such a lookup table is memory-intensive. We need to store the 2d coordinates of ALL block vertices that could theoretically appear on the screen. This scales with the render distance cubed.
Let's do an example for a render distance of 16 blocks, and a horizontal field of view of 90° (45° left, 45° right):
The "ground plane" at the far end of the "visible pyramid" is 32 by 24 blocks (smaller field of view in the vertical direction).
That means, there are (32+1)*(24+1) = 825 vertices in that plane. The +1 is rounding up for the vertices near the edge of the field of view.
The number of vertices needed roughly equals the colume of the "visible pyramid", which is
V = 1/3 * h * Ag
where the pyramid's height h is the render distance 16, and the surface area of the gound plane, which we've estimated to 825.
This yields 4400 vertices.
For each vertex, we would have to store 3 bytes (assuming 320x240 resolution, which means 2 bytes horizontal position, 1 byte vertical), which amounts for 13200 bytes.
Now there are symmetries.
Mirroring vertically and horizontally allows for reduction to one quarter of the size (3300 bytes). You could theoretically exploit the symmetry along one of the diagonals, too, further reducing LUT size. But that might not be worth the extra CPU spent on "unfolding" the symmetry.
The bottom line is, one or two sets of pre-calculated vertices fit in a RAM bank.
Another one of these LUTs could be used for a diagonal viewing angle of 45°.
2. Rendering Procedure
In 3d rendering, you typically want to do as little work as possible, ideally only spend rendering effort on things that are actually visible on-screen, and not covered up by other surfaces which are closer to the camera. The same, of course, is true here.
We initialize the whole screen with zero.
After that, we start drawing the surfaces, starting with the closest blocks first. Close surfaces have a good chance of quickly covering large portions of the screen already.
Before we draw a surface, we check whether all of its four corners are already covered.
This can be easily done by two lookups per corner:
Granted, it is likely possible to construct a situation where this assumption breaks down, but I am pretty sure it is good enough to be practical, and visual glitches because of it should be very rare.
Note that we don't have to do four lookups per surface.
There are a few more nuances to this, and I am sure, they are worth several headaches. I have thought about some more details already, but these are the core concepts.
Back-of-the-Envelope CPU budget calculation
Let's for now target 5 frames per second. This isn't very much, but if you consider you can only move in whole block increments anyway, it won't be that bad.
At 8 MHz that means we have 1600k CPU cycles for each frame.
Now let's say we want to fill 320 by 240 pixels, which is just under 80k pixels, we have a little over 20 cycles per pixel.
The inner rendering loop I outlined above might look something like this:
The weird layout with the entrypoint in the middle is to not waste any cycles on unnecessary (unconditional) jump instructions. This loop needs 15 cycles per pixel, so that would leave some headroom for other things. And other things there are plenty of! So I would conclude that 320x240 resolution might be a bit too high to get to a usable frame rate. Any reduction in the number of pixels will get us closer to it.
What I haven't addressed so far is the data structure of the world and how to efficiently read from it at the correct locations during rendering (how to quickly traverse it in different directions and between chunks). I think 16x16 chunks as in the original Minecraft is too large, 8x8 may be more suitable.
Just wanted to share. I currently don't feel like trying this out myself, so if anyone else is interested to have a go, I am totally fine with that. I might in the future, I might not, who knows
I like the idea, but you know, the isometric view only really works as long as you are on the surface.
As soon as you go underground, you have absolutely no idea what is happening anymore.
So I started thinking about whether it is possible to create a 1st person Minecraft-like experience on the Commander X16.
I think it's possible.
The main compromises are that you will only be able to
- move in whole block increments (like a pawn in chess) and
- look in a limited number of discrete directions (e.g. at 90 and 45 degree angles).
Here are my two core ideas:
- Pre-calculate 3d->2d projections and store them in a number of look-up tables for use during rendering.
- Render block surfaces from closest to farthest. Only render surfaces when one or more of its corners (vertices) aren't covered yet.
1. Pre-calculated Projections
This idea is best explained with a simple example.
Imagine you are standing on a perfectly flat world in Minecraft, and you are looking directly north.
You can see a bunch of squares on the ground in front of you, which are the upper surfaces of the blocks you are walking on. Let's focus on the block which is 5 units in front of you, at the surface.
Of this block, you can see its top surface with its four corners. These corners have some coordinates in the bottom half of the visible image.
Let's say, for example:
far-left (140, 170)
far-right (180, 170)
near-left (132, 190)
near-right (188, 190)
Now imagine you move forward by exactly one block. Now the coordinates of that block, which is now 4 units in front of you, might read
far-left (132, 190)
far-right (188, 190)
near-left (96, 220)
near-right (224, 220)
What happened? Basically, the block's former "near" vertices have become its new "far" vertices.
Conversely, the new "near" vertices are the same ones as the old "near" vertices of the block directly in front of it.
The same principle applies to movement in any direction. Vertices cover up perfectly whenever you move exactly one block or turn the camera by exactly 90 degrees.
In other words, it is possible to navigate the entire blocky world without having to re-project 3d vertices to 2d screen coordinates -- ever.
We just need to map world data (blocks) to pre-calculated 2d coordinates.
With a single lookup table like this, we could cover the four viewing angles (north, south, east, west), and also directly up and directly down.
Of course, such a lookup table is memory-intensive. We need to store the 2d coordinates of ALL block vertices that could theoretically appear on the screen. This scales with the render distance cubed.
Let's do an example for a render distance of 16 blocks, and a horizontal field of view of 90° (45° left, 45° right):
The "ground plane" at the far end of the "visible pyramid" is 32 by 24 blocks (smaller field of view in the vertical direction).
That means, there are (32+1)*(24+1) = 825 vertices in that plane. The +1 is rounding up for the vertices near the edge of the field of view.
The number of vertices needed roughly equals the colume of the "visible pyramid", which is
V = 1/3 * h * Ag
where the pyramid's height h is the render distance 16, and the surface area of the gound plane, which we've estimated to 825.
This yields 4400 vertices.
For each vertex, we would have to store 3 bytes (assuming 320x240 resolution, which means 2 bytes horizontal position, 1 byte vertical), which amounts for 13200 bytes.
Now there are symmetries.
Mirroring vertically and horizontally allows for reduction to one quarter of the size (3300 bytes). You could theoretically exploit the symmetry along one of the diagonals, too, further reducing LUT size. But that might not be worth the extra CPU spent on "unfolding" the symmetry.
The bottom line is, one or two sets of pre-calculated vertices fit in a RAM bank.
Another one of these LUTs could be used for a diagonal viewing angle of 45°.
2. Rendering Procedure
In 3d rendering, you typically want to do as little work as possible, ideally only spend rendering effort on things that are actually visible on-screen, and not covered up by other surfaces which are closer to the camera. The same, of course, is true here.
We initialize the whole screen with zero.
After that, we start drawing the surfaces, starting with the closest blocks first. Close surfaces have a good chance of quickly covering large portions of the screen already.
Before we draw a surface, we check whether all of its four corners are already covered.
This can be easily done by two lookups per corner:
- look up the screen coordinates in the table of projected vertices
- then look into the screen memory if the pixel has already been set (i.e. non-zero)
Granted, it is likely possible to construct a situation where this assumption breaks down, but I am pretty sure it is good enough to be practical, and visual glitches because of it should be very rare.
Note that we don't have to do four lookups per surface.
- each vertex is part of up to twelve potential surfaces, some lookups could be re-used
- many surfaces can be excluded for different reasons (facing away from the camera, not interface between air and solid)
There are a few more nuances to this, and I am sure, they are worth several headaches. I have thought about some more details already, but these are the core concepts.
Back-of-the-Envelope CPU budget calculation
Let's for now target 5 frames per second. This isn't very much, but if you consider you can only move in whole block increments anyway, it won't be that bad.
At 8 MHz that means we have 1600k CPU cycles for each frame.
Now let's say we want to fill 320 by 240 pixels, which is just under 80k pixels, we have a little over 20 cycles per pixel.
The inner rendering loop I outlined above might look something like this:
Code: Select all
; render line of pixels
ldx color
ldy num_pixels
bra @2 ; jump into the middle
@1:
sta DATA1 ; write pixel data
dey
beq @end
@2:
lda DATA0
beq @1 ; if pixel is zero, write pixel data
stx DATA1 ; pixel was non-zero: read and discard, advance DATA1 pointer
dey
bne @2
@end:
What I haven't addressed so far is the data structure of the world and how to efficiently read from it at the correct locations during rendering (how to quickly traverse it in different directions and between chunks). I think 16x16 chunks as in the original Minecraft is too large, 8x8 may be more suitable.
Just wanted to share. I currently don't feel like trying this out myself, so if anyone else is interested to have a go, I am totally fine with that. I might in the future, I might not, who knows