Some things I've been working on

Talk about your programs in progress. Discuss how to implement features, etc.
Forum rules
This section is for testing Commander X16 programs and programs related to the CX16 for other platforms (compilers, data conversion tools, etc.)

Feel free to post works in progress, test builds, prototypes, and tech demos.

Finished works go in the Downloads category. Don't forget to add a hashtag (#) and the version number your program was meant to run on. (ie: #R41).
cosmicr
Posts: 36
Joined: Tue Nov 14, 2023 4:29 am

Heart Some things I've been working on

Post by cosmicr »

Hello everyone. I've been a fan of the 8-bit guy since he was the iBook guy and have followed the updates on the CX16 with keen interest. I'm too poor to be able to buy one at this point in time, but late last year I discovered the emulator. I'm not a professional programmer or anything like that, but I love retro computers. I have some hobbyist experience in Python, C++ and a couple of other languages, but had never really touched anything C64 or 6502 related (apart from writing BASIC as a kid).

This is a somewhat technical post, so look at the pictures if you're only interested in the good stuff!

Scorched Earth Clone

Anyway, I decided to have a bit of a play, and learn C89, along with 65c02 assembly, and how the VERA works. I thought a GORILLA.BAS/Scorched Earth clone would be a fun challenge:

All the recordings in this post are *real-time*

Image

It was a great way to learn, I was forced to write some parts in assembly, because C code is just way too slow for any graphics. It's really just a proof of concept. I have bigger plans :)
Features include:
  • 4-bit 320x240 mode Layer 0 Bitmap, Layer 1 Text, and Sprites
  • Custom Font (I ripped from Sierra SCI Engine)
  • Animated explosions
  • 10 Procedurally generated levels
Some of the challenges was discovering how to do floating point numbers without a float type. Was a good introduction to fixed point maths. Obviously it's incomplete, my idea was that there would be alien bases you need to destroy, and you'd have an inventory of different ammo etc. It could be extended to a worms-type game and all kinds of possibilities.

Sierra AGI Interpreter

Whilst that was fun, I had my sights set on making a Sierra AGI Interpreter for the system. I've been apart of the "Sierra modding" community since 1997, before we had even completely reverse engineered the format of AGI. Check out my profile on the sciprogramming forum: https://sciprogramming.com/community/index.php

To my surprise Manannan had already begun to make one as well. Theirs appears to be based on Lance Ewings interpreter (MEKA). However, I have different goals in mind, so I continued to develop it. I again learned a LOT more about the system, including the memory limitations. My code is 32kb, pretty much maxing out the available code space, and I'm not even finished yet! I've done all kinds of tricks to minimise memory usage:
  • reduce stack size
  • utilise memory at $400 for lookup tables
  • inline a lot of one time code
  • convert "All the things!" to assembly!
It's still not enough, but I'm getting there. One of my goals is to implement the engine without using banked ram for code. Features I have implemented include:
  • Full resource loader (any resource from version 2 games) - resources are loaded into a simulated "Heap" space in banked ram.
  • Relatively fast pic (background) drawing routines - probably close to the Tandy or 8088 version speed (faster than the Apple II)
  • Text display, menus, and other text functions
  • Sound routines (all 3 voices and noise channel on the PSG!) It sounds great - I'll try to upload a sample, just gotta work out how to record it and find a place to upload.
  • about 50% of Logic opcodes implemented before I ran out of space.
Image

Currently I'm optimising the hell out of it (I know you shouldn't optimise early) to try to squeeze out more space in RAM.

Still to do are:
  • Views (sprites) - I will use the sprite layer. I'm thinking each view will have two sprites: 1 for the visible part, and 1 for a "Z-buffer". As the sprite is moved, it will update the buffer and draw as needed, drawing in front or behind based on the priority screen.
  • Input controls
  • Remaining opcodes. Some are system specific so can be left out, but there's quite a few that I haven't yet implemented.
  • Version 3 games (later games, Gold Rush, Manhunter 2, etc)
The reason the pic drawing is so fast is because it's using assembly. I had to learn about the bresenham algorithm and span filling flood fill. One of the bonus things about the 4-bit mode is that one byte represents 2 pixels, which is the same as the EGA mode on MS-DOS. So you can read and write pixels pretty fast. My process is to generally get it "working" in C first, then convert that C to Assembly. You need to have a good handle on Binary and Hexidecimal, and a good understanding of what 65c02 registers are affected by each instruction. I've tried to use some of the 65c02 exclusive instructions too, like STA, and BRA.

I've uploaded my work in progress here: https://github.com/cosmicr/astral_body

Another World

My AGI interpreter is going ok, but it will continue to be a challenge to get it all in less than 32k. I'm not sure I'll be able to complete it. Plus Manannan has made much better progress on their version than me anyway.

I'm a big fan of the Amiga 500 game "Another World". I knew that it had been reverse engineered and documented quite well. So that's my current project. Actually, there isn't much documentation on the formats of AW - more on that below. The game code is a lot simpler than AGI, with fewer instructions and simpler opcodes. I already have 99% of them implemented (mostly).

But where the opcodes are simple, the graphics system is quite the opposite. It seems that one of the goals of the original version was to cram as much data into the files as possbile. The compression format is an archaic type called "bytekiller". A500 afficionados would probably scoff at me, but I had never heard of it. I managed to reverse engineer it based on some code and have now documented it. The decompression on the CX16 is quite slow (it's written in C) so I've actually preprocessed the datafiles and decompressed everything. However if you want an authentic experience it does work. The next thing is the graphics use a LOT of bit manipulation to handle how the polygons are drawn. The graphics in Another World are all polygon based (except for text). Here is a good explanation: https://fabiensanglard.net/another_worl ... index.html. There's two things that stand out to me:
  1. The actual file formats of the graphics aren't discussed - that's because they are COMPLICATED! The polygons are setup as immediate polygons to be drawn, or polygon groups. These are set by bit flags in the files. Sounds simple enough. Except the logic behind which bitflags are used make no sense at all. It seems like the original author (Eric Chahi) might have had plans to expand the engine. As an example, if bit 6 and 7 are set then a color is defined on an immediate polygon by the lower 6 bits, but if only bit 7 is set then the color is inherited by it's parent, but if bit 6 or 7 isn't set and bit 2 is, then the next polygon is a group??? Very weird and confusing. This sort of stuff is all through the data formats.
  2. The game uses 4 pages of screen buffers. This is a HUGE problem for the Commander X16. I have been able to absolutely max out the VERA RAM with 4x 32000 bytes. Seems simple enough right? It's only 128000 bytes and the VERA has 131072 bytes(128k) right? Except that some of that memory is reserved for VERA registers (640 bytes), and then you need some memory for text mode, and the text character set, and all of the sudden there's not enough. Also memory space for layers in VERA RAM is only given in 2kb blocks, which 128000 does not fit into neatly. I ended up sacrificing text mode and layer 1.
The entire game at this point is in layer 0. The original game used fast DMA writes to VRAM, we don't have that luxury. I have been playing with the VERA FX for faster writing, but it's only 4x faster than regular writes (when using the 32 bit cache).

The polygon filling routine is a scanline routine which I copied from other implementations of the game, which was a pain because they are all 32-bit or higher. It uses fixed point maths to calculate the stepping along polygon edges. At the moment it's very slow because it's in C. I'm going to see if it might be fast to use the VERA FX, but I feel like the setup code time reduces the advantage gained by the extensions. There is heaps of room for optimisation

Image

At the moment you can see there are several issues. But it's working. It's about 0.5 frames per second. The game uses "dirty buffering" combined with double buffering. The double buffering is slow because no DMA. The dirty buffering will be quite fast once I have optimised the polygon filling. Dirty buffering is when it copies only the part replaced from behind the polygon, rather than the whole screen. With a lot of optimisation I reckon I could get it to about 10fps (at most). If I have to I may end up reducing the resolution. Currently it's at the original 320x200 res but with double pixels.

Features implemented so far:
  • Every opcode, bytecode loading
  • Game data loading and unpacking (decompression).
  • Palette loading
  • Polygon loading and rendering (about 80% implemented - need to do single pixel plotting)
  • Bitmap loading and drawing (not sure where it's implemented in the bytecode, but the data is there)
Not implemented:
  • Transparency (fairly easy to do, just haven't got around to it yet)
  • "Dirty buffering" - I don't know what the actual term is. Also fairly easy.
  • Player controls.
  • Sound and Music
The A500 had awesome sound capabilities and the CX16 could replicate it, but there isn't enough RAM to load the sounds into memory, even using banked ram. So it's unlikely I'll ever implement sound, unless I used the 2mb extension.

Sierra SCI Interpreter

The SCI Interpreter ran on the same specs as the AGI Interpreter, but had enhanced graphics and sound. I don't see any reason why it couldn't be done. SCI Games are the EGA higher resolution era games, such as Space Quest 3, and Police Quest 2. I will set my sights on it soon. I have already had a play with the picture resource format, and it's very similar to the AGI one. The main differences are in the bytecode - whilst AGI is interpreted, the SCI system is a full Virtual Machine. It uses a object oriented system similar to SmallTalk from the 70's/80's. Views (sprites) are similar to AGI, and music is similar to midi which could be translated to use the YM2151 or at worst the PSG. The resources are larger, so memory management would be even trickier, especially since the VM is expecting things to be in specific locations.

Future Plans

I've been really enjoying my experiments with the CX16, and trying to port my favourite engines. I like to think I've gotten pretty good at understanding the system in the last 4 months. I'd also be interested in doing a full original project too. But I'm not very creative, and suck at drawing too lol. I'll probably look at other old game engines from the DOS era and see what I can do.

If anyone wants to know any more about what I've shown here, feel free to ask. I reckon the more we share and get people learning the more good content we're likely to see.

I'll post an update again after I've made more progress!
User avatar
desertfish
Posts: 1098
Joined: Tue Aug 25, 2020 8:27 pm
Location: Netherlands

Re: Some things I've been working on

Post by desertfish »

Impressive stuff as results of learning projects, I must say!

I've looked at Fabien Sanglard's code review and analysis of Another World too. Are you aware that he made a C reimplementation of the full engine as well? https://github.com/fabiensanglard/Anoth ... nterpreter
cosmicr
Posts: 36
Joined: Tue Nov 14, 2023 4:29 am

Re: Some things I've been working on

Post by cosmicr »

Yeah I studied it to understand a few parts of the code, it's actually in CPP - I believe its a refactoring of an older version made by Gregory Montoir. Fabien is very talented - he has some great write ups on classic games. I took my scanline rasteriser from there. Originally I used a line-table approach where the side of each horizontal line of a polygon is added to a table and then the table is drawn, but the scanline version seemed (somewhat) simpler, albeit the fixed point maths killed me.
hstubbs3
Posts: 72
Joined: Thu Oct 26, 2023 12:14 pm

Re: Some things I've been working on

Post by hstubbs3 »

The reason the pic drawing is so fast is because it's using assembly. I had to learn about the bresenham algorithm and span filling flood fill. One of the bonus things about the 4-bit mode is that one byte represents 2 pixels, which is the same as the EGA mode on MS-DOS. So you can read and write pixels pretty fast.
Wait until you go back over this using the VERA polygon helper and line draw helper... should pick up a bit of speed that way.
Even if you just go back over this and use the VERA's 32-bit CACHE WRITE to do the flood fill...

if the 8 pixels are cache-aligned, that's 1 instruction to write 8 pixels... doesn't have to be single-color, could be any 4byte pattern...

btw, the VERA also has signed 16bit multiplier accumulator ... I haven't used it yet, mostly seems would be best used if you were processing a bunch of numbers in a row, like if you were doing 3D maths or something? is just a bit of overhead configuring the VERA into the needed FX mode for the multiplier, assuming you are using it to calculate graphics...
hstubbs3
Posts: 72
Joined: Thu Oct 26, 2023 12:14 pm

Re: Some things I've been working on

Post by hstubbs3 »

At the moment you can see there are several issues. But it's working. It's about 0.5 frames per second. The game uses "dirty buffering" combined with double buffering. The double buffering is slow because no DMA. The dirty buffering will be quite fast once I have optimised the polygon filling. Dirty buffering is when it copies only the part replaced from behind the polygon, rather than the whole screen. With a lot of optimisation I reckon I could get it to about 10fps (at most). If I have to I may end up reducing the resolution. Currently it's at the original 320x200 res but with double pixels.
the VERA has 126K usable VRAM for graphics..
320x240x4bit is 37.5K ... there's enough space on the VERA to have 1 buffer display while you construct the next screen.

actually... 320x240x4bit * 3 = 112.5K ... you could use layer0 set to BITMAP layer for the background ... the part you're replacing from behind the polygons.. and then Layer1 can flip between 2 buffers for the polygon stuff..

this leaves 13.5K ... if you specify text using 8x8 4bit sprites, 1 per letter, you could use the hardware sprites to make up to 128 characters onto the screen..

a 128 character font of such sprites would only require 4k... so actually, you could carve out the remaining 9.5K for use as larger sprites and copy text or other data there.. so if you wanted a screen full of text, maybe now its 1 sprite per _word_ ...

still without affecting either background layer or the polygon buffers...
paulscottrobson
Posts: 305
Joined: Tue Sep 22, 2020 6:43 pm

Re: Some things I've been working on

Post by paulscottrobson »

If you can do your filling using horizontal lines it should be *much* quicker, because you can just blat a series of bytes at Vera.

Incidentally, I wrote an algorithm for Stefany Allaire's machine which is a Bresenham Approximation, it's a cheat variant of Bresenham where basically everything is scaled by half. Bresenham would be a better way of working out your intermediate points.

What this means is if your coordinates are 320x240 which I'm presuming they are, as it's a bitmap, then all the maths is done in 8 bit not 16 bit, which makes it noticeably faster.

Experimentally (not mathematically proven) this seems to produce an error of about +/- 1 pixel in the resulting line, which isn't actually noticeable.

Ask if you want me to dig the source out.
voidstar
Posts: 498
Joined: Thu Apr 15, 2021 8:05 am

Re: Some things I've been working on

Post by voidstar »

I remember Scorched Earth on the PC, great stuff. I did the Velocity demo on the SD card (in BASLOAD), so something like Scorched Earth is a welcome upgrade over that.

I was playing Another World recently on the 3DO - it's such a great game, except that there's no time to relax and just enjoy the scenery since you have to stay moving as everything can kill you :) I think the 3DO was a 32-bit system, and to me its best title was Return Fire (sort of like the Genesis game Jungle Strike). Star Control is another good one (although not just the 3DO version).

Great to see your projects, thanks for sharing.
hstubbs3
Posts: 72
Joined: Thu Oct 26, 2023 12:14 pm

Re: Some things I've been working on

Post by hstubbs3 »

Is kinda funny was thinking about like 'Another world' sorta stuff as I was getting to work this morning and..

if you are already doubling pixels.. one could just scale the screen to be say 160x200 ... that would give enough leeway to use _sprites_ instead of bitmap layers to cover the screen - you could stack about 500 pixels worth of sprites onto each scan line and be OK (for 16 color mode... )

going to sprites means you don't 'waste' half the memory in the bitmap and you can scroll as needed.. also could set the X for sprite to better align for cache writes ( possibly )...
then use tile layers for text and UI stuff?

basically it takes that 'dirty buffering' idea and puts it on its head - you just draw 'dirty buffers' on top of each other, possibly blitting them together (like photoshop/gimp 'flatten visible') if you need to free up sprite slots ...

but halving the horizontal scaling means both horizontal and vertical _pixel_ amounts are within 8bit ...

also, assuming you reserve say 6K for text / text layers? giving 120KB free VRAM ... with the halved horizontal resolution, 1 screen worth is 160x200 x 4bits = 160,000B ... only ~16K ... now one could stash up to 7.5x the amount of info as can be displayed onscreen... potentially allowing for caching the last room's background and simplifying drawing of things that only move onscreen but don't need to be redrawn per se - they're the same shape just somewhere else..

or they're different colors, because now there's access to 16 different palette offsets ....

just thoughts...the VERA is a pretty crazy beast and the helpers don't care if you're writing a bitmap layer or sprite data. Just sprites being limited to 64x64 means you will likely have to slice background polygons accordingly...
cosmicr
Posts: 36
Joined: Tue Nov 14, 2023 4:29 am

Re: Some things I've been working on

Post by cosmicr »

Thanks for all the ideas and suggestions everyone.

One of the main problems with the graphics is the architecture of the engine relies on there being 4 buffers (1 screen and 3 back buffers). The bytecode for the game assumes this so to change it to just one or two and utilise layer 1 or the sprite layer etc would be very difficult because you have to re-write the way the code for changing pages is interpreted. Also the overhead for managing those would add to the slowness too.

Small update: I replaced the polygon filling algo from the original game with my own new one that doesn't use fixed point maths, but instead calculates slope using bresenham, and buffers the edge coordinates for drawing (it's not perfect, but it avoids 32-bit fixed point maths). I'm also using the 32-bit cache for copies and clears, I will eventually use it for my line function too, which should speed it up a lot. I'm still having quite a few buffer issues, but I'll work them out. I'm still considering reducing the resolution to 256x160 to only use 8-bit calcs, but the game uses signed numbers which complicates it a bit (for me anyway).
another_20240506_speed.gif
another_20240506_speed.gif (539.04 KiB) Viewed 2930 times
Here's an updated gif, sped up by about 5x. At this frame rate I'd say it's roughly 9 or 10 frames per second, so at normal speed it's about 2-3 frames per second. I reckon once I iron out a few more bugs and convert the C to assembly, I should be able to get the actual speed to 10+ frames per second. The intro is pixel heavy, so actual gameplay should be faster. The CX16 is definitely capable of playing this game, it's just a matter of getting the coding good. I think eventually it should play better than the MegaDrive or SNES versions.
User avatar
ahenry3068
Posts: 1147
Joined: Tue Apr 04, 2023 9:57 pm

Re: Some things I've been working on

Post by ahenry3068 »

Are you running in 320x240 8 bit mode ? If so I would advise a deep look into the ROM routine graph_draw_image. Its basically a bitblit routine. If the graphics bitmap is prebuffered in RAM it's capable of about 15-17 fps blitting on a 160x120 rectangle. Much faster for smaller rectangles. The routine is not aware of Banked Ram prior to ROM r47 so "blitting" from banked RAM is limited to 8k buffers. (But any widthxheight that will fit in that buffer) .. As of r47 the routine is "Bank Aware" so an entire full screen image can be stored in Banked RAM and blitted. I found a full screen blit took about 14 jiffies so on the order of 4 fps full screen. (Of course there isn't room for many frames). I'm not suggesting this as a cure all, but just as a useful tool.
Last edited by ahenry3068 on Tue May 07, 2024 12:28 am, edited 1 time in total.
Post Reply