desertfish wrote: ↑Mon Dec 23, 2024 8:11 am
You won't be changing ALL sprite attributes every frame I would think?
First off, thank you for the test program!
I am very used to programming on the Sega Genesis and Nintendo DS, where you keep a local copy in RAM of all the sprites and palette, and then just copy them to OAM or VRAM with DMA during vblank. If one were to allocate VRAM for every sprite and every palette entry, and then only update a few things every now and then, it would be slower than just having a local copy in RAM, do all the work there, and copy it all in one fast operation over to VRAM at vblank time.
On the Commander X16, I am forced to only use 32 sprites (a balance of game frame time, not vblank time), and 32 duplicated sprites for effects (all constructed onto a heap during active display, to be uploaded to VRAM as fast as possible by the CPU during vblank), since I need vblank to also upload palette and tilemap and hopefully like 1 or 2 8x8 4 bpp tiles (no way to do dynamic animations, the Sega Genesis can upload 40 tiles with DMA easily and keep the framerate at 60 Hz). We are talking about a standard game here. It feels weird to leave so many hardware sprites for nothing, but I will be fine with what I currently have.
ahenry3068 wrote: ↑Mon Dec 23, 2024 12:22 pm
All those things can easily be accomplished inside vblank with the existing architecture.
(and there is a screen off register) My video playing code actually flips visibility on 70 sprites and copies all 512 bytes of the palette comfortably during Vblank.
I also have a VERA decrement function that works to do a fade.
At work right now but I would be happy to provide some code for you later. My 2024 XMAS Demo does have the sprite flipping and palette copying code and it's not even fully optimized.
I was looking at that Second Reality demo and kept an eye on the palette during fade ins and fade outs, and there were no places where the entire palette was updated per frame. I took that as a sign that it would be impossible. I mean, why wouldn't you have 3D polygons flying around and also fade in at the same time using every single palette entry?
desertfish wrote: ↑Mon Dec 23, 2024 3:03 pm
Vera's auto increment/decrement mode and possibly using both data ports at the same time
Oh, I have already tried that. That was my first setup. It became very complex in the end to have two data ports, having to increment the source addresses twice per write and keep track of their carries. The CPU was just faster at moving data through one port. I do use it for vertical tilemap updates where I need 2 neighbor bytes to be written, 128 bytes apart.
Guybrush wrote: ↑Mon Dec 23, 2024 5:04 pm
DMA is not possible with VERA simply because there aren't enough address lines connecting VERA to the address bus. There are only 5 lines which gives you access to 32 registers and that's also why VERA FX requires you to use the DCSEL bits to access all of its functionality.
DMA functionality which would allow VERA to access RAM (even if it's not a full DMA implementation) would require all 16 address lines to be connected to VERA, plus some extra lines connected directly to the CPU (BE, RDY...). More address lines would require a larger FPGA which would obviously be more expensive.
Yep, as much as I would like to blame the price of FPGAs, I still think the potential IS there. I was surprised that a 16 bit CPU was being considered to be supported, when the bottleneck between the CPU and VERA is the actual problem.
I am currently living with the reality that the Commander X16 can do about 32 dynamic updating sprites per frame (sprites that change all their attributes in a frame). I am happy with 32 sprites. It just sucks that VERA has 128 for some reason (most likely cheaper to have more sprites and lesser address lines, than the opposite).
Wavicle wrote: ↑Mon Dec 23, 2024 6:48 pm
As I recall the DMA on NES lived in the 2A03 CPU.
It would be possible to have hardware external to VERA do DMA writes. The current VERA firmware could not keep up with writes every cycle so a delay would be necessary.
NES is the prime example of why DMA is needed for the CX16. Make a game for the NES that uses only the number of sprites that you would be able to copy using the CPU alone (no DMA, though you can't write sprites without DMA so my example lacks merit). Sure you could get all the sprites to be visible at some point, but not all of them at the same time on each new frame. Money could have been saved and force games on the NES to only have a few active sprites moving, and design all the games to work with that. Games like Super Mario Bros. could have used those extra sprites as the status bar since they don't need to move or update or anything. But for some reason, Nintendo decided to make all the sprites available through DMA. The question remains, why was Nintendo so stupid to do so when they should have gone the Commander X16 way, and save the money instead?