10 hours ago, xanthrou said:
Even with some trickery in sound, there's still memory banking, in addition of 39 kilobytes of RAM. Plus, we know that VERA has its own 128kB of VRAM. So there's still space for other things.
The memory isn't the main issue and is an easy solve in the tracker - each pattern is one 8k block "uncompressed". A sparse format for playback and storage would make this much MUCH smaller in the majority of cases so embedding a song within a game shouldn't be a problem. Curious how the C256 trackers might solve this as an aside though because that thing has sooo many channels (I think too many really).
The main challenge is, as I was trying to articulate and perhaps didn't do a good job, the processing power required to manage 16 channels worth of envelopes and effects for the PSG. The lack of hardware envelopes means we have to use CPU power to process that. The only means available to control a PSG voice is waveform, 2-bits of panning (LR,Both), 6-bits of volume, and 6-bits of PWM. The means we have to control it is via the data ports (so in this case the VRAM isn't particularly relevant). It's absolutely possible to produce some nice sounds with (see Concerto) but it comes at the cost of CPU. Which I think may be just fine but it also means most songs won't be using all 16 channels most of the time is my guess if we start to become CPU starved. Concerto uses a VERY clever solution here (it doesn't sync using vsync or line sync but sync's via the DPCM butter) which allows for controlling envelope precision (and thus CPU usage) as well as approaching things in a multi-timbral way.
Even so, modulating 16 channels means up to 64 accesses to the VERA per "tick", at whatever granularity is defined. For smooth sounds, it has to be pretty fast (that's why Concerto uses the DPCM buffer for timing). All told, producing even simple NES sounds will require some envelopes and automation since some of these had hardware support we do not have with the VERA. We do have a nice range of PWM though which is quite fun! (nothing though that will require software automation so the precision of modulating it will be related to how much CPU is available and how much voices someone wants to use).
Again, while I would definitely like hardware envelopes (and buzz noise, ala NES and GB's special noise flag), if we don't have it, that'll be just fine! Just pointing out there will probably be trade-offs to the point 16 channels of PSG is probably more than most music might use. I might rather use 8 or even 4 PSG voices (in tandem with FM of course) if it means I can get higher resolution envelopes.
I didn't even mention DPCM here. This can be expensive as well buuut I am curious by how much if using 'chiptune style' samples (very short samples which can be retriggered to make sounds ala the GameBoy's WAV channel or the TG16). I know folks many folks are thinking about digital audio here but "drawable chips" is the main draw for the DPCM for me. As far as I'm aware VERA doesn't have an auto-loop feature for DPCM so it may still require a lot of CPU, but I'll have to play with that. Using chiptune waveforms results in tiny amounts of data compared to actual DPCM audio.