Page 1 of 2

[Emulator] MACPTR to VERA causes poor performance

Posted: Sat Apr 08, 2023 7:12 am
by DragWx
See here for related demo and discussion

In r42, when using MACPTR to transfer a large block of data to the VERA, every single byte written to the VERA's data port causes the current scanline to be rerendered in full, as part of a catch-up partial update for supporting raster effects.

The problem comes from MACPTR being "instant" in the emulator; if you transfer 10000+ bytes to the VERA in the middle of a scanline, you immediately render 10000+ extra scanlines and use zero pixels from them. This causes the emulator to lag very badly when running the referenced demo.

A quick fix for this specific scenario is to exit render_line early if it's about to do a partial update of 0 pixels.

Re: [Emulator] MACPTR to VERA causes poor performance

Posted: Mon Apr 10, 2023 7:00 am
by MooingLemur
I think we've partially fixed the problem after R42. The emulator thought hostfs calls did not take any 6502 CPU clocks, but because it took wall clock time, it assumed the emulator was lagging and it warped forward, distorting audio. This has been fixed the current master and for R43.

Exiting render update should be a good additional optimization though.

Re: [Emulator] MACPTR to VERA causes poor performance

Posted: Mon Apr 10, 2023 7:21 am
by MooingLemur
Interestingly, foregoing the midline raster update altogether does not improve things in the official emu for this demo on hostfs. However it works quite well from SD card image and on real hardware.

This probably means the timing workaround I mentioned is actually telling the system too many clocks went by for the hostfs ops and the emu is slowing down too much. I'll work with this.

Re: [Emulator] MACPTR to VERA causes poor performance

Posted: Tue Apr 11, 2023 2:16 am
by DragWx
I got a build environment set up and was able to play with the source code for a bit today. When commenting out the warp-mitigation code in MACPTR, I found this:

When MACPTR lags, the default -abufs setting (8) isn't large enough to hold the extra samples necessary to catch up from it, at least on my test machine. Bumping that up to 16 improved the audio stuttering.

The audio buffer can actually overflow (wridx can overtake rdidx), which the emulator interprets as the audio buffer suddenly becoming empty, which can cause stuttering, especially during warps. (I can actually prepare a PR for this if it'd be helpful)

Re: [Emulator] MACPTR to VERA causes poor performance

Posted: Tue Apr 11, 2023 5:52 am
by MooingLemur
I think I understand what you're saying. There seem to be two issues you're bringing up.

1) The system spends too much time in the handle_ieee_intercept() code so that when it does get around to returning to the main loop to step the audio (and everything else) based on the elapsed clocks, it's so far behind that the buffers are insufficient to allow it to catch up. Perhaps we can't afford to stay inside handle_ieee_intercept() (particularly MACPTR) for long durations.

Interestingly when I artificially reduce the maximum bytes returned from MACPTR, the INDY.PRG demo breaks. The demo must be depending on some specific amount being able to come back with each call, even though the .A register input to MACPTR is technically a maximum and the system can legally return less as long as it returns at least 1 byte. It also didn't help with Calliope's sound quality if it was playing back audio while doing a background load.

2) If the system is allowed to warp, the audio buffer can be filled faster than it drains so that the buffer wraps before it can be emptied. If you implement a fix for this, please make sure that it doesn't prevent warping (toggle with Ctrl+=)

A PR would definitely be welcome.

Re: [Emulator] MACPTR to VERA causes poor performance

Posted: Tue Apr 11, 2023 6:50 pm
by DragWx
Hey, while I was working on the PR, I found another issue with audio; the resampling for the VERA and YM2151 doesn't work quite right (it fills the samp[] buffer with the same pair of bytes over and over, instead of the intended 8 bytes), which badly degrades the output, most noticable on the YM. Would you like me to fix this and roll it into the same PR?

Re: [Emulator] MACPTR to VERA causes poor performance

Posted: Wed Apr 12, 2023 6:20 am
by MooingLemur
I found the other issue. Guess what it was ;)

I was overthinking the solution and then suddenly spotted this.

Screenshot from 2023-04-12 06-15-12.png
Screenshot from 2023-04-12 06-15-12.png (31.25 KiB) Viewed 5006 times

hostfs can "consume" more than 255 clocks, and almost certainly less than 65535, but on the rare chance it gets exceeded, I made it a uint32_t. PR submitted, and merged. INDY.PRG now works flawlessly on hostfs.

Re: [Emulator] MACPTR to VERA causes poor performance

Posted: Wed Apr 12, 2023 6:22 am
by MooingLemur
DragWx wrote: Tue Apr 11, 2023 6:50 pm Hey, while I was working on the PR, I found another issue with audio; the resampling for the VERA and YM2151 doesn't work quite right (it fills the samp[] buffer with the same pair of bytes over and over, instead of the intended 8 bytes), which badly degrades the output, most noticable on the YM. Would you like me to fix this and roll it into the same PR?
It looks like you got that in with the PR. Thank you!

Re: [Emulator] MACPTR to VERA causes poor performance

Posted: Wed Apr 12, 2023 5:34 pm
by DragWx
Uh oh, with the newest commit, INDY.PRG is giving me garbled visuals and sound. I think there's still something weird going on.
x16_indy_glitch.png
x16_indy_glitch.png (70.31 KiB) Viewed 4982 times

Re: [Emulator] MACPTR to VERA causes poor performance

Posted: Wed Apr 12, 2023 7:50 pm
by DragWx
This might actually be a different issue. The demo might be getting confused when the PCM fifo becomes 0, because that's when the glitches happen. If I remove the code which skips clockticks6502 forward after an intercepted IEEE call, the demo no longer glitches.

I think this is a case of MACPTR just being slower on my machine, and the fact that the 6502 can't service IRQs during that time. That might be causing a situation the demo normally doesn't have to deal with, like the PCM fifo having a chance to become empty. :P