Can the VERA's FPGA be reprogrammed?

StephenHorn · Post by **StephenHorn** » Tue Jul 28, 2020 3:14 pm

5 hours ago, Kilian Hekhuis said:

I can imagine having to almost duplicate the sprite rendering engine (if you could call it that) for the sprite collision detection, but if that allows processing in a seperate thread, I'm all for it :).

Hrm, I'm not sure if that would help. Something I'm not sure I conveyed well is that the sprite collision on the VERA is determined by whether or not it tried to draw two or more sprites over the same pixel. So the work units *really* matter, all the way through.

I think there's something, though, to be mined from the approach of precalculating the work units per line for each sprite, so that we have less to calculate on-the-fly later on and can at least zip through the 128 sprites a little more quickly with faster ops when we're confident that there's enough work units to get through the entire sprite.

Another thought is that it may be useful to cache precalculated data about sprites, so that we don't have to pay the cost of re-calculating sprite info for looping sprite animations. In my most recent branch for working on performance improvements, I'm doing a lot of backbuffering of layer data, and then caching the backbuffers and layer settings based on a signature for the layer, so this is a similar idea to that. My video_layer_properties caching is pretty naive, though... it's just a linked list of up to 16 elements, but it gets the job done and removes the performance hit of switching between layer settings after the cache has been warmed up. I think a similar approach would be useful for sprites, but I anticipate needing something more like a hash table for storing hundreds of video_sprite_properties structs. There's also an enormous caveat because it's very easy to have to invalidate most, if not all, of the sprite cache when performing a write to VRAM that isn't directly to sprite properties, because otherwise we have to loop through every entry in the sprite cache and individually poke assets. There's gotta be a better way, and I'm wondering if there's a "split the difference" approach that might be possible by doing some precalculation over the entirety of VRAM, but for very few unique combinations of sprite properties.

This is currently where my thought processes are at.

BruceMcF · Post by **BruceMcF** » Sun Aug 02, 2020 4:50 am

On 7/27/2020 at 2:12 PM, TomXP411 said:

I'm really hoping the source code gets released, because I really want to see Commander on the MiSTer FPGA platform. All of the other components in Commander already exist as open source FPGA code; it's just VERA that is unique, at this point.

Since the MiSTer system is the kind of thing where many of its users will be heavy users of bootleg ROMs, it would not be surprising if Cloanta was hesitant to see the CX16 on MiSTer.

Yuki · Post by **Yuki** » Sun Aug 02, 2020 6:36 am

On 7/27/2020 at 1:22 PM, SlithyMatt said:

I'm not sure of the legality of the emulator's BSD license, as it is emulating a proprietary system.

Licenses can't apply to the system it's emulating (and vice-versa), as long as it's your own code that replicates said system. Kinda like ReactOS, which is a reverse engineered reimplementation of Windows, it's perfectly legal (or at least, assumed to be) as long as you're not stealing any code, ideas, patents or trademarks from Microsoft. Well, sure, it's their IP, but there's also a certain concept of interoperability in the law so emulators are legal, or at least the companies who own said systems won't mind.

TomXP411 · Post by **TomXP411** » Mon Aug 03, 2020 6:24 am

On 8/1/2020 at 9:50 PM, BruceMcF said:

Since the MiSTer system is the kind of thing where many of its users will be heavy users of bootleg ROMs, it would not be surprising if Cloanta was hesitant to see the CX16 on MiSTer.

Cloanto is already in that business, since they sell two different emulators of their own (which are mostly just a wrapper for open source emulation products.) And there is a perfectly legal way forward, since buying Amiga Forever or 64 Forever includes a license to use the ROM on physical hardware.

BruceMcF · Post by **BruceMcF** » Mon Aug 03, 2020 10:43 am

4 hours ago, TomXP411 said:

Cloanto is already in that business, since they sell two different emulators of their own (which are mostly just a wrapper for open source emulation products.) And there is a perfectly legal way forward, since buying Amiga Forever or 64 Forever includes a license to use the ROM on physical hardware.

If they were the distributor, they would obviously be much less hesitant.

Sohl · Post by **Sohl** » Thu Jan 14, 2021 3:29 am

New guy here! I posted an intro in the Intro section, so I will jump right into my question.

@Frank van den Hoef if you have a moment, or perhaps others can answer too. A question came up about adding math coprocessor functions to VERA in a YouTube comment to a Matt Heffernan assembly programming tutorial using the X-16 as a target platform. Anyway, I've only done a small amount of VHDL for FPGA on Xilinx Zynq but I know many FPGAs have several specialized multiplier/DSP cells that can be used in designs. Is the VERA design close enough to complete now that you know which FPGA will be used for VERA and how many unallocated cells it will have that might support later added functions? Post-launch, of course!

Hope this is not too prickly of a topic to raise again. I've skimmed this thread and several others and I think I understand the motivation and intent of the X16 project to keep to a fairly pure 8-bit 80's experience. I hope to support it myself with some programming work at least. Best wishes to all!

Terrel Shumway · Post by **Terrel Shumway** » Tue Feb 16, 2021 12:12 am

On 1/13/2021 at 8:29 PM, Sohl said:

New guy here! I posted an intro in the Intro section, so I will jump right into my question.

@Frank van den Hoef if you have a moment, or perhaps others can answer too. A question came up about adding math coprocessor functions to VERA in a YouTube comment to a Matt Heffernan assembly programming tutorial using the X-16 as a target platform.

Back in 1981, Digital Acoustics DBA DTACK Grounded created a 68000 co-processor board for Apple II and Pet to give faster floating point to the host computers. This seems very similar to what Acorn was doing with the Tube. I would think that just plunking an ARM core on an expansion card would be a really cool way to do the same thing today, and much easier (and cheaper) than creating your own FPU in the VERA FPGA.

Sean · Post by **Sean** » Tue Feb 16, 2021 2:29 am

1 hour ago, Terrel Shumway said:

Back in 1981, Digital Acoustics DBA DTACK Grounded created a 68000 co-processor board for Apple II and Pet to give faster floating point to the host computers. This seems very similar to what Acorn was doing with the Tube. I would think that just plunking an ARM core on an expansion card would be a really cool way to do the same thing today, and much easier (and cheaper) than creating your own FPU in the VERA FPGA.

I wonder if you could just interface with an actual FPU on an expansion board. It would take work, but the Motorola 688882 math coprocessor was designed to work as a direct coprocessor for compatible Motorola MPU's and as a peripheral processor for other Motorola and non-Motorola MPU's. That is, you can place it on an 8. 16. or 32 bit bus and treat it as a peripheral chip to be interfaced with like a 65C22 VIA or an ACIA. The coprocessor and MPU do not have to have the same clock speed. You would need glue logic. You would need code on the assembly side to convert the data to and from the correct floating point format. There are small quantities of used and NOS Motorola 688882 math coprocessors available from eBay and the like. There are tens of thousands available from Rochester Electronics NOS, though Rochester tends to want about $250 for a minimum order of any product except if they've got less than $250 of a product remaining.

I was looking into the 68882 for a possible 65C816 homebrew computer project. Certainly not for mass production like the crew here is doing with the X16, or some of the other projects out there like the Mega65 or the C256. The more I think about, though, the more I'm considering making it into a second processor expansion for the Commander X16, akin to what could be done with the BBC Micro or the TRS-80 Model 16's Z-80 + 68000 default configuration. I'll have to think more on this after I get past the breadboard stage of tinkering.

Another FPU option would be a separate FPGA as math coprocessor - there are several options upon on OpenCores.org. Again, some glue logic would be needed to integrate it with whatever expansion bus the Commander X16 ultimately offers.

A third thought would be interfacing with a Raspberry Pi Zero. Alas, the Pi won't function as an SPI slave, which would be the obvious interface, but a bare metal Pi could be interfaced without the latency issues of Linux and still offer most of the hardware capabilities if you needed them: FPU, HDMI, USB, etc. The circle project offers a C++ starting point for bare metal Pi programming, with plenty of examples. You could make it work with a Pi running Linux, but there would likely be random latency problems because the OS has preempted your code.

Just throwing some more ideas out there.

Terrel Shumway · Post by **Terrel Shumway** » Tue Feb 16, 2021 4:07 am

1 hour ago, Sean said:

I wonder if you could just interface with an actual FPU on an expansion board. It would take work, but the Motorola 688882 math coprocessor was designed to work as a direct coprocessor for compatible Motorola MPU's and as a peripheral processor for other Motorola and non-Motorola MPU's. That is, you can place it on an 8. 16. or 32 bit bus and treat it as a peripheral chip to be interfaced with like a 65C22 VIA or an ACIA. The coprocessor and MPU do not have to have the same clock speed. You would need glue logic. You would need code on the assembly side to convert the data to and from the correct floating point format. There are small quantities of used and NOS Motorola 688882 math coprocessors available from eBay and the like. There are tens of thousands available from Rochester Electronics NOS, though Rochester tends to want about $250 for a minimum order of any product except if they've got less than $250 of a product remaining.

https://www.pjrc.com/store/teensy41.html

Teensy 4.1 has a FPU (and USB and ....) and costs less than $30... faster and more capable than any retro math coprocessor. DTACK Grounded did eventually support the 68882 and the National Semiconductor FPU as peripherals when the chips became available, but even the software FP stack running on a 68000 was way faster than the 6502.

Interfacing a modern ARM SOC to the X16 may seem like cheating and out of scope for David's vision, but DTACK Grounded was also out of scope for Apple's vision of the II series. It was an amazing tool for those who cared about crunching numbers.

By creating a "stuffer" card like the one D.G. created for the Apple II, the Kernal only needs one low-level driver to support whatever SCSI-like packet interface people want to throw on an expansion card (including Ethernet and USB). I think this was also the idea behind Acorn's "Tube": you can add whatever co-processor you want and you only need one software interface.

I don't think there is any need for an ACIA or VIA. A much cleaner interface can be created with any microcontroller that has enough (16-40?) GPIO pins.

1 hour ago, Sean said:

I was looking into the 68882 for a possible 65C816 homebrew computer project. Certainly not for mass production like the crew here is doing with the X16, or some of the other projects out there like the Mega65 or the C256. The more I think about, though, the more I'm considering making it into a second processor expansion for the Commander X16, akin to what could be done with the BBC Micro or the TRS-80 Model 16's Z-80 + 68000 default configuration. I'll have to think more on this after I get past the breadboard stage of tinkering.

Great idea. This is exactly what I am thinking.

1 hour ago, Sean said:

A third thought would be interfacing with a Raspberry Pi Zero. Alas, the Pi won't function as an SPI slave, which would be the obvious interface, but a bare metal Pi could be interfaced without the latency issues of Linux and still offer most of the hardware capabilities if you needed them: FPU, HDMI, USB, etc. The circle project offers a C++ starting point for bare metal Pi programming, with plenty of examples. You could make it work with a Pi running Linux, but there would likely be random latency problems because the OS has preempted your code.

Does the X16 do SPI? That would be perfect.

Sean · Post by **Sean** » Tue Feb 16, 2021 4:42 am

24 minutes ago, Terrel Shumway said:

Does the X16 do SPI? That would be perfect.

The Commander X16 team have been rather sparse in detailing the hardware, but I haven't seen anything about SPI support in the posts that I have seen about it. On the other hand, there are plenty of "generic" 6502 solutions for that, which I suspect will be compatible with what they have said about the expansion capability. Bit-banging using a VIA is a common approach. I think it can also be done with shift registers. But there are also two common CPLD-based approaches I've seen for 6502 (and 65816) use.

65SPI using an Atmel ATF1504 CPLD: https://sbc.rictor.org/65spi2.html

65SPI/B or SPI65/B using a Xilinx 9572: http://www.6502.org/users/andre/spi65b/index.html, http://www.6502.org/users/andre/spi65b/65SPI-B Datasheet V1.1.pdf

And if you're interested in a bus approach to connect multiple SPI devices, see http://forum.6502.org/viewtopic.php?p=10957

Most of the details I've seen so far regarding expansion is in this post: