FPGA as graphics card?

Chat about anything CX16 related that doesn't fit elsewhere
Lasagna
Posts: 8
Joined: Thu Aug 26, 2021 8:24 pm

FPGA as graphics card?

Post by Lasagna »


I will point out that VERA has been recently cloned (re-implemented) for the Roscoe 68000 based retro computer.

https://www.tindie.com/products/rosco/xosera-fpga-video-r1/

Doesn't look like it re-uses any of Frank's work, but rather the same features have been implemented on an upduino and modified. Same 120K video memory, etc.

I really like the 16:9 aspect ratio option that was added for those that do not have an old monitor hanging around.

No mention of sprites here, which makes me wonder if they are implemented. 

They did add an Amiga style co-processor (copper).

It is nice that it is already open sourced.

Wavicle
Posts: 284
Joined: Sun Feb 21, 2021 2:40 am

FPGA as graphics card?

Post by Wavicle »



On 1/6/2022 at 1:43 PM, Lasagna said:




I will point out that VERA has been recently cloned (re-implemented) for the Roscoe 68000 based retro computer.



https://www.tindie.com/products/rosco/xosera-fpga-video-r1/



Doesn't look like it re-uses any of Frank's work, but rather the same features have been implemented on an upduino and modified. Same 120K video memory, etc.



I really like the 16:9 aspect ratio option that was added for those that do not have an old monitor hanging around.



No mention of sprites here, which makes me wonder if they are implemented. 



They did add an Amiga style co-processor (copper).



It is nice that it is already open sourced.



I like that they based it around the Upduino and Icebreaker which are relatively easy to get.

That said, this looks more like a VERA-inspired project and not a clone. It has a few features that VERA does not, but if I'm reading the RTL correctly, there is no space available for sprites. The VRAM data path is 16 bits wide, clocked at the dot clock frequency, and playfields have priority. Since the effective memory bandwidth is 2 bytes per dot clock, there are no spare cycles for anything else (e.g. sprites, host VRAM access) with two playfields active. Reading or writing to memory during active display may only take place during blanking periods. It looks like software needs to check a VRAM read/write address register to make sure that VRAM access goes to the intended location. While only allowing access during HBLANK or VBLANK was a common limitation for several early computers and consoles, it was not for the C64 and is not for VERA.

I know widening the VRAM data path is listed as a TODO item, but the Upduino+VGA version's FPGA has used 100% of the BRAM and 83% of the LCs. Sprites are a pretty big deal on retro hardware. I think the design would be vastly improved by completely dropping the second playfield in favor of sprites.

Kalvan
Posts: 115
Joined: Mon Feb 01, 2021 10:05 pm

FPGA as graphics card?

Post by Kalvan »



On 1/7/2022 at 2:06 PM, Wavicle said:




I like that they based it around the Upduino and Icebreaker which are relatively easy to get.



That said, this looks more like a VERA-inspired project and not a clone. It has a few features that VERA does not, but if I'm reading the RTL correctly, there is no space available for sprites. The VRAM data path is 16 bits wide, clocked at the dot clock frequency, and playfields have priority. Since the effective memory bandwidth is 2 bytes per dot clock, there are no spare cycles for anything else (e.g. sprites, host VRAM access) with two playfields active. Reading or writing to memory during active display may only take place during blanking periods. It looks like software needs to check a VRAM read/write address register to make sure that VRAM access goes to the intended location. While only allowing access during HBLANK or VBLANK was a common limitation for several early computers and consoles, it was not for the C64 and is not for VERA.



I know widening the VRAM data path is listed as a TODO item, but the Upduino+VGA version's FPGA has used 100% of the BRAM and 83% of the LCs. Sprites are a pretty big deal on retro hardware. I think the design would be vastly improved by completely dropping the second playfield in favor of sprites.



Well, maybe for the Xosera II, they could use a bigger FPGA to implement hardware sprites.  Since this version doesn't seem to include the sound channels and FIFO, presumably that should leave some room in the FPGA for more features.

Roscopeco
Posts: 1
Joined: Sun Jul 10, 2022 11:07 pm

FPGA as graphics card?

Post by Roscopeco »



On 1/7/2022 at 7:06 PM, Wavicle said:




I like that they based it around the Upduino and Icebreaker which are relatively easy to get.



That said, this looks more like a VERA-inspired project and not a clone. It has a few features that VERA does not, but if I'm reading the RTL correctly, there is no space available for sprites. The VRAM data path is 16 bits wide, clocked at the dot clock frequency, and playfields have priority. Since the effective memory bandwidth is 2 bytes per dot clock, there are no spare cycles for anything else (e.g. sprites, host VRAM access) with two playfields active. Reading or writing to memory during active display may only take place during blanking periods. It looks like software needs to check a VRAM read/write address register to make sure that VRAM access goes to the intended location. While only allowing access during HBLANK or VBLANK was a common limitation for several early computers and consoles, it was not for the C64 and is not for VERA.



I know widening the VRAM data path is listed as a TODO item, but the Upduino+VGA version's FPGA has used 100% of the BRAM and 83% of the LCs. Sprites are a pretty big deal on retro hardware. I think the design would be vastly improved by completely dropping the second playfield in favor of sprites.



Sorry to resurrect an old thread here, but I just found this and thought I’d correct a few assumptions ?

First off, you’re totally correct that Xosera is not a VERA clone, but it is inspired by VERA. Given our primary target is a 16-bit rather than an 8-bit, we changed things that it made sense to change, and what we wound up making was much closer to the Amiga hardware than the C64 - not because the C64 hardware isn’t great, but because Amiga fits our interests more.

You’re right that we don’t have hardware sprites at the moment, but your read on memory bandwidth is not quite correct - in fact we have multiple subsystems with memory access on a per-pixel basis. Obviously the display hardware has to fetch pixels on every dot clock, but because of the way we’ve segregated BRAM we also have a pixel-synchronised COPPER fetching instructions and writing registers, along with four channels of audio hardware doing it’s thing and the BLITTER running in tandem to do things like async block copies and BOB rendering (all controllable by other systems). We don’t have direct host bus access on the UPduino due to lack of pins, but I do have a prototype Xosera which acts as a full 68k bus participant (including vectored interrupts and bus mastering for host memory access) for when we, in the fullness of time, have a bigger FPGA (I.e. more IO pins).

FWIW your call on disabling playfield B may have been prescient, we’ve currently done that to make things fit (for audio, not so much sprites) but we’re relatively confident we can get back to a place in which all the main options fit. That said, we’re not overly worried about sprites - we believe we have the computational and memory capacity that we can live without them beyond the one or two we’re currently targeting (possibly for bigger FPGAs).

caveat; I’m a mere collaborator on Xosera, but the designer of the rosco_m68k.

Wavicle
Posts: 284
Joined: Sun Feb 21, 2021 2:40 am

FPGA as graphics card?

Post by Wavicle »



On 7/10/2022 at 4:06 PM, Roscopeco said:




Sorry to resurrect an old thread here, but I just found this and thought I’d correct a few assumptions ?



First off, you’re totally correct that Xosera is not a VERA clone, but it is inspired by VERA. Given our primary target is a 16-bit rather than an 8-bit, we changed things that it made sense to change, and what we wound up making was much closer to the Amiga hardware than the C64 - not because the C64 hardware isn’t great, but because Amiga fits our interests more.



You’re right that we don’t have hardware sprites at the moment, but your read on memory bandwidth is not quite correct - in fact we have multiple subsystems with memory access on a per-pixel basis. Obviously the display hardware has to fetch pixels on every dot clock, but because of the way we’ve segregated BRAM we also have a pixel-synchronised COPPER fetching instructions and writing registers, along with four channels of audio hardware doing it’s thing and the BLITTER running in tandem to do things like async block copies and BOB rendering (all controllable by other systems). We don’t have direct host bus access on the UPduino due to lack of pins, but I do have a prototype Xosera which acts as a full 68k bus participant (including vectored interrupts and bus mastering for host memory access) for when we, in the fullness of time, have a bigger FPGA (I.e. more IO pins).



FWIW your call on disabling playfield B may have been prescient, we’ve currently done that to make things fit (for audio, not so much sprites) but we’re relatively confident we can get back to a place in which all the main options fit. That said, we’re not overly worried about sprites - we believe we have the computational and memory capacity that we can live without them beyond the one or two we’re currently targeting (possibly for bigger FPGAs).



caveat; I’m a mere collaborator on Xosera, but the designer of the rosco_m68k.



Just looking at vram.sv, it looks to me like a 16x64K memory clocked at the pixel frequency. 16x64K is the entirety of the single-ported RAM on the iCE40UP5K. The effective bandwidth of the single-ported RAM is therefore 2 bytes per clock cycle. If any subsystem other than the video generator wants to access memory during active display, the VRAM arbitrator (vram_arb.sv) will stall it until no video generator accesses are pending. At the time I wrote the original message, there were two 8-bit playfields necessitating a minimum aggregate bandwidth of 2 bytes per pixel. There was no bandwidth to fetch sprites from VRAM, so they would have to live in BRAM. At the time, 100% of BRAM was committed (Xosera/xosera_upd_vga_640x480_stats.txt at 6c5ae5c71fc36852eee47b0d666d781bd49cccf5 · XarkLabs/Xosera (github.com)). What about my read was not correct?

Xark
Posts: 5
Joined: Thu Jul 21, 2022 2:43 am

FPGA as graphics card?

Post by Xark »


Hello,

I just stumbled on this thread.  I am the designer of the Xosera retro video project (along with Roscopeco who designed the Xosera hardware board and the copper co-processor - and the main computer I am using with it). ?

For me, a good retro computer has to have "fun" retro graphics and an FPGA is the closest thing to making a custom ASIC "video chip" for that right now.  VERA seems a very nice design, but it wasn't clear it would be open originally which is why I started on Xosera (besides wanting an excuse to really learn Verilog).  Xosera was started before I was even sure what FPGA the X16/VERA was going to use.  Xosera is trying to extend the rosco_m68k system to be "kind of similar" to Amiga or Atari ST era computers for graphics (with a some SNES and other tiled systems mixed in due to VRAM limits). ?

I have been a game developer for a fair number of decades (for 6502, 68K and many other systems) and I basically tried to cram in as many "video chip" features in as possible that would make sense to use to "make a game" (or demoscene demo etc.) with 128KB of VRAM and a retro CPU.  The 128KB of "VRAM" (SPRAM) seemed just enough for a reasonable video design (with some clever retro tricks - not for brute force true-color bitmaps [boring]).

The design has pretty much exceeded my expectations, but I am currently running up on some resource limits of the FPGA now when I enable every option.  The 4 channel Amiga-like audio was a bit more "expensive" than I was hoping.  Currently it seems I can fit 2-channels of audio with most of the other important features (dual playfields, blitter, copper), but I really want four channels (for good MOD audio sources). ?

I had certainly intended to include sprites (or at least one for an "easy" mouse cursor), but this resource limit is making me having to rethink things a bit.  I will mention the included blitter can draw a large number of blitter-objects so it is not totally clear to me that sprites are required (in my initial testing I can erase and re-draw ~10 32x16 16-bpp objects in the VBLANK interval - so quite a large number should be possible with double buffering).  A few sprites would be very handy though...

As far as DMA bandwidth, Wavicle is correct that *if* you use full horizontal resolution (640 or 848 for 16:9) on both play-fields with both showing 8-bpp, there is indeed no more memory bandwidth during the pixel scan out time.  However this isn't generally realistic, because of VRAM limits so as a practical matter you probably are going to either be horizontally pixel doubling or using 4-bpp - or both (I will mention Xosera does also have 10KB of "tile" memory that can be accessed in parallel with VRAM, so sprites could live there).

Given all that, still sprite bandwidth is "not really a big problem" (vs the FPGA being "full") because there are a quite a lot of off-screen cycles where the sprite DMA could occur (~275 of them on each line before the first pixel, this is also where the audio DMA currently occurs).  So using this DMA time, the sprite "line buffers" could be filled before scan-out (and then overlaid on top of play-field at the appropriate horizontal pixel).

I still hope to optimize and trim to get pretty much everything fitting, I am making slow progress (it is close - but the routing gets problematic).  For sure I plan to make some sprites available an option in the design (even if you can't enable *all* the options at once using this small FPGA).  Xosera has already been ported to other larger FPGAs (ECP5) as well as hooked up to a 6502 system https://zeromips.org/posts/2022-03-20-xosera/ (with a nice demo ported over too, linked below).  Even though Xosera was designed for use with 68K, since it is using an 8-bit parallel bus interface, pretty easy to use with any 8-bit CPU (main issue is level conversion or running at 3.3v - like I did with AVR).   So lots of "retro" possibilities.





I am still working on the SystemVerilog design, but I look forward to writing some 68K (and 6502) code to do some fun things on Xosera eventually.  I love all the older systems and programming them (and FPGAs to make new modern-retro designs).



Take it easy,

Xark

https://hackaday.io/Xark

Kalvan
Posts: 115
Joined: Mon Feb 01, 2021 10:05 pm

FPGA as graphics card?

Post by Kalvan »


I had a wild, wacky idea that you might want to follow up on:

 

How about a tilemode consisting of tiles 36 horzontal X 28 vertical pixels.  In a 400x225 16:9 aspect ratio display, with the leftover columns and rows used for line buffering, you get 11x8, or 88 tiles per screen.  Each tile would consist of four stacks of 7 pixels, 36 stacks wide. You can map the biits of each stack so that 7 bits are used to map colors from a given CLUT, with an eighth set of seven bits used to select the CLUT.  This means that you can have a hypothetical maximum of 16,192 colors (128 CLUTs of 128 colors each) onscreen, and with Video RAM at 128K mapped half and half between the tile data and the scrolling field tile map, there's room for 74 unique tiles and a screen map of 256x256 tiles in a single field, or 23x32 screens (737 total) at that 16:9 aspect ratio.

If you reduce the pixel bitwidth, you can have even more tiles and/or an even larger scrolling field.

Of course, one drawback is the memory requirement for all those CLUTs, which may cut into either the tile data or the screen tile map.  And this still doesn't account for necessary  Video RAM for BOBs...

ExtraOrdinary
Posts: 1
Joined: Wed Jul 27, 2022 5:21 pm

FPGA as graphics card?

Post by ExtraOrdinary »



On 7/25/2022 at 1:33 AM, Kalvan said:




How about a tilemode consisting of tiles 36 horzontal X 28 vertical pixels.  In a 400x225 16:9 aspect ratio display, with the leftover columns and rows used for line buffering, you get 11x8, or 88 tiles per screen.  Each tile would consist of four stacks of 7 pixels, 36 stacks wide. You can map the biits of each stack so that 7 bits are used to map colors from a given CLUT, with an eighth set of seven bits used to select the CLUT.  This means that you can have a hypothetical maximum of 16,192 colors (128 CLUTs of 128 colors each) onscreen, and with Video RAM at 128K mapped half and half between the tile data and the scrolling field tile map, there's room for 74 unique tiles and a screen map of 256x256 tiles in a single field, or 23x32 screens (737 total) at that 16:9 aspect ratio.



There's a lot of ways to segment your display grid. But first, you need to cover some basic points to your design; aspect ratio, bitmapped or tiled, max res, tile layers count and/or buffer surfaces. To have max flexibility, you must provide capable hardware which would be FPGA, no doubt, coupled with large framebuffer to support any given video configuration. I see VERA for example relies on on-chip 128 KB of pSRAM which outlines the grid for tile-oriented approach. 128 KB also could allow graphics modes up to 640x400x8 or 320x200x16 which is another possibility.

I'll definitely go for a more configurable design to allow more space to explore which wasn't accessible back then in the early Commodore days.

Post Reply