VDPX: The Quest for High-Performance Retro Graphics

Page 1/21
| 2 | 3 | 4 | 5 | 6

By MagicBox

Master (198)

MagicBox's picture

02-10-2008, 20:19

Introduction

Having read a lot about MSX graphics and the lurking desire for better (and supported!) videographics my interest was sparked. In the past, I have worked with CPLD and FPGAs which are programmable logic chips. Based on the love for nostalgia and the once glorious MSX platform, I made a simple calculation: 1 + 1 = 2. Having made hardware for MSX in the past, I am familiar with its architecture, with its VDPs and their limitations. Since today, in 2008, FPGAs are quite affordable and have become extremely powerful. The quest for a high-performance retro graphics extention for the MSX platform begun. And so, VDPX was born.

What is VDPX?

VDPX is the project name for the design of a new graphics processor for the MSX, from the ground up, using a powerful FPGA chip. VDPX is developed by one person, me. The design is mostly based on my vision of what this graphics processor should be. However, all of its features are up for discussion to allow for feedback by MSX scene. If the graphics processor is to be supported by the community, it should offer functionality that is both wanted and desired by the community; ofcourse, within the design limits. The resulting graphics processor design will become something that ultimately fulfills your retro graphics desires for the MSX platform.

It is very important to realize that this project is not an attempt to leverage MSX computing to that of modern PCs. The goal is not to create stunning 3D graphics on an MSX machine. For that, use your PC and stuff a good NVIDIA or ATI card in there. No; VDPX instead builds on the vision of the 80's video chips. VDPX in no way will compare to any of the current PC graphics cards. VDPX is designed as a retro expansion for the MSX platform. However, compared to 80's videochips, VDPX will be blazingly fast and will offer capabilities that none of the V99xx video chips are able to match. VDPX is a project that will be built in the same retro spirit as the 80's chips.

VDPX is developed in conjunction with the MSX scene through forum feedback. This way, hopefully VDPX will become a widely supported hardware extention within the scene. Financially, the cost for the cartridge is trying to be kept down as much as possible.

How will VDPX work on an MSX?

VDPX is designed as a cartridge. The original, real MSX will not need to be modified in any way to make VDPX work. Instead, the video output of the MSX is routed into VDPX using a simple cable. VDPX will have two outputs. An RGB and Composite output. If feasible, maybe even a RF unit to hook up to a TV. Software can chose the video output, being the standard MSX video, or the VDPX output. VDPX will be compatible with any MSX, from MSX1 to MSX TurboR and will support both 50Hz and 60Hz refresh rates.

High-performance Retro Graphics.. gimme specs!

You probably skipped right down to this section, didn't you? Well. As explained, VDPX is in the same retro spirit as MSX itself was back in the days. This means, VDPX is a 2D graphics card supporting sprites. A lot of them. In addition to sprites, VDPX will also support layers whose viewports can be set for maximum flexibility. Now, a run-down of the specs:

- 256x216 resolution
- 512KB VRAM 
- 1KB Palette RAM 
- 1KB Sprite Attribute RAM
- 128 16x16 Sprites, no scan-line limitation (!)
- 6144 256-Color Map Patterns (8x8 tiles)
- 1024 256-Color Sprite Patterns (Shared with Map Patterns)
- 8 Logical Maps (Name Tables), dimensions and positions can be customly set.
- 16.7M Color Palette with individual and global 8-bit color fade registers.
- Advanced Sprite Collision Detect
- Memory Paged VRAM Access or through I/O ports with read/write pointers.
- 200MByte/Sec raw VRAM read throughput
- 150MByte/Sec raw VRAM write throughput
- Zero Wait-States, CPU will not have to wait at all between writes (OTIR!)
- Powerful blitter, colordepth conversion
- And much more!

As you can see, VDPX is geared towards creating games foremost, but it will support a bitmapped mode just as much allowing for other types of applications to make use of high-performance retro graphics. VDPX allows for smoothscrolling in all directions, each layer independantly from the other. Sprites have a priority setting to determine their Z-axis position with respect to layers. VDPX's intention is to be a dream coming true for creating high-performance retro graphics on our beloved MSX platform.

Defining a Standard

Ultimately, VDPX should become something that offers features that's wanted by most of you. This topic is meant for just that: provide feedback for features that are in-line with the design. This means, no "I want 3D graphics" types of requests. Stay in the realm of retro graphics. Furthermore, I will update this thread as the design progresses. There will be plenty room for discussion, like blitter features and what not more. Who knows, VDPX may just become the graphics standard for today's MSX machines! All it needs is your support. And talking about support; as all details of VDPX are known, I can support any developer with all the informations and how-tos to create working software. It ultimately is a public-domain project. Built by and for the community. Whenever the day is there that the hardware works and is ready to be produced, it should come with manuals, programming examples, editor software and what not more that should aid in development of software for VDPX, be it games or applications.

Closing

It's quite a project to build, a very challenging project. However, it is a fun project to work on for me and especially if it becomes supported well enough I may put even more energy into it. There will probably be people who will loathe this initiative, people wanting MSX to stay just that: V99x8 with their limitations. For others this may be the answer to create the MSX software you could never create before due to the original VDPs limitations...

All in all, so far I enjoy the project and I hope you will do too!

Login or register to post comments

By Edwin

Paragon (1182)

Edwin's picture

02-10-2008, 21:56

I haven't been following the discussion at all for time management reasons. But I'm wondering if you're already implementing it, and if so, what you are using for it. Naturally it would be very helpful to stick it in the 1chipMSX.

By PingPong

Prophet (3460)

PingPong's picture

02-10-2008, 22:02

@MagicBox: do you think the card can fit on a std msx cartridge?

By MagicBox

Master (198)

MagicBox's picture

02-10-2008, 22:10

@Edwin: I'm already designing it. The target platform is an Altera Cyclone III and will use quite some FPGA resources to realize VDPX because of its sheer speed. (Memory blocks are used to cache sprite patterns in order to be able to process 128 sprites well within a pixel clock). I doubt it will fit in OCM unless it has been designed with a large capacity/highspeed FPGA to begin with.

@Pingpong: I'm aiming for the size of an FMPAC, a bit taller most likely.

By Leo

Paragon (1236)

Leo's picture

02-10-2008, 23:00

good. I always felt limited by the very sluggidh speed copy of V9990 and its poor sprite performance.

By Edwin

Paragon (1182)

Edwin's picture

03-10-2008, 00:11

A cyclone 3 for an MSX vdp seems like somewhat beyond retro to me. There's probably more power in there than every msx hardware extension combined.

I'm also wondering about using the memory bits for sprite patterns. Are you planning on retrieving all sprite patterns at some point during vblank?

By flyguille

Prophet (3029)

flyguille's picture

03-10-2008, 02:11

I want some extras at Sprite attribute table that define the view of the sprite if it is possible...

ok there is 128 sprites of 16x16 ... I imagine MULTICOLORED pixel per pixel???? that is not mentioned

but......

first request) ¿is too hard if with the color value of each sprite pixel in the pattern, is also, a transparency/solid value atleast in a scale of 16 levels?

(second request)

setting 2 bits in the SPRITE MODE register... can you do this???

by example

BIT A B
0 0 = normal 128 sprites of 16x16
0 1 = 64 sprites BUT 32x16
1 0 = 64 sprites BUT 16x32
1 1 = 32 sprites BUT 32x32

I mean that the VDP does the following automatically....

when is 32x16
The VDP will use the XY coord. of sprite 0 to positioning sprite 0 AND 1, but 1 with offset +x16px
so if you move sprite 0, automatically moves the sprite 1,

this way is THE SAME SPRITE ENGINE but handling different the attribute table,

so, if you JOINS 2 sprite horizontally, or 2 sprites vertically, or 4 sprites in a block of 32x32... you will have biggers sprite just handling ONE xyPOS. LESS work for CPU....

third request) The collision sprite can retreive a list of collisions, is like this.... set up in hardware a collision's buffer within one frame, so, when rendering the sprites, the engine detects the colissions, for each collision put a new register in the list, then, set up a flag to inform that there is something at the buffer, if starts a new frame without having the buffer inspected by the CPU, the sprite collision system will be disable, until the cpu read and emptys the buffer....

Ofcourse the buffer will no cointain the information of collision for each pixel of each sprite collision, I want in the table something like

SPRITE X COLLIDES with sprite Y
Sprite Z collides with sprite E
.....

and registers to control the buffer
<ThereIsColisionFlag>
<CpuReadedIt>
<BufferLen>

(or maybe using simple buffer pointers at VRAM like a STACK.... but really is recommended a STOP flag that prevent the engine from filling it again.

if one pair of sprites collides in a lot of pixels, that only the first detected helps, that is easy, just the buffer don't accepts repeated data comparing with the last stacked input.

By MagicBox

Master (198)

MagicBox's picture

03-10-2008, 08:04

A cyclone 3 for an MSX vdp seems like somewhat beyond retro to me. There's probably more power in there than every msx hardware extension combined.

I'm also wondering about using the memory bits for sprite patterns. Are you planning on retrieving all sprite patterns at some point during vblank?

Well, the Cyclone 3's come in different capacity grades. I'm using the 3C16, the 3rd grade. The reason mainly is that it offers enough M9K blocks to do what I want in addition to having enough pins to connect the video DAC, RAM and CPU bus. For the selected package (240 QFP) only the slowest speedgrade of 8 is available. But it's fast enough for VDPX. The internal systemfrequency will be 200MHz, using the PLL and an external 50MHz oscillator. The Fmax when using the M9K blocks is 238MHz.

To maintain this speed which is required to fully process 128 sprites, the sprite engine is 'multi-core'. There will be 4 processing cores that each can process a whole sprite in one clock-tick. (That is, determine if and which pixel to render for a given screen X/Y, processing sprite x, y, pattern and flip bits). To do this, the sprite engine is an 8-stage pipe-line.

There will be plenty pure logic capacity left in the FPGA which I could use for integrating things like an SCC at no additional cost. Ofcourse, the FPGA isn't retro, it's a modern high-performance part. But that's the beauty of this all Smile

As for the sprite patterns, yes, the are retrieved from VRAM once per frame, 3 border scan-lines before the first content scanline (Y=0). 128 256color 16x16 sprites is 32KB worth of RAM. The VRAM has a 16-bit databus, only 16K read operations need to be done. It's done as a burst-read. The entire cache is filled in like one full scanline. Whenever the CPU accesses VRAM, the CPU access is 'inserted'. However since the CPU is so slow compared to VDPX, the additional CPU access cycles hardly extend the loading time.

Because all sprite patterns are cached, during normal screen rendering, VDPX will only need to access VRAM to access the 8 layer nametables and the 8 corresponding pattern pixels. Concurrently, the sprite engine will process the sprites from the cache while the layer engine processes the layers. When both are done, the final merge is done (sprite pixel if it was valid and is on top of the resulting layer pixel if there was any). About 60% of the time between pixelclocks is used for rendering. 40% goes unused to allow CPU access without delays as well as blitter time. CPU access is synchronized with the VRAM arbiter; I'm using 100MHz SRAM, meaning a read only takes 2 200MHz cycles. There will be about 60 cycles per pixel clock. 36 Cycles are needed for rendering. Synchronized CPU access only takes 2 cycles every 4 pixel clocks. The CPU will never be able to choke VDPX in its operation Smile

By MagicBox

Master (198)

MagicBox's picture

03-10-2008, 08:14

@flyguille:

Each sprite pixel can be set to one of the 256 available colors. Color 0 is always the transparant color. As for collision detection, that's already been worked out by using a category system (see thread in general discussion). A cross-collision table is too complex to make and would negatively impact performance as it require a multi-pass of attributes. Still the CPU would have to examine all the collision entries. Using the category detection system, no additional passes are needed and the CPU gets to know instantly if any collision is worth checking out.

Sprite modes I've been thinking about though, for up to 32x32 sprites. I'll see if/how I can implement this in the current engine. It certainly is useful.

Alphablending at this time I won't be supporting, maybe in a later design revision when the initial design has finished. The FPGA has plenty multiply blocks to do blending calculations. If Alphablending is implemented, it certainly will not be on a per-pixel basis.

By MicroTech

Champion (386)

MicroTech's picture

03-10-2008, 10:40

A suggestion, message posted on May 06 2008, 15:13

By MagicBox

Master (198)

MagicBox's picture

03-10-2008, 11:18

Sprite Mode has been implemented in the Sprite Cores at no performance penalties ^^. Like described, the following modes are now supported:

00: 128 16x16
01:  64 16x32
10:  64 32x16
11:  32 32x32

Sprite X/Y flip continues to work on all sprite modes.

@MicroTech: Yes, it's easy to upgrade VRAM sizes. I've sort of been thinking to upgrade to 1MB, or even 2MB if the memory chips won't be exponentially more expensive.

Page 1/21
| 2 | 3 | 4 | 5 | 6