Everything is real-MSX-EXT. z280 7?
Grauw wrote:
For such a platform, I think an FPGA which just focuses on the MSX-ENGINE / S1990 / R800 and all the rest is implemented with original chips may be even better. Then development effort can focus on perfecting the emulation of the CPU timing and glue logic, timers, etc., and the sound and video chips will be bug-free. AY-3-8910, YM2149, YM2413, V9958, all still available. OPL4 and V9990 cartridges exist so those are really optional for me to have built in, although an integrated V9990 superimposing on the V9958 could probably reduce the number of cables so that might be nice.
Yeah... I have toyed with that idea for a while. But I do not think it is very practical. Those chips cost a lot more than the FPGA, are bulkier, require 5V to 3.3V conversion for dozens of pins and are a little bit of a supply chain problem. It is ok if you want to build a one of a kind thing with FPGA and some of the original chips. But ultimately it will cost 3x what an FPGA only solution would cost. It is ok to build a handful of boards, more than that you will have to put a lot of "love" into the project. And buying >10 V9958s nowadays is somewhat risky, there is no warranty any of then will actually work. It is ok for me to buy a handful on Utsource/Ebay/Etc. But if the goal is to make a turbo-R capable kit it is too risky to count on those.
I have not used an OCM yet. Do you believe the behavior of the current FPGA implementation of the chips depart too much from the original silicon behavior? And how good is the OpenMSX emuation of those chips? The FPGA design can be simulated in it's entirety and compared to a reference (OpenMSX). It is not easy but it is certainly doable. And maybe worth it in case of the V9958, for example.
Here is where the FPGA/discrete chips Frankenstein may be really useful. As a vehicle to improve the FPGA implementation of the original chips. We could build a few systems with FPGA and original chips running simultaneously and have the FPGA recording traces of the behavior that is different. Then run existing MSX software on the thing, record the differences and improve the FPGA implementation based on that. Even OpenMSX could benefit from it.
Basically you’d be developing a new MSX-ENGINE with integrated CPU similar to the T9769. Could be useful for a lot of homebrew projects, or even turbo upgrades. Omitting the VDP and FM reduces the scope of the project by a lot, so you could really perfect the basics of turboR emulation. Its CPU / bus timing is fairly complex and will already require a lot of testing to get it to match the real hardware.
Perfecting a good VDP emulation will take a lot of additional time so if you eliminate that from the equation you can get to a high quality functional product sooner. The current 1chipMSX VDP implementation is ok, but not great, there’s several differences in the details which you run in to if you push it beyond basic operation. It doesn’t implement accurate VDP access slots timing, and I believe VDP command execution speed is simply throttled to somewhat match the real one. To me it seems more of a high level emulation rather than modeling how the V9958 functions internally. The openMSX emulation is much better and even that isn’t without issues.
Aren’t there 5V FPGAs? If it’s just for the CPU and glue logic, the capacity maybe doesn’t need to be as massive.
If the goal is to enable people to make a turboR clone that is nearly indistinguishable from the real thing, focusing on perfect emulation of the components which aren’t generally available may be the fastest route.
Grauw asked:
Aren’t there 5V FPGAs? If it’s just for the CPU and glue logic, the capacity maybe doesn’t need to be as massive.
No, they are long gone. FPGA companies could not get rid of them fast enough. Expensive, hard to find. Design tools that support them have long been abandoned. 5V CPLDs are also long gone. You find them as you find V9958s, let's say.
Even 3.3V FPGA's are long gone, to be honest. Today's FPGA use a <1.25V core (often 0.9V) with circuitry for 3.3V operation only at the IO pins.
I sometimes see FPGAs connected directly to the MSX slot in DIY projects. If level-shifting is not done properly, any FPGA currently in production will have a short life if used like that. For the older ones, hundreds of hours, maybe, for the newer ones they can blow in the first use.
Older FPGAs recommend that a series resistor with a diode clamp to 3.3V is used in those cases. Sometimes the diode could be internal. This works for FPGAs that supported the PCI bus. But even in those cases a limited number of pins and a minimum resistor value has to be used.
If we want to build something that would last as long as an old MSX machine would, we need to stay clear of those things.
Grauw wrote:
If the goal is to enable people to make a turboR clone that is nearly indistinguishable from the real thing, focusing on perfect emulation of the components which aren’t generally available may be the fastest route.
That is a good point. But I would make a careful distinction... I'll phase it in two different ways:
A turbo-R "clone" that is indistinguishable to software running on the machine is one thing.
A turbo-R clone that enforces exact turbo-R behavior is another thing.
A software developer may care for the second but a user does not. If you are writing too fast to the VDP and this causes some issues the programmer wants to detect and fix that (as you certainly know). But for a user if the FPGA implementation can take data at the maximum rate a turbo-R can write to an IO port it is ok. Good software will not do it, it does not mean it has to fail.
From a design perspective, for example, it makes little sense to actually have the Z80 and R800 as separate things if implementing it in an FPGA. If you are designing a CPU it can do both by having modes. It certainly makes sense to have Z80 cycle-accurate behavior in one mode. But for the R800, I'm not sure it is worth the effort. In an R800 you will not get the same timing if you run code twice in the same machine. It depends on bus access alignments and refresh cycle alignments and potentially on other things we do know know about the bus controller. Making it cycle accurate is possible, but a little questionable.
ROM/RAM mode is possibly more important as it has tangible side effects in the amount of RAM available and memory contents. But I'm not sure if a ROM mode that performs as fast as RAM mode would cause real trouble. Also, in part, because few software rely on exact R800 speed, what is often the case with Z80. (Well, thinking about it... the turbo-R DAC may be the exception...)
In the end I have the impression that a machine that has a MSX 2+ accurate mode, with a close enough (a few %) turbo-R timing mode and a "all bets are off/run as fast as you can/remove all limitations" mode would make the most sense.
I do not really see this exactly as a product, to be honest. Turbo-R prices will only rise. Machines will die. Membranes will malfunction. Original OCMs are super expensive. And recent reproductions are still on the expensive side, with the FPGA used only getting older( software support is EOL) and full (:-)). MSX hardware is getting more on the collectible side than really on the usable side. Something that smells like a turbo-R and costs less than US$100 could change things a bit.
From a design perspective, for example, it makes little sense to actually have the Z80 and R800 as separate things if implementing it in an FPGA. If you are designing a CPU it can do both by having modes. It certainly makes sense to have Z80 cycle-accurate behavior in one mode. But for the R800, I'm not sure it is worth the effort. In an R800 you will not get the same timing if you run code twice in the same machine. It depends on bus access alignments and refresh cycle alignments and potentially on other things we do know know about the bus controller. Making it cycle accurate is possible, but a little questionable.
In the end I have the impression that a machine that has a MSX 2+ accurate mode, with a close enough (a few %) turbo-R timing mode and a "all bets are off/run as fast as you can/remove all limitations" mode would make the most sense.
I agree, the turbo CPU speed should be ballpark the same but the timing doesn’t need to be identical. Instruction set should be the R800’s though.
I think the part that’s important to copy is to respect the standard MSX bus timing by inserting waits during MEMRQ and IORQ to align them to the 3.58 MHz clock and keep them active for 2 / 3 cycles. With the notable exception of the internal memory, which should be accessed at full speed.
I think that part of the turboR design is very good and keeps full MSX bus compatibility at only a small performance cost, because 95% of the time the CPU will be accessing internal memory.
(This is also my biggest complaint about the 1chipMSX turbo modes - it works just like 7 MHz mod circuits and outputs the wrong clock and accesses hardware too fast, and as a result of not decoupling this properly, the higher the clock speed the more hardware that you plug in will stop working.)
Grauw, (or anybody really...)
do you know if there is any work on R800 undocumented instructions? Besides the ones listed on your site (that are R800 official), do you know if there is any undocumented R800 behavior that is being actively exploited?
Thank you!
I wouldn’t count on it, because 1. there is not much R800-specific software to begin with, 2. they generally aren’t used often anyway, and 3. not much is known about them afaik.
Although of course if there’s some useful ones I could imagine it’s useful if a successor would implement them as well, like perhaps TST which I’ve seen it mentioned although without description, but I believe it’s an AND without modifying A (just the flags).
More thorough research into undocumented R800 instructions might be a good idea anyway, perhaps starting out with the Z800 / Z280 instruction set. E.g. I wonder about the MMU functions, the R800 pinout seems to hint at similar capabilities.
Z80 undocumented instructions, I use only the ones that exist on R800 (ld ixl,r... etc.) and since most of the others don’t work on R800 it seems unwise to use them.
Grauw wrote:
I think the part that’s important to copy is to respect the standard MSX bus timing by inserting waits during MEMRQ and IORQ to align them to the 3.58 MHz clock and keep them active for 2 / 3 cycles. With the notable exception of the internal memory, which should be accessed at full speed.
Yes, MSX bus timing and behavior compliance is critical.
But I need to understand better what appears on the MSX bus while R800 is executing from the internal memories. And how refresh behaves in the MSX bus when R800 is executing from internal memory for too long.
The MSX itself "over-refreshes" the dynamic memory. A cycle every 15us would suffice. turbo-Rs are more efficient in that front.
But I agree with you fully. I think there is need for a "MSX bus controller, that translates the internal traffic to the slot when needed and makes sure the slots are rock solid at 3.579MHz regardless of what is happening inside the FPGA. In modern designs a bus like the old buses (MSX, ISA...) is relatively cumbersome. Point to point communication and crossbars (routers) are usually preferred (the reason is that fast design ). This bus controller could even be aware of some special addresses, like MSX-Audio for example and other FM chips, to make sure sequential accesses are not done too fast between address register and data registers.
The difficult part is to define what traffic could be filtered out. RAM and BIOS access is probably fine. Peripherals implemented internally in the FPGA would be nice to exclude, but not all. Slot control is tricky. There is nothing that prevents a cartridge from monitoring PPI A port and make its own internal slot decoding. This allows memory access in a different slot than the SLTSL of the current slot the cartridge is connected. I only saw DIY 1-2 cartridges exploring that. But PPI-A access is filtered this would no longer work. Not critically important but a depart from normal MSX bus behavior. MIDI pack like cartridges monitor several IO ports that could be internal peripherals.
I think the part that’s important to copy is to respect the standard MSX bus timing by inserting waits during MEMRQ and IORQ to align them to the 3.58 MHz clock and keep them active for 2 / 3 cycles. With the notable exception of the internal memory, which should be accessed at full speed.
Btw, reading the Z280 technical manual, it seems to have functions for this (page 22-23).