Preferably i prefer to avoid unrolling
How about converting the data beforehand in stead of in real-time?
PingPong, there is not much you can really do about it. This is the better as it takes. There are some other ways to write it, but the performance will be somewhat the same. If you unroll the inner loop, you'll gain some T states without loosing much space.
If you unroll it, you'll need to add a NOP before the OUT, starting with the second one, if you wish your routine work on any MSX (including MSX1 and the weird SONYs). When playing with colors, SONYs requires up to 4 NOPs between outputs (when a single one would do the trick on a normal MSX1). This is the cause SimCity colors do not work correctly on that SONYs. A beta version of SimCity has a new video routine, for those slow computers, but I have not found a way to detect them.
Currently my code has three routines: slowest for SONYs, slow for MSX1 and some MSX2 (detected by VDP speed test) and faster (for MSX2+ and turboR (also used accordingly to the results of VDP speed test). Unfortunately, my VDP speed test is not able to determine a clear difference between a normally slow MSX1 VDP and the very-very slot SONY's VDP.
About converting data to native mode, in fact, it'll not make anything better. I did the test, made some math... and on faster computers (7MHz or more) it'll make a little difference (faster), but it'll make no difference at all on slower computers. In fact, depending on the way you program it, it can be even slower on slower computers (due to the way one add delays between OUTs). The only thing that would make things MUCH faster would be convert the game to use the MSX PATTERN mode, updating the name table only. But this is not always possible, since much spectrum games really cope with the screen as a bitmap.
The usual solution is *not* displaying the entire screen. On GnG, for example, the "frameskip" option makes the first two lines being printed in one interrupt and the other lines being print in the next one. Since most of the game is processed in one of two interrupts, I designed the routines to update the score in the same IRQ as the game will be processed and other lines are updated in the "idle" IRQ. Also, the game put a "black mask" on the 2 rightmost patterns, by painting those patterns as always "black on black". This was nice, since it allowed me to NOT print them. This means I had to re-set VRAM write address at each line end... but this is done with just two OUTs per line, and I had avoided 16 OUTs per line, by not printing the last two patterns. In fact, since I had not to print their colors, I spend four OUTs per line, but avoid 32 OUTs.
In SimCity the trick is different. Since the game is VERY cpu-intensive, if I tried to update the entire screen every moment the game would slow down a lot (in fact, the early betas were very very very slow because of this). Since the screen had not the need to be updated too fast, the solution was plot one third of the pattern table at one IRQ and then one third of color table in the next IRQ... and so on, until the entire screen was updated in 6 IRQs.
Most speccy-to-MSX conversions update only a limited piece of the screen. As an example, Astro Marine Corps, a conversion I have as a reference of "good porting techniques", was only possible because the game is not full screen. If it was, the result would be as crappy as hell...
BTW, I am curious to know what are you working at.
PingPong, there is not much you can really do about it. This is the better as it takes. There are some other ways to write it, but the performance will be somewhat the same. If you unroll the inner loop, you'll gain some T states without loosing much space.
Sgrunt!
BTW, I am curious to know what are you working at.
Honestly i've hoped to find a better solution, but if you and others tell me there is no much more to gain....
At this point nothing, the idea was to convert some speccy games in a monocrome way (ex: Green beret?), but with this limit, i forget anything, probably i will take into account a msx2 version in screen5, obviously without scroll, but this require a re-engineering of the entire game, not a fast conversion....
Anyway, thx to you and others for the time and support.
Unfortunately, ZX-to-MSX conversions are easier in theory than in practice... While making the game simply run is a good challenge, the real challenge it tightly related to make the port playable and at least ass fun as the original game.
Anyway, I have never played through all Green Beret stages, but it is really needed to update the *entire* screen? AFAIR, there is 4 or 5 score pattern lines and the game uses only the bottom of playfield... am I wrong? Looking at WoS, I noticed the game also uses only 28 columns (instead of 32). Even if you want to print the entire playfield every time, this reduces your screen plotting to 73% (original: 768, reduced: 560). This could make a little difference to the final result.
A Screen 5 version would also be nice.