Timing of BIOS routines, such as BIOS_WRTVRM

Page 2/3
1 | | 3

By Grauw

Ascended (9270)

Grauw's picture

11-07-2020, 16:37

albs_br wrote:

Also, how do I know if I'm on vblank, to make this 128 bytes copy?

You can do the transfer any time. The only advanced detail is that during vblank, the VDP does not need access slots so you do not need to have writes spaced 8 µs apart, so you can do the transfer a few cycles faster (OUTI is 5 µs). You know you’re in vblank when the interrupt occurs.

However if you need to ask this question, I think this is too much and too unnecessary optimisation at this point Smile. If you do not do the transfer at exactly the right timing, e.g. if it takes a little too long and the VDP has started display again, you will get memory corruption. This can also depend on VDP frequency (50 Hz has a longer blanking period). So there is risk for compatibility issues.

I would just stay on the safe side, transfer at an 8 µs rate (29 cycles), and just transfer whenever is convenient. It will already be several times faster than using the BIOS.

By albs_br

Expert (115)

albs_br's picture

11-07-2020, 18:45

Thanks. I will follow this path. One last question. How do I space the writes 8uS? A NOP between each OUTI would be sufficient?

By albs_br

Expert (115)

albs_br's picture

11-07-2020, 19:08

I think each T-state should be the inverse of the clock frequency. Am I right?

One t state would be then 1/3.58 MHz or 0.28uS
Only one NOP wouldn't be sufficient. I need 10 t states.

By albs_br

Expert (115)

albs_br's picture

11-07-2020, 19:59

Found information here: http://map.grauw.nl/articles/vdp_tut.php
In fact 8us is 29 cycles as you said.

By Grauw

Ascended (9270)

Grauw's picture

11-07-2020, 20:48

albs_br wrote:

I think each T-state should be the inverse of the clock frequency. Am I right?

One t state would be then 1/3.58 MHz or 0.28uS
Only one NOP wouldn't be sufficient. I need 10 t states.

Correct! You need 11 additional cycles (aka t-states). A common 31-cycle loop is:

    ld b,128
Loop:
    outi
    jr nz,Loop

It’s two more cycles than strictly necessary but it is more convenient than unrolling.

p.s. So far I’ve been assuming you’re developing for the MSX1 TMS9918 VDP. On the MSX2 V9938 VDP there are more access slots, so there you can use otir or unrolled outi.

By albs_br

Expert (115)

albs_br's picture

11-07-2020, 22:09

Yeah. MSX 1, which is the machine I had when I was a kid and is the only acceptable in MSX DEV 2020.

By albs_br

Expert (115)

albs_br's picture

11-07-2020, 22:13

It seems that the site I linked earlier is your own site!

By pgimeno

Master (230)

pgimeno's picture

12-07-2020, 02:27

In a TMS-based MSX 1, you have about 27,000 cycles after the interrupt during which you can write at any speed. Beware that the default interrupt routine consumes some of these before jumping to the hook. Past these 27,000 cycles, you need to write with a 29-cycle separation.

If you want your program to be compatible with V9938-based MSX machines (SVI-738, MSX2, ...), you have to write with a minimum 17-cycle separation. Note that that includes the time of the OUT instruction itself. Since OUTI takes 18 cycles, you can use a series of OUTI during the first 27,000 cycles without fear on any system. If you use OUT (n),A or OUT (C),r, however, you need some instruction in the middle (like a NOP) to ensure there's at least 17 cycles, because these take 12 and 14 cycles respectively.

That applies to SCREENs 1-3.

By albs_br

Expert (115)

albs_br's picture

12-07-2020, 06:48

I managed to make a subroutine to mimic VPOKE behaviour, and it seems to work:

; HL: address, A: value
Vpoke:
            push af

            ld a, l
            di
            out (0x99), a
            nop
            nop
            set 6, h                    ; Set write flag
            ld a, h
            out (0x99), a
            
            pop af
            nop
            nop
            nop
            ei
            out (0x98), a

            res 6, h                    ; Reset write flag

            ret

But, to make the routine that copies 128 bytes from RAM to Sprites attribute table using OUTI and loop, I'm having a doubt on where to disable and re-enables the interrupts.
Is before and after the loop? Isn't much time with interruptions disabled?

Thanks.

By theNestruo

Master (178)

theNestruo's picture

12-07-2020, 11:15

I don't want to sound disrespectful, but I agree with Grauw that the questions you ask and the kind of optimizations you are trying to achieve don't mix well...

Have you tried actually using BIOS' LDIRVM to transfer the SPRATR RAM buffer? How does it perform? Is there a bottleneck there?
BIOS is usually reviled as "non-performant" by experienced programmers (or programmers with the not-invented-here syndrome), but unless you are coding an VDP intensive game, with lots of sprites, scrolling, pattern redefinition, etc., it will probably perform good enough.
I'm using LDIRVM to transfer the sprites every frame, plus some WRTVRM to change the NAMTBL, plus some more LDIRVM every 8 frames to animate some tiles... and I still have time for all the game logic with the game running at 50/60fps.

I honestly think that you should try the "easy" way (i.e.: use the BIOS) and then, if you face a performance problem, profile where the actual bottleneck is and improve that part. This will save you a lot of time fiddling with problems that maybe don't exist.

Page 2/3
1 | | 3