# Vertical scrolling routine (MSX1)

Página 8/12
1 | 2 | 3 | 4 | 5 | 6 | 7 | | 9 | 10 | 11 | 12

thank you, posted here the results just to have infos about cycles. But I have to manage the cose 'cause every 8th frame the scroll has a bit of "indecision", i think because at the 8th frame it does the LDDR on RAM nametable and take a row from the map coded in RAM. So I'm thinking if it's worth to reserve in RAM a second nametable and LDDR it by 1/8 every frame and the swap with the actual nametable...

also need to know:
JR NZ has timing 13/8
but when 13 and when 8? always 8 and when Z 13? or always 13 and 8 only when Zero reached?

branch taken / not taken. 13 / 8.
when the jump is not done then you save 5 cycles.

so, when it is a jump for a loop. the jump is done very often. then the normal JP is faster than the jump relative JR.

Ty, never jr in a loop, then!

yep! And since we are talking about optimizing code (which is one of my obsessions lately, hahaha), two links that I found very useful when optimizing the code for Tales of Popolon:

http://wikiti.brandonw.net/index.php?title=Z80_Optimization

Ty. I already knew of Z80 Heaven. Other site seems interesting too. I've discovered an optimization by myself:
LD A,(HL) ;8
LD (HL),A ;8

I use
SET 7,(HL); 17
To subtract 128 I use RES 7,(HL)

I use this for double buffering cause I have 31 tiles in every bank from 0 to 30 and the db tiles from 128 to 158

Don't know much about assembly anymore, but it seems to me these two routines don't do the same. Ff A>127, and you will add 128, if will become smaller than 128 (128+128 = 256, does not fit within one byte, and will become 0). Set 7,(HL) will always set bit 7, even if the byte already is larger than 127, so 128 will stay 128, while 1 will become 129.

Don't know if this would work better (and if it's faster)

LD A,128
XOR (HL)

If I'm saying something stupid, don't hesitate to correct me.

Xor stores the result in A so it’s not faster, you still need to ld (hl),a after.

Optimisations rarely do the same thing, but as long as what it does doesn’t matter for the end result that’s ok. In this case, if the range of input values is 0-127 and you alternate `set 7` and `res 7`, you just use bit 7 as a buffer index.

The two sites mentioned above contain a nice overview of what kind of optimisations you can make and along which lines you need to think. However what I don’t like so much is that they say things like "never do", or this comment here:

```  sla l
rl h         ; I've actually seen this!
; >
; -> save 3 bytes and 5 T-states
```

As if the programmer is incredibly stupid for not making that optimisation...

I think these kind of recommendations should be taken with a big grain of salt and certainly don’t always need to be applied. Many of them obfuscate the code so if it’s not in a hot code path and you’re not pressed for memory space, there’s often something to be said to use the thing that most clearly expresses what you want to do. E.g. for `ld a,0` vs. `xor a`, the former can be preferable. (Though the latter is such a commonly known optimisation that I guess it’s not a big deal either.)

Also as an example of "optimisations rarely do the same thing", the `add hl,hl` one does not set the zero flag, while `xor a` sets them all. Always be wary that Z80 instructions are not very consistent in terms of what flags they set.

@Poltergeist, as I said I have a range from 0 to 30 (and this range only) and I have alternately add or sub 128, so this optization os right for my case. I know it is useless in most of other cases
And I always fill of comments my listings

Página 8/12
1 | 2 | 3 | 4 | 5 | 6 | 7 | | 9 | 10 | 11 | 12