Code optimization

Page 1/5
| 2 | 3 | 4 | 5

Par ARTRAG

Enlighted (6977)

Portrait de ARTRAG

07-02-2008, 12:06

It is a long time since someone mentioned ASM coding on MRC.

Time to repair this issue.
Can this code be optimized ?

The two functions are supposed to copy from and to a "room" of size map_w X map_h (height do not care)
the background under a frame taken from external data and store the tiles in a buffer.

The position in the room and the address of the buffer that stores the background are passed
as parameters in registers BC and DE.

The frame number is passed on the stack.

;de   source_addr;
;bc   dest_addr;
;ix+4 e ix+5 nframe
    global _npctgrab
_npctgrab:
    push    ix
    ld  ix,0
    add ix,sp

    push    de
    ld  e,(ix+4)
    ld  d,(ix+5)
    ld  hl,_frames
    add hl,de
    add hl,de
    ld  e,(hl)
    inc hl
    ld  d,(hl)  ; de punta alla frame corrente
    push    de
    pop     ix  ; ora ix punta alla frame corrente

    pop hl      ; hl punta alla source in room

    ld d,b      ; bc puntava alla destination in frame buffer
    ld e,c      ; ora de punta alla destination in frame buffer

1:  ld  a,(ix+0)      ; 127 == fine

    cp 127
    jp z,3f

    ld  c,a
    ld  b,0
    push hl
    add hl,bc       ; source

    ld  c,(ix+1)    ; len
    inc ix
    inc ix
    add ix,bc

    ldir

    pop hl

    ld  bc,(_map_w)
    add hl,bc

    jp  1b

3:  pop ix
    pop hl
    pop af
    jp  (hl)



;de   source_addr;
;bc   dest_addr;
;ix+4 e ix+5 nframe
    global _npctrest
_npctrest:
    push    ix
    ld  ix,0
    add ix,sp

    push    de
    ld  e,(ix+4)
    ld  d,(ix+5)
    ld  hl,_frames
    add hl,de
    add hl,de
    ld  e,(hl)
    inc hl
    ld  d,(hl)  ; de punta alla frame corrente
    push    de
    pop     ix  ; ora ix punta alla frame corrente

    pop hl      ; hl punta alla source in room

    ld d,b      ; bc puntava alla destination in frame buffer
    ld e,c      ; ora de punta alla destination in frame buffer

1:  ld  a,(ix+0)      ; 127 == fine

    cp 127
    jp z,3f

    ld  c,a
    ld  b,0
    push de
    ex de,hl
    add hl,bc       ; source
    ex de,hl

    ld  c,(ix+1)    ; len
    inc ix
    inc ix
    add ix,bc

    ldir
    pop de

    ld  bc,(_map_w)
    ex de,hl      
    add hl,bc
    ex de,hl  
    jp  1b

3:  pop ix
    pop hl
    pop af
    jp  (hl)

The frame data are structured like this


framex1:
db 0,2,18,19 ; X offset of line 0, length of line 0, data, data ect
db 0,2,20,21;  X offset of line 1, length of line 1, data, data ect
db 127         ; end of the frame

frame0:
db 5,1,147
db 4,1,147
db 3,1,147
db 2,1,147
db 1,1,147
db 127

etc

_frames:
    dw  framex1,frame0,frame1,frame2,frame3,frame4,frame5, etc etc
!login ou Inscrivez-vous pour poster

Par ARTRAG

Enlighted (6977)

Portrait de ARTRAG

07-02-2008, 13:35

(I mean optimized for speed naturally)

Par ro

Scribe (5061)

Portrait de ro

07-02-2008, 15:48

Well, using the index regs (IX and IY) are never clever tricks concering speed. They're slow. Using HL regs and doing some incs and decs will speed it up already. Make intellegent tabels so you don't have to inc/dec too many times.

Comparing the Accu with 127 using the CP method, like you do, can be done faster by using AND #7F, JP NZ,xxxx

just some thoughts...

Par Metalion

Paragon (1628)

Portrait de Metalion

07-02-2008, 15:49

try not to use the IX register, it will increase speed TongueTongueTongue

EDIT : posted at the same time as the previous message ...

Par Huey

Prophet (2696)

Portrait de Huey

07-02-2008, 15:57

try not to use the IX register, it will increase speed TongueTongueTongue

AFAIK the ASM code is called using Hitech-C. It puts parameters in IX register....

Par ARTRAG

Enlighted (6977)

Portrait de ARTRAG

07-02-2008, 16:05

@Huey
Not really, Hitech-C puts parameters on the stack before calling the function
and asks the called function to not modify the value of IX.

@ro and Metalion
I'd like to avoid IX and IY in the loop, but I do not know how, this is why I ask support

Par MicroTech

Champion (389)

Portrait de MicroTech

07-02-2008, 16:28

I'd like to avoid IX and IY in the loop, but I do not know how, this is why I ask support
Do you have an equivalent C source?
Maybe it can be re-compiled with ASCII-C (which does not use index registers) and we can take inspiration from the resulting asm code.

Par ARTRAG

Enlighted (6977)

Portrait de ARTRAG

07-02-2008, 16:42

@MicroTech
No, this code is hand made and designed to be called by the Hitech-C compiler.
This affects only the way in which input parameters are passed and implies the need of restoring IX on exit

Par ARTRAG

Enlighted (6977)

Portrait de ARTRAG

07-02-2008, 17:00

Needless to say, I've tried to avoid the use of IX, but I do not see any real solution

Par jltursan

Prophet (2619)

Portrait de jltursan

07-02-2008, 17:10

Optimization based on avoiding the use of index registers is a good idea; but there's no much iteration over these instructions. The biggest time is wasted in LDIR; so I think the best idea is to unroll the LDIR and repeat (height) times, (width) LDIs...at a size cost, of course Sad

I've just remembered of a "Fast LDIR routine" posted somewhere in the forum, it was based on LDI of course; but with variable length, not custom as is now the case....

Par ARTRAG

Enlighted (6977)

Portrait de ARTRAG

07-02-2008, 17:13

Sadly to say, LDIR most part of the times moves 1 ore 2 bytes at time...
but this depends on the shape of the frame, so I cannot unroll it as I do not know the length of line X in advance

Page 1/5
| 2 | 3 | 4 | 5