MSX Programming Languages Showdown

Pagina 1/6
| 2 | 3 | 4 | 5 | 6

Door Marq

Champion (387)

afbeelding van Marq

15-09-2013, 12:50

Preliminary test results from pitting different MSX languages, compilers and cross-compilers against each other. Could be totally bogus in some cases, but probably some food for thought and surprises there, too Smile

http://www.kameli.net/marq/?p=2697

Aangemeld of registreer om reacties te plaatsen

Van PingPong

Prophet (3889)

afbeelding van PingPong

15-09-2013, 16:07

Probably the Hitech-C cross compiler will perform better than SDCC

Van Manuel

Ascended (18786)

afbeelding van Manuel

15-09-2013, 16:44

Interesting, but some more test cases would be useful.

About the shortened loops: you could use accurate emulators and just emulate at full speed to get the actual numbers Smile (If you don't trust the emulator, first do a comparison run of course.)

How about a comparison with R800 mode as well? (Although I guess that none of the compilers can generate R800 optimized code, if applicable at all for the test case.)

Van geijoenr

Champion (311)

afbeelding van geijoenr

15-09-2013, 18:08

a nested loop I guess is a trivial example, and is not enough to characterize how good code generation is, but indeed is a worst case scenario in the sense that if that is bad; then everything else will be worse...

anyway, any C compiler will be slower than assembler unless you can use register variables; in SDCC for instance, the keyword is accepted, but is seems to have no effect. Even using global variables, they are accessed using index registers,
so no surprise it takes 50% more time than trivial assembler.

Van ARTRAG

Enlighted (6844)

afbeelding van ARTRAG

15-09-2013, 19:57

My Pascal is a bit rusty but if I'm correct the test ported to C should be this

#include 
#include 

void main (void) {
	unsigned int i,j,s;
	s = 0;
	for (i=0;i<10000;i++)
	  for(j=0;j<100;j++)
	    s++;
	printf("%d",s);
}

The COM compiled with the Hitech C cross compiler (v 7.8p2) has an execution time of 6 secs on plain msx1.This is the corresponding ASM

	global	small_model
;stdlib.h: 122: extern int atexit(void (*)(void));
;stdlib.h: 126: extern void qsort(void *, size_t, size_t, int (*)(const void *, const void *));
;stdlib.h: 127: extern void * bsearch(const void *, void *, size_t, size_t, int(*)(const void *, const void *));
	global	_main
	signat	_main,24
	psect	text,class=CODE
	global	_printf
	signat	_printf,26
	file	"C:\HT-Z80\BIN\TEST.C"
	line	4
_main:
	push	ix
	ld	ix,0
	add	ix,sp
	push	bc
	push	iy
;TEST.C: 5: unsigned int i,j,s;
; _s allocated to bc
	line	6
	ld	bc,0
;TEST.C: 7: for (i=0;i<10000;i++)
	line	7
	ld	(ix+-2),c
	ld	(ix+-1),c
	line	8
l8:
;TEST.C: 8: for(j=0;j<100;j++)
; _j allocated to iy
	ld	iy,0
	line	9
l11:
;TEST.C: 9: s++;
	inc	bc
	line	8
	inc	iy
	ld	de,064h
	push	iy
	pop	hl
	or	a
	sbc	hl,de
	jp	c,l11
	line	7
	inc	(ix+-2)
	jp	nz,u11
	inc	(ix+-1)
u11:
	ld	de,02710h
	ld	l,(ix+-2)
	ld	h,(ix+-1)
	or	a
	sbc	hl,de
	jp	c,l8
;TEST.C: 10: printf("%d",s);
	line	10
	push	bc
	ld	hl,u19
	push	hl
	call	_printf
	pop	bc
	pop	bc
;TEST.C: 11: }
	line	11
	pop	iy
	ld	sp,ix
	pop	ix
	ret	
	psect	strings,class=CODE
u19:
	defb	37,100,0
	psect		text
	end

Van geijoenr

Champion (311)

afbeelding van geijoenr

15-09-2013, 20:53

it is possible to remove the access to vars via index registers by using pointers to global variables,
then you get more opcodes, but faster ones. Not sure if overall is faster.

unsigned int i,j,s;
unsigned int *pi = &i;
unsigned int *pj = &j;
unsigned int *ps = &s;

void main (void) {
    *ps = 0;
    for ( *pi=0; *pi < 10000; *pi++)
      for( *pj=0; *pj < 100; *pj++)
        (*ps)++;
}

then you get (with SDCC version 3.1.0)

_main:
;test_loop.c:8: *ps = 0;
	ld	hl,(_ps)
	xor	a, a
	ld	(hl), a
	inc	hl
	ld	(hl), a
;test_loop.c:9: for ( *pi=0; *pi < 10000; *pi++)
	ld	hl,(_pi)
	xor	a, a
	ld	(hl), a
	inc	hl
	ld	(hl), a
00105$:
	ld	hl,(_pi)
	ld	d,(hl)
	inc	hl
	ld	h,(hl)
	ld	a,d
	sub	a, #0x10
	ld	a,h
	sbc	a, #0x27
	ret	NC
;test_loop.c:10: for( *pj=0; *pj < 100; *pj++)
	ld	hl,(_pj)
	xor	a, a
	ld	(hl), a
	inc	hl
	ld	(hl), a
00101$:
	ld	hl,(_pj)
	ld	d,(hl)
	inc	hl
	ld	h,(hl)
	ld	a,d
	sub	a, #0x64
	ld	a,h
	sbc	a, #0x00
	jr	NC,00107$
;test_loop.c:11: (*ps)++;
	ld	de,(_ps)
	ld	l,e
	ld	h,d
	ld	c,(hl)
	inc	hl
	ld	b,(hl)
	inc	bc
	ex	de,hl
	ld	(hl),c
	inc	hl
	ld	(hl),b
;test_loop.c:10: for( *pj=0; *pj < 100; *pj++)
	ld	hl,#_pj
	ld	a,(hl)
	add	a, #0x02
	ld	(hl),a
	inc	hl
	ld	a,(hl)
	adc	a, #0x00
	ld	(hl),a
	jr	00101$
00107$:
;test_loop.c:9: for ( *pi=0; *pi < 10000; *pi++)
	ld	hl,#_pi
	ld	a,(hl)
	add	a, #0x02
	ld	(hl),a
	inc	hl
	ld	a,(hl)
	adc	a, #0x00
	ld	(hl),a
	jr	00105$
	ret
_main_end::

Van ARTRAG

Enlighted (6844)

afbeelding van ARTRAG

15-09-2013, 21:23

ops! I did a mistake in computing the times.
The COM from the Hitech Crosscompiler needs 32 secs on plain z80, not 6 secs

Van PingPong

Prophet (3889)

afbeelding van PingPong

15-09-2013, 21:27

ARTRAG wrote:

ops! I did a mistake in computing the times.
The COM from the Hitech Crosscompiler needs 32 secs on plain z80, not 6 secs

Are you sure? I cannot believe that Hitech-C does generate slower code than SDCC...

Van Marq

Champion (387)

afbeelding van Marq

15-09-2013, 22:22

I'm happy to see the topic sparked some discussion Smile And the test case, indeed, is a trivial one, but at least it tells a thing or two. Loops are so fundamental that if they are slow, probably everything else is, too. Need to try the Hisoft cross-compiler when I get the chance. SDCC isn't that bad after all: if you look at the code it seems to do silly things at times, but the overhead isn't that heavy compared to many other compilers tested here. There's a lot of good top-level optimization going on even if the Z80 code generator is quite lame.

Van ARTRAG

Enlighted (6844)

afbeelding van ARTRAG

15-09-2013, 23:13

I wouldn't conclude that one is better than the other from a single test
You should at least define a suite of cases as done with code size here
http://sdcc.sourceforge.net/mediawiki/index.php/Z80_code_size

Van Marq

Champion (387)

afbeelding van Marq

16-09-2013, 06:44

Well, I didn't mean this experiment as any conclusive test – just a fun one-afternoon test run of different compilers. At least it made me quite happy about my choice of SDCC+inline Smile Was considering z88dk at some point, but according to this it's a no go. In real applications the differences would, of course, be smaller especially if they're I/O bound.

Pagina 1/6
| 2 | 3 | 4 | 5 | 6