Object Oriented Z80 Assembly programming

Door Micha

Resident (36)

afbeelding van Micha

22-08-2021, 15:43

In my games (written directly in assembly language) I am using the index registers (IX and IY) very often, e.g. for controlling multiple enemies. It’s a very convenient way to do things and it produces nice readable code. But there have to be faster ways. One of the other ways to access lists of variables is using aligned indexing with HL. H is then used as a pointer to which enemy you want to adress and L points to the variable (e.g. x-coordinate). It is faster (in my pieces of code approximately 25 to 35%), but it has some serious drawbacks:
1) There have to be a 256 byte gap between the records
2) You cannot easily use HL anymore in the code (or you have to push/pop)
3) Code is less readable
Both methods use a form of indirect adressing. If there would be only one enemy, direct adressing could be used, which is much faster.
Now my idea is to copy all routines that are used with our enemies into RAM multiple times (each routine has its own copy for each enemy), and use direct adressing within each piece of code. This gives nice clean code, HL can be used normally, things don’t have to be aligned anymore. It is also possible to spawn enemy objects, allocate the RAM for its variables and code, and remove them later on; just like in object oriented programming. One obvious drawback is of course that it uses RAM, so this works best when there are not too many objects and not too many large routines to control the objects.
My estimate is that this way of programming is at least 1.5 to 2 times faster than aligned indexing with HL.

Would this be a good way to go forward? Does it maybe already exist and are there any examples of this? Did I miss some serious drawbacks of this method? What are your thoughts?

Regards, Micha

Aangemeld of registreer om reacties te plaatsen

Van thegeps

Paladin (862)

afbeelding van thegeps

22-08-2021, 17:03

Well, surely it will be fast but you'll waste a lot of RAM. It's ok if your games don't need compressed data

Van ARTRAG

Enlighted (6539)

afbeelding van ARTRAG

22-08-2021, 20:37

Not a good idea, you will not be able to allocate dynamically ram for your enemies, unless you have very few enemies ram will become soon a problem
All tests involving all active enemies are forced to scan all possible items even if the active ones are one or two.
A list of the sole active items would be much easier to manage
Moreover your code size will explode as well

Van Sandy Brand

Master (243)

afbeelding van Sandy Brand

22-08-2021, 21:03

I think what you are proposing is a sort of 'macro expansion', but then do it at run-time by copying a piece of 'template code' and then patch it up relative to the start address of the allocated memory?

I agree with Artrag: you will probably run out of memory really soon if you want to have enemies that to do something remotely interesting which will require quite a bit of code.

Also, each enemy may have wildly differing memory footprints (it's total size is now code memory + variables memory). Thus, you may need to resort to actual memory allocation schemes in order to make the best use of what little available memory you have. This has all the usual problems of being slower, adding more code bloat and things such as memory fragmentation easily become a real issue with the little amount of memory an MSX has.

Finally, the creation of a single enemy will now be significantly more expensive: finding some memory, copying the 'template' and then patching it up. It depends on the game, of course, but most enemies are rather short lived: you must ask yourself the question how long an enemy must live before it starts to offset the additional costs of instantiating it if you go with your proposed idea.

Van MsxKun

Paragon (1031)

afbeelding van MsxKun

22-08-2021, 21:25

Keep it simple.
But the best way to do things on such a limited machine always depends on the specific case.

Van Timmy

Master (163)

afbeelding van Timmy

23-08-2021, 03:05

If you game is going to be 1.5 to 2 times better with this addition, then you should definitely add it into your game.

Van Metalion

Paragon (1444)

afbeelding van Metalion

23-08-2021, 07:38

I've come to use also IX and IY extensively, so I can program in assembly using some OOP techniques (but not all of them). Mainly, it's to be able to affect variables and methods to an enemy. For example, moving all enemies goes down to calling all 'move' methods of each enemy, whatever their family and characteristics. Same goes for collision.

I find that the gain of readability and portability exceed the other problems, and that the loss of speed is minimal. If you have to go through the list of enemies, and call their private methods (move, collision, ...), the difference between using normal registers and IX,IY is minimal.

	ld	h,(iy+enemy.move+1)	; 21
	ld	l,(iy+enemy.move)	; 21
	call	jphl			; 18+5
					; (65)
	
	ld	a,(hl)			; 8
	inc	hl			; 7
	ld	h,(hl)			; 8
	ld	l,a			; 5
	call	jphl			; 18+5
					; (51)

And that's without counting some setup of 'hl', needed before accessing the method's address.

Of course, if you don't go through the list, and call directly enemies 'by name', you gain some cycles. But, as Artrag said, you always have a variable number of enemies, and at some point, you need to go through some list to enumerate them.

Van Micha

Resident (36)

afbeelding van Micha

23-08-2021, 08:59

Metalion wrote:

I find that the gain of readability and portability exceed the other problems, and that the loss of speed is minimal. If you have to go through the list of enemies, and call their private methods (move, collision, ...), the difference between using normal registers and IX,IY is minimal.

I totally agree, I also prefer using IX and IY over aligned HL indexing. But to stick with your example, direct adressing has a lot of profit; you can do:

ld hl,(enemy.move)     ; 17
call jphl   ; 18+5
               ; (40)

...still readable and a lot faster.
But after looking a bit more into it, this is almost best case scenario. I've tested it on pieces of my code and gains were more like in the 20 - 40% range comparing to using index registers and only slightly faster than aligned HL indexing. Also things like inc (hl) or inc (ix+o) have become impossible.
So I think I'll stick with using index registers and look for other kind of optimizations. Or maybe I'll use HL indexing together with IX and IY...

Thanks for all your inputs!

Van Metalion

Paragon (1444)

afbeelding van Metalion

23-08-2021, 10:38

Micha wrote:

But after looking a bit more into it, this is almost best case scenario

Of course. Because the (enemy.move) address will not fall out of the sky. You will first need to determine which enemy you are dealing with, and then find the correct 'move' method. So, some setup will be needed.

Micha wrote:

I've tested it on pieces of my code and gains were more like in the 20 - 40% range comparing to using index registers and only slightly faster than aligned HL indexing.

Yes, but that gain is only on the calling of the method. I'm quite sure that the method itself takes more cycles, and therefore your gain is diluted. And unless you need very fast enemies on screen, or have problems with frame rate, then the gain will not be relevant.

Van GhostwriterP

Hero (619)

afbeelding van GhostwriterP

23-08-2021, 11:04

Personally I am a fan of the HL indexing with 256 bytes aligned variables. IX and IY instructions not only slower but also longer code. If you minimise the need for random access, by cleverly arranging the variables, HL indexing will beat IX and IY indexing on both fronts. At least in my experience. But maybe it is only suggestive and I am a little biased... those 20+ cycles instructions just really scare me.
I had this idea of RAM code routines and in-line variables also a while back. This could really work well for simple and small routines where you need a lot of, like bullets in a shooter. The idea is of course that the routines are already there and only the variables (which are on known, constant and quick accessible places) and some specific parameters are adjusted when a bullet is fired. This method I intended for turbor which has a lot of RAM by default so that is not an issue. Furthermore the speed gain on turbor is even bigger because memory access gives you extra penalties. Now you can do something like this:

.x = $ + 1
  ld hl,0000     ;read x
.xinc = $ + 1
  ld de,0000     ;read x increment
  add hl,de      ;add
  ld (.x),hl     ;store updated x

Player coordinates for collision can be kept in a free register pair etc. etc. you get the idea Wink

Ayway if using IX and IY indexing works and is fast enough you might just as well stick with that. It makes coding live a bit easier as frees up HL. So in short, just choose the method that fits your needs and what you are comfortable with.

Van Grauw

Ascended (10062)

afbeelding van Grauw

23-08-2021, 12:57

Maybe rather than focusing on IX / IY, there is an algorithmic improvement possible on the routine to get greater gains? E.g. to run it less frequently, have a more optimal lookup structure of active enemies, precalculate some data, optimise the collision logic so that less values need to be compared, or restrict the flexibility. Along those lines. Since I’m not familiar with the code I can’t give more concrete suggestions but maybe this will spark a thought.

I’m a big fan of IX / IY because they make code much easier to read and modify, and they can be more performant than one would think because they can do several operations at once (e.g. ld (ix+o),n). Having two very flexible additional registers helps to reduce register pressure as well, where you would otherwise need to push / pop or load via an intermediate register. Although there are certainly cases where I have optimised them away down the line, usually I attempt to find other algorithmic types of optimisations first.