Grauw’s RPG in development

Page 24/25
17 | 18 | 19 | 20 | 21 | 22 | 23 | | 25

By Grauw

Ascended (9072)

Grauw's picture

26-04-2020, 20:54

Update time!

I reorganised the ROM so that after the first 32K it has one giant block of data which I address using 24-bit pointers (bank, address). Some things span many banks (e.g. images and tile bitmaps), other things I align so that they stay within a single 2k bank (e.g. tile map). It’s a bit like a file store this way, but still with fast direct access where I need it.

The tile animation has been temporarily removed. It was never a fully functional system, more a quick proof of concept that didn’t scale. Removing it means I no longer need to store the tile set in VRAM. Later I’ll introduce a smaller VRAM cache containing only the tiles that animate.

The sprites are now generated from Aseprite spritesheets by the build script. Until now they were converted from tile data in assembly. This gives more flexibility to optimize sprites, and they also no longer rely on a particular tile pixel data format.

I’ve made the screen mode configurable in the code and the build script. It’s nice to see the engine running in screens 7, 8 and 11 just by changing a value. It doesn’t come cheap though, I barely have any spare time left on the CPU! I need to come up with some optimisations if I want to proceed with that…

Finally I implemented PSG + OPLL audio playback. It converts from VGM to a VGM-like format. Playback takes under 6000 cycles (~5% CPU) in the worst case where all registers update every two frames. I previously budgeted in the music player with a dummy routine of 7000 cycles (6% CPU) so I’m glad that was somewhat accurate.

I’m now working on compressing the audio data in a CPU-friendly way. I started out by grouping the register write commands per interrupt and counting the duplicates. On one test song of 3383 commands total, there were 881 such groups, of which only 72 were unique. Some compression potential there.

However I need to go further than that, because this would still mean I need two bytes per frame which gives me only just over a minute in 2K of memory. Ideally I would want song data to stay under 2K. Huffman comes to mind, although that’s not the speediest thing to do I would only have to do 1 iteration / frame. If it doesn’t work out though I could decide be wasteful with ROM space.

I’ve implemented a decompressor before, but never a compressor, so I’m kinda winging it. Since I want to play it straight from ROM I’m thinking about either some kind of hierarchical data structure, or a bytecode VM with a stack. If someone has some tips or knows some nice articles about optimisation structures for this kind of data and how to compress somewhat efficiently, suggestions are welcome :).

By Daemos

Paragon (1718)

Daemos's picture

27-04-2020, 09:33

Awesome projext. Had seen some snippets during one of the fairs and I loved the atmosfere that the game exhibited. I find it mostly impressive to see the complete different approach to coding very orderly with even budgetting on the cpu compared to the we wil see what happens mentality that I exhibit myself. Make sure to tease us from time to time with screenies. Smile

By pgimeno

Master (176)

pgimeno's picture

27-04-2020, 19:12

If I went for a compression algorithm to store compressed data in my programs, I would probably use LZEXE as a compressor. Back in the day I traced through the decompressor and understood the code; it's not hard to understand. This is all from my memory:

My recollection is that the compressed stream consists of bit data for decompressor control, interspersed with byte data for decompressor data. You first load a byte from the stream and start decoding its bits. They use a pre-established Huffman tree, which is very simple to decode. I don't remember the codes and their meaning, though.

The general idea is Lempel-Ziv encoding, that is, when something repeats, it indicates how far to go back in the stream and how much to copy from there. Now, say if the next control bit is 0, the next byte in the stream indicates how many bytes must be copied verbatim, so you copy as many bytes as indicated; if it is 1, the next one can be a 0 indicating a three-bit position and a three-bit length, or a 1 indicating more stuff. 110 would then mean e.g. take a byte for how much to go back, and another byte for how much to copy. 111 could mean end of file. I've made up the precise Huffman codes, but the idea is there. You'll have to disassemble an executable produced by it if you want to know the exact codes.

The beauty of the compressor is that it places the control bytes at exactly the necessary places. When you run out of bits in the register, you know that the next byte in the stream will be another byte of the control bit stream, so you just grab it and keep shifting bits.

I haven't studied UPX, but if you feel like that, maybe you can analyse it to see if it is as simple as I've described. I believe that compressing stuff with it should be simpler due to the fact that there are tools available for most modern OSes (so you don't need DOSBOX or similar), and the sources of the decompressors are readily available at the link.

By norakomi

Paragon (1084)

norakomi's picture

29-05-2020, 10:44

Any updates on this awesome project ?

I saw this game yesterday:
https://www.youtube.com/watch?v=Q53kd-wATgM

And I immediately had to think of your scrollengine.
It seems to me a game like aurail should be portable using your engine.

I wanted to fool around a little in the code. The omnidirectional scroll is just superb.
So I went to https://hg.sr.ht/~grauw/tiletile/browse to download your source,
but I can only download file by file... Is there an option where I can download the whole package ?

By Grauw

Ascended (9072)

Grauw's picture

29-05-2020, 16:12

Hey norakomi, here’s a small status update;

The last month I’ve been investigating the conversion of YJK mode (screen 10-12) images, but it’s not simple to get a “perfect” conversion, so a lot of math and tweaking involved. I’m writing an article about it for the MAP as I go along. I’m not certain yet I will use YJK mode for tiles, but at the least I will use it somewhere (even if just a title screen), since this is an MSX2+ game I think it is only proper to.

I’ve also helped someone with a VGM-like converter and player for OPM and OPN* chips, which was pretty interesting. Maybe I can bring some of the learnings of how I set up that file format back into this project. It’s without compression, but I did e.g. make the most frequent event (key on/off) a single byte. I will post the code for that project at some later time.

As for the source code, I’ve just put some development information in the README. In summary: it’s easiest to clone it with Mercurial using the command hg clone https://hg.sr.ht/~grauw/tiletile. Then to build you need to have Java (for the assembler) and Node.js + NPM (for the build scripts) installed. The Makefile contains the build steps, in the node_modules and all targets, on Windows you need to enter them manually (or make a batch file). You should ideally use the latest development build of openMSX, or version 0.16.0 once it’s out.

By Grauw

Ascended (9072)

Grauw's picture

29-05-2020, 15:01

By the way, note that I’m making things a bit more difficult for myself than they need to be. My horizontal resolution is 248 pixels out of 256 because I use only 1-page horizontal scroll (to have the option to use screen modes >= 7), and my vertical resolution is 243 out of 256 because I am using overscan. This means that I have only 8 (h) and 13 (v) pixels of border area available, so I draw in strips of just 4 pixels wide to keep the background drawing invisible.

If you go for just normal screen 5 without overscan, things become a lot more simple and efficient. With the 2-page horizontal scroll and the regular 212 lines, you will have 264 (h) and 44 (v) pixels of border area available which is plenty. This means that you can just draw a couple of 16x16 tiles directly each frame, e.g. only 4-5 tiles each frame for a 2 px / frame unidirectional scroll. This will surely be faster and simpler than drawing those strips I’m using.

An example screen 5 page layout for an MSX2+ scroll engine would be:

Page 0-1: Visible screen with 2-page scroll mode (512x256 scroll area with 248x212 window).
Page 2: Tile map with 256 tiles of 16x16 pixels.
Page 3: HUD / dialogs shown with screen split, and sprite data.

You can copy about 20 tiles per frame on screen 5, so 4-5 tiles account for about 25% of the frame time, and the rest of the time you can spend on sprites, game logic and animating tiles. I also use a VDP copy to animate the player sprite patterns while the CPU is updating the sprite positions and colours.

By FiXato

Scribe (1600)

FiXato's picture

29-05-2020, 15:35

(And if you don't want to install mercurial, I think you can get an archive of the most recent HEAD via https://hg.sr.ht/~grauw/tiletile/archive/@.tar.gz)

By Grauw

Ascended (9072)

Grauw's picture

29-05-2020, 16:07

FiXato wrote:

(And if you don't want to install mercurial, I think you can get an archive of the most recent HEAD via https://hg.sr.ht/~grauw/tiletile/archive/@.tar.gz)

You can, but then you also need to get the appropriate version of the neonlib subrepository and place it in lib/neonlib. Currently that is https://hg.sr.ht/~grauw/neonlib/archive/22b6366a.tar.gz. Mercurial pulls in the correct neonlib version into the correct place automatically, so it’s a bit easier.

By FiXato

Scribe (1600)

FiXato's picture

29-05-2020, 16:23

good to know!

By Grauw

Ascended (9072)

Grauw's picture

06-06-2020, 18:43

After watching Reidrac’s recent video, I’m thinking about the way I represent the tile data. His is a screen-flipping game and mine isn’t, however I think I could learn something from it.

Currently I have tile maps of variable dimensions. The tile data consists of references to tile objects. Each tile is an object which has a reference to the tile bitmap and collision flags for that specific tile.

There are a few issues with this. The tile map data can’t exceed 8K (64×64), so the maximum size is limited. When you transition to another map there needs to be a screen transition (fade). The variable map size and the indirection to get the tile bitmap data complicate the calculations. Lastly if I do entity collision tests across the entire map I’d have to test way more than are visible on screen. A smaller broad-phase partitioning would be nice.

One possible way to address these is to permit the tile map to be bigger than 8K and span multiple ROM banks. This seems ideal however the downside is that a bank switch check would be needed for every tile we access. Also for collision we would still need some kind of partitioning.

A new way I’ve been thinking about it is to subdivide the map into 16x16 partitions. Like a room-based game, but with smooth scrolling between the rooms rather than page flipping.

With these partitions the tile data bank check only has to be done when you cross to a different room. The map renderer already has special code for this threshold because a screen page is 256×256, so it’s quite a natural fit. The collision only needs to concern itself with what’s inside the same partition, and the math could be using 8-bit coordinates. Although it complicates when you cross between partitions.

The partitions could be laid out into a grid, or they could link to other partitions left / right / top / bottom. The former would map well to world x/y coordinates and map editor tools like Tiled, the latter would truly remove any constraint on map dimensions which also has some appeal.

What do you think? Good idea? Bad idea?

Page 24/25
17 | 18 | 19 | 20 | 21 | 22 | 23 | | 25