We tested some speeds with a file ~16k, and:
BB (new): 1264038 clock ticks (0,35 s)
BB: 1435048 clock ticks (0,40 s)
TCF (new): 1498766 clock ticks (0,42 s)
apack: 1590929 clock ticks (0,45 s)
TCF: 1860128 clock ticks (0,52 s)
Also, compressed file sizes for that particular file (originally 16391 bytes):
apack: 5187
BB: 5321
TCF: 5345
As you can see, if you compare these results with the compression size tests of earlier, it differs quite a bit depending on the input.
Anyways, so, which is the fastest is clear ^_^. However also remember that this is only one single test case (though the gaps are pretty large to overcome... ). And it's been a nice evening discussing (de)compression on IRC, so surely things will still change for the better! ^_^.
~Grauw
Could you test this with some yuv or 16bit rgb pictures too?
The TCF decompressor used in this test was size-optimized. I've just spent 15 minutes optimizing the decompressor for speed. As I do not have the testcase, I can't post results right now, but I'm sure they'll be quite a bit better for TCF
Still... All this talk about compression engines is nice, but we're actually talking G9B here.
That means there should be a MSX compressor, which isn't available for any of the 3 formats. (for apack there's not even sourcecode for the compressor available)
Or isn't that necessary?
Since we're talking graphics here, I think maybe a difference compression (like T&E Soft's CMP) would be a better idea anyway. Difference compression is easy to program on MSX, unlike any of the LZ packers we've tested until now.
I'm thinking: XOR each line with the previous, do a MTF (Move To Front) on the output and compress with static huffman.
New TNI Compression Format benchmark (same test, optimized engine):
1498766 clock ticks (0,42 s)
I edited this result into the earlier post (compression ratio is the same), and also recalculated all times in s as I made a small miscalculation (I multiplied the Z80's 3.57 MHz two times with 1024 instead of 1000 -_-;.
~Grauw
Btw, these tests were conducted in openMSX using the debugdevice. The given emutime translate to clock cycli when divided by 24. Aso, all three tests were run with interrupts disabled.
~Grauw
I just found out that my timing for bitbuster wasn't correct, due to a difference between having DOS2 inserted and having it not inserted (you could also blame it on my bad coding though )
The number of cycles for that test should be 1435048, making it just a little faster than TCF. However, I got some tips from sjoerd which makes it easier & faster to decode the match length values. Using this optimization, the decompression takes 1264038 cycles. I also tested in R800 mode, in which it only takes 135887 cycles. That is more than 9 times as fast as in z80 mode!
You tested R800 on openMSX? Because I don't think openMSX has accurate timings for R800 yet. There's all kinds of stalls and penalties in R800 which are poorly documented.
uhm yeah, that's tested on openMSX since I don't have a TR.
I updated the test results in the message above.
~Grauw
I have just uploaded a new version of the gf9k library (version 0.004)
New in this version:
- Font routines works okay now on turbo r
- Added multiple font support. Upto ~64KB data can be used for fonts. This makes it
possible to load about 16 fonts of 8*16 pixels or 32 fonts of 8*8 pixels.