If something other than TZX would lead to a decrease in the probability of preservation, I think that's a big point in TZX's favour.
The only one benefit of using TZX is that Laser Squad can be archived.
Why is that? Can't it be archived in WAV? Or in UEF, or CSW?
The only one benefit of using TZX is that Laser Squad can be archived.
Why is that? Can't it be archived in WAV? Or in UEF, or CSW?
It's archived as a .WAV.
It could be archived as a .CSW.
Not sure about the UEF because of the Spectrum data in it.
It certainly can't be archived as a .CAS.
If something other than TZX would lead to a decrease in the probability of preservation, I think that's a big point in TZX's favour.
It wouldn't. TZXDuino doesn't have a large following in the MSX community anyway as CASDuino is far better suited as there are already a large amount of .CAS files and the higher baudrates are more reliable than they are in TZXDuino.
Not sure about the UEF because of the Spectrum data in it.
I certainly cannot think of an efficient encoding: one that wouldn't be some big multiple of the original data size and of the TZX. At the most absurd usage, you could keep posting a single cycle of carrier tone and changing baud rate, which would probably to be comparable in size to an equivalent WAV but semantically much more painful. I think you could do a bit better than that, but not hugely.
You're unambiguously right to discount it as a smart choice.
(such as it may be interesting trivia: the Acorn machines have relatively high-level tape hardware that operates much closer to an FDC; there's an interrupt for high tone detection and then interrupts for bytes received, and no route directly to inspect the current tape level. So that's the context there)
VGM files are also usually gzip-compressed as .vgz, in the first versions of VGMPlay I only supported the decompressed files (later I implemented gunzip support). If CASDuino reads from an SD-card, I think there should be no-one stopping you from putting uncompressed files on the SD card, with the GBs of space available nowadays I don’t think it should be a big issue.
Alternatively, since the zero-crossing times will be very similar, a format based on deltas and simple Huffman compression should already keep the file size down by a lot, and would be simple to implement, stream incrementally, emulate in real-time, and process digitally with threshold values.
There are also LH-variants which do not need a large buffer and only a small code implementation. Finally, gzip supports smaller buffer sizes as well, although it needs to be enforced by the encoder, so a restriction could be specified explicitly in the file format specification (but might be hard to enforce).
If you all want to play and have fun you can have a try to these conversion tools:
https://github.com/nataliapc/MSX_devs/tree/master/TSXphpclass
You can even convert Atom/BBC UEF files to TSX format and play them in a TZXDuino.
(not all UEF blocks are translated but the most common/useful for this case)
Also I want to present a site where we can see several TSX files. All them are tested and validated in reliability.
Verification is a hard work, so only 28 files are available at this time.
http://tsx.eslamejor.com/
I agree that TSX format is not the definitive tape format and there are a lot of opinions about how a 'better' format would be.
TSX format is not an angel, but also not a devil, and I think that people that worked to have hundred of TSX files to preserve the tapes data must be recognized.
I've also talked ago about the TSX pros in several forums (semantic data, reliability, reuse of TZX tools, reduced size ...) and I don't want to reopen the discussion about (that's enough).
Instead, I would like start a new talk about a possible new format and which must be the features that all of us we would like to see in, and at same time remove heavy baggage from past formats.
Some features that could include this new format:
- Simple/serialized format for small player devices (xxxDuino), and emulators.
- Reliability
- Semantic data
- Easily scalable for new features/platform formats
- Optional metadata (easily skipped when played)
- Multiplexed data? interlacing tape pulses (CSW) and raw bytes data inside each data block?
- ...
This try to be a kind of brainstorm... please feel free to discuss and add new features and constructive ideas.
Off the top of my head, for the sake of throwing things out there: do it by microcode.
Each block would say, either:
- this is a piece of microcode to run n times; or
- here is the microcode for output a 0; here is the microcode for output a 1; here is the microcode for each byte; here is a block of bytes.
Microcode instructions are:
- REVERSE — reverses output polarity (i.e. in MSX terms, toggles the output level between a 0 and a 1 or vice versa);
- WAIT n — does nothing for n units of time;
- OUTPUT 0 — calls the output a 0 routine;
- OUTPUT 1 — calls the output a 1 routine;
- OUTPUT bit n — reads bit n from the current byte and calls either the output a 0 routine or the output a 1 routine;
- CLEAR PARITY — resets the current value of parity;
- SET PARITY — sets the current value of parity;
- OUTPUT PARITY — outputs the current value of parity (implicitly: it has been toggled by all OUTPUT calls in the interim).
So e.g. to reproduce the data within a CAS you might have:
New block, repeat a few thousand times:
WAIT 208
REVERSE
New block, routine to output a 0 is:
WAIT 417
REVERSE
WAIT 417
REVERSE
Routine to output a 1 is:
WAIT 208
REVERSE
WAIT 208
REVERSE
WAIT 208
REVERSE
WAIT 208
REVERSE
Routine to output a byte is:
OUTPUT 0
OUTPUT b0
OUTPUT b1
OUTPUT b2
OUTPUT b3
OUTPUT b4
OUTPUT b5
OUTPUT b6
OUTPUT b7
OUTPUT 1
OUTPUT 1
And block of bytes is: [whatever] .
New block, repeat once:
WAIT 5000000
Some sort of tiny-window gzip-esque compression — whatever is appropriate for a CASDUINO-type device — could be layered on top.
I think the instructions would be a variable length encoding, in a bit stream. One idea would simply be: 3 bits for the opcode. If it's OUTPUT bit n then an additional 3 to specify a bit. If it's WAIT then, well, it depends on what time precision we want. Whatever fits at least five seconds, I'd say. Possibly also make it a multiple of 3 bits since that's the pattern so far. I used microseconds above, so 24 bits, which is also nice for being a multiple of 8.
That would cover all storable MSX tapes, with negligible overhead over a CAS for CAS-style data. CSW-style data would be massive though, so I'd at least consider a third block for "output reverses at the following intervals" which would be like CSW (but without additional compression) and/or like Commodore .TAP.
If you don't like microcoding, then I'd still advocate the same approach of 'either do these directly-specified things, or else here's a template and apply it to the following bytes' but formulated as data. E.g. the equivalent of the above would be more like:
OUTPUT BLOCK: a few thousand instances of 'wait 208, reverse'
DATA BLOCK: a 1 is this output block, a 0 is this output block, the template is 0 b0 b1 b2 b3 b4 b5 b6 b7 1 1, the bytes are
OUTPUT BLOCK: one instance of 'wait' with a long number
You just end up getting into slightly complex hoops with machines that store parity (two extra potential tokens needed in the template), or which leave gaps between bytes (an extra header field? an extension of the template?). But just formulating as a generic output block and being able to specify one of those as the form of a 0 and another as the form of a 1 is already a huge step above what's already out there.
EDIT: consider calling it Micro Cas because it's microcoded, because it's oriented towards cassette formats used by micros, and because the compression window will be selected with an eye as to microcontrollers.
Also, if it wasn't explicit, the two or three kinds of block will be explicitly and uniformly demarcated. Take that, TZX!
As we're brainstorming, here's an entirely different suggestion for those who didn't like the previous: adopt something more like PNG's approach.
PNG files are a collection of lines. Each line is an initial value and a predictor. The predictor answers the question: given the values to the left, immediately above, and diagonally to the top left, what value would you expect to be here? The line itself is then encoded as a list of error values: at each pixel, the difference between what the predictor said should be there and what's actually there. That's then subject to rote gzip compression.
The idea is that if you pick a good predictor, the error values tend to be small and repetitive. Which makes them gzip really well. You use a priori knowledge to try to reduce entropy and let the compressor deal with reducing the byte size.
The PNG predictors aren't particularly complicated. They're just things like the difference, the average, next in the numeric sequence, etc.
For microcomputer audio tapes the thing you're sampling would be amount of time between zero transitions. The predictors could likely be things like "same as length before", "same as length one before last", "equal to most common length in the last 16 events", "equal to the average of the last 16 events", etc. You'd probably have to approach it experimentally.
So what you're basically defining is an improved CSW: much smarter than just gzipping a list of wave times, just like PNG is much better than just gzipping raw image data, and if we're factoring in Arduino targets then we can pick a suitably constrained back-end compressor.
A quick back-of-an-envelope calculation suggests to me that for MSX ROM data, a predictor of "length is same as last" would be correct around 81.4% of the time on average.
Sorry, I don't mean to dominate the conversation but I thought of an improvement to my instruction set above; withdraw the three parity operations and replace with two:
- COMPLEMENT PARITY — inverts parity;
- OUTPUT AND CLEAR PARITY — outputs the current parity and resets it.
... and define parity as initially reset upon entry into any data block. Reduces the instruction set from 8 instructions to 7.
I'm then tending towards potentially introducing a SHORT WAIT with a 12-bit operand to complement the LONG WAIT with a 24-bit operand, as 12 bits is enough for the in-data sections of all tape data encodings I'm aware of, but then when you want to put something like a 5-second gap between files you don't want to have to specify 2441 waits.