Z80Babel: C++, D, Rust, Zig and Fortran

Page 1/4
| 2 | 3 | 4

Par salutte

Master (161)

Portrait de salutte

08-02-2022, 00:10

We all know exactly what is the only thing our MSX needs to be up with the times...

--- exactly! a Rust Compiler!

Well, allow me to present you the crazy project from this last weekend: the Z80 Babel. This is a series of pipelines to compile C, C++, D, Rust, Zig and Fortran, for our MSX:


As of now, there is only the code and some benchmark results. If I get to know exactly how I managed to get everything running in detail, I'll write it. The key takeover is that any llvm front-end that accepts the AVR target can produce llvm intermediate files that are suitable for the z80 (e.g., pointers being 16 bits). This llvm intermediate file is translated to C using the llvm-cbe backend, and this C is compiled for the Z80 using our beloved SDCC.

My first guess is that such a pipeline would break at some point, but amazingly, a little bit of duct tape (a.k.a SED) manages to fix most of the issues. Support for memory management or standard library is, of course, limited.

On the performance side, the intermediate C files generated by the llvm-cbe backend are very verbose, and SDCC has some trouble optimizing them, but if e crank up the *max-allocations*, it can be even faster than the C baseline! On the other hand, the binaries are quite larger than the C baseline.

With --max-allocs-per-node 1000:

With --max-allocs-per-node 50000:

!login ou Inscrivez-vous pour poster

Par geijoenr

Champion (328)

Portrait de geijoenr

08-02-2022, 02:39

coool Big smile

Par santiontanon

Paragon (1662)

Portrait de santiontanon

08-02-2022, 05:05

This is amazing, great job!! I think this is the first ever MSX program written in Rust! Big smile

Some questions:
- Would it be possible to add in-line assembler? or would it be hell with this pipeline?
- And about the intermediate llvm to C step. Would it be possible (with a lot of work, of course) to create an llvm to z80 assembler directly? or is that intermediate language far too removed from assembler? I am not sure how the llvm pipeline works in x86/arm world, do they also go to C? or do they generate assembler directly from llvm intermediate language?

Anyway, this is very cool, I really hope people pick this up and use it!

Par _ThEcRoW

Master (145)

Portrait de _ThEcRoW

08-02-2022, 09:28

Rust for msx. Interesting will check it out when time is available. Cool! Smile

Par Ped7g

Resident (61)

Portrait de Ped7g

08-02-2022, 12:23

I'm not an expert on the topic, but I think I can answer some of that accurately enough as entry-level knowledge:
- llvm is a lot like machine code, for fictional virtual CPU and machine
- other languages usually assemble to llvm intermediate form, then there is "back-end" compiler building target machine code directly from llvm, C intermediate form is not used at all (usually) (assembly neither)

Not sure about in-line assembler... checking clang docs, the inline assembly extension is largely compatible with gcc, so it probably has some means how to parse+store it into llvm form and then llvm tools are probably capable to walk around such block and preserve it, but not sure what compiling it back into C would do.. would require to actually know something about llvm to answer this, ie. not me. Smile

llvm to z80 directly is theoretically possible, and you can even find on github several abandoned attempts, but z80 is really not suited well for regular way how C is compiled on modern CPUs, so when people try to implement it, they usually quickly figure out it's lot of pain and the result is mediocre. With enough effort maybe somebody talented could produce something practical. The z80_babel idea to use AVR along may elevate some pain out of such effort, I think this is somewhat novel approach? At least I haven't heard about it.

Par Grauw

Ascended (10623)

Portrait de Grauw

08-02-2022, 12:52

Rust and C++! That’s very cool! And a very interesting and creative approach, too.

santiontanon wrote:

- And about the intermediate llvm to C step. Would it be possible (with a lot of work, of course) to create an llvm to z80 assembler directly? or is that intermediate language far too removed from assembler? I am not sure how the llvm pipeline works in x86/arm world, do they also go to C? or do they generate assembler directly from llvm intermediate language?

I gave it a go a while ago because opening up those languages as salutte achieved now seemed very interesting, and I figured it would be educational as well.

But LLVM is very complex and constantly evolving, so once I realised the scope of the work and that getting it upstreamed and maintained would require continued support for many years, I abandoned it… It would be really cool if someone made that investment though.

Par geijoenr

Champion (328)

Portrait de geijoenr

08-02-2022, 13:21


But LLVM is very complex and constantly evolving, so once I realised the scope of the work and that getting it upstreamed and maintained would require continued support for many years, I abandoned it… It would be really cool if someone made that investment though.

As a matter of fact, some people have been working on a Z80 backend for llvm for a while, see:


the project seems to be alive and moving forward, not sure how stable and usable at the moment though.

Par salutte

Master (161)

Portrait de salutte

08-02-2022, 17:38


One of the "complexities" is to compile the same source file with different optimization settings and link all these compilations together. This is somewhat complex because all these object files "announce" the same symbols, so I have to mangle the Obj code to rename the symbols. This makes the Makefile more complex that it needs to be for pure functional purposes.

On the other hand, in my short-term todo list is to combine it with Santi's MDL! I bet it will crunch the files quite a bit!

About adding in-line assembly might be possible on C++, and maybe D (haven't tested it), but I don't think it would support other languages. Also, leveraging SDCC specific extensions which are really useful for MSX development (like IO ports, __banked, etc.) is not supported. There are cases, mostly related to banking and symbol allocation, where mixing ASM and C++ can be useful, but currently I would go for linking .asm and .cpp files together and interface them using function calls.

The llvm intermediate language has the problem that it does not have a stable specification, and changes every major number release. So any back-end must be maintained and kept up to date. There has been some Z80 backends in the past that have been abandoned, and even the C backend was made obsolete a few years ago and it was updated to work with current llvm-ir's not long ago.

If the new support for Z80 that @geijoenr comes to fruition, it will be amazing, and will give us yet another option!

Par santiontanon

Paragon (1662)

Portrait de santiontanon

08-02-2022, 18:45

salutte wrote:

On the other hand, in my short-term todo list is to combine it with Santi's MDL! I bet it will crunch the files quite a bit!

Cool! Happy to add any new optimization patterns that could be useful here Smile

Par salutte

Master (161)

Portrait de salutte

11-02-2022, 13:11

I am evaluating the clang z80 backend from jacobly0. It is a reasonable alternative to SDCC, it has some advantages and some disadvantages. My first impressions:

On the good side:
* It seems robust enough. It is being used by the community that programs the TI84, which uses the ez80 CPU.
* It supports both C and C++.
* Allows passing structs as functions parameters, which is HUGE, and it enables interfacing with lots of existing libraries!
* It compiles faster than SDCC, and my impression is that the code "looks nicer".

On the "could be improved" category, but no deal-breaker:
* The assembly generated is not even close to the format used by SDCC, so it is hard for me to currently mix it in my benchmark.
* It lacks documentation (i.e., reading the code, I saw that there are several calling conventions supported, but I could not find its description).
* It appears to me that the backend is optimized for the ez80 CPU, and not for the bare z80.

My core pain point is that it uses extensively "builtin functions", so you need to use a library named "RTLIB" which is only implemented now for ez80 CPUs, as far as I could see. SDCC also uses builtin functions, e.g., to multiply two integers, but clang-z80 uses them even when it would be easier to inline them.

Example code:

uint16_t j, i;
j = i + i + i;


; hl = 3 * de
	ld	l, e
	ld	h, d
	add	hl, hl
	add	hl, de


; hl = 3 * de
	ex	de, hl
	ld	bc, 3
	call	__smulu ; RTLIB::MUL_I16

The list of functions clang-z80 requires can be found in:

  setLibcall(RTLIB::ZEXT_I16_I24,     "_stoiu",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SEXT_I16_I24,     "_stoi",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SEXT_I24_I32,     "_itol",     CallingConv::Z80_LibCall   );

  setLibcall(RTLIB::NOT_I16,          "_snot",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::NOT_I24,          "_inot",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::NOT_I32,          "_lnot",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::NOT_I64,          "_llnot",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::AND_I16,          "_sand",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::AND_I24,          "_iand",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::AND_I32,          "_land",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::AND_I64,          "_lland",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::OR_I16,           "_sor",      CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::OR_I24,           "_ior",      CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::OR_I32,           "_lor",      CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::OR_I64,           "_llor",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::XOR_I16,          "_sxor",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::XOR_I24,          "_ixor",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::XOR_I32,          "_lxor",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::XOR_I64,          "_llxor",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SHL_I8,           "_bshl",     CallingConv::Z80_LibCall_AB);
  setLibcall(RTLIB::SHL_I16,          "_sshl",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SHL_I16_I8,       "_sshl_b",   CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::SHL_I24,          "_ishl",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SHL_I24_I8,       "_ishl_b",   CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::SHL_I32,          "_lshl",     CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::SHL_I64,          "_llshl",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SRA_I8,           "_bshrs",    CallingConv::Z80_LibCall_AB);
  setLibcall(RTLIB::SRA_I16,          "_sshrs",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SRA_I16_I8,       "_sshrs_b",  CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::SRA_I24,          "_ishrs",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SRA_I24_I8,       "_ishrs_b",  CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::SRA_I32,          "_lshrs",    CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::SRA_I64,          "_llshrs",   CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SRL_I8,           "_bshru",    CallingConv::Z80_LibCall_AB);
  setLibcall(RTLIB::SRL_I16,          "_sshru",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SRL_I16_I8,       "_sshru_b",  CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::SRL_I24,          "_ishru",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SRL_I24_I8,       "_ishru_b",  CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::SRL_I32,          "_lshru",    CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::SRL_I64,          "_llshru",   CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::CMP_I32,          "_lcmpu",    CallingConv::Z80_LibCall_F );
  setLibcall(RTLIB::CMP_I64,          "_llcmpu",   CallingConv::Z80_LibCall_F );
  setLibcall(RTLIB::CMP_I16_0,        "_scmpzero", CallingConv::Z80_LibCall_F );
  setLibcall(RTLIB::CMP_I24_0,        "_icmpzero", CallingConv::Z80_LibCall_F );
  setLibcall(RTLIB::CMP_I32_0,        "_lcmpzero", CallingConv::Z80_LibCall_F );
  setLibcall(RTLIB::CMP_I64_0,        "_llcmpzero",CallingConv::Z80_LibCall_F );
  setLibcall(RTLIB::SCMP,             "_setflag",  CallingConv::Z80_LibCall_F );
  setLibcall(RTLIB::NEG_I16,          "_sneg",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::NEG_I24,          "_ineg",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::NEG_I32,          "_lneg",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::NEG_I64,          "_llneg",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::ADD_I32,          "_ladd",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::ADD_I32_I8,       "_ladd_b",   CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::ADD_I64,          "_lladd",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SUB_I32,          "_lsub",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SUB_I64,          "_llsub",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::MUL_I8,           "_bmulu",    CallingConv::Z80_LibCall_BC);
  setLibcall(RTLIB::MUL_I16,          "_smulu",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::MUL_I24,          "_imulu",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::MUL_I24_I8,       "_imul_b",   CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::MUL_I32,          "_lmulu",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::MUL_I64,          "_llmulu",   CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SDIV_I8,          "_bdivs",    CallingConv::Z80_LibCall_BC);
  setLibcall(RTLIB::SDIV_I16,         "_sdivs",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SDIV_I24,         "_idivs",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SDIV_I32,         "_ldivs",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SDIV_I64,         "_lldivs",   CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UDIV_I8,          "_bdivu",    CallingConv::Z80_LibCall_BC);
  setLibcall(RTLIB::UDIV_I16,         "_sdivu",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UDIV_I24,         "_idivu",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UDIV_I32,         "_ldivu",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UDIV_I64,         "_lldivu",   CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SREM_I8,          "_brems",    CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::SREM_I16,         "_srems",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SREM_I24,         "_irems",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SREM_I32,         "_lrems",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SREM_I64,         "_llrems",   CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UREM_I8,          "_bremu",    CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::UREM_I16,         "_sremu",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UREM_I24,         "_iremu",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UREM_I32,         "_lremu",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UREM_I64,         "_llremu",   CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UDIVREM_I24,      "_idvrmu",   CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UDIVREM_I32,      "_ldvrmu",   CallingConv::Z80_LibCall   );

  setLibcall(RTLIB::POPCNT_I8,        "_bpopcnt",  CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::POPCNT_I16,       "_spopcnt",  CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::POPCNT_I24,       "_ipopcnt",  CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::POPCNT_I32,       "_lpopcnt",  CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::POPCNT_I64,       "_llpopcnt", CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::BITREV_I8,        "_bbitrev",  CallingConv::Z80_LibCall_AC);
  setLibcall(RTLIB::BITREV_I16,       "_sbitrev",  CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::BITREV_I24,       "_ibitrev",  CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::BITREV_I32,       "_lbitrev",  CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::BITREV_I64,       "_llbitrev", CallingConv::Z80_LibCall   );

  setLibcall(RTLIB::ADD_F32,          "_fadd",     CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::SUB_F32,          "_fsub",     CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::MUL_F32,          "_fmul",     CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::DIV_F32,          "_fdiv",     CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::REM_F32,          "_frem",     CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::NEG_F32,          "_fneg",     CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::CMP_F32,          "_fcmp",     CallingConv::Z80_LibCall_F );
  setLibcall(RTLIB::FPTOSINT_F32_I32, "_ftol",     CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::FPTOUINT_F32_I32, "_ftol",     CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::SINTTOFP_I32_F32, "_ltof",     CallingConv::Z80_LibCall_L );
  setLibcall(RTLIB::UINTTOFP_I32_F32, "_ultof",    CallingConv::Z80_LibCall_L );

  setLibcall(RTLIB::ADD_F64,          "_dadd",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SUB_F64,          "_dsub",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::MUL_F64,          "_dmul",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::DIV_F64,          "_ddiv",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::REM_F64,          "_drem",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::NEG_F64,          "_dneg",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::CMP_F64,          "_dcmp",     CallingConv::Z80_LibCall_F );
  setLibcall(RTLIB::FPEXT_F32_F64,    "_ftod",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::FPROUND_F64_F32,  "_dtof",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::FPTOSINT_F64_I32, "_dtol",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::FPTOUINT_F64_I32, "_dtoul",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SINTTOFP_I32_F64, "_ltod",     CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UINTTOFP_I32_F64, "_ultod",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::FPTOSINT_F64_I64, "_dtoll",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::FPTOUINT_F64_I64, "_dtoll",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::SINTTOFP_I64_F64, "_lltod",    CallingConv::Z80_LibCall   );
  setLibcall(RTLIB::UINTTOFP_I64_F64, "_ulltod",   CallingConv::Z80_LibCall   );

Par geijoenr

Champion (328)

Portrait de geijoenr

11-02-2022, 13:35

If the biggest problem are those libcalls it doesn't seem that bad at all. We can rewrite rtlib for z80 (or r800 Wink ).

I will give it a go as well, I wasn't expecting it to just work to be honest.

Page 1/4
| 2 | 3 | 4