I’ve been very busy for the last few months, with many different things. This post will therefore be a bit of a mess, unfortunately. I’m at a place where I’ve been before a few times: not knowing the most interesting road to explore, since there are so many, and they are all fascinating and rewarding in their own way.
First up though, drawing a line – albeit a dotted line – under MAXI030.
I should have done this months ago, but I’ve finally put the KiCAD files (schematics and PCB design) on github. This repo should, in theory, include everything someone needs to build their own MAXI030, including the Quartus VHDL project files for the FPGA. It also includes the schematics and PCB design for the three expansion cards I’ve built (TestSRAM, Printer+Joysticks and the Cyclone II based video card). The only thing missing on that front is the VHDL for my graphics card design, but that is only because the code is in currently in a terrible state.
The other two MAXI030-related repos are:
- 68K monitor – My machine-code monitor, including the RTL80194AS (PDF) code for downloading the Linux kernel.
- Linux – My Linux port, up to date with upstream HEAD as of a few weeks ago. The branch lm/maxi030 has the changes needed by MAXI030 on it.
If anyone is serious about building their own MAXI030, and ideally thinks they would be able assist me with extending the programmable logic functionality and the Linux port, I’d be willing to supply a MAXI030 PCB and an EPF10K130E FPGA, and possibly some other parts, at zero cost, provided the shipping is paid for. I have around four of each spare at present. Let me know in the comment section below.
I’ve also added a project page on hackaday.io, just for completeness. The main source of information on MAXI030 will continue to be this blog.
I’ve also recorded a brief video that goes over the usage of the MAXI030 console:
Lastly on the MAXI030 front, I have obtained from mouser.com some alternative clock oscillators for the board. In a previous post I discussed the problems I’d had running the board at 40MHz, and was reduced to running it at 20MHz. I’ve since swapped the oscillator for a 32MHz part and am happy to report the board is running well at this speed, including running Linux. The video above was recorded prior to this oscillator change.
Next, MAXI030’s Linux port has inspired me to examine how to setup Linux to run on a generic ARM board. Because I had it to hand, I thought it would be fun to fire up Debian Linux on my Terasic DE10 Nano board. I originally bought this board for running MiSTer, the FPGA-assisted emulation platform. All I’ve really done is follow a tutorial, but with a couple of twists.
Firstly the userland rootfs is essentially the same as MAXI030, except it is up to date. I used a modified version of the script I wrote for building MAXI030’s Debian root filesystem, the main difference being the software pulled in is up to date Debian.
Secondly I have tweaked the kernel configuration slightly. I thought it would be fun to play with Linux’s USB “gadget” support. Since the DE10 Nano’s USB port is OTG (On The Go) enabled, it is possible to use the port as a device connection, as well as a host. For now I’ve just played with the Linux kernel presenting the port as a USB storage device. Other possibilities exist.
The main goal of running my own Linux kernel on the DE10 Nano is to investigate linking up the Linux kernel to programable logic implemented in the FPGA-side of the SoC on the board. This should be where things get extremely interesting. There are many possibilities from compute acceleration to a graphical controller attached to the board’s HDMI port. To start with it will be something extremely simple, like a hardware register, and a driver in the kernel, which turns an LED on and off.
So to LLVM, then.
Prior to the above activities, I spent a couple of months getting to grips with (parts of) the LLVM code base with the goal of implementing a backend for my 32 bit softcore processor. This was and is an almost vertical learning curve, but I’m pleased with the progress I have made all the same. This post is not, and cannot be, a tutorial on writing an LLVM backend, but rather a list of references to information I found useful when starting this project, and a summary of my prtogress. For completeness I should state that prior to starting this project I knew next to nothing about compilers, and even now I would call myself a rank-amateur at best.
An LLVM backend consists of many different components, which involves changes to various areas of LLVM functionality, but focusing in what LLVM terms the Target area.
Luckily there are various tutorials and examples available to learn from. I found the following to be extremely useful, but naturally no one tutorial describes it all.
First up, documentation from the LLVM project itself. This is a good starting point, but you must be fairly familiar with the wider LLVM code base to get a lot out of it; you cannot make use of this on its own, which is what I tried to do.
Next we have a tutorial describing creating a backend for the hypothetical Cpu0 architecture. This architecture is actually very similar to my 32 bit processor: 32 bit, with 16 general purpose registers. It’s extremely detailed, and probably 80% of the text is sample code. I found it to be full of detail, but not much in the way of a higher level view of what the code was doing.
I also found a set of articles on a backend for another hypothetical process, the cunningly named RISCW, to be excellent. It does not delve deeply, but it covers exactly what you need to know to make a start on a backend perfectly. There is a also a github repo with the full source for the, albeit incomplete, target.
I’m keen to learn more about LLVM, beyond what is needed to write a target, so I also bought an old fashioned dead-tree book, Learn LLVM 12: A beginner’s guide to learning LLVM compiler tools and core libraries with C++. It has a chapter on writing a backend, this time for the Motorola MC88K, though of course it’s only a chapter in one book so the amount of detail it can go into is limited. Overall the book seems to give a good overview of nearly every aspect of LLVM. The code for the backend has been posted on github so it can be read without purchasing the book, though I would encourage you to do so if you find the source useful.
The last major source for inspiration and information has been other targets in the LLVM repo. The main backend I’ve utilised for this has been the one for the MSP430. Somewhat surprisingly, this is a 16 bit MCU produced by Texas Instruments, and it would appear to be nothing like my processor in terms of its capabilities, yet I found the target code clearly laid out and easy to follow. At first glance it would seem that the RISC-V LLVM backend would have a wealth of code to read and learn from, but since it is an extremely complex and fully featured architecture its usefulness as a learning tool, at least for me, is limited. The same can be said for the M68k target, recently added. The M68k is a large complex instruction set architecture so, again, its usefulness is fairly limited.
So far my LLVM target is pretty limited. The assembler is relatively complete and, since I have made a start on a linker, it is possible to link multiple objects together to create a linked executable, within certain constraints.
The compiler is much less far along. I am able to compile very simple programs, but again there are constraints and problems. In particular the implemented calling convention limits function calls to four parameters, which are passed via registers. An illustration of a successful compile is that of a strcmp() implementation:
int strcmp(const char *s1, const char *s2) { while (*s1 == *s2++) if (*s1++ == 0) return (0); return (*(unsigned char *)s1 - *(unsigned char *)--s2); }
This code is compiled into LLVM’s Intermediate Representation using the following clang command line:
bin/clang -O2 -S -emit-llvm --target=cpu32 strcmp.c
The resultant file, strcmp.ll is then compiled into CPU32 assembly and the result is the following, which includes the encoded code stream in the comments:
.text .file"strcmp.c" .globlstrcmp ; -- Begin function strcmp .p2align2 .typestrcmp,@function strcmp: ; @strcmp ; %bb.0: ; %entry push r14 ; encoding: [0x60,0xef,0x00,0x00] copy r14, r15 ; encoding: [0x70,0xef,0x00,0x00] push r4 ; encoding: [0x60,0x4f,0x00,0x00] copy r2, r0 ; encoding: [0x70,0x20,0x00,0x00] loadi.l r0, #0 ; encoding: [0x10,0x00,0x00,0x00,0x00,0x00,0x00,0x00] .LBB0_1: ; %while.cond ; =>This Inner Loop Header: Depth=1 load.bu r4, 0(r1) ; encoding: [0x24,0x41,0x00,0x00,0x00,0x00,0x00,0x00] load.bu r3, 0(r2) ; encoding: [0x24,0x32,0x00,0x00,0x00,0x00,0x00,0x00] compare r3, r4 ; encoding: [0x40,0x03,0x84,0x00] branchne .LBB0_3 ; encoding: [0x31,0x0f,0x20,0x00,A,A,A,A] ; fixup A - offset: 4, value: .LBB0_3, kind: FK_Data_4 branch .LBB0_2 ; encoding: [0x31,0x0f,0x00,0x00,A,A,A,A] ; fixup A - offset: 4, value: .LBB0_2, kind: FK_Data_4 .LBB0_2: ; %while.body ; in Loop: Header=BB0_1 Depth=1 addq r2, r2, #1 ; encoding: [0x50,0x22,0x00,0x01] addq r1, r1, #1 ; encoding: [0x50,0x11,0x00,0x01] compare r3, #0 ; encoding: [0x42,0x03,0x80,0x00,0x00,0x00,0x00,0x00] brancheq .LBB0_4 ; encoding: [0x31,0x0f,0x10,0x00,A,A,A,A] ; fixup A - offset: 4, value: .LBB0_4, kind: FK_Data_4 branch .LBB0_1 ; encoding: [0x31,0x0f,0x00,0x00,A,A,A,A] ; fixup A - offset: 4, value: .LBB0_1, kind: FK_Data_4 .LBB0_3: ; %while.end sub r0, r3, r4 ; encoding: [0x40,0x03,0x24,0x00] .LBB0_4: ; %return pop r4 ; encoding: [0x61,0x4f,0x00,0x00] pop r14 ; encoding: [0x61,0xef,0x00,0x00] return ; encoding: [0x38,0x0f,0x00,0x00] .Lfunc_end0: .sizestrcmp, .Lfunc_end0-strcmp ; -- End function .ident"clang version 15.0.0 (https://github.com/llvm/llvm-project.git c1488e916dca64ab4c43bb7612fe3456c8dc6cfc)" .section".note.GNU-stack","",@progbits
The commandline to generate this is as follows:
bin/llc strcmp.ll --show-mc-encoding -o strcmp.s
Note that the “fixups” are for symbols which will be resolved at the linking stage. There a number of things about the generated code which are suboptimal:
- It makes no use of “quick mode” instructions, ie. all immediates are in the following longword instead of being embedded with the instruction.
- There are redundant branches, where the branch takes the program counter to the next instruction.
- R14 is used for the frame pointer, but no frame is used.
I have not actually run (or simulated) the above code, though I have successfully simulated even simpler programs. Looking at the above code I suspect it would run, though not very efficiently for the reasons given above. But compiling code that is only slightly less trivial then this string compare function usually fails for reasons I have yet to investigate.
The goal for this compiler project has been reduced from a fully featured C compiler to one where I would be happy if it compiled a subset of C, for instance one where it only supported a limited number of arguments to functions, amongst other limitations.
Working on this compiler has been very educational. LLVM is an amazingly approachable codebase, and it is fantastic that it includes functionality which the GNU project spreads across multiple projects. Today everything from a preprocessor to a linker is under the LLVM umbrella, which makes working on the full compiler toolchain for a brand new, otherwise unknown processor, possible without having to learn multiple codebases. Albeit LLVM is massive!
My LLVM fork needs cleaning up before I push it onto github. The plan is to then try running some more compiled programs under my simulation framework. I can then look at running compiled code in hardware on my DE2-115 Cyclone IV development board. This would be quite the milestone for my projects…
Hello
Deeply impressed by your blog contents which means a lot to electronic hobbyists.
I’d like to sponsor your project by providing free PCB prototyping,
only hope a slight promotion or a review about quality or service in return.
Are you interested?