LLVM comes with a great tutorial that builds a compiler for a simple language called Kaleidoscope. The compiler parses Kaleidoscope into an AST, from which LLVM code is then generated using the LLVM IR building APIs. Once we have LLVM IR, it can be JITed to generate machine code and run it. In other words, convert your language into LLVM IR and leave the rest to LLVM itself (including world-class optimizations).
The tutorial is presented in several "chapters" that start with a simple lexer and build up the language step by step.
This repository contains a chapter-by-chapter translation of the LLVM tutorial into Python, using the llvmlite package that exposes LLVM to Python. For tips on setting up llvmlite
with LLVM, see this blog post.
This repository is fairly complete - the whole Kaleidoscope language is implemented. The only thing missing is Chapter 8 - Debug Information, because llvlite
does not yet support convenient emission of debug info.
Go through the LLVM tutorial. The files in this repository are named after tutorial chapters and roughly correspond to the C++ code presented in the tutorial. In each source file, the __main__
section of code in the bottom is a small sample of usage, and there are also unit tests that check a variety of cases.
I tested with Python 3.4, LLVM 3.5 and top-of-tree llvmlite
.
Some of the files have unit test classes in them. To run all unit tests:
$ python3.4 -m unittest discover -p "*.py"
The most interesting program written in Kaleidoscope is the Mandelbrot set generator in chapter 6:
**************************************************************************************************++++++**********************************************+++++...++++++***************************************++++++++.. ...+++++***********************************++++++++++.. ..+++++*********************************++++++++++. ..++++++******************************+++++++++.... ..++++++***************************++++++++....... .....++++*************************++++++++. . ... .++**********************++++++++... ++********************+++++++++.... .+++****************+++..+++++.... ..+++***********++++++. .......... +++********++++++++.. .. .++******++++++++++... .++++** ******++++++++++.. .++++** *****++++++..... ..++++** *****+........ ...++++** *****+... .... ...++++** *****+++++...... ..++++** *****++++++++++... .++++** *******++++++++++... ++++** ********+++++++++.. .. ..++**********++++++.. .......... +++***************+++...+++..... ..+++******************+++++++++.... ..++*********************++++++++... +++***********************+++++++.. . ... .++*************************++++++++....... ......+++****************************+++++++++.... ..++++++******************************++++++++++.. ..++++++*********************************++++++++++.. ...+++++************************************++++++++.. ...+++++***************************************++++++....+++++*********************************************++++++++***************************************************************************************************************************************