September 10, 2019

Writing a Bytecode Compiler and Virtual Machine in Go

While at the Recurse Center, I’ve finished up Thorsten Ball’s book, Writing a Compiler in Go. Thorsten’s book is an excellent resource for learning about Compilers and Virtual Machines due to the well thought out explanations and diagrams. I presented the final product during RC’s weekly technical talks and talked about how the Compiler pieces fit together and some of the challenges faced when translating an Interpreter into a Compiler.

Context

Picking up where Thorsten’s first book, Writing an Interpreter in Go finished, we start with the Interpreter we implemented in the first book. if you’d like to know more about the Interpreter, check out the blog post preceeding this one. I’ll give a rough overview below. The Interpreter has 3 important parts:

Lexer
Parser
Evaluator

The Lexer and Parser are considered the language’s front end. The front end deals with:

Reading the submitted program
Breaking it up into it’s smallest pieces
Recombining the pieces, analyzing them, validating types + syntax and then outputting an Abstract Syntax Tree (AST) representing the input program

The backend of the Interpreter deals with:

Execution of the program via the evaluator

The backend is straightforward for the Interpreter, it requires only one piece. The evaluator is a tree walking evaluator, which walks the AST from start to finish executing what was inputted as Monkey code in native Go.

Performance challanges

The evaluator does a lot of work to evaluate the AST at runtime.
Maybe, we can improve this by introducing some optimizations?
Considering the evaluator already spends a lot of cycles running the code, optimizations are a hard sell.

This brings us to the next iteration which takes the Interpreter built for Monkey and replaces the backend with something more complex. Submitting a program to the Interpreter means it will immediatly be executed. What if this was separated this into two phases, compilation and execution? This complexity allows us to increase the language’s performance.

Speeding up Monkey

Speeding up Monkey requires replacing the backend evaluator with a Bytecode Compiler and Virtual Machine. This increases performance through several avenues. Separating compile-time from run-time enables optimizations to happen before the program is run. Additionally, since the Bytecode instructions are lower level, the Virtual Machine can execute them faster than evaluating the AST.

The full stack of the Compiler includes these pieces:

Lexer
Parser
Bytecode Compiler
Virtual Machine

The Compiler compiles the AST into a linear set of instructions (not too dissimilar from what assembly looks like) which we can run in the Virtual Machine later on. Next, the Virtual Machine mimics a real computer taking in Bytecode instructions, decoding them and then executing them within a run loop. The VM maintains a data stack, a call stack and an instruction pointer as it iterates through the given instructions executing them in Go.

The End Result

The code is up on Github along with documentation. By implementing Monkey, I’ve built an intuition about how Programming Languages may work under the hood. This intuition has nice applications in both learning languages as well as debugging them. Learning about Virtual Machines for programming languages has been pretty insightful as well… I’m curious as to how different this is from the virtual machines that can be spun up on Amazon EC2.

Several developers I’ve talked to while working on this project made comments that building Interpreters and Compilers seems “hardcore” and as a result not something within their reach. It doesn’t need to be this way, this topic can be approachable. Thorsten Ball’s two books are a great step forward in this regard. Additionally, better visualizations of Interpreter and Compiler execution would make the topic more approachable and friendly. I haven’t seen excellent visualizations yet but if you have one to share, send it to me. In the meantime, I’m tossing around the idea of implementing such a visualization.

Interesting tidbits

Python’s execution depends on a gigantic case statement, so does Shopify’s Lua VM … this was how the Monkey virtual machine was implemented as well. I looked around for the JVM implementation to compare but I didn’t see it on GitHub or GitLab.
The Bytecode Compiler and Virtual Machine implemented in the book is somewhat similar to how the Java Compiler and JVM works, sans any optimizations. javac, the Java Compiler, compiles a program down to Java Bytecode instructions. At a later point in time, those instructions are submitted to the Java Virtual Machine, which executes the bytecode in native C++. The differences are that javac does some optimizations while compiling to byte code (compile-time optimizations) and another set of optimizations while the code is running (run-time optimizations). For run-time optimizations, Java has a Just-In-Time Compiler (JIT) that will profile bytecode as it runs and speculatively optimize it, compiling the hot paths in bytecode down to machine code.

Want to become a better programmer? Join the Recurse Center!