Make a Lisp in Python
2024-08-08
#build-log #language-design #python #lisp
I’ve been spending a lot of time learning how programming languages are created. At the moment, I am half-way through reading Writing a Compiler in Go (which, along with the first book in the series, is fantastic). I have lofty aspirations for making my own VM and self-compiled language, but I have a lot more to learn first.
The books covers building a fairly robust interpreter (and later compiler) for a C-like language. I’ve already written several thousand lines of code and probably have another thousand or two to go. It’s been rewarding, but also a lot to think about. I wanted to take a little brain break and build a dirt-simple lisp-like language.
It turned out to be pretty easy to build (and it was a lot of fun). I thought it would be a fun tutorial for folks, so join me as we make our very own programming language!
Lisp crash course
Lisp is short for “list processing”. If you’re use to C-like languages (e.g. JavaScript or Rust), the thing that probably stands out is that it has a lot of parenthesis. Everything in lisp is a list (elements surrounded by parenthesis). This makes it much easier to create because we simply need to parse and evaluate a bunch of nested lists.
Consider the following bit of JavaScript:
const ten = 4 + 6;
The equivalent would be something like this in a lisp-like language:
(def ten (+ 4 6))
Lisp uses prefix notation so the operator (e.g. +
) appears before the operands (e.g. 4
and 6
). Everything in lisp follows this structure (a property called homoiconicity), which makes the code very easy to reason about. Every expression follows the pattern:
(<function-name> <argument(s)>)
By the end of this post, you will have built yourself a language that looks like this:
(def factorial
(fn (x)
(if (= x 0)
1
(* x (factorial (- x 1))))))
Running (factorial 5)
will provide the factorial of 5 (120
). Sound good? Great, let’s get started!
Chicken and egg
So what is a programming language? It’s really just a bunch of text that a program (in our case, an “interpreter”) reads and then evaluates. So what is that program written in? At least initially, it needs to be written in a different language (a process called bootstrapping).
Consider how node works - when you run this command:
$ node myScript.js
Your computer will run the node
program which interprets the contents of myScript.js
. The node
program isn’t written in JavaScript, it’s actually largely written in C++. The same is true for other popular interpreted language like Python and Ruby.
We’re going to be building our language interpreter in Python. I chose Python because it’s simple and translates well to other languages (and I’m tired of writing JavaScript all the time). I’ll try to explain any Python-specific features as they arise.
To get started, let’s prepare our project:
$ mkdir my-neat-lang
$ cd my-neat-lang
$ touch main.py
Step 1: Lexing
The first step in building an interpreter is lexical analysis (or “lexing” for short). Don’t let the name intimidate you, it’s simply a matter of taking a string and breaking it into smaller strings called tokens. These tokens are used by the parser in the next step.
What constitutes a token? In the case of our simple lisp, we only need to care about three types of tokens: (
, )
, and words. All tokens are strings that represent one of those three things. Sadly, regex is probably the best tool for this use case. I promise, this is the scariest part of the whole post.
import re
def lex(s):
return re.findall(r"[()]|[^() \n]+", s)
I’m no regexpert, but here’s a basic rundown:
[()]
Match a single character that’s either(
or)
[^() \n]+
Match any group of characters that aren’t(
or)
Let’s test it out by running this in a main function and printing the results:
if __name__ == "__main__":
tokens = lex("(+ 1 2)")
print(tokens)
You can run this script with:
$ python main.py
If everything worked correctly, you should get the following printed to the terminal. This is our list of tokens!
['(', '+', '1', '2', ')']
Step 2: Parsing
Now that we have tokens, we need to perform syntax analysis, more commonly known as “parsing”. The end result will be nested nodes that form an ”abstract syntax tree“.
There are a lot of libraries out there that can help you generate “grammar” which define how tokens should be translated into meaning, but we’re going to keep things simple and parse everything by hand. Because we’re coding a lisp and everything is in a uniform shape, we don’t really need a lot of grammar rules.
Our goal is to have a nested structure that looks like this: ['def', 'three', ['+', 1, 2]]
for the expression (def three (+ 1 2))
.
Here’s the first pass at the parsing function:
def parse(tokens):
token = tokens.pop(0)
if token == "(":
result = []
while len(tokens):
if tokens[0] == ")":
tokens.pop(0)
return result
else:
result.append(parse(tokens))
else:
return token
parse
takes in a list of strings (tokens
)- We remove the first token in the list and save it to the variable
token
- NOTE: this alters the
tokens
list (it’s length is now 1 less) - In JavaScript, this would be
Array.prototype.shift()
- NOTE: this alters the
- If the token is a
(
character, we know we’ve hit a new lisp expression- We create a new python list to store the elements of the expression (
result
) - While we have
tokens
left- If the next token is a
)
- We’re at the end of the expression
- Remove the first element from the
tokens
list (the(
character) since we don’t need it - Return
result
- This is one of the “base cases” of the recursive loop
- Otherwise, we have more elements in this list
- Call
parse
again and add the results to the end ofresult
- In JavaScript, this would be
Array.prototype.push()
- In JavaScript, this would be
- This recursion ensures we parse every list structure, no matter how nested
- Call
- If the next token is a
- We create a new python list to store the elements of the expression (
- Lastly, if the token isn’t a
(
character, we’ve hit a “word” and we simply return it- This is the other “base cases” of the recursive loop
Okay, that was a lot. Let’s add some code to our main function and see how it does:
if __name__ == "__main__":
tokens = lex("(+ 1 2)")
print(tokens)
tree = parse(tokens)
print(tree)
Which returns the following results:
['(', '+', '1', '2', ')']
['+', '1', '2']
That certainly looks right, but there’s one more thing we need. See the “numbers”? They’re actually still strings. That’s going to make arithmetic really difficult. Let’s add one more case to our parse
function to handle numbers:
def parse(tokens):
token = tokens.pop(0)
if token == "(":
result = []
while len(tokens):
if tokens[0] == ")":
tokens.pop(0)
return result
else:
result.append(parse(tokens))
# NEW
elif type(token) is str and token.isdigit():
return int(token)
else:
return token
We confirm that the token
is a string and then use a Python function to see if the string can be cast to an integer with token.isdigit()
. If you’re following along in JavaScript, this could be achieved with !isNaN(token)
. This won’t work for floats or other number types, but adding additional logic for that is fairly trivial so I’ll leave that as an exercise for the reader.
Our output looks much better now:
['(', '+', '1', '2', ')']
['+', 1, 2]
Step 3: Evaluation
The last step is to turn our tree of nodes (nested lists) into values by evaluating each expression one at a time. This is very much “the rest of the owl”, but we can start small by continuing with our test input: (+ 1 2)
. Let’s get us a 3
!
Here’s the first pass at an eval
function:
def eval(node):
if type(node) is list:
[fn, *args] = node
match fn:
case "+":
return eval(args[0]) + eval(args[1])
case _:
raise ValueError(f"unsupported fn: {fn}")
elif type(node) is int:
return node
else:
raise ValueError(f"unsupported node: {node}")
eval
takes in a single “node” (starting with the root node)- If the node is a list (aka an expression):
- Destructure the list: the first element is the function
fn
and the remaining elements are the arguments*args
.- This makes it easier to work with, but you could just as easily do
node[0]
etc.
- This makes it easier to work with, but you could just as easily do
match
on the function value. This is python’s version of aswitch
statement. A series ofif
statements would have worked too.- If the function is a
+
(that’s right:+
is a function in lisp)- Run
eval
on each of the two arguments and add them together (recursion!) - At the moment, this only accepts two arguments (arity), but we’ll fix that soon
- Run
- If we run into any other function,
raise
an error- We’ll add more cases soon
- If the function is a
- Destructure the list: the first element is the function
- If the node is a number, just return the number
- Any other value should
raise
an error
Moment of truth, let’s add a few more lines to our main function:
if __name__ == "__main__":
tokens = lex("(+ 1 2)")
print(tokens)
tree = parse(tokens)
print(tree)
result = eval(tree)
print(result)
['(', '+', '1', '2', ')']
['+', 1, 2]
3
We got a 3
!
Utility break
Now that the major pieces are in place, we can start adding new features to our language. To make our lives a little easier, we should write a few helper functions to help us test.
The first takes a string input and pipes it through all our interpreter methods:
def run(s):
tokens = lex(s)
tree = parse(tokens)
return eval(tree)
The other function we’re going to write takes an input string (s
) and checks that the result of run(s)
is equal to a provided expected
value. It prints some helpful messages that will help us quickly test out new features:
def test(s, expected):
actual = run(s)
if actual == expected:
print(f"[PASS] {s}")
else:
print(f"[FAIL] {s} -> expected={expected}, got={actual}")
We can make use of these utilities in our main function:
if __name__ == "__main__":
test("(+ 1 2)", 3)
test("(+ 4 6)", 10)
test("(+ 4 6)", 11) # this should fail
[PASS] (+ 1 2)
[PASS] (+ 4 6)
[FAIL] (+ 4 6) -> expected=11, got=10
Feature: def
In this language, everything is created using the def
keyword: variables, functions, macros, etc. The structure is (def key <value>)
where <value>
could be anything that the language supports.
Where do we store these values? We’re going to introduce the concept of an “environment” which is just a key / value store:
global_env = dict()
Nothing fancy. Just a plain ol’ dictionary (a JavaScript object or Map
would suffice). Let’s update our eval
function to accept an environment parameter (defaulting to the “global” environment we just made):
def eval(node, env=global_env): # updated
if type(node) is list:
[fn, *args] = node
match fn:
case "+":
return eval(args[0], env) + eval(args[1], env) # updated
# [...]
Finally, we can add a new case
statement that handles def
:
def eval(node, env=global_env):
if type(node) is list:
[fn, *args] = node
match fn:
# [...]
case "def":
[key, value] = args
value = eval(value, env)
env[key] = value
return value
# [...]
- Destructure the
key
andvalue
off theargs
list - Evaluate the value by calling
eval
(more recursion!) - Set the value for the current environment
key
- Return the newly set value
Let’s check to make sure what we have is correct:
if __name__ == "__main__":
# [...]
test("(def a 1)", 1)
...
[PASS] (def a 1)
Nice. However, this is only half of the problem. We can set a variable, but we also need to read it too! We can add an elif
case to eval
for this:
def eval(node, env=global_env):
if type(node) is list:
# [...]
elif type(node) is int:
# [...]
elif node in env: # new
return env[node]
else:
# [...]
It only took 2 lines of code. We can confirm this works by adding another test:
if __name__ == "__main__":
# [...]
test("(def ten (+ 5 5))", 10)
test("ten", 10)
...
[PASS] (def ten (+ 5 5))
[PASS] ten
Not only does this work, it also demonstrates the recursive evaluation at work: this first turns (+ 5 5)
into 10
(via eval(value, env)
) and then assigns that value to the key ten
. Pretty impressive for roughly 60 lines of code!
Feature: Arithmetic and built-in functions
As impressive as this language already is, we can only add two numbers together. Let’s fix that by introducing subtraction, multiplication, and division. Along the way, we’ll add a new concept to our language: built-in functions! These are functions written in the host language (Python) that execute something we’re unable to do in our language (in this case, basic maths).
Let’s take a look at the addition operation we currently support:
(+ 1 2)
The +
symbol is actually a function that (currently) takes two arguments. Coming from other languages, it might feel a bit odd to think about function names with special characters, but this is totally normal in a lisp. This code is evaluated with the following case
in our eval
function:
case "+":
return eval(args[0], env) + eval(args[1], env)
Rather than add all the remaining arithmetic operators as new case
statements, we’re going to define them as built-in functions. While we’re at it, let’s make addition, multiplication, and subtraction take any number of arguments (variadic). This will let us evaluate something like (+ 1 2 3 4)
and get 10
.
To start, we’re going to remove the case
for addition. Instead we’ll update the logic in the default case (case _
):
def eval(node, env=global_env):
if type(node) is list:
[fn, *args] = node
match fn:
case "def":
# [...]
case _: # new
args = [eval(a, env) for a in args]
fn = eval(node[0], env)
return fn(*args)
# [...]
- The first line uses python’s list comprehension feature to run
eval
on each of the arguments in ourargs
list- A JavaScript equivalent would be something like
args = args.map((a) => eval(a, env))
- A JavaScript equivalent would be something like
- Next we
eval
the function name- The goal is for
eval
go find the function name in the environment vianode in env
(which we will add in a moment)
- The goal is for
- Finally, we run the function and pass in all the arguments (spread in with
*args
)
Next up, let’s write an addition function in Python:
def builtin_add(*args):
result = 0
for n in args:
result += n
return result
This hardly needs an explanation. The only things I’d like to point out is that this works on any number of arguments and it is working with actual integers, not AST nodes.
The last thing we need to do is to add this to our “global” environment:
global_env = dict()
global_env["+"] = builtin_add
If you run the tests again you’ll see that everything still works. Even better, we can add lots of number:
if __name__ == "__main__":
# [...]
test("(+ 1 2 3 4)", 10)
...
[PASS] (+ 1 2 3 4)
To add the remaining arithmetic is pretty trivial:
def builtin_subtract(*args):
[result, *nums] = args
for n in nums:
result -= n
return result
def builtin_multiply(*args):
[result, *nums] = args
for n in nums:
result *= n
return result
def builtin_divide(a, b):
return int(a / b)
And don’t forget to add each to the environment (otherwise you’ll get an error like ValueError: unsupported node: *
).
global_env = dict()
global_env["+"] = builtin_add
global_env["-"] = builtin_subtract
global_env["*"] = builtin_multiply
global_env["/"] = builtin_divide
Let’s kick the tires with some tests:
if __name__ == "__main__":
# [...]
test("(+ 1 2 3 4)", 10)
test("(- 10 5 2)", 3)
test("(* 5 2)", 10)
test("(/ 10 2)", 5)
...
[PASS] (+ 1 2 3 4)
[PASS] (- 10 5 2)
[PASS] (* 5 2)
[PASS] (/ 10 2)
I was blown away at just how easy it was to add all this functionality. Three lines of code in our default case
statement, a few simple functions, and we now have arithmetic support in our language!
You might be tempted to add a bunch more built-in functions, but I would caution about creating too many from the start. As you’ll soon see, we’ll be able to add new language features in the language itself soon. In fact, we’ll only add one or two more built-in functions in this post.
Feature: Keywords and if
At the moment, we only support one type of primitive: integers. In order to add equality checks and the ability to write if
statements, we need to add in another primitive that represents true
and false
. This could be a boolean, but I’m going to follow in the footsteps of elixir and make these “keywords” (elixir calls them “atoms”).
In our language, a keyword is a unique primitive that starts with a :
. So if we want to represent true, we would use the :true
keyword and false is the :false
keyword.
Let’s add support for keywords in our eval
function:
def eval(node, env=global_env):
if type(node) is list:
# [...]
elif type(node) is int:
# [...]
elif type(node) is str and node[0] == ":": #new
return node
elif node in env:
# [...]
else:
# [...]
Nothing fancy. All we’re really doing is returning the string representation of the keyword.
Now we can add the if
function. The format is as follows:
(if <condition> <consequence> <alternative>)
Here’s the update we need to make to eval
to get this working:
def eval(node, env=global_env):
if type(node) is list:
[fn, *args] = node
match fn:
case "def":
# [...]
case "if": # new
[cond, conseq, alter] = args
cond = eval(cond, env)
if cond == ":true":
return eval(conseq, env)
else:
return eval(alter, env)
case _:
# [...]
- First we destructure the parameters to
if
for easier access - Next, we evaluate the condition (which should return
:true
or some other value) - If the condition evaluated to the
:true
keyword, return the evaluated consequence - Otherwise, return the evaluated alternative
I’ve made a few choices here that you might not agree with. First off, the only truthy value is :true
so everything else is considered falsey. If that’s not to your liking, feel free to swap around the logic. This also doesn’t support a form of if
that only has a consequence (if
without the “else”). Adding that in requires a few more checks and making a decision about what to return if the if
returns a falsey value (maybe something like :nil
). That said, this is your language…feel free to make different choices here!
Let’s add a few tests to kick the tires:
if __name__ == "__main__":
# [...]
test("(if :true 11 22)", 11)
test("(if :false 11 22)", 22)
...
[PASS] (if :true 11 22)
[PASS] (if :false 11 22)
Let’s add one more built-in that let’s us check for equality:
def builtin_eq(a, b):
if a == b:
return ":true"
return ":false"
# [...]
global_env["="] = builtin_eq
And a few tests to verify it’s working correctly:
if __name__ == "__main__":
# [...]
test("(= 10 10)", ":true")
test("(= 10 20)", ":false")
...
[PASS] (= 10 10)
[PASS] (= 10 20)
At this point, we’re almost turing complete. We need one more helper function (do
) before we can tackle the big one: user-defined functions (fn
).
Feature: do
We need one more helper function to evaluate a series of expressions and return the results of the last one. We’ll call this function do
. This time, I’ll start with the test since it illustrates the usage better than I can describe:
if __name__ == "__main__":
# [...]
test("(do 1 2)", 2)
do
will evaluate each of the arguments and return the last one. Here we’re simply returning the last primitive value (2
). Of course, this doesn’t work because we don’t have support for this function yet:
...
ValueError: unsupported node: do
This is fairly easy to add to our eval
function:
def eval(node, env=global_env):
if type(node) is list:
[fn, *args] = node
match fn:
case "def":
# [...]
case "if":
# [...]
case "do": # new
result = ":nil"
for a in args:
result = eval(a, env)
return result
case _:
# [...]
- First, we create a
result
variable and set it to the:nil
keyword- If we were in JavaScript, I would probably have just done
let result;
and avoid:nil
altogether, but Python requires that this variable be bound to some value
- If we were in JavaScript, I would probably have just done
- Then we go through each argument, evaluate it, and overwrite
result
with it’s value - Lastly we return the
result
which should be the last expression’s value
Incredibly simple.
...
[PASS] (do 1 2)
Feature: fn
This is the last major feature we’ll add in this post and it’s a big one. Here’s an example of how fn
works in conjunction with def
:
(def add-one (fn (x) (+ x 1)))
Calling (add-one 2)
should evaluate to 3
.
We already know how def
works, so let’s focus on fn
. Here’s the general structure:
(fn <function-arguments> <function-body>)
A few important notes:
<function-arguments>
might be an empty list()
<function-body>
is either a list or a single value (e.g.(fn (x) x)
)
The first thing we need to add is the concept of scopes. Consider the function (fn (x) (+ x 1))
. We need to be able to reference the variable x
inside our function, but it shouldn’t be defined in the global scope. Furthermore, we should be able to access variables define outside our function’s scope too:
(do
(def ten 10)
(def ten-plus-x (fn (x) (+ x ten)))
(ten-plus-x 5)
)
The last statement should return 15
. This is because we add the local scoped variable x
to the “global” scoped variable ten
.
I’ve been throwing around the word “scope”…but we already have this concept in our language: environments! Let’s start expanding on that concept by making a helper function that creates a new environment that references a parent environment and some new variables we’d like to add.
def new_env(parent_env, params, args):
env = dict(zip(params, args))
env[":parent"] = parent_env
return env
This might look a little confusing, but it will make sense soon I promise.
When creating a function, the user can define parameters. When a user calls a function, they pass arguments for each of those parameters. In the case of our function ((fn (x) (+ x 1)) 2)
, the parameter x
is set to the value (argument) 2
.
Armed with that knowledge, let’s break down this function:
- The function accepts a “parent” environment, a list of parameters, and a list of arguments for those parameters
- We create a new dictionary where the keys are the parameters and the values are the corresponding arguments
- The same could have been achieved with a
for
loop over every element in theparams
list to build up the environment - In our example, the result of this line would be
{"x": 2}
- The same could have been achieved with a
- Lastly, we set the
:parent
key to be the parent environment
Great, so now our environments have the concept of a parent environment. Let’s make another helper function that, given a key, will return the value in the provided environment or the parent environment:
def get_var(key, env):
if key in env:
return env[key]
elif ":parent" in env:
return get_var(key, env[":parent"])
else:
raise KeyError(f"{key} not found in env: {env}")
The first few lines are pretty self-explanatory: if the key exists in the current environment, return the value associated with it. The next statement is where this function shines: if this environment has a parent environment, call the get_var
function again (recursion!) using that environment. Lastly, if we can’t find it and there’s no parent environment (global_env
), then this key doesn’t exist and we raise
an error.
An important note: this is not limited to any particular depth, which means we could nest environments as many times as we need. This means defining functions inside of functions is totally possible in our language.
Let’s update eval
to make use of this new function:
def eval(node, env=global_env):
if type(node) is list:
# [...]
elif type(node) is int:
# [...]
elif type(node) is str and node[0] == ":":
# [...]
# deleted: elif node in env:
else: # updated
return get_var(node, env)
This replaces our elif node in env
and our else
cases.
We’re finally have all the supporting elements needed to make a function. When eval
encounters a fn
function, we need to transform the list into something that we can later call. We’ve been using type(node)
to check what type of node we are looking at, so we’ll need a way to define a custom type for our functions. I’ll be using python’s @dataclass
feature, but a regular class (or a JavaScript object with a _type
property) would work just as well:
from dataclasses import dataclass
# [...]
@dataclass
class Function:
params: list
body: list
env: dict
This lets us create new functions and pass in the list of parameters, the body, and the environment they were called in. Let’s see it in action by updating our eval
function to handle the fn
function:
def eval(node, env=global_env):
if type(node) is list:
[fn, *args] = node
match fn:
case "def":
# [...]
case "if":
# [...]
case "do":
# [...]
case "fn": # new
[params, body] = args
return Function(params, body, env)
case _:
# [...]
We destructure the arguments to the fn
call into the two parts: params
and body
. Then we make a new Function
with that information and the current environment.
Alright, function created. The last thing we need to do is add the ability to call that function. To do that, we’ll need to update the else
case inside the type(node) is list
branch of the eval
function:
def eval(node, env=global_env):
if type(node) is list:
[fn, *args] = node
match fn:
case "def":
# [...]
case "if":
# [...]
case "do":
# [...]
case "fn":
# [...]
case _:
args = [eval(a, env) for a in args]
fn = eval(node[0], env)
# new
if type(fn) is Function:
fn_env = new_env(fn.env, fn.params, args)
return eval(fn.body, fn_env)
else:
return fn(*args)
# [...]
- We check to see if the type of
fn
isFunction
(if this is a user-defined function). - If so, we create a new environment with:
- The current environment as the “parent”
- The function’s parameters
- The arguments passed to the function call
- Then we return the evaluation of the function body, given the new environment
- If it’s not a
Function
(e.g. a built-in), we simply run the function like we were doing before
Alright, you ready?
if __name__ == "__main__":
# [...]
test("((fn (x) x) 2)", 2)
test("((fn (x) (+ x 1)) 2)", 3)
...
[PASS] ((fn (x) x) 2)
[PASS] ((fn (x) (+ x 1)) 2)
WE’VE GOT FUNCTIONS!
This is huge. Our language is now turing complete and with it we can (in theory) solve any type of computational problem. Not bad for less than 150 lines of code!
Victory lap
Let’s really kick the tires on this thing, now that it’s all powerful. To do so, I want to add one more built-in:
def builtin_print_value(*args):
result = ""
for arg in args:
result += str(arg) + " "
print(result)
# [...]
global_env["print-v"] = builtin_print_value
This simply prints a list of values (we don’t currently support strings, I’ll mention that in a sec).
Next, let’s update our main function to read from a file instead of making all those test
calls:
if __name__ == "__main__":
with open("./test.myradlisp", "r") as reader:
run(reader.read())
Now we can make a file called test.myradlisp
(feel free to use whatever extension you want) and start writing code in the language we just built! Let me demo some of the crazy cool stuff we can already do:
(do
(def assert
(fn (cond)
(if cond
(print-v :pass)
(print-v :fail))))
(print-v :_____identity)
(def id (fn (x) x))
(assert (= (id 1) 1))
(assert (= (id :foo) :foo))
(print-v :_____decrement)
(def dec (fn (x) (- x 1)))
(assert (= (dec 10) 9))
(assert (= (dec 1) 0))
(print-v :_____square)
(def sq (fn (x) (* x x)))
(assert (= (sq 2) 4))
(assert (= (sq 10) 100))
(print-v :_____compose)
(def comp (fn (f g) (fn (x) (f (g x)))))
(def double-dec (comp dec dec))
(assert (= (double-dec 3) 1))
(def square-dec (comp dec sq))
(assert (= (square-dec 5) 24))
(print-v :___factorial)
(def factorial
(fn (x)
(if (= x 0)
1
(* x (factorial (- x 1))))))
(assert (= (factorial 5) 120))
)
:_____identity
:pass
:pass
:_____decrement
:pass
:pass
:_____square
:pass
:pass
:_____compose
:pass
:pass
:___factorial
:pass
If you ask me, that’s pretty dang impressive. We’ve got functional composition, recursion, nested functions, and all sorts of awesome functionality. We even replace our need for a test
function in Python: we can now test our language in the language itself!
Wrap-up
This is my longest post by a long shot, but I had a blast writing it. If you want to see my version of the code base, I’ve published it here.
There’s still a lot of features we could add to the language. Some of the most obvious include:
- Strings! Our parser doesn’t currently handle turning multiple tokens into a single node (other than lists).
- Comments. Our lexer is letting us down here. We would need to move beyond a one-line regex to sort out handling this.
- Floats and non-negative numbers. This is actually a fairly easy add, but I omitted it to help keep the post length down.
- Macros. This would let us write code that writes code…letting us transform the language to own needs. It’s how most lisp-like languages add additional features beyond the simple ones we’ve created here.
I’ll probably make a follow-up post showing how to pull this off.I’ve created a follow-up post explaining how to add this. - I/O and system calls. This is just a matter of exposing a few more built-in functions and are fairly easy to add.
If any of those features sound interesting to you…you should build them! It’s your language, you can do whatever you want with it.
If you have any comments or questions about this, feel free to email me, reach out to me on mastodon, or just yell out the nearest window really loud.