Adding Macro Support to Our Lisp

2024-08-17

#build-log #language-design #python #lisp 

I really enjoyed writing my last post on creating a small lisp-like language using python. Since the post was already fairly long, I decided to cut one really cool feature: Metaprogramming with macros. This post adds that feature and is meant to be a continuation of the previous post.

What is Metaprogramming?

Metaprogramming is the ability for a language to write code that can modify itself while running. To put it another way: it’s a way to write code that writes code. This technique is often used to add new language features by writing code in the language itself. Another common use case is building up domain-specific languages (DSL).

If you’re curious how a few other languages do it, here are some relevant links:

The benefits of a lisp

This feature is very common in lisp-like languages because the code is, essentially, already data. Consider the following expression:

(+ 1 2)

This adds 1 and 2 together to produce the value 3.

The form we’re seeing is a list (an “array” in other languages). What if we could alter the contents of that list before we execute it? We could change the order, add new elements to the list, or remove elements. Since expressions are just data (lists), we can operate on them just like any other piece of data in our language.

“Data is code” and “code is data” is a big selling point of lips-like languages.

Infix operations

Let’s say we wanted to add support for “infix” operations in our language. This would let us put the operator in between the operands, like most people are probably use to. Instead of doing something like + 1 2, this would let us write 1 + 2.

Here’s how we’d like that to work in our language:

(infix 1 + 2)

Feature: macro

The way a user creates a macro looks very similar to a function. The only difference is a new macro keyword:

(def infix
 (macro (a f b) (f a b)))

To start, we’ll need to create a new type that represents a macro. That way, when we run into it in our recursive eval function, we’ll be able to tell it apart from regular functions.

Like we did with fn, we’re going to use a dataclass (but a regular JavaScript object with a type tag would do just as well):

@dataclass
class Macro
    params: list
    body: list

Next, whenever we run into the word macro in our evaluator, we can create a new instance of the Macro class:

def eval(node, env=global_env):
    if type(node) is list:
        [fn, *args] = node
        match fn:
            case "def":
                # [...]
            case "if":
                # [...]
            case "do":
                # [...]
            case "fn":
                # [...]
            case "macro": # new
                [params, body] = args
                return Macro(params, body)
            case _:
                # [...]

So far nothing groundbreaking here. We do almost the same thing when creating functions. The only difference is that macros don’t take their own environment (macro hygiene is way out of scope for this post).

Now we need to update our case _ to handle macros. Here’s that case in full:

def eval(node, env=global_env):
    if type(node) is list:
        [fn, *args] = node
        match fn:
            case "def":
                # [...]
            case "if":
                # [...]
            case "do":
                # [...]
            case "fn":
                # [...]
            case _:
                args = [eval(a, env) for a in args]
                fn = eval(node[0], env)
                if type(fn) is Function:
                    fn_env = new_env(fn.env, fn.params, args)
                    return eval(fn.body, fn_env)
                # new
                if type(fn) is Macro:
                    return expand_macro(node, env)
                else:
                    return fn(*args)

All we’re really doing here is calling a new function (expand_macro). Let’s take a look at that:

 def expand_macro(node, env):
     [key, *args] = node
     macro = env[key]
     params_to_args = dict(zip(macro.params, args))
     body = replace(macro.body, params_to_args)
     return eval(body, env)

The last piece of this puzzle is replace. This is what updates the list to be something new (code rewriting code):

def replace(node, mapping):
    if type(node) is str:
        if node in mapping:
            return mapping[node]
        else:
            return node
    elif type(node) is list:
        return [replace(e, mapping) for e in node]
    else:
        raise TypeError(f"invalid node type: {node}")

That’s it! This function is what rewrites code before it gets executed. Let’s break it down:

Kicking the tires

In the last post, we had a test file we wrote some code assertions in. Let’s add a new section to that:

(do
  ...

  (print-v :_____macros)
  (def infix (macro (a f b) (f a b)))
  
  (assert (= (infix 1 + 2) 3))
  (assert (= (infix 5 - 4) 1))
)

And if we run our interpreter with python main.py:

...
:_____macros
:pass
:pass

It works! We’ve added metaprogramming with only 25 more lines of code. Here’s my code for this blog post.

At this point, the sky is the limit. With the power to manipulate the language itself, we can add all sorts new language features without having to write any new python code.

At this point, I leave the decisions up to you: it’s your language. Make it however you would like. I hope you enjoyed this little trip into creating and extending your own lisp!