# The interpreter and the compiler _2026 May_ Last December, I wrote the following in the context of how one might [bootstrap](250329-boot.html) Zisp even if it uses a self-hosting compiler: * There will be a Zisp interpreter written in Zig, which is fairly simple and naive in its implementation and, for example, ignores static type declarations. It should support the full Zisp language including hygienic macros, but be as easy as possible to maintain. * The Zisp compiler will be written in Zisp. The interpreter can run the compiler (since it can run any Zisp program) and will be used to compile the compiler. After some pondering on a variety of topics, I've decided to stick with this, just with one significant added insight: The interpreter will not be some bootstrapping hack and then put in the dustbin until someone needs to bootstrap from scratch again. Rather, the interpreter will be a first-class citizen of the Zisp implementation. This is because a simple interpreter without any compilation overhead is useful for an entire class of applications: Small to medium size scripts that you simply plop into `~/bin` with a shebang line at the top, or other similarly small programs that are simply distributed as monolithic source files, or at most a small collection of files. The interpreter may be slow, but these would be the kinds of programs one might otherwise write in GNU Bash or the like (which is also quite slow) except GNU Bash doesn't even have proper data structures, so it becomes a terrible choice very quickly. The next consideration after Bash would typically be a language like Python, and although even the CPython interpreter might beat the naive Zisp interpreter (because the former at least uses bytecode and had a ton of engineering poured into it) this shouldn't really matter, since the kind of tiny application we're talking about typically wouldn't involve heavy computation. (Besides, a Zisp script could choose to compile parts of itself; more on this later.) Another example are build scripts. One of the first ideas I had when pondering on Zisp's design is how [compilation](250210-compile.html) should automatically evaluate the top-level of a program, simply because this feels most natural to me. Furthermore, I've pondered about how it should be possible to [serialize](250210-serialize.html) everything in the language, so compiling a program would be a matter of calling something like `(write main)` after the main function is defined. Both of these fit naturally with the idea that a build script for a Zisp program would essentially just be a Zisp script which imports all the files in the codebase, compiles everything, and writes out the result. Such a build script would be interpreted, with the compiler being a shared library it loads. The compiler itself would typically still be shipped in compiled form, as well as the rest of the standard library, though it's conceivable that there might be benefits to having stdlib sources available; the compiler may be able to do better whole-program analysis, achieving better results than what you might get from LTO. ## The programmer is in control of compilation Shipping an interpreter, with a compiler as a library, being able to compile things on-the-fly as instructed by the interpreted source itself, enables some novel strategies in development and deployment. ### Manual JIT First, imagine you started developing a program as a fairly small script but at some point begin to realize that it does, after all, involve some heavy computations that could benefit from improved performance. Maybe it takes 10-20 minutes to run, with the majority of that time spent on one or two functions sifting through massive amounts of data and doing some heavy computation, involving some tight loops. Well, your interpreter includes a compiler, so what about you simply just call the compiler on those functions right after defining them? Note that we're not talking about compiling *files* but simply some functions that are sitting in memory as AST and would otherwise be interpreted naively and slowly. It's said that the difference between a naive AST interpreter, and compiled native code, can be as high as a 5-20x difference, so your script running in 20 minutes could be reduced down to 1-2 minutes; a little extra computation is added up-front to compile a function or two, then they run blazing fast. ### Native targeting, and user data/code specialization The fact that you have a compiler in your runtime, and that it has a well-designed easy to use API, opens the door to a somewhat unusual software deployment strategy: Despite the fact that your application is rather sophisticated and needs to run at peak performance, you distribute it as source code, with a "boot" process that compiles all the sources every time when it's started up on the end user's machine. (Well, the compilation result could be cached into files on disk too, but that's a detail.) This has two advantages. For one, the code is always compiled for the exact native architecture, not just an ISA family. This can improve performance a little, sometimes. Secondly, and more interestingly, data *and even code* read from a configuration file can be compiled straight into the native code that's being generated. If you know Nginx's configuration format, you may know that it has some limitations that appear a bit strange, typically because the directives need to be "compiled" into something efficient if they declare some logic that has to be executed on every single request. Since Nginx doesn't want to implement a sophisticated compiled DSL like Varnish, it ends up being somewhat limited. Varnish does make that jump and implements a whole DSL for per-request decisions, which is transpiled to C, compiled into a dynamic lib and loaded. Imagine Nginx was written in Zisp, and distributed in source format. You could have arbitrary code in your configuration, for per-request decisions, which would be compiled into native code and potentially inlined straight into Nginx's request handler. Imagine Varnish was written in Zisp. It wouldn't need to invent a whole new language! (I just realized Varnish has been renamed to Vinyl Cache, but I suspect most people still know it as Varnish, like me just now.) Just as an aside, I think this "compile at startup and cache it" strategy is used by Elixir. Or maybe I just got that impression because I've installed Pleroma (an Elixir application) from Git. Either way, I doubt my idea is entirely new; this is definitely a strategy that can already be used by any application written in a language with a compiler built into the runtime, like many Lisp or Scheme implementations. ## Why not automatic JIT? Although a more "proper" JIT has some advantages, like being able to specialize on arbitrary run-time data (not just config files or other such "boot-time" data), they typically produce significantly worse code than a "full AOT compiler in a JIT-shaped trench coat" because the AOT compiler simply spends a *lot* more time on analysis upfront. Don't cite me on this, but it appears to be the current consensus. Traditional JIT, as opposed to what LLVM and GCC offer (i.e., AOT in a JIT shaped trench coat), needs to be low latency, since it's done on the fly, transparently, and concurrently. Imagine your browser ran GCC or LLVM for every JS file it received. That would be ridiculous. Note that JS is special in that it's basically the only programming language where arbitrary new code is loaded *all the time* during the normal course of operations. Other languages just don't need this. It's just JS where high upfront latency is unacceptable. Why do Java, Lua, and a bunch of other dynamic languages use JIT? Partly, it may be cultural: Native AOT compilation feels yucky, invoking associations such as long compile times multiplied by the number of target architectures, needing to ship binary blobs, and the primitive C ABI. Java can have its own rich ABI, and languages like Lua don't have an ABI at all because everything is source code. If programmers can simply ship source files, or at worst cross-platform byte code like for the JVM, and then the JIT magically makes things faster, there's less headache I guess. (There is AOT for Java, but it's a niche.) Another reason, probably, is that many high-level languages are very dynamic and lack a serious static type system that would be needed to generate peak performance AOT compiled code. Zisp is all about breaking norms, and giving the programmer maximum freedom. The interpreter might one day incorporate some lightweight JIT, but my aim is to ensure that a Zisp programmer always has the ability to generate peak-performance native compiled binaries, through a combination of features such as: An optional but serious static type system, the ability to completely take control over memory management rather than relying on GC, and integrating with a high-end AOT native compiler like GCC. Tall claims, I know. Stop looking at me like that. Yes I know, all I have so far is a fucking s-expression parser, a NaN packing strategy for dynamic typing, and dreams. But if I keep dreaming and planning, I'm sure the implementation will spontaneously pop into existence any day now. ## Summary of planned implementation architecture Just to recap, here's the plan so far: 1. A code base in a low level language (probably Zig but not married to it) implements the Zisp core, meaning interpreter, basic data types, and a slim standard library. Comparable to R7RS-small in complexity, give or take. The interpreter accepts but ignores advanced code constructs intended to help the compiler, such as declarations and directives related to static typing and explicit object lifetime management. (Simple bindings to libgccjit are exposed; libgccjit.so is an optional run-time dependency.) This yields libzisp.so and the zisp executable, which are like liblua and the lua executable. You *can* use just this if you need a minimal Zisp interpreter with a barebones stdlib; OS package repositories could deploy these in a "zisp-core" package. 2. Richer standard library routines are written in Zisp, but the sources are meant to stay in the source code repo; wait for it. 3. An advanced compiler, which actually understands the constructs mentioned in point 1, is written in Zisp. The compiler infers static types where possible, and applies strategies to decrease GC pressure, such as escape analysis, even if compiled code offers no helpful declarations at all. But with full static typing and manual memory management, Zisp can practically be used as if it's yet another low-level language front-end for GCC; it's up to the programmer how much effort they want to put into improving the performance of their code. The compiler implementation may use parts of the richer standard library mentioned above, which is not yet compiled, mind you. 4. The interpreter runs the compiler to compile the compiler; this yields libzispcomp.so which Zisp can load dynamically so when deploying Zisp you don't need to compile the compiler on every end-user machine. (Zisp can load any .so dynamically really.) Standard library routines written in Zisp are imported directly from within the source code repo at this point, and are merely interpreted, since the compiler itself wasn't ready yet. (Actually, you could run the compiler with the interpreter to compile the stdlib first, then use the compiled stdlib while compiling the compiler. But this would probably be slower.) 5. The richer standard library routines are finally compiled, giving us libzisputil.so, which contains goodies that interpreted Zisp code can also load and use, so Zisp scripts aren't limited to the barebones stdlib anymore. In OS package repositories, you'd have zisp-core which only contains libzisp.so and the zisp executable, and then you'd have the standard zisp package which also pulls in libzispcomp and libzisputil as two additional packages. Actually, libzispcomp itself would probably depend on libzisputil anyway, but if you're an absolute nerd you *could* manually install only zisp-core and libzisputil, giving you an interpreter and rich standard library, without a compiler. This would allow you to omit libgccjit as well, which could be useful if you want to use the Zisp interpreter for simple scripts on some minimal systems. ## Closing up Funny, I had totally forgotten about this note: - [Using libgccjit?](250920-libgccjit.html) Yes, I will most definitely be using libgccjit. If Zisp is to be a true [full-stack language](260102-full-stack.html) then it must be able to produce code rivaling C in efficiency, and that requires either GCC or LLVM. Some of the other considerations in the above linked note, like the "ZispScript" idea, are obsolete. Unless I've totally goofed up and planned some illogical nonsense above, I'll be going with what I've written here, not in the previous note.