# Decoder _2026 January_ I've mulled over this quite a bit now, and I believe I've figured out what kind of design I want for the "decoder" component. To recap: Zisp has a "parser" that implements an extremely bare-bones s-expression format (though with some interesting syntax sugar baked in), with a lot of the features you would expect of a typical "reader" being offloaded into a second pass over the data. That second pass is done by the *decoder* and will handle, among other things: - Number literals (the parser only knows about strings) - Boolean literals (the parser only knows about "runes") - Literals for various compound objects like vectors - Datum labels/references, for cyclic data - Emitting direct references to macros like quote, unquote, and those implementing some of the more exotic syntax features like `foo.bar` for field and method access, `foo:bar` for type declarations, etc. (To be clear, `foo.bar` actually becomes `(#DOT foo & bar)` at the parse stage, and `foo:bar` becomes `(#COLON foo & bar)` and so on. The decoder then substitutes `#DOT` and `#COLON` and the like for references to macros that actually implement the feature.) The decoder is also going to be extensible, to allow for something similar to reader macros in Common Lisp, but closer to regular macros because this extensibility will be based on runes: A list beginning with a rune can invoke a decoder procedure for that rune, and these can be user-defined. I've previously agonized over whether this means that the decoder is essentially the same thing as a macro expander, or rather, whether it would make sense to merge the functionality of the two. But I've come to the conclusion that this would be wrong. Key differences between the decoder and a macro expander include: - The macro expander is fully aware of bindings and lexical scope; it's influenced by import statements, operates on syntax objects that carry scope context, and so on. The decoder is completely oblivious to identifier bindings and doesn't understand scoping. For example, there's nothing like `let-syntax` for the decoder. - The macro expander only calls a macro when the head of a list is an identifier bound to a syntax transformer. The decoder walks through lists and checks for runes everywhere; otherwise the following would not work as expected: ```scheme ;; Alist with vectors as the values. ((x & #(a b)) (y & #(c d))) ``` The parser will turn the entries into `(x #HASH a b)` and the like, since `#(a b)` is sugar for `(#HASH a b)` and `(x & (#HASH a b))` is equivalent to `(x #HASH a b)`. So, to make this work, the decoder checks every single pair in a list and invokes a transformer if the `car` of that pair is a rune bound to a decoder rule. These differences not only mean that the implementation will be quite different, but also that the decoder is conceptually a very different thing. No doubt there will be some similarity in their algorithms, but the conceptual simplicity of the decoder (no notion of scope or identifier bindings) means that you can reason about what it will do to source files much more easily. Macros in Scheme have a completely different "feel" to them. They're really part of the program logic. The whole point of hygienic macros is that they fit in seamlessly with the rest of your program, rather than being a disjoint pre-processor operating outside program logic. That's valuable in a different way. (Zisp will also support hygienic macros like Scheme.) Although the decoder is not as smart as a macro expander, I still intend to make it fairly powerful, supporting: - `(#IMPORT ...)` to import additional decoder rules dynamically, so you could have something akin to a library of decoder extensions. Yes, I know: It's ironic to list the decoder's lack of awareness of imports as a key difference from the expander, and then make it support its own import mechanism. But it's not the same. Regular imports will be allowed within lexical scopes; decoder imports are top-level only. - `(#DEFINE ...)` to dynamically add a decoder rule on the spot. Again, not like a regular define: Top-level only, and unaware of surrounding bindings. The decoder procedures defined in this way will run in a pristine standard environment, though they can use regular imports within their body to call to external code. - `(#STRING ...)` to embed the contents of a file as a string literal, similar to `@embedFile()` in Zig. - `(#PARSE ...)` to parse a single expression from a file and put it into this position. (Error if file contains more expressions.) - `(#SPLICE ...)` to parse all expressions in a file and splice them into this position. (Essentially, `#include` from C, but obviously not meant to be used like in C.) These will be turned off by default, so a decoded file cannot run arbitrary code, or maliciously embed `/dev/random`! The standard "Zisp code decoder" configuration used to read program and library files will then enable these features. Splicing could be used for the same effect as an import, but import makes it explicit that no expressions are being inserted. Files with decoder rules could also be compiled into a binary, which the import mechanism could locate and use, instead of parsing the source file again every time. Here's some imaginary Zisp source files demonstrating decoder use: ```scheme ;; a.zisp (#IMPORT "ht.zisp") ;may load compiled code of ht.zisp from a cache (define my-hash-table #ht((a 1) (b 2))) ;#ht imported from ht.zisp (#DEFINE (#foo x y) (import (de tkammer my-helper-module)) (let ((blah (frobnicate x)) (blub (quiblify y))) `(foo bar ,(generate blah blub)))) (#foo x y z) ;decoder error (#foo x y) ;proper use (a b #foo x y) ;also works, but don't do it please (#DEFINE #bar '(+ 1 2)) (import (zisp io)) ;imports the print function (print #bar) ;will print 3 (#SPLICE "b.zisp") ``` ```scheme ;; ht.zisp (#DEFINE (#ht & entries) (define ht (make-hash-table)) (loop ((key value) entries) (ht.set key value)) ht) ``` ```scheme ;; b.zisp (define (foobar) (let ((data (#STRING "example.data"))) (data.do-something))) (define cycle #%0=(1 2 3 & #%0%)) ``` If you find the use of uppercase to be ugly, consider that a feature, because messing with the decoder this much would be discouraged. The only example above that actually makes some sense is the one defining hash table syntax. Actually, since I want to make it possible to serialize absolutely anything in Zisp, a regular macro could also be used to construct hash-table literals. See the [serialization](250210-serialize.html) note on that. However, this causes such custom object literals to not stand out: ```scheme (define (foobar) (let ((my-ht (ht (a 1) (b 2)))) ;doesn't look like a literal (use my-ht here))) (define (foobar) (let ((my-ht #ht((a 1) (b 2)))) ;more obvious that it's a literal (use my-ht here))) ``` For this reason, it would be a convention that decoder rules are used to implement new object literal syntax, and macros used for then you want to output code, with hygienic bindings. ```scheme ;; Can't do this with decoder rules (import (zisp base)) (import (de tkammer my-module)) (define-syntax (my-macro x y ) (let ((x (call something from my-module)) (y (also bind this one to something)) (foo (this is a new local identifier))) (do-something-with foo) ;; this could also contain a `foo` without clashing: )) ``` Decoder rules would probably be equivalent in power to Common Lisp macros, but it will only be possible to bind them to runes, not to regular identifiers, so they will be demarcated very clearly. They aren't intended for the same purposes as Common Lisp macros, so the equal power is merely incidental. Use hygienic macros if you want "real" Lisp macros; decoder rules are only for superficial syntax enrichment, not meant to be intertwined with program logic.