New note.

author: Taylan Kammer <taylan.kammer@gmail.com> 2026-01-07 13:26:51 +0100
committer: Taylan Kammer <taylan.kammer@gmail.com> 2026-01-07 13:26:51 +0100
commit: cf2697d24c13cdc7ea5f93ce0ff5143f41a85a83 (patch)
tree: 7658ff8d6e758b1f63c6cbae342c87db1cfde045
parent: b49af311220090c126be917993ba547cbf48bbaa (diff)
2 files changed, 219 insertions, 0 deletions
diff --git a/notes/260107-decoder.md b/notes/260107-decoder.md
new file mode 100644
index 0000000..a1118b7
--- /dev/null
+++ b/notes/260107-decoder.md
@@ -0,0 +1,218 @@
+# Decoder
+
+_2026 January_
+
+I've mulled over this quite a bit now, and I believe I've figured out
+what kind of design I want for the "decoder" component.
+
+To recap: Zisp has a "parser" that implements an extremely bare-bones
+s-expression format (though with some interesting syntax sugar baked
+in), with a lot of the features you would expect of a typical "reader"
+being offloaded into a second pass over the data.
+
+That second pass is done by the *decoder* and will handle, among other
+things:
+
+- Number literals (the parser only knows about strings)
+
+- Boolean literals (the parser only knows about "runes")
+
+- Literals for various compound objects like vectors
+
+- Datum labels/references, for cyclic data
+
+- Emitting direct references to macros like quote, unquote, and those
+  implementing some of the more exotic syntax features like `foo.bar`
+  for field and method access, `foo:bar` for type declarations, etc.
+
+(To be clear, `foo.bar` actually becomes `(#DOT foo & bar)` at the
+parse stage, and `foo:bar` becomes `(#COLON foo & bar)` and so on.
+The decoder then substitutes `#DOT` and `#COLON` and the like for
+references to macros that actually implement the feature.)
+
+The decoder is also going to be extensible, to allow for something
+similar to reader macros in Common Lisp, but closer to regular macros
+because this extensibility will be based on runes: A list beginning
+with a rune can invoke a decoder procedure for that rune, and these
+can be user-defined.
+
+I've previously agonized over whether this means that the decoder is
+essentially the same thing as a macro expander, or rather, whether it
+would make sense to merge the functionality of the two.  But I've come
+to the conclusion that this would be wrong.
+
+Key differences between the decoder and a macro expander include:
+
+- The macro expander is fully aware of bindings and lexical scope;
+  it's influenced by import statements, operates on syntax objects
+  that carry scope context, and so on.  The decoder is completely
+  oblivious to identifier bindings and doesn't understand scoping.
+  For example, there's nothing like `let-syntax` for the decoder.
+
+- The macro expander only calls a macro when the head of a list is an
+  identifier bound to a syntax transformer.  The decoder walks through
+  lists and checks for runes everywhere; otherwise the following would
+  not work as expected:
+
+  ```scheme
+  ;; Alist with vectors as the values.
+  ((x & #(a b))
+   (y & #(c d)))
+  ```
+
+  The parser will turn the entries into `(x #HASH a b)` and the like,
+  since `#(a b)` is sugar for `(#HASH a b)` and `(x & (#HASH a b))` is
+  equivalent to `(x #HASH a b)`.  So, to make this work, the decoder
+  checks every single pair in a list and invokes a transformer if the
+  `car` of that pair is a rune bound to a decoder rule.
+
+These differences not only mean that the implementation will be quite
+different, but also that the decoder is conceptually a very different
+thing.  No doubt there will be some similarity in their algorithms,
+but the conceptual simplicity of the decoder (no notion of scope or
+identifier bindings) means that you can reason about what it will do
+to source files much more easily.
+
+Macros in Scheme have a completely different "feel" to them.  They're
+really part of the program logic.  The whole point of hygienic macros
+is that they fit in seamlessly with the rest of your program, rather
+than being a disjoint pre-processor operating outside program logic.
+That's valuable in a different way.  (Zisp will also support hygienic
+macros like Scheme.)
+
+Although the decoder is not as smart as a macro expander, I still
+intend to make it fairly powerful, supporting:
+
+- `(#IMPORT ...)` to import additional decoder rules dynamically, so
+  you could have something akin to a library of decoder extensions.
+  Yes, I know: It's ironic to list the decoder's lack of awareness of
+  imports as a key difference from the expander, and then make it
+  support its own import mechanism.  But it's not the same.  Regular
+  imports will be allowed within lexical scopes; decoder imports are
+  top-level only.
+
+- `(#DEFINE ...)` to dynamically add a decoder rule on the spot.
+  Again, not like a regular define: Top-level only, and unaware of
+  surrounding bindings.  The decoder procedures defined in this way
+  will run in a pristine standard environment, though they can use
+  regular imports within their body to call to external code.
+
+- `(#STRING ...)` to embed the contents of a file as a string literal,
+  similar to `@embedFile()` in Zig.
+
+- `(#PARSE ...)` to parse a single expression from a file and put it
+  into this position.  (Error if file contains more expressions.)
+
+- `(#SPLICE ...)` to parse all expressions in a file and splice them
+  into this position.  (Essentially, `#include` from C, but obviously
+  not meant to be used like in C.)
+
+These will be turned off by default, so a decoded file cannot run
+arbitrary code, or maliciously embed `/dev/random`!  The standard
+"Zisp code decoder" configuration used to read program and library
+files will then enable these features.
+
+Splicing could be used for the same effect as an import, but import
+makes it explicit that no expressions are being inserted.  Files with
+decoder rules could also be compiled into a binary, which the import
+mechanism could locate and use, instead of parsing the source file
+again every time.
+
+Here's some imaginary Zisp source files demonstrating decoder use:
+
+```scheme
+;; a.zisp
+
+(#IMPORT "ht.zisp")  ;may load compiled code of ht.zisp from a cache
+
+(define my-hash-table #ht((a 1) (b 2)))  ;#ht imported from ht.zisp
+
+(#DEFINE (#foo x y)
+  (import (de tkammer my-helper-module))
+  (let ((blah (frobnicate x))
+        (blub (quiblify y)))
+    `(foo bar ,(generate blah blub))))
+
+(#foo x y z)   ;decoder error
+
+(#foo x y)     ;proper use
+
+(a b #foo x y) ;also works, but don't do it please
+
+(#DEFINE #bar '(+ 1 2))
+
+(import (zisp io))  ;imports the print function
+
+(print #bar)   ;will print 3
+
+(#SPLICE "b.zisp")
+```
+
+```scheme
+;; ht.zisp
+
+(#DEFINE (#ht & entries)
+  (define ht (make-hash-table))
+  (loop ((key value) entries)
+    (ht.set key value))
+  ht)
+```
+
+```scheme
+;; b.zisp
+
+(define (foobar)
+  (let ((data (#STRING "example.data")))
+    (data.do-something)))
+
+(define cycle #%0=(1 2 3 & #%0%))
+```
+
+If you find the use of uppercase to be ugly, consider that a feature,
+because messing with the decoder this much would be discouraged.  The
+only example above that actually makes some sense is the one defining
+hash table syntax.
+
+Actually, since I want to make it possible to serialize absolutely
+anything in Zisp, a regular macro could also be used to construct
+hash-table literals.  See the [serialization](250210-serialize.html)
+note on that.
+
+However, this causes such custom object literals to not stand out:
+
+```scheme
+(define (foobar)
+  (let ((my-ht (ht (a 1) (b 2))))  ;doesn't look like a literal
+    (use my-ht here)))
+
+(define (foobar)
+  (let ((my-ht #ht((a 1) (b 2))))  ;more obvious that it's a literal
+    (use my-ht here)))
+```
+
+For this reason, it would be a convention that decoder rules are used
+to implement new object literal syntax, and macros used for then you
+want to output code, with hygienic bindings.
+
+```scheme
+;; Can't do this with decoder rules
+
+(import (zisp base))
+(import (de tkammer my-module))
+
+(define-syntax (my-macro x y <body>)
+  (let ((x (call something from my-module))
+        (y (also bind this one to something))
+        (foo (this is a new local identifier)))
+    (do-something-with foo)
+    ;; this could also contain a `foo` without clashing:
+    <body>))
+```
+
+Decoder rules would probably be equivalent in power to Common Lisp
+macros, but it will only be possible to bind them to runes, not to
+regular identifiers, so they will be demarcated very clearly.  They
+aren't intended for the same purposes as Common Lisp macros, so the
+equal power is merely incidental.  Use hygienic macros if you want
+"real" Lisp macros; decoder rules are only for superficial syntax
+enrichment, not meant to be intertwined with program logic.
diff --git a/notes/index.md b/notes/index.md
index dd5a946..5d02a60 100644
--- a/notes/index.md
+++ b/notes/index.md
@@ -23,3 +23,4 @@
 * [Goals](250920-goals.html)
 * [A full-stack programming language](260102-full-stack.html)
 * [Simplifying S-Expression Grammar](260106-simpler-grammar.html)
+* [Decoder](260107-decoder.html)
author	Taylan Kammer <taylan.kammer@gmail.com>	2026-01-07 13:26:51 +0100
committer	Taylan Kammer <taylan.kammer@gmail.com>	2026-01-07 13:26:51 +0100
commit	cf2697d24c13cdc7ea5f93ce0ff5143f41a85a83 (patch)
tree	7658ff8d6e758b1f63c6cbae342c87db1cfde045
parent	b49af311220090c126be917993ba547cbf48bbaa (diff)