2 files changed, 115 insertions, 6 deletions
diff --git a/notes/250219-reader.md b/notes/250219-reader.md
index de71b4e..503d402 100644
--- a/notes/250219-reader.md
+++ b/notes/250219-reader.md
@@ -7,6 +7,10 @@ article:*
 
 [Symbols are strings are symbols](250210-symbols.html)
 
+*This whole article is me rambling, and the actual implementation of
+the parser that I settled on is slightly different from all the ideas
+that are wildly explored here.  See late addition at the bottom.*
+
 OK but hear me out... What if there were different reader modes, for
 code and (pure) data?
 
@@ -463,10 +467,57 @@ from the apostrophe if needed.)
 Also, all those would work without a rune as well, to allow a file to
 change the meaning of some of the default syntax sugar if desired:
 
-    "foo"    -> (#string . foo)
+    "foo"        -> (#string . foo)
 
     [foo bar]    -> (#square foo bar)
 
     {foo bar}    -> (#braces foo bar)
 
 Or something like that.  I'm making this all up as I go.
+
+## Actual implementation
+
+_2026 January_
+
+Just to summarize what I actually ended up implementing in the end:
+
+- There is only one parser, not separate data and code parsers.
+
+- It simply desugars `"foo bar"` into `(#QUOTE . |foo bar|)`, i.e.,
+  these expressions are equivalent, and indistinguishable once they
+  have been parsed into data.  (The syntax `|foo bar|` represents a
+  string literal in its purest form.)  Another equivalent expression
+  would be `'|foo bar|` that also parses into `(#QUOTE . |foo bar|)`.
+  All three parse into the exact same data in memory.
+
+- If you want to use Zisp expressions for something like config files
+  and want to type `"foo bar"` instead of `|foo bar|` but don't want
+  to deal with `(#QUOTE . |foo bar|)` then just run a decoder on the
+  data before using it.  You'll need to run a decoder on it anyway if
+  you want to support vectors, mappings, and other such data types in
+  your config file that don't have a *direct* data representation.
+
+- The decoder is not implemented yet, but it will be configurable and
+  may have default configurations for "code" and "data" where the data
+  configuration would presumably just strip `(#QUOTE . foo)` down to
+  `foo` just to make `"foo"` and `|foo|` totally equivalent in data
+  contexts like config files.  In the code configuration, it would
+  decode `(#QUOTE . foo)` into a macro call expression object which,
+  when evaluated, results in `foo`.
+
+- If you wanted to have a config file with code snippets in it, and
+  don't want e.g. `(code (string-append "foo" x))` to be decoded into
+  `(code (string-append foo x))` thus changing the meaning of the
+  embedded code, you have two options:
+
+  1. Make your entire config file be Zisp code written in a DSL.
+
+  2. Wrap code snippets in one layer of quoting like `'(...)` which
+     will effectively protect nested uses of `#QUOTE` from the data
+     decoder, since decoding is a breadth-first operation.
+
+See here for full documentation of Zisp expressions as implemented:
+
+- [Informal docs](https://git.tkammer.de/zisp/tree/docs/parser.md)
+- [Formal spec](https://git.tkammer.de/zisp/tree/spec/syntax.md)
+- [ABNF](https://git.tkammer.de/zisp/tree/spec/syntax.abnf)
diff --git a/spec/syntax.md b/spec/syntax.md
index b85ed78..91e5495 100644
--- a/spec/syntax.md
+++ b/spec/syntax.md
@@ -6,7 +6,9 @@ We use a BNF notation with the following rules:
   followed by `bar`.
 
 * Expressions may be followed by `?`, `*`, `+`, `{N}`, or `{N,M}`,
-  which have the meanings they have in regular expressions.
+  which have meanings analogous to regular expressions.
+
+* The syntax `[foo]` is shorthand for `(foo)?`.
 
 * The syntax is defined in terms of bytes, not characters.  Terminals
   `'c'` and `"c"` refer to the ASCII value of the given character `c`.
@@ -18,10 +20,13 @@ We use a BNF notation with the following rules:
 
 * Ranges of terminal values are expressed as `x...y` (inclusive).
 
-* There is no ambiguity, backtracking, or look-ahead beyond the byte
-  currently being matched.  Rules match left to right, depth-first,
-  and greedy.  As soon as the input matches the first terminal of a
-  rule, it must match that rule to the end.
+* ABNF "core rules" like `ALPHA` and `HEXDIG` are supported, with the
+  addition of EOF to explicitly demarcate the end of the byte stream.
+
+* There is no ambiguity, backtracking, or look-ahead beyond one byte.
+  Rules match left to right, depth-first, and greedy.  As soon as the
+  input matches the first terminal of a rule, it must match that rule
+  to the end or it is considered a syntax error.
 
 The last rule means that the BNF is very simple to translate to code.
 
@@ -29,6 +34,59 @@ The parser consumes one `unit` from an input stream every time it's
 called; it returns the `datum` therein, or EOF.
 
 ```
+Unit          : Blank* ( Datum [Blank] | EOF )
+
+
+Blank         : 9...13 | Comment
+
+Datum         : OneDatum ( [JoinChar] OneDatum )*
+
+JoinChar      : '.' | ':'
+
+
+Comment       : ';' ( SkipUnit | SkipLine )
+
+SkipUnit      : '~' Unit
+
+SkipLine      : ( ~LF )* [LF]
+
+
+OneDatum      : BareString | CladDatum
+
+BareString    : ( '.' | '+' | '-' | DIGIT ) ( BareChar | '.' )*
+              | BareChar+
+
+CladDatum     : '|' PipeStrElt* '|'
+              | '"' QuotStrElt* '"'
+              | '#' HashExpr
+              | '(' List ')' | '[' List ']' | '{' List '}'
+              | "'" Datum | '`' Datum | ',' Datum
+
+
+BareChar      : ALPHA | DIGIT
+              | '!' | '$' | '%' | '&' | '*' | '+' | '-' | '/'
+              | '<' | '=' | '>' | '?' | '@' | '^' | '_' | '~'
+
+
+PipeStrElt    : ~( '|' | '\' ) | '\' StringEsc
+
+QuotStrElt    : ~( '"' | '\' ) | '\' StringEsc
+
+HashExpr      : Rune [ '\' BareString | CladDatum ]
+              | '\' BareString
+              | '%' Label ( '%' | '=' Datum )
+              | CladDatum
+
+List          : Unit* [ '.' Unit ] Blank*
+
+
+StringEsc     : '\' | '|' | '"' | ( HTAB | SP )* LF ( HTAB | SP )*
+              | 'a' | 'b' | 't' | 'n' | 'v' | 'f' | 'r' | 'e'
+              | 'x' ( HEXDIG{2} )+ ';'
+              | 'u' HEXDIG{1,6} ';'
+
 
+Rune          : ALPHA ( ALPHA | DIGIT ){0,5}
 
+Label         : HEXDIG{1,12}
 ```