# Zisp S-Expression Syntax We use a BNF notation with the following rules: * Concatenation of expressions is implicit: `foo bar` means `foo` followed by `bar`. * Expressions may be followed by `?`, `*`, `+`, `{N}`, or `{N,M}`, which have meanings analogous to regular expressions. * The syntax `[foo]` is shorthand for `(foo)?`. * The syntax is defined in terms of bytes, not characters. Terminals `'c'` and `"c"` refer to the ASCII value of the given character `c`. Numbers are in decimal and refer to a byte with the given value. * The `~` prefix means NOT. It only applies to rules that match one byte, and negates them. For example, `~( 'a' | 'b' )` matches any byte other than 97 and 98. * Ranges of terminal values are expressed as `x...y` (inclusive). * ABNF "core rules" like `ALPHA` and `HEXDIG` are supported, with the addition of EOF to explicitly demarcate the end of the byte stream. * There is no ambiguity, backtracking, or look-ahead beyond one byte. Rules match left to right, depth-first, and greedy. As soon as the input matches the first terminal of a rule, it must match that rule to the end or it is considered a syntax error. The last rule means that the BNF is very simple to translate to code. The parser consumes one `unit` from an input stream every time it's called; it returns the `datum` therein, or EOF. ``` Unit : Blank* ( Datum [Blank] | EOF ) Blank : 9...13 | Comment Datum : OneDatum ( [JoinChar] OneDatum )* JoinChar : '.' | ':' Comment : ';' ( SkipUnit | SkipLine ) SkipUnit : '~' Unit SkipLine : ( ~LF )* [LF] OneDatum : BareString | CladDatum BareString : ( '.' | '+' | '-' | DIGIT ) ( BareChar | '.' )* | BareChar+ CladDatum : '|' PipeStrElt* '|' | '"' QuotStrElt* '"' | '#' HashExpr | '(' List ')' | '[' List ']' | '{' List '}' | "'" Datum | '`' Datum | ',' Datum BareChar : ALPHA | DIGIT | '!' | '$' | '%' | '&' | '*' | '+' | '-' | '/' | '<' | '=' | '>' | '?' | '@' | '^' | '_' | '~' PipeStrElt : ~( '|' | '\' ) | '\' StringEsc QuotStrElt : ~( '"' | '\' ) | '\' StringEsc HashExpr : Rune [ '\' BareString | CladDatum ] | '\' BareString | '%' Label ( '%' | '=' Datum ) | CladDatum List : Unit* [ '.' Unit ] Blank* StringEsc : '\' | '|' | '"' | ( HTAB | SP )* LF ( HTAB | SP )* | 'a' | 'b' | 't' | 'n' | 'v' | 'f' | 'r' | 'e' | 'x' ( HEXDIG{2} )+ ';' | 'u' HEXDIG{1,6} ';' Rune : ALPHA ( ALPHA | DIGIT ){0,5} Label : HEXDIG{1,12} ```