diff options
| author | Taylan Kammer <taylan.kammer@gmail.com> | 2025-03-28 21:14:12 +0100 |
|---|---|---|
| committer | Taylan Kammer <taylan.kammer@gmail.com> | 2025-03-28 21:14:12 +0100 |
| commit | 5025f9acf31cd880bbff62ff47ed03b69a0025ee (patch) | |
| tree | 866f9365ae87315b0d5e41a8fe27435b803ce706 /spec/syntax.md | |
| parent | 615e400ff150a3c355086664c7f9de512b5859dc (diff) | |
| parent | 2cbfacaedcc77e28e0a0473045cac689fb43a8ef (diff) | |
Merge branch 'new-parser'
Diffstat (limited to 'spec/syntax.md')
| -rw-r--r-- | spec/syntax.md | 34 |
1 files changed, 34 insertions, 0 deletions
diff --git a/spec/syntax.md b/spec/syntax.md new file mode 100644 index 0000000..b85ed78 --- /dev/null +++ b/spec/syntax.md @@ -0,0 +1,34 @@ +# Zisp S-Expression Syntax + +We use a BNF notation with the following rules: + +* Concatenation of expressions is implicit: `foo bar` means `foo` + followed by `bar`. + +* Expressions may be followed by `?`, `*`, `+`, `{N}`, or `{N,M}`, + which have the meanings they have in regular expressions. + +* The syntax is defined in terms of bytes, not characters. Terminals + `'c'` and `"c"` refer to the ASCII value of the given character `c`. + Numbers are in decimal and refer to a byte with the given value. + +* The `~` prefix means NOT. It only applies to rules that match one + byte, and negates them. For example, `~( 'a' | 'b' )` matches any + byte other than 97 and 98. + +* Ranges of terminal values are expressed as `x...y` (inclusive). + +* There is no ambiguity, backtracking, or look-ahead beyond the byte + currently being matched. Rules match left to right, depth-first, + and greedy. As soon as the input matches the first terminal of a + rule, it must match that rule to the end. + +The last rule means that the BNF is very simple to translate to code. + +The parser consumes one `unit` from an input stream every time it's +called; it returns the `datum` therein, or EOF. + +``` + + +``` |
