summaryrefslogtreecommitdiff
path: root/spec/syntax.md
diff options
context:
space:
mode:
Diffstat (limited to 'spec/syntax.md')
-rw-r--r--spec/syntax.md31
1 files changed, 18 insertions, 13 deletions
diff --git a/spec/syntax.md b/spec/syntax.md
index affa7a1..7f3561c 100644
--- a/spec/syntax.md
+++ b/spec/syntax.md
@@ -6,7 +6,7 @@ We use a BNF notation with the following rules:
followed by `bar`.
* Expressions may be followed by `?`, `*`, `+`, `{N}`, or `{N,M}`,
- which have meanings analogous to regular expressions.
+ which have the same meanings as in regular expressions.
* The syntax `[foo]` is shorthand for `(foo)?`.
@@ -20,21 +20,24 @@ We use a BNF notation with the following rules:
* Ranges of terminal values are expressed as `x...y` (inclusive).
-* ABNF "core rules" like `ALPHA` and `HEXDIG` are supported, with the
- addition of EOF to explicitly demarcate the end of the byte stream.
+* ABNF "core rules" like `ALPHA` and `HEXDIG` are supported.
-* There is no ambiguity, backtracking, or look-ahead beyond one byte.
+* There is no ambiguity, or look-ahead / backtracking beyond one byte.
Rules match left to right, depth-first, and greedy. As soon as the
input matches the first terminal of a rule, it must match that rule
to the end or it is considered a syntax error.
The last rule means that the BNF is very simple to translate to code.
+It also probably makes it equivalent to PEG.
-The parser consumes one `unit` from an input stream every time it's
-called; it returns the `datum` therein, or EOF.
+The parser consumes one `Unit` from an input stream every time it's
+called; it returns the `Datum` therein, or EOF. The final optional
+`Blank` represents the fact that the parser will consume one more
+blank at the end if it finds one; this is because `Datum` is not
+self-closing so the parser has to check if it goes on.
```
-Unit : Blank* ( Datum [Blank] | EOF )
+Unit : Blank* [ Datum [Blank] ]
Blank : 9...13 | Comment
@@ -44,16 +47,17 @@ Datum : OneDatum ( [JoinChar] OneDatum )*
JoinChar : '.' | ':'
-Comment : ';' ( SkipUnit | SkipLine [LF] )
+Comment : ';' ( SkipUnit | SkipLine )
SkipUnit : '~' Unit
-SkipLine : ( ~LF )*
+SkipLine : ( ~LF )* [LF]
OneDatum : BareString | CladDatum
-BareString : BareChar+
+BareString : ( '.' | '+' | '-' | DIGIT ) ( BareChar | '.' )*
+ | BareChar+
CladDatum : '|' ( PipeStrChar | '\' StringEsc )* '|'
| '"' ( QuotStrChar | '\' StringEsc )* '"'
@@ -63,8 +67,9 @@ CladDatum : '|' ( PipeStrChar | '\' StringEsc )* '|'
BareChar : ALPHA | DIGIT
- | '!' | '$' | '%' | '*' | '+' | '-' | '.' | '/'
- | '<' | '=' | '>' | '?' | '@' | '^' | '_' | '~'
+ | '!' | '$' | '%' | '*' | '+'
+ | '-' | '/' | '<' | '=' | '>'
+ | '?' | '@' | '^' | '_' | '~'
PipeStrChar : ~( '|' | '\' )
@@ -76,7 +81,7 @@ HashExpr : Rune [ '\' BareString | CladDatum ]
| '%' Label ( '%' | '=' Datum )
| CladDatum
-List : Unit* [ '&' Unit ] Blank*
+List : Unit* [ Blank* '&' Unit ] Blank*
StringEsc : '\' | '|' | '"' | ( HTAB | SP )* LF ( HTAB | SP )*