diff options
Diffstat (limited to 'spec/syntax.md')
| -rw-r--r-- | spec/syntax.md | 31 |
1 files changed, 18 insertions, 13 deletions
diff --git a/spec/syntax.md b/spec/syntax.md index affa7a1..7f3561c 100644 --- a/spec/syntax.md +++ b/spec/syntax.md @@ -6,7 +6,7 @@ We use a BNF notation with the following rules: followed by `bar`. * Expressions may be followed by `?`, `*`, `+`, `{N}`, or `{N,M}`, - which have meanings analogous to regular expressions. + which have the same meanings as in regular expressions. * The syntax `[foo]` is shorthand for `(foo)?`. @@ -20,21 +20,24 @@ We use a BNF notation with the following rules: * Ranges of terminal values are expressed as `x...y` (inclusive). -* ABNF "core rules" like `ALPHA` and `HEXDIG` are supported, with the - addition of EOF to explicitly demarcate the end of the byte stream. +* ABNF "core rules" like `ALPHA` and `HEXDIG` are supported. -* There is no ambiguity, backtracking, or look-ahead beyond one byte. +* There is no ambiguity, or look-ahead / backtracking beyond one byte. Rules match left to right, depth-first, and greedy. As soon as the input matches the first terminal of a rule, it must match that rule to the end or it is considered a syntax error. The last rule means that the BNF is very simple to translate to code. +It also probably makes it equivalent to PEG. -The parser consumes one `unit` from an input stream every time it's -called; it returns the `datum` therein, or EOF. +The parser consumes one `Unit` from an input stream every time it's +called; it returns the `Datum` therein, or EOF. The final optional +`Blank` represents the fact that the parser will consume one more +blank at the end if it finds one; this is because `Datum` is not +self-closing so the parser has to check if it goes on. ``` -Unit : Blank* ( Datum [Blank] | EOF ) +Unit : Blank* [ Datum [Blank] ] Blank : 9...13 | Comment @@ -44,16 +47,17 @@ Datum : OneDatum ( [JoinChar] OneDatum )* JoinChar : '.' | ':' -Comment : ';' ( SkipUnit | SkipLine [LF] ) +Comment : ';' ( SkipUnit | SkipLine ) SkipUnit : '~' Unit -SkipLine : ( ~LF )* +SkipLine : ( ~LF )* [LF] OneDatum : BareString | CladDatum -BareString : BareChar+ +BareString : ( '.' | '+' | '-' | DIGIT ) ( BareChar | '.' )* + | BareChar+ CladDatum : '|' ( PipeStrChar | '\' StringEsc )* '|' | '"' ( QuotStrChar | '\' StringEsc )* '"' @@ -63,8 +67,9 @@ CladDatum : '|' ( PipeStrChar | '\' StringEsc )* '|' BareChar : ALPHA | DIGIT - | '!' | '$' | '%' | '*' | '+' | '-' | '.' | '/' - | '<' | '=' | '>' | '?' | '@' | '^' | '_' | '~' + | '!' | '$' | '%' | '*' | '+' + | '-' | '/' | '<' | '=' | '>' + | '?' | '@' | '^' | '_' | '~' PipeStrChar : ~( '|' | '\' ) @@ -76,7 +81,7 @@ HashExpr : Rune [ '\' BareString | CladDatum ] | '%' Label ( '%' | '=' Datum ) | CladDatum -List : Unit* [ '&' Unit ] Blank* +List : Unit* [ Blank* '&' Unit ] Blank* StringEsc : '\' | '|' | '"' | ( HTAB | SP )* LF ( HTAB | SP )* |
