summaryrefslogtreecommitdiff
path: root/notes/sr.md
diff options
context:
space:
mode:
Diffstat (limited to 'notes/sr.md')
-rw-r--r--notes/sr.md368
1 files changed, 368 insertions, 0 deletions
diff --git a/notes/sr.md b/notes/sr.md
new file mode 100644
index 0000000..0fa9e06
--- /dev/null
+++ b/notes/sr.md
@@ -0,0 +1,368 @@
+# Better syntax-rules?
+
+Yesterday, someone on IRC asked for help in improving the following
+syntax-rules (s-r) macro:
+
+```scheme
+
+(define-syntax alist-let*
+ (syntax-rules ()
+
+ ;; uses subpattern to avoid fender
+ ;; alist-expr is evaluated only once
+ ((_ alist-expr ((key alias) ...) body body* ...)
+ (let ((alist alist-expr))
+ (let ((alias (assq-ref alist 'key)) ...)
+ body body* ...)))
+
+ ((_ alist-expr (key ...) body body* ...)
+ (let ((alist alist-expr))
+ (let ((key (assq-ref alist 'key)) ...)
+ body body* ...)))
+
+))
+
+;; Example uses:
+
+(define alist '((foo . 1) (bar . 2)))
+
+(alist-let alist (foo bar)
+ (+ foo bar)) ;=> 3
+
+(alist-let alist ((foo x) (bar y))
+ (+ x y)) ;=> 3
+
+;; Problem: Can't mix plain key with (key alias) forms:
+
+(alist-let alist ((foo x) bar)
+ (+ x bar)) ;ERROR
+
+```
+
+How do we make it accept a mix of plain keys and `(key alias)` pairs?
+Oh boy, it's more difficult than you may think if you're new to s-r
+macros. Basically, there's no "obvious" solution, and all we have is
+various hacks we can apply.
+
+Let's look at two fairly straightforward hacks, and their problems.
+
+## Option 1
+
+```scheme
+
+;; Solution 1: Internal helper patterns using a dummy constant.
+
+(define-syntax alist-let*
+ (syntax-rules ()
+
+ ((_ "1" alist ((key alias) rest ...) body body* ...)
+ (let ((alias (assq-ref alist 'key)))
+ (alist-let* "1" alist (rest ...) body body* ...)))
+
+ ((_ "1" alist (key rest ...) body body* ...)
+ (let ((key (assq-ref alist 'key)))
+ (alist-let* "1" alist (rest ...) body body* ...)))
+
+ ((_ "1" alist () body body* ...)
+ (begin body body* ...))
+
+ ;; dispatch, ensuring alist-expr only eval'd once
+ ((_ <alist> <bindings> <body> <body*> ...)
+ (let ((alist <alist>))
+ (alist-let* "1" alist <bindings> <body> <body*> ...)))
+
+))
+
+```
+
+(I've switched to my `<foo>` notation for pattern variables in the
+"dispatcher" part. Don't let it distract you. I strongly endorse
+that convention for s-r pattern variables, to make it clear that
+they're like "empty slots" where *any* expression can match, but
+that's a topic for another day.)
+
+What the solution above does, is "dispatch" actual uses of the macro,
+which obviously won't have the string literal `"1"` in first position,
+onto internal sub-macros, which can call each other recursively, so
+each layer only handles either a stand-alone `key` or a `(key alias)`
+couple.
+
+There's some nuances to this implementation. First, if you're not
+familiar with s-r macros, you may mistakenly worry that this solution
+could mask a programmer error: What if we accidentally call the macro
+with a variable bound to the string "1"? Would this lead to a very
+annoying bug that's hard to find? No; remember that syntax-rules
+patterns match *unevaluated* operands, so the internal sub-patterns
+are only triggered by the appearance of a literal string constant of
+`"1"` in the first position; a mistake that would be very apparent in
+code you're reading, and is extremely unlikely to occur by accident.
+
+As for a real pitfall of this implementation: The dispatcher pattern
+*must* be in the final position; otherwise it will actually catch our
+recursive calls starting with `"1"` and bind that string literal to
+the `alist` pattern variable! (Kind of the "reverse" of the fake
+problem described in the previous paragraph, in a sense?) If the
+dispatcher pattern is in the first position, it will keep calling
+itself with an increasing number of `"1"`s at the start, in an
+infinite loop, until you forcibly stop it or it crashes.
+
+As a side note, this brings me to a general s-r pitfall, that applies
+to the original implementation as well in this case: Since patterns
+are matched top to bottom, a simple `key` pattern variable *could*
+actually match the form `(key alias)`, so you have to make sure that
+the pattern for matching those key-alias couples comes before the one
+matching plain keys.
+
+Oh, and by the way, if you're questioning whether we even need those
+internal helper patterns at all: Yes, it's the only way to ensure the
+initial `<alist>` expression is only evaluated once, in an outermost
+`let` wrapping everything.
+
+Let's summarize the issues we've faced:
+
+1. It's easy to forget that pattern variables can match arbitrary
+ expressions, not just identifiers, and there's no way to say it
+ should only match identifiers.
+
+2. When an arbitrary expression is matched by the pattern variable,
+ using it means repeating that expression every time, unless you
+ explicitly use `let` to take care of that, which may require
+ dispatching to another pattern immediately if you wanted to use
+ recursive patterns.
+
+3. You may accidentally put a more generic pattern first, causing it
+ to match an input that was meant to be matched by a subsequent
+ pattern with more deeper destructuring.
+
+It may be interesting trying to solve 3 by specifying some way of
+measuring the "specificity" of a pattern, and saying that those with
+the highest specificity match first, but that may prove difficult.
+Besides, solving 1 would basically solve 3 anyway.
+
+Racket has syntax-parse, which solves the first problem through an
+incredibly sophisticated specification of "syntax patterns" that take
+the place of the humble generic pattern variable of syntax-rules.
+It's cool and all, but the charm of s-r is the simplicity. Can't we
+use some of the ideas of syntax-parse patterns and add them to s-r?
+
+In Racket, there's the concept of "syntax classes," and a pattern can
+be a variable with `:syntax-class-id` appended to its name, which is
+how you make it only match inputs of that syntax class, such as for
+example, only identifiers. Trying to find out what syntax class ids
+are supported may send you down a rabbit hole of how you can actually
+define your own syntax classes, but that just seems to be a weak spot
+of the Racket online documentation; looking a bit closer, you should
+find the list of built-in classes that are supported. They are just
+called "library" syntax classes for some reason:
+
+[Library Syntax Classes and Literal Sets -- Racket Documentation](https://docs.racket-lang.org/syntax/Library_Syntax_Classes_and_Literal_Sets.html)
+
+It would be great if there were classes for atoms (anything that's not
+a list) and lists, though; then we could do this:
+
+```scheme
+
+(define-syntax alist-let*
+ (syntax-rules ()
+
+ ((_ <alist>:list bindings body body* ...)
+ (let ((alist <alist>))
+ (alist-let* alist bindings body body* ...)))
+
+ ((_ alist (key:id ...) body body* ...)
+ (let ((key (assq-ref alist 'key)) ...)
+ body body* ...))
+
+ ((_ alist ((key:atom alias:id) ...) body body* ...)
+ (let ((alias (assq-ref alist 'key)) ...)
+ body body* ...))
+
+))
+
+```
+
+(The key could also be a non-symbol immediate value, like a fixnum,
+boolean, etc.; anything that `assq-ref` can compare via `eq?`. One
+could also just not quote the key, and instead let it be an arbitrary
+expression, which would probably make for a more useful macro, but
+that's a different topic.)
+
+Isn't that really neat? But let's go one step further. I believe
+this strategy of binding an expression via `let` to ensure it's only
+evaluated once is probably so common that it warrants a shortcut:
+
+```scheme
+
+(define-syntax alist-let*
+ (syntax-rules ()
+
+ ((_ alist:bind (key:id ...) body body* ...)
+ (let ((key (assq-ref alist 'key)) ...)
+ body body* ...))
+
+ ((_ alist:bind ((key:atom alias:id) ...) body body* ...)
+ (let ((alias (assq-ref alist 'key)) ...)
+ body body* ...))
+
+))
+
+```
+
+The idea here is: All pattern variables marked with `:bind` are first
+collected, and if there is at least one that is not an identifier,
+then the whole template (the part that produces the output of the s-r
+macro) is wrapped in a `let` which binds those expressions to the name
+of the pattern variable, and uses of that pattern variable within the
+template refer to that binding.
+
+I'm not entirely sure yet if this is an ingenious idea, or a hacky fix
+for just one arbitrary issue you can face while using syntax-rules,
+but I suspect it's a common enough pattern to make it desirable.
+
+## Option 2
+
+I said there were various hacks to solve the original problem; here's
+the second variant. It's actually almost the same thing, but we put
+the helper patterns into a separate macro.
+
+```scheme
+
+;; Solution 2: Separate helper macro
+
+(define-syntax alist-let*
+ (syntax-rules ()
+
+ ;; dispatch, ensuring alist-expr only eval'd once
+ ((_ <alist> <bindings> <body> <body*> ...)
+ (let ((alist <alist>))
+ (%alist-let-helper alist <bindings> <body> <body*> ...)))
+
+))
+
+(define-syntax %alist-let-helper
+ (syntax-rules ()
+
+ ;; basically do here what the internal helpers did in solution 1,
+ ;; but without the need for the "1" string literal hack
+
+))
+
+```
+
+That's cleaner in terms of the patterns we have to write, but we had
+to define a second top-level macro, which feels wrong. It should be
+properly encapsulated as part of the first.
+
+This is where another improvement to s-r could come in handy, and
+that's not making it evaluate to a syntax transformer (i.e., lambda)
+directly, but rather making it more like syntax-case in that regard.
+However, the additional lambda wrapping always really annoyed me, so
+the following syntax may be desirable.
+
+```scheme
+
+(define-syntax (alist-let* . s)
+
+ (define-syntax (helper . s)
+ (syntax-rules s ()
+ ((alist ((key alias) rest ...) body body* ...)
+ (let ((alias (assq-ref alist 'key)))
+ (alist-let* "1" alist (rest ...) body body* ...)))
+
+ ((alist (key rest ...) body body* ...)
+ (let ((key (assq-ref alist 'key)))
+ (alist-let* "1" alist (rest ...) body body* ...)))
+
+ ((alist () body body* ...)
+ (begin body body* ...))
+ ))
+
+ (syntax-rules s ()
+ ((<alist> <bindings> <body> <body*> ...)
+ (let ((alist <alist>))
+ (helper alist <bindings> <body> <body*> ...)))))
+
+```
+
+That looks a bit confusing at first sight, but we can actually do
+something a lot better now, since we already get one stand-alone
+pattern at the start, which fits our intention perfectly here:
+
+```scheme
+
+(define-syntax (alist-let* <alist> <bindings> <body> <body*> ...)
+
+ (define-syntax (helper . s)
+ (syntax-rules s ()
+ ((alist ((key alias) rest ...) body body* ...)
+ (let ((alias (assq-ref alist 'key)))
+ (alist-let* "1" alist (rest ...) body body* ...)))
+
+ ((alist (key rest ...) body body* ...)
+ (let ((key (assq-ref alist 'key)))
+ (alist-let* "1" alist (rest ...) body body* ...)))
+
+ ((alist () body body* ...)
+ (begin body body* ...))
+ ))
+
+ #'(let ((alist <alist>))
+ (helper alist <bindings> <body> <body*> ...)))
+
+```
+
+To be honest, I don't like this solution nearly as much as the first,
+and I now realize that there wouldn't be much point in keeping s-r if
+it's going to be so close to syntax-case. (The only difference, at
+this point, would be that s-r implicitly puts `#'` in front of the
+templates. That's literally all it would do, if I'm not mistaken.)
+
+## Or just implement syntax-parse?
+
+Racket can actually give you the implicit lambda when you want it, by
+offering `syntax-parser` as an alternative to `syntax-parse`:
+
+```scheme
+
+;; The following two are equivalent.
+
+(define-syntax foo
+ (lambda (s)
+ (syntax-parse s ...)))
+
+(define-syntax foo
+ (syntax-parser ...))
+
+```
+
+(At least, I'm pretty sure that's how it's supposed to work; the docs
+just bind the result of `syntax-parser` to an identifier via `define`
+and call it as a procedure to showcase it, for whatever reason.)
+
+Yes, syntax-parse is a lot more complex than syntax-rules, but to be
+honest it seems mainly the fault of the documentation that it doesn't
+showcase the simplest ways of using it, which look essentially the
+same as using syntax-rules, so it's not clear why s-r should stay if
+you have syntax-parse.
+
+Maybe I would just make one change, which is to allow the following
+syntax and thus make the additional `syntax-parser` unnecessary:
+
+```scheme
+
+(define-syntax (foo s)
+ (syntax-parse s ...))
+
+```
+
+Note that this is different from my previous idea of making the first
+operand to `define-syntax` a pattern. The only thing I don't like
+about this variant is that there will never be more than one argument,
+but maybe that's fine?
+
+In any case, I guess the only innovation I came up with here is the
+special `:bind` syntax class id, assuming there isn't already a
+similar thing in Racket or elsewhere.
+
+Oh and this made me realize I should add `foo:bar` as reader syntax to
+Zisp, turning it into `(#COLON foo . bar)` or such.