update

author: Taylan Kammer <taylan.kammer@gmail.com> 2025-02-28 14:38:57 +0100
committer: Taylan Kammer <taylan.kammer@gmail.com> 2025-02-28 14:38:57 +0100
commit: 472f3e89a61ec51218cefe65305ec6f0a0d95fbf (patch)
tree: a64ef16a6b23a822ab09e02b9d967f3b8bb3d17e /html
parent: 34de389fe744018e808f2c8b301648d504ab610d (diff)
3 files changed, 532 insertions, 6 deletions
diff --git a/html/index.md b/html/index.md
index 37565f1..e7c5ff2 100644
--- a/html/index.md
+++ b/html/index.md
@@ -6,9 +6,17 @@ been invented today, and had it been designed with pragmatic use as a
 primary concern in its design.
 
 This language doesn't actually exist yet.  You are merely reading the
-ramblings of a madman.
+ramblings of a madman.  A little bit of code is here already though:
 
-* [Compilation is execution](notes/compilation.html)
+[Zisp on GitHub](https://github.com/TaylanUB/zisp/)
+
+Some of the following articles are quite insightful.  Others are VERY
+rambly; you've been warned.
+
+Some are outdated with regards to the actual implementation of Zisp,
+because writing the code often gives you yet another perspective.
+
+* [Compilation is execution](notes/compile.html)
 * [Everything can be serialized](notes/serialize.html)
 * [Symbols are strings](notes/symbols.html)
 * [Stop the "cons" madness!](notes/cons.html)
@@ -22,7 +30,6 @@ ramblings of a madman.
 * [Object-oriented programming](notes/oop.html)
 * [Equality and equivalence semantics](notes/equal.html)
 * [NaN-packing](notes/nan.html)
-
-Temporary source repo before I set up my own git server:
-
-[Zisp on GitHub](https://github.com/TaylanUB/zisp/)
+* [Reader? Decoder? I barely know 'er!](notes/reader.html)
+* [Does the decoder implement macros?](notes/macros.html)
+* [Better syntax-rules?](notes/sr.html)
diff --git a/html/notes/macros.md b/html/notes/macros.md
new file mode 100644
index 0000000..3169c49
--- /dev/null
+++ b/html/notes/macros.md
@@ -0,0 +1,151 @@
+# Does the decoder implement macros?
+
+I've written about the [parser/decoder dualism](reader.html) in a
+previous article.  Long story short, the parser takes care of syntax
+sugar, like turning `#(...)` into `(#HASH ...)`, and the decoder takes
+care of turning that into a vector or whatever.
+
+Now, since the job of the decoder seems superficially quite similar to
+that of a macro expander, I've been agonizing for the past two days or
+so whether it *is* the macro expander.
+
+(Warning: This post is probably going to be very rambly, as I'm trying
+to gather my thoughts by writing it.)
+
+On one hand, sure:
+
+    (define-syntax #HASH
+      (syntax-rules ()
+        (#HASH <element> ...)
+        (vector '<element> ...)))
+
+Or something like that.  You know what I mean?  I mean, in Scheme you
+can't return a vector from a macro, but in Zisp the idea is that you
+can very well do that if you want, because why not.
+
+It's very much possible that I will eventually realize that this is a
+bad idea in some way, but we'll see.  So far I really like the idea of
+a macro just returning objects, like a procedure, rather than having
+to return a syntax object that has a binding to that procedure.
+
+This may be similar to John Shutt's "vau calculus" from his language
+Kernel.  Maybe Zisp will even end up being an implementation of the
+vau calculus.  But I don't know; I've never fully grokked the vau
+calculus, so if I end up implementing it, it will be by accident.
+
+In any case, I want the user to be able to bind transformers to runes,
+and doing so feels like it's pretty much the same thing as defining a
+macro, so maybe the decoder should also be the macro expander.
+
+But then there's an issue with quoting.  Consider the following:
+
+    (define stuff '(foo #(0 1 2)))
+
+In Zisp, this would first of all be parsed into:
+
+    (define stuff (#QUOTE foo (#HASH 0 1 2)))
+
+Now, if #QUOTE didn't decode its operand, we'd end up seeing #HASH in
+the result, never creating the vector we meant to create.
+
+But if #QUOTE calls decode on its operand, and the decoder is also the
+macro expander, whoops:
+
+    (let-syntax ((foo (syntax-rules () ((_ x) (bar x)))))
+      '(foo #(0 1 2)))
+
+    ;; => (bar #(0 1 2))
+
+I mean... MAYBE that should happen, actually?!  Probably not, though.
+What Scheme does isn't gospel; Zisp isn't Scheme and it will do some
+things differently, but we *probably* don't want anything inside a
+quoted expression to be macro expanded.  Probably.
+
+The thought that I might actually want that to happen sent me down a
+whole rabbit whole, and made me question "runes" altogether.  If they
+just make the decoder invoke a predefined macro, well, why not ditch
+runes and have the parser emit macro calls?
+
+So instead of:
+
+    #(x y z)  ->  (#HASH x y z)
+
+(Which is then "decoded" into a vector...)  Why not just:
+
+    #(x y z)  ->  (VECTOR x y z)
+
+And then `VECTOR` is, I don't know, a macro in the standard library I
+guess.  If the decoder is the macro expander, then sure, it will know
+about the standard library; it will have a full-blown environment that
+it uses to macro expand, to look up macro names.
+
+But no, I think this conflates everything too much.  Even just on the
+level of comprehensibility of code containing literals, I think it's
+good for there to be something that you just know will turn into an
+object of some type, no matter what; that's what a literal is.
+
+(In Zisp, it's not the reader that immediately turns the literal into
+an object of the correct type, but the decoder still runs before the
+evaluator so it's almost the same.)
+
+Then again, maybe this intuition just comes from having worked with
+Scheme for such a long time, and maybe it's not good.  Perhaps it's
+more elegant if everything is a macro.  Don't pile feature on top of
+feature, remember?
+
+Booleans, by the way, would just be identifier syntax then.  Just
+`true` and `false` without the hash sign.  In Zisp, you can't shadow
+identifiers anyway, so now they're like keywords in other languages,
+also a bit like `t` and `nil` in CL and Elisp.
+
+IF we are fine with the quote issue described above, then I *think*
+everything being a macro would be the right thing to do.  Although
+I've said the decoder could be used for things other than code, like
+for configuration files containing user-defined data types, you could
+still do that by defining macros and calling the macro expander on the
+config file.
+
+It's just that you would either not be able to have stuff like vectors
+in a quoted list (you'd just get a list like `(VECTOR ...)` in it if
+you tried), or you'd have to be expanding any macros encountered
+within the quoted list.  Either both, or neither.
+
+Not getting a choice, you say...  That's not very expressive.  That
+seems like a limitation in the language.  Remember: remove the
+limitations that make additional features seem necessary.
+
+Next thing we will have two variants of quote: One which quotes for
+real, and one that expands macros.  Or maybe some mechanism to mark
+macros as being meant to be run inside a quote or not, but then we
+re-invented runes in a different way.
+
+Which brings me back to runes, and how `#QUOTE` could handle them,
+even if the decoder is the macro expander.
+
+Encountering `#QUOTE` could tell the decoder that while decoding the
+operand, it should only honor runes, not macros bound to identifiers.
+
+That would probably be a fine way to solve the quote problem, should
+the decoder also be the macro expander: Macros are bound to runes or
+identifiers, and the rune-bound macros are those that are expanded
+even inside a quote.
+
+I think that would be the same as having completely separate decode
+and macro-expand phases.
+
+(The reason we would want them merged, by the way, is that it would
+presumably prevent duplication of code, since what they do is so
+similar.)
+
+It's possible that I'm agonizing for no reason at all because maybe
+the decoder cannot be the macro expander anyway.
+
+We will see.
+
+For now, I think it's best to proceed by implementing the decoder, and
+once I've come to the macro expander I can see if it makes sense to
+merge the two or not.
+
+But I'll probably keep runes one way or another, since they're a nice
+way of marking things that should be processed "no matter what" such
+that they can function as object literals within code.
diff --git a/html/notes/sr.md b/html/notes/sr.md
new file mode 100644
index 0000000..0fa9e06
--- /dev/null
+++ b/html/notes/sr.md
@@ -0,0 +1,368 @@
+# Better syntax-rules?
+
+Yesterday, someone on IRC asked for help in improving the following
+syntax-rules (s-r) macro:
+
+```scheme
+
+(define-syntax alist-let*
+  (syntax-rules ()
+  
+    ;; uses subpattern to avoid fender
+    ;; alist-expr is evaluated only once
+    ((_ alist-expr ((key alias) ...) body body* ...)
+     (let ((alist alist-expr))
+       (let ((alias (assq-ref alist 'key)) ...)
+         body body* ...)))
+         
+    ((_ alist-expr (key ...) body body* ...)
+     (let ((alist alist-expr))
+       (let ((key (assq-ref alist 'key)) ...)
+         body body* ...)))
+
+))
+
+;; Example uses:
+
+(define alist '((foo . 1) (bar . 2)))
+
+(alist-let alist (foo bar)
+  (+ foo bar))                     ;=> 3
+
+(alist-let alist ((foo x) (bar y))
+  (+ x y))                         ;=> 3
+
+;; Problem: Can't mix plain key with (key alias) forms:
+
+(alist-let alist ((foo x) bar)
+  (+ x bar))                       ;ERROR
+
+```
+
+How do we make it accept a mix of plain keys and `(key alias)` pairs?
+Oh boy, it's more difficult than you may think if you're new to s-r
+macros.  Basically, there's no "obvious" solution, and all we have is
+various hacks we can apply.
+
+Let's look at two fairly straightforward hacks, and their problems.
+
+## Option 1
+
+```scheme
+
+;; Solution 1: Internal helper patterns using a dummy constant.
+
+(define-syntax alist-let*
+  (syntax-rules ()
+
+    ((_ "1" alist ((key alias) rest ...) body body* ...)
+     (let ((alias (assq-ref alist 'key)))
+       (alist-let* "1" alist (rest ...) body body* ...)))
+
+    ((_ "1" alist (key rest ...) body body* ...)
+     (let ((key (assq-ref alist 'key)))
+       (alist-let* "1" alist (rest ...) body body* ...)))
+
+    ((_ "1" alist () body body* ...)
+     (begin body body* ...))
+
+    ;; dispatch, ensuring alist-expr only eval'd once
+    ((_ <alist> <bindings> <body> <body*> ...)
+     (let ((alist <alist>))
+       (alist-let* "1" alist <bindings> <body> <body*> ...)))
+
+))
+
+```
+
+(I've switched to my `<foo>` notation for pattern variables in the
+"dispatcher" part.  Don't let it distract you.  I strongly endorse
+that convention for s-r pattern variables, to make it clear that
+they're like "empty slots" where *any* expression can match, but
+that's a topic for another day.)
+
+What the solution above does, is "dispatch" actual uses of the macro,
+which obviously won't have the string literal `"1"` in first position,
+onto internal sub-macros, which can call each other recursively, so
+each layer only handles either a stand-alone `key` or a `(key alias)`
+couple.
+
+There's some nuances to this implementation.  First, if you're not
+familiar with s-r macros, you may mistakenly worry that this solution
+could mask a programmer error: What if we accidentally call the macro
+with a variable bound to the string "1"?  Would this lead to a very
+annoying bug that's hard to find?  No; remember that syntax-rules
+patterns match *unevaluated* operands, so the internal sub-patterns
+are only triggered by the appearance of a literal string constant of
+`"1"` in the first position; a mistake that would be very apparent in
+code you're reading, and is extremely unlikely to occur by accident.
+
+As for a real pitfall of this implementation: The dispatcher pattern
+*must* be in the final position; otherwise it will actually catch our
+recursive calls starting with `"1"` and bind that string literal to
+the `alist` pattern variable!  (Kind of the "reverse" of the fake
+problem described in the previous paragraph, in a sense?)  If the
+dispatcher pattern is in the first position, it will keep calling
+itself with an increasing number of `"1"`s at the start, in an
+infinite loop, until you forcibly stop it or it crashes.
+
+As a side note, this brings me to a general s-r pitfall, that applies
+to the original implementation as well in this case: Since patterns
+are matched top to bottom, a simple `key` pattern variable *could*
+actually match the form `(key alias)`, so you have to make sure that
+the pattern for matching those key-alias couples comes before the one
+matching plain keys.
+
+Oh, and by the way, if you're questioning whether we even need those
+internal helper patterns at all: Yes, it's the only way to ensure the
+initial `<alist>` expression is only evaluated once, in an outermost
+`let` wrapping everything.
+
+Let's summarize the issues we've faced:
+
+1. It's easy to forget that pattern variables can match arbitrary
+   expressions, not just identifiers, and there's no way to say it
+   should only match identifiers.
+
+2. When an arbitrary expression is matched by the pattern variable,
+   using it means repeating that expression every time, unless you
+   explicitly use `let` to take care of that, which may require
+   dispatching to another pattern immediately if you wanted to use
+   recursive patterns.
+
+3. You may accidentally put a more generic pattern first, causing it
+   to match an input that was meant to be matched by a subsequent
+   pattern with more deeper destructuring.
+
+It may be interesting trying to solve 3 by specifying some way of
+measuring the "specificity" of a pattern, and saying that those with
+the highest specificity match first, but that may prove difficult.
+Besides, solving 1 would basically solve 3 anyway.
+
+Racket has syntax-parse, which solves the first problem through an
+incredibly sophisticated specification of "syntax patterns" that take
+the place of the humble generic pattern variable of syntax-rules.
+It's cool and all, but the charm of s-r is the simplicity.  Can't we
+use some of the ideas of syntax-parse patterns and add them to s-r?
+
+In Racket, there's the concept of "syntax classes," and a pattern can
+be a variable with `:syntax-class-id` appended to its name, which is
+how you make it only match inputs of that syntax class, such as for
+example, only identifiers.  Trying to find out what syntax class ids
+are supported may send you down a rabbit hole of how you can actually
+define your own syntax classes, but that just seems to be a weak spot
+of the Racket online documentation; looking a bit closer, you should
+find the list of built-in classes that are supported.  They are just
+called "library" syntax classes for some reason:
+
+[Library Syntax Classes and Literal Sets -- Racket Documentation](https://docs.racket-lang.org/syntax/Library_Syntax_Classes_and_Literal_Sets.html)
+
+It would be great if there were classes for atoms (anything that's not
+a list) and lists, though; then we could do this:
+
+```scheme
+
+(define-syntax alist-let*
+  (syntax-rules ()
+
+    ((_ <alist>:list bindings body body* ...)
+     (let ((alist <alist>))
+       (alist-let* alist bindings body body* ...)))
+
+    ((_ alist (key:id ...) body body* ...)
+     (let ((key (assq-ref alist 'key)) ...)
+       body body* ...))
+
+    ((_ alist ((key:atom alias:id) ...) body body* ...)
+     (let ((alias (assq-ref alist 'key)) ...)
+       body body* ...))
+
+))
+
+```
+
+(The key could also be a non-symbol immediate value, like a fixnum,
+boolean, etc.; anything that `assq-ref` can compare via `eq?`.  One
+could also just not quote the key, and instead let it be an arbitrary
+expression, which would probably make for a more useful macro, but
+that's a different topic.)
+
+Isn't that really neat?  But let's go one step further.  I believe
+this strategy of binding an expression via `let` to ensure it's only
+evaluated once is probably so common that it warrants a shortcut:
+
+```scheme
+
+(define-syntax alist-let*
+  (syntax-rules ()
+
+    ((_ alist:bind (key:id ...) body body* ...)
+     (let ((key (assq-ref alist 'key)) ...)
+       body body* ...))
+
+    ((_ alist:bind ((key:atom alias:id) ...) body body* ...)
+     (let ((alias (assq-ref alist 'key)) ...)
+       body body* ...))
+
+))
+
+```
+
+The idea here is: All pattern variables marked with `:bind` are first
+collected, and if there is at least one that is not an identifier,
+then the whole template (the part that produces the output of the s-r
+macro) is wrapped in a `let` which binds those expressions to the name
+of the pattern variable, and uses of that pattern variable within the
+template refer to that binding.
+
+I'm not entirely sure yet if this is an ingenious idea, or a hacky fix
+for just one arbitrary issue you can face while using syntax-rules,
+but I suspect it's a common enough pattern to make it desirable.
+
+## Option 2
+
+I said there were various hacks to solve the original problem; here's
+the second variant.  It's actually almost the same thing, but we put
+the helper patterns into a separate macro.
+
+```scheme
+
+;; Solution 2: Separate helper macro
+
+(define-syntax alist-let*
+  (syntax-rules ()
+
+    ;; dispatch, ensuring alist-expr only eval'd once
+    ((_ <alist> <bindings> <body> <body*> ...)
+     (let ((alist <alist>))
+       (%alist-let-helper alist <bindings> <body> <body*> ...)))
+
+))
+
+(define-syntax %alist-let-helper
+  (syntax-rules ()
+
+    ;; basically do here what the internal helpers did in solution 1,
+    ;; but without the need for the "1" string literal hack
+
+))
+
+```
+
+That's cleaner in terms of the patterns we have to write, but we had
+to define a second top-level macro, which feels wrong.  It should be
+properly encapsulated as part of the first.
+
+This is where another improvement to s-r could come in handy, and
+that's not making it evaluate to a syntax transformer (i.e., lambda)
+directly, but rather making it more like syntax-case in that regard.
+However, the additional lambda wrapping always really annoyed me, so
+the following syntax may be desirable.
+
+```scheme
+
+(define-syntax (alist-let* . s)
+
+  (define-syntax (helper . s)
+    (syntax-rules s ()
+      ((alist ((key alias) rest ...) body body* ...)
+       (let ((alias (assq-ref alist 'key)))
+         (alist-let* "1" alist (rest ...) body body* ...)))
+  
+      ((alist (key rest ...) body body* ...)
+       (let ((key (assq-ref alist 'key)))
+         (alist-let* "1" alist (rest ...) body body* ...)))
+  
+      ((alist () body body* ...)
+       (begin body body* ...))
+      ))
+
+  (syntax-rules s ()
+    ((<alist> <bindings> <body> <body*> ...)
+     (let ((alist <alist>))
+       (helper alist <bindings> <body> <body*> ...)))))
+
+```
+
+That looks a bit confusing at first sight, but we can actually do
+something a lot better now, since we already get one stand-alone
+pattern at the start, which fits our intention perfectly here:
+
+```scheme
+
+(define-syntax (alist-let* <alist> <bindings> <body> <body*> ...)
+
+  (define-syntax (helper . s)
+    (syntax-rules s ()
+      ((alist ((key alias) rest ...) body body* ...)
+       (let ((alias (assq-ref alist 'key)))
+         (alist-let* "1" alist (rest ...) body body* ...)))
+  
+      ((alist (key rest ...) body body* ...)
+       (let ((key (assq-ref alist 'key)))
+         (alist-let* "1" alist (rest ...) body body* ...)))
+  
+      ((alist () body body* ...)
+       (begin body body* ...))
+      ))
+
+  #'(let ((alist <alist>))
+      (helper alist <bindings> <body> <body*> ...)))
+
+```
+
+To be honest, I don't like this solution nearly as much as the first,
+and I now realize that there wouldn't be much point in keeping s-r if
+it's going to be so close to syntax-case.  (The only difference, at
+this point, would be that s-r implicitly puts `#'` in front of the
+templates.  That's literally all it would do, if I'm not mistaken.)
+
+## Or just implement syntax-parse?
+
+Racket can actually give you the implicit lambda when you want it, by
+offering `syntax-parser` as an alternative to `syntax-parse`:
+
+```scheme
+
+;; The following two are equivalent.
+
+(define-syntax foo
+  (lambda (s)
+    (syntax-parse s ...)))
+
+(define-syntax foo
+  (syntax-parser ...))
+
+```
+
+(At least, I'm pretty sure that's how it's supposed to work; the docs
+just bind the result of `syntax-parser` to an identifier via `define`
+and call it as a procedure to showcase it, for whatever reason.)
+
+Yes, syntax-parse is a lot more complex than syntax-rules, but to be
+honest it seems mainly the fault of the documentation that it doesn't
+showcase the simplest ways of using it, which look essentially the
+same as using syntax-rules, so it's not clear why s-r should stay if
+you have syntax-parse.
+
+Maybe I would just make one change, which is to allow the following
+syntax and thus make the additional `syntax-parser` unnecessary:
+
+```scheme
+
+(define-syntax (foo s)
+  (syntax-parse s ...))
+
+```
+
+Note that this is different from my previous idea of making the first
+operand to `define-syntax` a pattern.  The only thing I don't like
+about this variant is that there will never be more than one argument,
+but maybe that's fine?
+
+In any case, I guess the only innovation I came up with here is the
+special `:bind` syntax class id, assuming there isn't already a
+similar thing in Racket or elsewhere.
+
+Oh and this made me realize I should add `foo:bar` as reader syntax to
+Zisp, turning it into `(#COLON foo . bar)` or such.
author	Taylan Kammer <taylan.kammer@gmail.com>	2025-02-28 14:38:57 +0100
committer	Taylan Kammer <taylan.kammer@gmail.com>	2025-02-28 14:38:57 +0100
commit	472f3e89a61ec51218cefe65305ec6f0a0d95fbf (patch)
tree	a64ef16a6b23a822ab09e02b9d967f3b8bb3d17e /html
parent	34de389fe744018e808f2c8b301648d504ab610d (diff)