diff options
Diffstat (limited to 'notes/250329-strings.md')
| -rw-r--r-- | notes/250329-strings.md | 30 |
1 files changed, 20 insertions, 10 deletions
diff --git a/notes/250329-strings.md b/notes/250329-strings.md index 6f01944..43cb869 100644 --- a/notes/250329-strings.md +++ b/notes/250329-strings.md @@ -1,14 +1,16 @@ # Symbols and strings, revisited -My [original plan](symbols.html) was to make strings and symbols one -and the same. Then I realized this introduced ambiguity between bare -strings meant as identifiers, and quoted strings representing a string -literal in code. +_2025 March_ + +My [original plan](250210-symbols.html) was to make strings and +symbols one and the same. Then I realized this introduced ambiguity +between bare strings meant as identifiers, and quoted strings +representing a string literal in code. After a bunch of back-and-forth, I came up with the idea of the Zisp -[decoder](reader.html) with which I'm very happy overall, but I still -decided to ditch the idea of using an intermediate representation for -quoted string literals like `(#STRING . "foo")` after all. +[decoder](250219-reader.html) with which I'm very happy overall, but I +still decided to ditch the idea of using a representation for quoted +string literals like `(#STRING . "foo")` after all. The idea was that the reader would have a data mode and a code mode and that quoted strings would become `(#STRING . "foo")` or such in @@ -25,9 +27,9 @@ So, ultimately I've decided to simply make quoted strings a proper sub-type of strings. (Or make symbols a sub-type of strings; which ever way you want to look at it.) -Also, my [NaN-packing strategy](nan.html) has so much extra room that -I've decided to put up-to-6-byte strings into NaNs as an optimization -hack, and this applies to both quoted and bare strings. +Also, my [NaN-packing strategy](250210-nan.html) has so much extra +room that I've decided to put up-to-6-byte strings into NaNs as an +optimization hack, and this applies to both quoted and bare strings. So we have two different string types, and two different in-memory representations for each. Let's summarize and give them names: @@ -55,3 +57,11 @@ Here's how the parser uses these types: * Quoted string of more than 6 bytes? Uninterned string. *** WIP *** + +_2026 January_ + +Currently, the Zisp parser does, after all, conflate strings and +symbols, with string literals simply being quoted symbols. There +aren't going to be separate data types because it's unnecessary after +all. The syntax `"foo bar"` parses into `(#QUOTE . |foo bar|)` and +I'll leave it at that. |
