# Symbols and strings, revisited _2025 March_ My [original plan](250210-symbols.html) was to make strings and symbols one and the same. Then I realized this introduced ambiguity between bare strings meant as identifiers, and quoted strings representing a string literal in code. After a bunch of back-and-forth, I came up with the idea of the Zisp [decoder](250219-reader.html) with which I'm very happy overall, but I still decided to ditch the idea of using a representation for quoted string literals like `(#STRING . "foo")` after all. The idea was that the reader would have a data mode and a code mode and that quoted strings would become `(#STRING . "foo")` or such in code mode, but not in data mode. This way, reading a configuration file (in data mode) that uses quoted strings would not end up giving you this wonky thing with `#STRING`. It was an exciting idea at first, but eventually I realized that the above was the *only* substantial reason to have separate modes for reading s-expressions. It also annoyed me a bit that every single quoted string in code would be wrapped in a cons cell... So, ultimately I've decided to simply make quoted strings a proper sub-type of strings. (Or make symbols a sub-type of strings; which ever way you want to look at it.) Also, my [NaN-packing strategy](250210-nan.html) has so much extra room that I've decided to put up-to-6-byte strings into NaNs as an optimization hack, and this applies to both quoted and bare strings. So we have two different string types, and two different in-memory representations for each. Let's summarize and give them names: * sstr: Short string (symbol, up to 6 bytes) * qstr: Quoted short string (non-symbol, up to 6 bytes) * istr: Interned string (symbol, greater than 6 bytes) * ustr: Uninterned string (non-symbol, greater than 6 bytes) Don't get hung up on the short four-letter names; they aren't fully descriptive. The "qstr" isn't the only one representing a quoted string literal; a "ustr" may also represent one. Here's how the parser uses these types: * Encountered an unquoted string of up to 6 bytes? Make a sstr. * Encountered a quoted string of up to 6 bytes? Make a qstr. * Unquoted string of more than 6 bytes? Intern it to make an istr. * Quoted string of more than 6 bytes? Uninterned string. *** WIP *** _2026 January_ Currently, the Zisp parser does, after all, conflate strings and symbols, with string literals simply being quoted symbols. There aren't going to be separate data types because it's unnecessary after all. The syntax `"foo bar"` parses into `(#QUOTE . |foo bar|)` and I'll leave it at that.