summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTaylan Kammer <taylan.kammer@gmail.com>2026-06-11 06:07:51 +0200
committerTaylan Kammer <taylan.kammer@gmail.com>2026-06-11 06:07:51 +0200
commitbfaa74b19fc81dbe071d55566a78a8e329237eff (patch)
tree020424baffcf63ec27127f1a0a43d9759d5369c3
parentb27809517b26089ad3359c1212caf9c3ffbf247d (diff)
New note.
-rw-r--r--notes/260811-fastcons4.md81
-rw-r--r--notes/index.md1
2 files changed, 82 insertions, 0 deletions
diff --git a/notes/260811-fastcons4.md b/notes/260811-fastcons4.md
new file mode 100644
index 0000000..7c7ccd5
--- /dev/null
+++ b/notes/260811-fastcons4.md
@@ -0,0 +1,81 @@
+# Cons cell optimization AGAIN
+
+_2026 June_
+
+Last episode: [Cons cell optimization again](260810-fastcons3.html)
+
+I am completely straying from my original idea of: "Just make a naive,
+simple interpreter, with bindings to libgccjit, and anyone who needs
+performance can just ask for specific functions in their code (or all
+of them) to be compiled to native code."
+
+(See: [The interpreter and the compiler](260522-interpreter.html))
+
+To be clear, I won't ditch that overall design idea. However, as I
+started asking myself how to at least make a *reasonably* efficient
+interpreter through the use of some smart tricks, I fell into a big
+rabbit hole, and found myself enthralled by the idea of an advanced
+direct AST interpreter that tries to approach the performance of a
+byte-code VM while sticking to a tree representation of code.
+
+One of the many causes of inefficiency in AST walking is the need to
+chase around pointers and, if using traditional s-expressions with
+cons cells, the terrible data density. (Citation needed? I think
+this is widely accepted.)
+
+A cons cell is 50% wasted space, using half of its memory to point to
+the next cell. This is mostly alleviated by the array-segmented list
+idea, but they still need a nil terminator for proper lists.
+
+That nil terminator is a whopping 8 bytes. If I also used 16-byte
+alignment, I'd be wasting an average of 12 bytes per list! That's
+what, almost 1/5 of a cache line? It's 18.75% to be exact. Even
+using 8-byte alignment, wasting only 8 bytes per list, it's 1/8 or
+12.5% of a cache line.
+
+Another way to look at it is total memory overhead. To calculate it,
+you'd need to know the exact average length of lists in code. Let's
+say it's 4 or something. (That's probably quite generous?) At that
+length, every 1 out of 5 elements is just the nil terminator; a data
+overhead of 20%!
+
+Reducing the memory footprint of code by ~10-20% or so is a highly
+attractive prospect. And all we need to do is ditch improper lists.
+
+The new strategy is this:
+
+* A 3-bit tag on a list pointer indicates the length; either 1 to 7,
+ or if the tag value is 0, greater than 7.
+
+* For lists longer than 7 elements, with the tag value 0, the array
+ continues until a sentinel bit pattern is reached.
+
+A list is always an array; there are no linked segment chains. There
+is no zero length list, since tag 0 means greater than 7, and other
+values mean that exact count of elements.
+
+The vast majority of functions have only a handful of arguments; we
+support up to 6 arguments without needing to rely on the special list
+array format that is sentinel-terminated.
+
+This means the data density is 100% for function call expressions up
+to 6 arguments (7 elements with the function itself). Perfect!
+
+## But... improper lists :(
+
+I honestly like improper lists. I think it's so elegant how Scheme
+uses them to denote rest arguments. I even had this fancy idea of
+using them for function calls where you pass on your own rest arg.
+
+(See: [Stop the "cons" madness!](250210-cons.html))
+
+But sometimes you have to make sacrifices! Besides, I think the only
+reason I like them is familiarity. They are kinda wonky if you think
+about it. Common Lisp and Emacs Lisp don't use improper lists in any
+part of their actual syntax, if I remember correctly. They use some
+special identifier to mark rest args, and I could do a similar thing.
+
+Damn it: I spent so much time refining the documentation of the Zisp
+parser, and now I'll have to make massive revisions again. Serves me
+well for obsessing on one component without yet having a clear bigger
+picture in mind.
diff --git a/notes/index.md b/notes/index.md
index e331101..b38c529 100644
--- a/notes/index.md
+++ b/notes/index.md
@@ -32,3 +32,4 @@
* [Static analysis to optimize GC](260524-gc-opt.html)
* [Cons cell optimization!](260608-fastcons2.html)
* [Cons cell optimization again](260610-fastcons3.html)
+* [Cons cell optimization AGAIN](260811-fastcons4.html)