summaryrefslogtreecommitdiff
path: root/drafts/posts/understanding-pratt-parsing.md
diff options
context:
space:
mode:
authorBrandon C. Irizarry <brandon.irizarry@gmail.com>2026-03-08 19:16:04 -0400
committerBrandon C. Irizarry <brandon.irizarry@gmail.com>2026-03-08 19:16:04 -0400
commitcdecb8191a791b8db66effe61de16cb3e3904238 (patch)
tree815c42e49e87ab89b103cf876437cc985f078675 /drafts/posts/understanding-pratt-parsing.md
parentc4af1098ffebffa02d3a055d7570193acd77985d (diff)
Reorder posts according to new publishing algorithm
Diffstat (limited to 'drafts/posts/understanding-pratt-parsing.md')
-rw-r--r--drafts/posts/understanding-pratt-parsing.md155
1 files changed, 0 insertions, 155 deletions
diff --git a/drafts/posts/understanding-pratt-parsing.md b/drafts/posts/understanding-pratt-parsing.md
deleted file mode 100644
index d88ba88..0000000
--- a/drafts/posts/understanding-pratt-parsing.md
+++ /dev/null
@@ -1,155 +0,0 @@
-+++
-title = "Understanding Pratt Parsing"
-tags = ["programming languages"]
-date = 2025-12-02
-
-summary = """
-
-A post I had written about Pratt parsing, in the context of a \
-programming language I was designing at the time.
-
-"""
-
-+++
-
-# Introduction
-
-I've forgotten how I came across Pratt parsing specifically. I had
-been working on an interpreter for a programming language based on the
-one loosely described in Greg Michaelson's *An Introduction to
-Functional Programming Through Lambda Calculus*. I had managed to
-implement simple arithmetic, and even extended the basic lambda
-calculus spec with assignment expressions (a feat which I was very
-proud of.)
-
-However, some parts of my implementation felt a bit hacky (for
-example, how I had implemented `letrec`), and my implementation of
-lazy evaluation, while mostly complete, ultimately turned out to be
-buggy.
-
-At first, I decided to rewrite the project from scratch. One major
-guiding factor was to narrow the scope of the project, borrowing some
-advice from [Zed Shaw](https://learncodethehardway.com/blog/32-very-deep-not-boring-beginner-projects/). I got around to writing a new tokenizer,
-and then I was on to writing the parser. My initial parser used the
-recursive descent technique, taking advantage of the simplified lambda
-calculus grammar used in Michaelson's text (for example, parentheses
-are always used for application terms there.)
-
-This time though, I wanted to try something different. And so,
-rummaging through the internets, I stumbled across Pratt parsing.
-
-# "It's like a burrito"
-
-Understanding Pratt parsing ended up being much harder than I
-expected. I ended up searching through a bunch of examples online:
-
-1. [Alex Kladov's](https://matklad.github.io/2020/04/13/simple-but-powerful-pratt-parsing.html) Rust-based tutorial.
-2. [Eli Bendersky's](https://eli.thegreenplace.net/2010/01/02/top-down-operator-precedence-parsing) Python-based tutorial.
-3. Bob Nystrom's [intro](https://journal.stuffwithstuff.com/2011/03/19/pratt-parsers-expression-parsing-made-easy/) to the subject, as well as the relevant
- chapter in his [Crafting Interpreters](https://craftinginterpreters.com/compiling-expressions.html).
-4. Vaughan Pratt's [original paper](https://tdop.github.io/). Shout-out to the legend who
- put this up as a GitHub Pages site!
-5. Douglas Crockford's celebrated [article](https://crockford.com/javascript/tdop/tdop.html) on the subject deserves
- honorable mention, though my JavaScript is currently rusty and so I
- didn't look into it in any depth.
-
-Alex Kladov in his post calls Pratt parsing the "monad tutorial of
-syntactic analysis". As I had recently become familiar with the
-concept of "monad" in a [philosophical](https://en.wikipedia.org/wiki/Monad_(philosophy)) sense, I aksed ChatGPT
-where the reference comes from: it turns out that it's an inside joke
-involving *Haskell* monads. I'm not an expert, but my readings have
-given me enough of an inkling to see the connection: Pratt parsing
-isn't a discrete "thing" with exactly one shape: it's more of a
-technique, if you will—a design pattern—which can assume various
-manifestations.
-
-A good example of this is iteration, because we all recognize it when
-we see it, even when we don't know the language—but no one syntactic
-construction sufficiently defines it. For-loops, while-loops, Python
-generator functions, and [optimized tail-recursive functions](https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/full-text/book/book-Z-H-11.html#%_sec_1.2.1 ) all
-count as iteration.
-
-After struggling for some time, I finally managed to distill the
-essence of the algorithm, which I present here as pseudocode:
-
-```
-parse(level):
- t ← next(stream)
- acc ← nud_dispatch(t)
-
- while level < precedence(peek(stream)):
- t ← next(stream)
- acc ← led_dispatch(t, acc)
-
- return acc
-```
-
-Unwrapping what this does exactly is a surprisingly nuanced task,
-precisely because the algorithm is more about technique than
-structure. Because of this, I won't pretend to be up to the task
-here. Nevertheless, a brief synopsis is warranted.
-
-There is a global `stream` of tokens, such that `next(stream)`
-consumes and returns the next token, and `peek(stream)` returns the
-next token without consuming it. The function `nud_dispatch`
-interprets `t` as a *null denotation*, or "nud" for short, and
-initializes `acc`. The function `led_dispatch` interprets `t` as a
-*left denotation*, or "led" for short. It accumulates a value into
-`acc` using the existing value of `acc` (hence the choice of name.)
-Both dispatch functions call `parse` recursively in all but the most
-trivial cases.
-
-When `peek`ing the stream reveals a token with higher precedence than
-the current `level`, the while loop exits and `acc` is returned.
-
-The algorithm is initialized by calling `parse(0)`.
-
-# Down To Brass Tacks
-
-My approach was to take Eli Bendersky's full source code at the bottom
-of his post, and start chiseling away at it. What I ended up with was
-the same simple arithmetic calculator, only with a different
-architecture: I moved away from Eli's object-oriented approach towards
-something closer to the formulation given in the previous section.
-
-In the end, I was amazed at how simple and robust the actual
-implementation turned out to be! I feel that what I came up with (at
-this stage, anyway) is arguably simpler than even many of the examples
-I initially came across: for example, it isn't necessary to add space
-between precedence levels (10, 20, etc.), since you can use an enum to
-take care of any ordering needed. Also, using even and odd precedence
-levels (for handling right associativity) is unnecessary. For example,
-say you have precedence levels `MULTIPLICATION = 2` and
-`EXPONENTIATION = 3`. The algorithm cleverly avoids clashing
-`EXPONENTIATION-1` with `MULTIPLICATION` when enforcing right
-associativity for exponentiation. I found this to be one of the more
-remarkable aspects of the algorithm.
-
-# Wanting More
-
-To be fair, my calculator app technically doesn't parse arithmetic
-expressions: it evaluates them wholesale. This is OK: instead of
-accumulating an AST, I'm accumulating an arithmetic result.
-
-Because of how compelling the calculator app turned out to be, I
-decided to stop work on the lambda calculus project, and instead work
-on expanding the calculator into a full-blown programming language,
-albeit a simple one. I've already made progress in this direction: in
-addition to arithmetic (including trig functions!), the application
-currently supports variable assignment.
-
-Ideally, I'd like something with
-<br></br>
-
-+ Booleans
-+ Conditionals
-+ Loops
-+ Functions
-+ Proper lexical scoping, even for conditional and loop blocks
-<br></br>
-I'll see how many of these I manage.
-
-
-
-
-