Skip to content

Switch parser to Lalrpop#64

Draft
wackywendell wants to merge 6 commits intomainfrom
wendell/lalrpop
Draft

Switch parser to Lalrpop#64
wackywendell wants to merge 6 commits intomainfrom
wendell/lalrpop

Conversation

@wackywendell
Copy link
Copy Markdown
Collaborator

@wackywendell wackywendell commented Feb 26, 2026

This PR switches from Pest to lalrpop, to clean up the code substantially:

  • No more ParsePair, ScopedParsePair, RuleIter, unreachable!().
  • There's an AST constructed from the grammar, with a second pass for 'lowering' from the AST to the protobuf message. This makes for a much more straightforward implementation.
  • Unfortunately, the line_grammar.lalrpop is not quite as readable as the pest grammar, but it's not bad.
  • Also adds some docs to modules in a number of places.
  • It's a big PR, but it's -1700 lines overall.

See adrs/lalrpop.md to see the reasoning for this, parser/line_grammar.lalrpop for the grammar, and parser/ast.rs for the AST.

This is a draft PR because I'm a bit undecided still, and haven't fully reviewed it. This was all done via @codex partly in the cloud and partly locally.

@wackywendell
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 76afd04a49

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/parser/lower/relations.rs Outdated
Comment thread src/parser/line_grammar.lalrpop
@wackywendell wackywendell mentioned this pull request Feb 26, 2026
…ering pipeline

Replace the Pest PEG parser with a LALRPOP LR(1) grammar, introducing a
two-phase parse-then-lower architecture. The grammar produces typed AST nodes
that are lowered to Substrait protobuf with relation-specific semantic
validation.

Includes extension declaration parsing via LALRPOP, extraction of the extension
lowering module, FromStr/From trait implementations, and in-code documentation
for all major modules.
Add sections for import organization, pattern matching depth, unused parameter
handling, parser pipeline pattern, extension handling guidelines, and rustdoc
markdown formatting.
Add ordered_float dependency and implement manual Eq/Hash for
LiteralValue (delegating to OrderedFloat for f64), then derive
Eq + Hash on the full AST type chain.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant