Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@ jobs:
- name: Lint with Clippy (All features)
run: cargo clippy --all-features -- -D warnings

- name: Check documentation
run: cargo doc --no-deps --all-features
env:
RUSTDOCFLAGS: "-D warnings"

- name: Build
run: cargo build --features protoc --verbose

Expand Down
12 changes: 12 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,18 @@ Keep parser internals split into two phases:

This separation keeps grammar concerns and semantic checks independent and easier to test.

For extension declaration lines, use parser-layer entrypoints in
`parser/lalrpop_line.rs`:

- Prefer explicit helpers (`parse_extension_urn_declaration`,
`parse_extension_declaration`) at parser call sites.
- `FromStr` is also available on parser AST declaration types
(`ExtensionUrnDeclaration`, `ExtensionDeclaration`) for ergonomic tests and
small adapters.

Keep extension handlers parser-independent: lowering converts parser AST
relation arguments into `ExtensionArgs` before invoking extension resolution.

#### Documentation Formatting

##### Rustdoc Markdown Formatting
Expand Down
1 change: 1 addition & 0 deletions Justfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ test:
# Run clippy to check for linting errors.
check:
cargo clippy --examples --tests --all-features
RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features

examples:
cargo run --example basic_usage
Expand Down
4 changes: 4 additions & 0 deletions adrs/lalrpop.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ Relation-specific validation (e.g., "Filter expects an expression before `=>` an

**Implementation note (2026-02-26):** The current implementation parses argument entries as a single ordered list and records whether named/positional ordering is invalid, then reports that as a lowering validation error. This was chosen to keep the grammar conflict-free while preserving the same user-visible constraint.

**Implementation note (2026-02-26):** Extension declaration lines in the
`=== Extensions` section (`@...` URNs and `#...` declarations) are parsed via
the same LALRPOP -> AST pipeline and then validated in parser/lowering code.

**Benefits:**
- Smaller grammar with fewer LR conflict risks.
- Adding new relation types (Window, Set, HashJoin) requires only lowering code, no grammar recompilation.
Expand Down
11 changes: 9 additions & 2 deletions src/extensions/any.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
//! Unified wrapper for the protobuf `Any` type.
//!
//! Substrait extension relations carry opaque payloads as protobuf `Any`
//! messages. The Rust ecosystem has two incompatible `Any` types
//! (`prost_types::Any` and `pbjson_types::Any`); this module provides [`Any`]
//! and [`AnyRef`] to convert between them transparently.

use prost::{Message, Name};

use crate::extensions::registry::ExtensionError;
Expand All @@ -11,8 +18,8 @@ pub struct Any {
}

/// A reference to a protobuf `Any` type. Can be created from references to
/// [`prost_types::Any`](prost_types::Any),
/// [`pbjson_types::Any`](pbjson_types::Any), or our own [`Any`](Any) type.
/// [`prost_types::Any`],
/// [`pbjson_types::Any`], or our own [`Any`] type.
#[derive(Debug, Copy, Clone, PartialEq)]
pub struct AnyRef<'a> {
pub type_url: &'a str,
Expand Down
4 changes: 2 additions & 2 deletions src/extensions/args.rs
Original file line number Diff line number Diff line change
Expand Up @@ -309,8 +309,8 @@ impl ExtensionRelationType {
}
}

// Note: create_rel is implemented in parser/extensions.rs to avoid
// pulling in protobuf dependencies in the core args module
// Note: relation construction lives in parser/lower/extensions.rs so this
// core args module stays parser- and protobuf-agnostic.

impl ExtensionArgs {
/// Create a new empty ExtensionArgs
Expand Down
7 changes: 7 additions & 0 deletions src/extensions/simple.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
//! Anchor-to-name lookup table for Substrait simple extensions.
//!
//! [`SimpleExtensions`] maps extension anchors (`#10`, `@1`) to URNs and
//! function/type names. It is the shared data structure used by both the
//! parser (to resolve references while lowering) and the textifier (to
//! display human-readable names instead of numeric anchors).

use std::collections::BTreeMap;
use std::collections::btree_map::Entry;
use std::fmt;
Expand Down
43 changes: 42 additions & 1 deletion src/parser/ast.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,51 @@
//! AST used by the parser pipeline (`line_grammar.lalrpop` -> lowering).
//! AST for the text representation of `substrait-explain`, used by the
//! [`crate::parser`] module.
//!
//! Parsing happens in two steps:
//!
//! 1. Conversion from text to the AST in the [`crate::parser`] module via
//! LALRPOP, with syntactical validation.
//! 2. Conversion from the AST to a Protobuf [`substrait::proto::Plan`], via the
//! [`crate::parser::lower`] module, with semantic validation, e.g. function
//! references, anchors, column references, argument types/shapes match.
//!
//! This AST is intentionally close to the text syntax and does not encode all
//! semantic constraints; semantic validation happens in lowering.

use std::fmt;

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct ExtensionUrnDeclaration {
pub anchor: u32,
pub urn: String,
}

impl ExtensionUrnDeclaration {
pub fn new(anchor: u32, urn: impl Into<String>) -> Self {
Self {
anchor,
urn: urn.into(),
}
}
}

#[derive(Debug, Clone, PartialEq, Eq)]
pub struct ExtensionDeclaration {
pub anchor: u32,
pub urn_anchor: u32,
pub name: String,
}

impl ExtensionDeclaration {
pub fn new(anchor: u32, urn_anchor: u32, name: impl Into<String>) -> Self {
Self {
anchor,
urn_anchor,
name: name.into(),
}
}
}

#[derive(Debug, Clone, PartialEq)]
pub struct Relation {
pub name: RelationName,
Expand Down
11 changes: 11 additions & 0 deletions src/parser/errors.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
//! Error types for the parser.
//!
//! Three layers of errors, matching the parsing phases:
//!
//! - [`SyntaxErrorDetail`] — LALRPOP parse failures with span and expected-token
//! info, for rendering caret diagnostics on a single line.
//! - [`MessageParseError`] — semantic errors during lowering (invalid values,
//! missing references), categorised by [`ErrorKind`].
//! - [`ParseError`] — top-level error that attaches [`ParseContext`] (line
//! number + source text) to any of the above.

use std::fmt;

use thiserror::Error;
Expand Down
Loading