feat: support VirtualTable Read with inline format#106
Conversation
2d99fa9 to
5f27f0a
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5f27f0a6c1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
5f27f0a to
62c2b3c
Compare
fix: return parse errors instead of panicking on invalid VirtualTable expressions
62c2b3c to
51ad742
Compare
benbellick
left a comment
There was a problem hiding this comment.
Few comments but otherwise LGTM
| let plan = r#" | ||
| === Plan | ||
| Root[id, name] | ||
| Read:Virtual[_ => id:i64, name:string]"#; |
There was a problem hiding this comment.
This is probably fine because I think this is the convention for other relations and it is better to be consistent, but I find the usage of _ here a bit unintuitive. To me, _ communicates "I don't care", which doesn't really make sense in this context, but that is where my mind goes.
I think a more explicit syntax would be to use [] in all of the places, e.g.:
Read:Virtual[[] => id:i64, name:string]"#;
or
Read:Virtual[[(1), (2), (3)] => id:i64]"#;
There was a problem hiding this comment.
I see your point about _; it's used to mean 'unspecified value' in several languages, whereas here it means 'empty', which is quite different.
That said - I'm not really sure what would be a better replacement:
[]brackets aroundargs: I'd rather not add extra brackets if we don't need to - the syntaxoperator[args... => columns]seems fairly unambiguous, and having to put extra brackets around args everywhere adds clutter.()for empty: Nice syntax, but ambiguous: it's not clear ifOperator[() => …]means "1 argument, an empty tuple" or "no arguments".emptykeyword: E.g.Operator[empty => …]. Right now, there are no keywords, and this would conflict with the possibility of having an identifier - e.g.Read[empty => …]currently means "a read from the table namedempty"- Nothing: Allow either
Operator[columns]likeProjectandRoot(feat: make '=>' optional when relation arguments are absent #104), orOperator[=> columns]/Operator[args =>]for empty columns / args. The first has precedent and should be OK; the second is unambiguous.
So for now, I think we keep it...
|
|
||
| `"Read:Virtual" "[" (virtual_row ("," virtual_row)*)? "=>" named_column_list "]"` | ||
|
|
||
| Where `virtual_row := "(" expression ("," expression)* ")"` — a parenthesized tuple of expressions forming one row. Use `_` in place of rows for an empty virtual table. |
There was a problem hiding this comment.
| Where `virtual_row := "(" expression ("," expression)* ")"` — a parenthesized tuple of expressions forming one row. Use `_` in place of rows for an empty virtual table. | |
| Where `virtual_row := "(" expression ("," expression)* ")"` — a parenthesized tuple of expressions forming one row. For an empty virtual table, write `_` in place of the entire row list, e.g. `Read:Virtual[_ => id:i64]`. The `_` replaces the whole list and cannot appear alongside rows. |
What do you think of something like this?
Description
Add support for
VirtualTableinReadRel(closes #56), enabling inline literal data tables in Substrait plans — similar to SQLVALUESclauses.The virtual read is written as a
Read:Virtualoperator, with rows inline as positional arguments:Use
_for an empty virtual table (no rows):Design decisions
Read:Virtualuses theType:Subtypecolon pattern (consistent withExtensionLeaf:Name)Addendumenum introduced in refactor: self-contained RelationParsePair with identity emit #107 for structural parser integrationGRAMMAR.mdupdated with syntax, examples, and doc-tested code blocksType of Change
Testing
Checklist
Related Issues
Closes #56