parser: cap AST nesting depth to prevent CPU DoS (#5bo)#528
Merged
Conversation
Borrowed from Zero (rocicorp/mono#6000): `ilo serv` and any other context that compiles untrusted source can be DoSed by a deeply nested expression like `((((...((1+1))))...))` 1000 levels deep, which recurses straight through the OS thread stack on the tree-walker parser. Adds a depth counter on Parser, guarded at the entry of parse_expr, parse_stmt, parse_decl, parse_atom, parse_pattern, and parse_type. When depth >= max_depth the next recursion returns the new ILO-P103 error instead of going deeper. Default cap is 256 (DEFAULT_MAX_AST_DEPTH) - far above anything hand-written, low enough to stay well inside the 8 MB main-thread stack on every supported platform. `parser::parse_with_max_depth` and `Parser::new_with_max_depth` are the public override surface; a process-wide AtomicUsize lets CLI entry points install the cap once instead of threading the value through 30+ `parser::parse` call sites.
Plumb the parser's new depth cap through the CLI surface. `--max-ast-depth N` is a global flag (works on `ilo`, `ilo run`, `ilo check`, `ilo build`, `ilo serv`, anywhere source is compiled). The flag is stripped in `fn main` before either the clap or the bare-positional dispatch sees it, then installed via `parser::set_max_ast_depth_override` so every subsequent `parser::parse` call picks it up without threading. Both forms accepted: `--max-ast-depth 1024` and `--max-ast-depth=1024`. Anything after a literal `--` separator is left alone so user programs that take a positional `--max-ast-depth` arg still see it.
Six regression tests pinning the depth-cap behaviour:
- 1000-deep expression rejected at the default 256 cap, diagnostic
names the cap and the override flag in its hint
- 100-deep expression parses clean under the default cap
- raising the cap via parse_with_max_depth lets a 140-deep expr through
- lowering the cap rejects shapes the default would accept
- the same parse pipeline `ilo serv` uses rejects the deep-nest payload
- a 300-deep nested statement chain (`wh true{wh true{...}}`) trips P103,
proving the cap covers both `parse_expr` and `parse_stmt` surfaces
All tests run on a 32 MB stack because debug parser frames are ~24 KB
each and even hitting the cap recurses 256 frames deep, blowing the
2 MB default test thread stack via SIGSEGV before the cap can fire
logically.
Also adds `examples/ast-depth-cap.ilo` so the engine-matrix harness
exercises a shallow nested-expr program and the file acts as in-context
documentation of the feature for agents.
SPEC.md gets a new gotcha-table row plus an explainer paragraph alongside the existing P101/P102/P021 traps, and the CLI invocation section lists the new global flag. ai.txt regenerates from SPEC.md via build.rs. skills/ilo/ilo-errors.md adds the one-liner row for P103; the agent skill gets a short "AST depth cap" section next to the serv-mode docs so agents hitting the cap know to flatten or pass --max-ast-depth before retrying. The site docs (cli.md, diagnostics.md) live in the separate ilo-lang/site repo and are committed there.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Borrowed from Zero (rocicorp/mono#6000): `ilo serv` and any other context that compiles untrusted source is exposed to deeply nested expressions that can blow the parser stack or hit pathological verifier complexity. A 1000-deep `((((...((1+1))))...))` payload sent to `ilo serv` would recurse straight through the OS thread stack on the tree-walker parser.
This adds a parser-side AST depth check, default 256, with a process-wide `--max-ast-depth N` global flag to override. Hitting the cap surfaces as the new `ILO-P103` diagnostic that names both the cap and the override flag in its hint.
Manifesto framing: one shape, one diagnostic. Agents that legitimately need deeper nesting get a single-character flag bump; attackers get a stable `ILO-P103` they can't bypass without operator opt-in.
Repro before / after
Before this PR:
```
$ python3 -c "n=1000; print('main>n;' + '(' * n + '1' + ')' * n)" > /tmp/deep.ilo
$ ilo check /tmp/deep.ilo
thread 'main' has overflowed its stack
fatal runtime error: stack overflow, aborting
[1] 87753 abort ilo check /tmp/deep.ilo
```
After this PR:
```
$ ilo check /tmp/deep.ilo
{"code":"ILO-P103","message":"AST nesting depth exceeded 256",
"suggestion":"...raise the cap with --max-ast-depth N if a legitimate program needs more than 256 levels of nesting."}
$ ilo --max-ast-depth 1024 check /tmp/legitimately-nested.ilo
$ echo $?
0
```
What's in the diff
Site docs (`cli.md`, `diagnostics.md`) are committed in the separate `ilo-lang/site` repo: ilo-lang/ilo-site@b69b714.
Test plan
Follow-ups
None planned. The cap is a safety net, not a behaviour change; nothing else in the language touches it.