Skip to content

parser: cap AST nesting depth to prevent CPU DoS (#5bo)#528

Merged
danieljohnmorris merged 4 commits into
mainfrom
fix/ast-depth-cap
May 21, 2026
Merged

parser: cap AST nesting depth to prevent CPU DoS (#5bo)#528
danieljohnmorris merged 4 commits into
mainfrom
fix/ast-depth-cap

Conversation

@danieljohnmorris
Copy link
Copy Markdown
Collaborator

Summary

Borrowed from Zero (rocicorp/mono#6000): `ilo serv` and any other context that compiles untrusted source is exposed to deeply nested expressions that can blow the parser stack or hit pathological verifier complexity. A 1000-deep `((((...((1+1))))...))` payload sent to `ilo serv` would recurse straight through the OS thread stack on the tree-walker parser.

This adds a parser-side AST depth check, default 256, with a process-wide `--max-ast-depth N` global flag to override. Hitting the cap surfaces as the new `ILO-P103` diagnostic that names both the cap and the override flag in its hint.

Manifesto framing: one shape, one diagnostic. Agents that legitimately need deeper nesting get a single-character flag bump; attackers get a stable `ILO-P103` they can't bypass without operator opt-in.

Repro before / after

Before this PR:

```
$ python3 -c "n=1000; print('main>n;' + '(' * n + '1' + ')' * n)" > /tmp/deep.ilo
$ ilo check /tmp/deep.ilo
thread 'main' has overflowed its stack
fatal runtime error: stack overflow, aborting
[1] 87753 abort ilo check /tmp/deep.ilo
```

After this PR:

```
$ ilo check /tmp/deep.ilo
{"code":"ILO-P103","message":"AST nesting depth exceeded 256",
"suggestion":"...raise the cap with --max-ast-depth N if a legitimate program needs more than 256 levels of nesting."}
$ ilo --max-ast-depth 1024 check /tmp/legitimately-nested.ilo
$ echo $?
0
```

What's in the diff

  • f47cb7c parser: add AST nesting depth cap (ILO-P103) — depth counter on `Parser`, instrumented at the entry of every recursive parse helper (`parse_expr`, `parse_stmt`, `parse_decl`, `parse_atom`, `parse_pattern`, `parse_type`). New `ILO-P103` entry in the diagnostic registry. `parse_with_max_depth`/`Parser::new_with_max_depth` are the override surface; a process-wide `AtomicUsize` lets CLI entry points install the cap once instead of threading the value through 30+ `parser::parse` call sites.
  • 5134448 cli: add --max-ast-depth flag — global flag stripped in `fn main` before either clap or the bare-positional dispatch sees it. Both `--max-ast-depth 1024` and `--max-ast-depth=1024` forms accepted; `--` separator preserved so a user program's positional arg of the same name still passes through.
  • 2ba0f94 tests: cover ILO-P103 across expression, statement, and serv-style paths — 6 regression tests in `tests/parser_depth_cap.rs` (default cap rejects, under-cap accepts, raise + lower both work, serv-pipeline rejects, statement chain rejects). `examples/ast-depth-cap.ilo` exercises a shallow nest in the engine-matrix harness and acts as in-context docs for agents.
  • 4b82752 docs: document ILO-P103 and --max-ast-depth — `SPEC.md` gotchas + explainer + CLI invocation, `ai.txt` regenerates from `SPEC.md` via `build.rs`, `skills/ilo/ilo-errors.md` adds the P103 one-liner, `skills/ilo/ilo-agent.md` gets an "AST depth cap" section next to serv-mode.

Site docs (`cli.md`, `diagnostics.md`) are committed in the separate `ilo-lang/site` repo: ilo-lang/ilo-site@b69b714.

Test plan

  • `cargo test --release --features cranelift` clean across all 5000+ tests
  • new `tests/parser_depth_cap.rs` (6 tests) green
  • `cargo fmt --check` and `cargo clippy --features cranelift --tests` clean
  • manual smoke: deep-nest file rejected with P103, shallow file passes, `--max-ast-depth 1024` raises the cap as expected

Follow-ups

None planned. The cap is a safety net, not a behaviour change; nothing else in the language touches it.

Borrowed from Zero (rocicorp/mono#6000): `ilo serv` and any other context
that compiles untrusted source can be DoSed by a deeply nested expression
like `((((...((1+1))))...))` 1000 levels deep, which recurses straight
through the OS thread stack on the tree-walker parser.

Adds a depth counter on Parser, guarded at the entry of parse_expr,
parse_stmt, parse_decl, parse_atom, parse_pattern, and parse_type. When
depth >= max_depth the next recursion returns the new ILO-P103 error
instead of going deeper. Default cap is 256 (DEFAULT_MAX_AST_DEPTH) - far
above anything hand-written, low enough to stay well inside the 8 MB
main-thread stack on every supported platform.

`parser::parse_with_max_depth` and `Parser::new_with_max_depth` are the
public override surface; a process-wide AtomicUsize lets CLI entry points
install the cap once instead of threading the value through 30+
`parser::parse` call sites.
Plumb the parser's new depth cap through the CLI surface. `--max-ast-depth N`
is a global flag (works on `ilo`, `ilo run`, `ilo check`, `ilo build`,
`ilo serv`, anywhere source is compiled). The flag is stripped in `fn main`
before either the clap or the bare-positional dispatch sees it, then
installed via `parser::set_max_ast_depth_override` so every subsequent
`parser::parse` call picks it up without threading.

Both forms accepted: `--max-ast-depth 1024` and `--max-ast-depth=1024`.
Anything after a literal `--` separator is left alone so user programs
that take a positional `--max-ast-depth` arg still see it.
Six regression tests pinning the depth-cap behaviour:

- 1000-deep expression rejected at the default 256 cap, diagnostic
  names the cap and the override flag in its hint
- 100-deep expression parses clean under the default cap
- raising the cap via parse_with_max_depth lets a 140-deep expr through
- lowering the cap rejects shapes the default would accept
- the same parse pipeline `ilo serv` uses rejects the deep-nest payload
- a 300-deep nested statement chain (`wh true{wh true{...}}`) trips P103,
  proving the cap covers both `parse_expr` and `parse_stmt` surfaces

All tests run on a 32 MB stack because debug parser frames are ~24 KB
each and even hitting the cap recurses 256 frames deep, blowing the
2 MB default test thread stack via SIGSEGV before the cap can fire
logically.

Also adds `examples/ast-depth-cap.ilo` so the engine-matrix harness
exercises a shallow nested-expr program and the file acts as in-context
documentation of the feature for agents.
SPEC.md gets a new gotcha-table row plus an explainer paragraph alongside
the existing P101/P102/P021 traps, and the CLI invocation section lists
the new global flag. ai.txt regenerates from SPEC.md via build.rs.

skills/ilo/ilo-errors.md adds the one-liner row for P103; the agent skill
gets a short "AST depth cap" section next to the serv-mode docs so agents
hitting the cap know to flatten or pass --max-ast-depth before retrying.

The site docs (cli.md, diagnostics.md) live in the separate ilo-lang/site
repo and are committed there.
@danieljohnmorris danieljohnmorris merged commit c0ddc70 into main May 21, 2026
4 checks passed
@danieljohnmorris danieljohnmorris deleted the fix/ast-depth-cap branch May 21, 2026 10:36
@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

❌ Patch coverage is 86.20690% with 20 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/main.rs 73.43% 17 Missing ⚠️
src/parser/mod.rs 96.10% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant