parser: cap AST nesting depth to prevent CPU DoS (#5bo) by danieljohnmorris · Pull Request #528 · ilo-lang/ilo

danieljohnmorris · 2026-05-21T10:31:06Z

Summary

Borrowed from Zero (rocicorp/mono#6000): `ilo serv` and any other context that compiles untrusted source is exposed to deeply nested expressions that can blow the parser stack or hit pathological verifier complexity. A 1000-deep `((((...((1+1))))...))` payload sent to `ilo serv` would recurse straight through the OS thread stack on the tree-walker parser.

This adds a parser-side AST depth check, default 256, with a process-wide `--max-ast-depth N` global flag to override. Hitting the cap surfaces as the new `ILO-P103` diagnostic that names both the cap and the override flag in its hint.

Manifesto framing: one shape, one diagnostic. Agents that legitimately need deeper nesting get a single-character flag bump; attackers get a stable `ILO-P103` they can't bypass without operator opt-in.

Repro before / after

Before this PR:

```
$ python3 -c "n=1000; print('main>n;' + '(' * n + '1' + ')' * n)" > /tmp/deep.ilo
$ ilo check /tmp/deep.ilo
thread 'main' has overflowed its stack
fatal runtime error: stack overflow, aborting
[1] 87753 abort ilo check /tmp/deep.ilo
```

After this PR:

```
$ ilo check /tmp/deep.ilo
{"code":"ILO-P103","message":"AST nesting depth exceeded 256",
"suggestion":"...raise the cap with --max-ast-depth N if a legitimate program needs more than 256 levels of nesting."}
$ ilo --max-ast-depth 1024 check /tmp/legitimately-nested.ilo
$ echo $?
0
```

What's in the diff

f47cb7c parser: add AST nesting depth cap (ILO-P103) — depth counter on `Parser`, instrumented at the entry of every recursive parse helper (`parse_expr`, `parse_stmt`, `parse_decl`, `parse_atom`, `parse_pattern`, `parse_type`). New `ILO-P103` entry in the diagnostic registry. `parse_with_max_depth`/`Parser::new_with_max_depth` are the override surface; a process-wide `AtomicUsize` lets CLI entry points install the cap once instead of threading the value through 30+ `parser::parse` call sites.
5134448 cli: add --max-ast-depth flag — global flag stripped in `fn main` before either clap or the bare-positional dispatch sees it. Both `--max-ast-depth 1024` and `--max-ast-depth=1024` forms accepted; `--` separator preserved so a user program's positional arg of the same name still passes through.
2ba0f94 tests: cover ILO-P103 across expression, statement, and serv-style paths — 6 regression tests in `tests/parser_depth_cap.rs` (default cap rejects, under-cap accepts, raise + lower both work, serv-pipeline rejects, statement chain rejects). `examples/ast-depth-cap.ilo` exercises a shallow nest in the engine-matrix harness and acts as in-context docs for agents.
4b82752 docs: document ILO-P103 and --max-ast-depth — `SPEC.md` gotchas + explainer + CLI invocation, `ai.txt` regenerates from `SPEC.md` via `build.rs`, `skills/ilo/ilo-errors.md` adds the P103 one-liner, `skills/ilo/ilo-agent.md` gets an "AST depth cap" section next to serv-mode.

Site docs (`cli.md`, `diagnostics.md`) are committed in the separate `ilo-lang/site` repo: ilo-lang/ilo-site@b69b714.

Test plan

`cargo test --release --features cranelift` clean across all 5000+ tests
new `tests/parser_depth_cap.rs` (6 tests) green
`cargo fmt --check` and `cargo clippy --features cranelift --tests` clean
manual smoke: deep-nest file rejected with P103, shallow file passes, `--max-ast-depth 1024` raises the cap as expected

Follow-ups

None planned. The cap is a safety net, not a behaviour change; nothing else in the language touches it.

Borrowed from Zero (rocicorp/mono#6000): `ilo serv` and any other context that compiles untrusted source can be DoSed by a deeply nested expression like `((((...((1+1))))...))` 1000 levels deep, which recurses straight through the OS thread stack on the tree-walker parser. Adds a depth counter on Parser, guarded at the entry of parse_expr, parse_stmt, parse_decl, parse_atom, parse_pattern, and parse_type. When depth >= max_depth the next recursion returns the new ILO-P103 error instead of going deeper. Default cap is 256 (DEFAULT_MAX_AST_DEPTH) - far above anything hand-written, low enough to stay well inside the 8 MB main-thread stack on every supported platform. `parser::parse_with_max_depth` and `Parser::new_with_max_depth` are the public override surface; a process-wide AtomicUsize lets CLI entry points install the cap once instead of threading the value through 30+ `parser::parse` call sites.

Plumb the parser's new depth cap through the CLI surface. `--max-ast-depth N` is a global flag (works on `ilo`, `ilo run`, `ilo check`, `ilo build`, `ilo serv`, anywhere source is compiled). The flag is stripped in `fn main` before either the clap or the bare-positional dispatch sees it, then installed via `parser::set_max_ast_depth_override` so every subsequent `parser::parse` call picks it up without threading. Both forms accepted: `--max-ast-depth 1024` and `--max-ast-depth=1024`. Anything after a literal `--` separator is left alone so user programs that take a positional `--max-ast-depth` arg still see it.

Six regression tests pinning the depth-cap behaviour: - 1000-deep expression rejected at the default 256 cap, diagnostic names the cap and the override flag in its hint - 100-deep expression parses clean under the default cap - raising the cap via parse_with_max_depth lets a 140-deep expr through - lowering the cap rejects shapes the default would accept - the same parse pipeline `ilo serv` uses rejects the deep-nest payload - a 300-deep nested statement chain (`wh true{wh true{...}}`) trips P103, proving the cap covers both `parse_expr` and `parse_stmt` surfaces All tests run on a 32 MB stack because debug parser frames are ~24 KB each and even hitting the cap recurses 256 frames deep, blowing the 2 MB default test thread stack via SIGSEGV before the cap can fire logically. Also adds `examples/ast-depth-cap.ilo` so the engine-matrix harness exercises a shallow nested-expr program and the file acts as in-context documentation of the feature for agents.

SPEC.md gets a new gotcha-table row plus an explainer paragraph alongside the existing P101/P102/P021 traps, and the CLI invocation section lists the new global flag. ai.txt regenerates from SPEC.md via build.rs. skills/ilo/ilo-errors.md adds the one-liner row for P103; the agent skill gets a short "AST depth cap" section next to the serv-mode docs so agents hitting the cap know to flatten or pass --max-ast-depth before retrying. The site docs (cli.md, diagnostics.md) live in the separate ilo-lang/site repo and are committed there.

codecov · 2026-05-21T10:36:49Z

Codecov Report

❌ Patch coverage is 86.20690% with 20 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/main.rs	73.43%	17 Missing ⚠️
src/parser/mod.rs	96.10%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

danieljohnmorris added 4 commits May 21, 2026 11:26

danieljohnmorris merged commit c0ddc70 into main May 21, 2026
4 checks passed

danieljohnmorris deleted the fix/ast-depth-cap branch May 21, 2026 10:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parser: cap AST nesting depth to prevent CPU DoS (#5bo)#528

parser: cap AST nesting depth to prevent CPU DoS (#5bo)#528
danieljohnmorris merged 4 commits into
mainfrom
fix/ast-depth-cap

danieljohnmorris commented May 21, 2026

Uh oh!

Uh oh!

codecov Bot commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danieljohnmorris commented May 21, 2026

Summary

Repro before / after

What's in the diff

Test plan

Follow-ups

Uh oh!

Uh oh!

codecov Bot commented May 21, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant