diff --git a/CHANGELOG.md b/CHANGELOG.md index ad2276a9..1f9089c0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,7 @@ For the release process and tag conventions, see [RELEASING.md](RELEASING.md). ### Added +- **`ilo httpd` lazy streaming response body (ILO-482).** A handler's response `body` field may now be a lazy line iterator (`get-stream` / `pst-stream`, `for-line stdin`) in addition to a plain string or an eager `L t` list. When the body is a lazy iterator, `ilo httpd` writes and flushes each yielded line as its own chunked-transfer block as soon as the handler produces it, rather than materialising the whole body first. This lets a handler hold the connection open and emit chunks incrementally (true SSE / long-poll / tailing a growing source). If the client disconnects mid-stream the connection thread drops the iterator and exits cleanly with no panic. The existing string and `L t` body shapes are unchanged. Unblocks `ilo-lang/crew`'s `crew-server` `GET /events/stream`, intended as a held-open tail of `data/feed/.jsonl` but stuck on a one-shot snapshot while the body buffered. Follow-ups: a `tail-file` source (lazy `tail -f`) for the file-tail case, and `get-stream`'s 16 KiB read-buffer granularity for sub-buffer payloads. See `docs/streaming.md` and `examples/httpd-stream.ilo`. - **`ilo httpd` resolves `use` imports (ILO-481).** Handler files loaded by `ilo httpd` now have their `use` imports resolved at startup, relative to the handler's own directory, matching the existing `ilo run` / `ilo check` semantics. Previously `httpd` lexed, parsed, and verified only the single handler file and silently skipped import resolution, so a handler could not `use` a sibling module - `ilo-lang/crew`'s `crew-server` had to inline ~140 lines of store logic to work around it. A missing module now surfaces a real import diagnostic and the server refuses to start instead of failing later with a generic verifier error. See `docs/streaming.md`. - **`spawn` builtin (ILO-477).** `spawn fn args... > _` runs `fn args...` on a background OS thread, fire-and-forget. Returns nil immediately; errors and panics inside the thread go to stderr and the thread dies, while the parent is unaffected. Caps are inherited from the parent via `Arc::clone(&env.caps)`, so a worker started under `--allow-net` / `--allow-write` keeps the same policy. Unblocks daemon-style programs that need multiple concurrent loops in one process - canonical case is `ilo-lang/crew`'s per-machine agent (MCP HTTP server foreground + SSE consumer background + write-behind queue drainer background). Out of scope for v1 (separate tickets): join handles, channels, supervision, cancellation tokens, async runtime, native VM / Cranelift codegen. Tree-walker only at runtime; VM and Cranelift inherit through the existing tree bridge. See `examples/daemon-loops.ilo`. - **Client-side HTTP streaming (ILO-448).** Four new builtins that return a lazy `L t` line iterator over a chunked / SSE response body: `get-stream url`, `get-stream-h url headers`, `pst-stream url body`, `pst-stream-h url body headers`. Consume via `@line (get-stream url){...}` - one chunk-line per iteration, body never fully buffered. Cap-checked via `--allow-net` before opening the connection; mid-stream I/O errors surface as `ILO-R009 http-stream read error: ...`. WASM returns `Err`. Symmetric counterpart to the server-side `ilo httpd` + chunked transfer encoding shipped in ILO-46 / ILO-379; unblocks `ilo-lang/crew`'s per-machine agent daemon needing an SSE consumer. Tree + VM only in this release; Cranelift JIT follow-up. See `docs/streaming.md` and `examples/sse-client.ilo`. diff --git a/SPEC.md b/SPEC.md index cb6e5215..b7cb8c41 100644 --- a/SPEC.md +++ b/SPEC.md @@ -2566,7 +2566,9 @@ Handler signature: -- -- Response fields read by ilo httpd: -- status:n HTTP status code (200, 404, 500, ...) --- body:t response body +-- body response body: t (buffered), L t (eager chunked), +-- or a lazy iterator (get-stream / for-line stdin) for +-- true incremental streaming -- headers:M t t optional response headers type rsp{status:n;body:t} @@ -2581,6 +2583,12 @@ Use `req:_` (wildcard) for the request param type — the `Request` record is cr The handler file's `use` imports are resolved at startup, relative to the handler's own directory, matching `ilo run` / `ilo check` (ILO-481). A handler can split logic across sibling modules (`use "store.ilo"`) rather than inlining everything. A missing import surfaces a real diagnostic and the server refuses to start. +The response `body` field may take three shapes (ILO-482): + +* `t` — a plain string, sent with `Content-Length` (the default). +* `L t` — a list of strings, sent eagerly with `Transfer-Encoding: chunked`: each element becomes one chunk. The list is materialised before the first byte is written. +* a lazy line iterator (`get-stream`/`pst-stream`, `for-line stdin`) — sent with `Transfer-Encoding: chunked` **lazily**: each line the iterator yields is written and flushed as its own chunk, so the handler can hold the connection open and emit chunks as they are produced (SSE, long-poll, tailing a growing source) without buffering the whole body first. If the client disconnects mid-stream the connection thread exits cleanly. A zero-arg `body` function (`FnRef`/closure) is called first and may itself return any of the three shapes. + **`ilo check --strict`.** Treats every warning-severity diagnostic (ILO-T032 bare `fmt`, ILO-T033 bare `mset` / `+=` / `mdel`, ILO-W002 `@x (jpar! …){…}` steering to `jpar-list!`, future warning codes) as a hard exit-code failure. The diagnostic stream itself is unchanged: warnings still emit with `severity: "warning"` in the JSON output, so editor integrations that route by severity stay correct. Only the exit code is elevated. CI harnesses that gate merges on `ilo check` should use `--strict` so warnings can't slip through silently; for interactive use, the default (warnings-are-advisory) is the right behaviour. **Default-run.** Inline programs (`ilo 'code'`) and single-function files run their entry function with the remaining CLI args; no explicit function name needed. Multi-function files auto-pick a function called `main` when no positional func arg is supplied. The same heuristic applies to the explicit engine flags - `--vm` and `--jit` both auto-pick `main` on multi-fn files, matching the default-engine behaviour. With no `main` declared, supply a function-name argument. diff --git a/ai.txt b/ai.txt index 94e727a4..29d1e3f1 100644 --- a/ai.txt +++ b/ai.txt @@ -17,7 +17,7 @@ IMPORTS: Split programs across files with `use`: use "path/to/file.ilo" -- flat PACKAGE REGISTRY: ilo has a lightweight GitHub-based package registry. There is no central server — GitHub is the substrate. [Installing packages] ilo add / -- fetch latest default branch ilo add /@ -- fetch a specific branch, tag, or SHA prefix ilo update -- re-fetch all installed packages ilo update / -- re-fetch one package `ilo add` performs a shallow `git clone` into `~/.ilo/pkgs///` and writes a line to `ilo.lock` in the current directory. [Using installed packages] After `ilo add myorg/helpers`, import the package's `index.ilo` with: use "myorg/helpers" -- imports ~/.ilo/pkgs/myorg/helpers/index.ilo use "myorg/helpers" [foo bar] -- selective import use "myorg/helpers/utils.ilo" -- import a specific file from the package A `use` path whose first component contains no `.` is treated as a package reference, not a local file path. To import a local file in a sibling directory, use an explicit leading `./`: use "./sibling.ilo" -- always local use "myorg/helpers" -- always a package [Lockfile (`ilo.lock`)] `ilo add` writes/updates `ilo.lock` in the current working directory. Commit this file to source control. myorg/helpers https://github.com/myorg/helpers Format: tab-separated columns `slug`, `sha`, `url`. Lines starting with `#` are comments. [Non-goals (v1)] Centralised registry hosting (GitHub is the substrate) Semantic versioning enforcement Private registry / auth Transitive dependency resolution ERROR HANDLING: `R ok err` return type. Call then match: get-user uid;?{^e:^+"Lookup failed: "e;~d:use d} Compensate/rollback inline: charge pid amt;?{^e:release rid;^+"Payment failed: "e;~cid:continue} [Auto-Unwrap `!`] `func! args` calls `func` and auto-unwraps the Result: if `~v` (Ok), returns `v`; if `^e` (Err), immediately returns `^e` from the enclosing function. inner x:n>R n t;~x outer x:n>R n t;d=inner! x;~d Equivalent to `r=inner x;?r{~v:v;^e:^e}` but in 1 token instead of 12. Rules: The called function must return `R` or `O` (else verifier error ILO-T025) The enclosing function must return `R` (or `O` for Optional callees) (else verifier error ILO-T026) `!` goes after the function name, before args: `get! url` not `get url!` Zero-arg: `fetch!()` [Panic-Unwrap `!!`] `func!! args` is symmetric in shape with `!`, but on the failure path it aborts the program with a runtime diagnostic and exit code 1 instead of propagating. There is no enclosing-return-type constraint, so persona code can use it from `main>t`, `main>n`, or any non-Result / non-Optional context. main>t;rdl!! "input.txt" -- read file, abort with diagnostic if missing main>n;v=num!! "42";v -- parse number, abort on parse error main>n;m=mset mmap "k" 7;mget!! m "k" -- get value or abort if key missing On `^e` (Err) the program writes `panic-unwrap: ` to stderr and exits 1. On `O nil` the program writes `panic-unwrap: expected value, got nil`. On `~v` (Ok) or non-nil Optional, the inner value is extracted, identical to `!`. Rules: The called function must return `R` or `O` (else verifier error ILO-T025) **No constraint on the enclosing function's return type** - this is the difference from `!` `!!` goes after the function name, before args: `rdl!! path` not `rdl path!!` Zero-arg: `fetch!!()` Use `!` when the caller wants to react to the Err (compensate, retry, log). Use `!!` when the failure is a programming or environmental error the caller has no way to recover from - typical in short scripts, glue code, and main entry points. PATTERNS (FOR LLM GENERATORS): [Bind-first pattern] Always bind complex expressions to variables before using them in operators. Operators only accept atoms and nested operators as operands - not function calls. -- DON'T: *n fac -n 1 (fac is an operand of *, not a call) -- DO: r=fac -n 1;*n r (bind call result, then use in operator) [Recursion template] >;;...;;combine 1. **Guard**: base case returns early - `<=n 1 1` (or `<=n 1{1}`) 2. **Bind**: bind recursive call results - `r=fac -n 1` 3. **Combine**: use bound results in final expression - `*n r` [Factorial] fac n:n>n;<=n 1 1;r=fac -n 1;*n r `<=n 1 1` - braceless guard: if n <= 1, return 1 `r=fac -n 1` - recursive call with prefix subtract as argument `*n r` - multiply n by result [Fibonacci] fib n:n>n;<=n 1 n;a=fib -n 1;b=fib -n 2;+a b `<=n 1 n` - braceless guard: return n for 0 and 1 `a=fib -n 1;b=fib -n 2` - two recursive calls, each with prefix arg `+a b` - add results [Tail-call optimisation] ilo guarantees that **tail calls do not consume host-stack frames**. A function that recurses only in tail position can run to arbitrary depth — the runtime trampolines the call by rebinding parameters in place rather than pushing a frame. The manifesto's "Constrained" rule (every feature must pay for itself in tokens) vetoed adding a `loop` keyword. Instead, tail-recursive accumulator patterns are the canonical idiom for iteration beyond what `@` foreach covers, and the TCO guarantee makes them safe at any depth. A call is in **tail position** when its return value is the function's return value: the last statement of the body, the expression of a `ret` statement, an arm of a tail-position `?` match, or the body of a braceless guard. Calls inside `@` foreach, `@` range, `wh` loops, or as operands of further computation are NOT in tail position. > **Recursive self-call discarded at non-tail position fires `ILO-T043`.** When a function calls itself before another statement runs, the recursive return is silently dropped — every call falls through to the later statements. The verifier emits `ILO-T043` with a hint pointing at the tail-position fix (move the recursive call to the body's last statement, wrap it in `ret`, or restructure via `?h cond then else`). The warning is narrowly scoped to self-calls (caller name == callee name); bare non-recursive user-fn calls at non-tail position may be side-effecting and do not warn. Surfaced 2026-05-21 by the interp1d persona: see `examples/recursive-tail-position.ilo` for the canonical fix shape. -- Tail-recursive countdown — runs to arbitrary depth. count-down n:n>n;=n 0 0;count-down -n 1 -- Tail-recursive accumulator — sums a list without growing the host stack. sum-acc xs:L n acc:n>n;empty=len xs;=empty 0 acc;sum-acc tl xs +acc hd xs Constraints on the tail-call peephole: The callee must be a direct user-defined function name (not a FnRef in scope, not a closure, not a builtin, not a tool). The call must have no auto-unwrap (`!` / `!!`) — those forms inspect the result before deciding whether to propagate. These constraints leave the common shapes (recursive accumulators, state machines, mutual recursion via direct names) covered. Other shapes still recurse the host stack as before; for deep recursion through non-tail-eligible shapes, restructure into an accumulator. Tree interpreter and bytecode VM (`--vm`) support shipped in 0.12.x; the VM emits `OP_TAILCALL` for tail-position user-fn calls and reuses the current call frame instead of pushing a new one, so depth is bounded only by available heap. Cranelift (`--jit`, AOT) gains matching `return_call` lowering in a subsequent PR; until then, deep tail-recursion under the JIT/AOT path recurses the host stack and is bounded by it. [Multi-statement bodies] Semicolons separate statements. Last expression is the return value. f x:n>n;a=*x 2;b=+a 1;*b b -- (x*2 + 1)^2 Bodies may also be written across multiple newline-separated lines, indented under the signature. The parser stays inside the same function body while it sees an open bracket (`[`, `(`, `{`) or a pipe operator continuation. This makes long literals and multi-line conditional pipelines readable without semicolons: f x:n>n a=*x 2 b=+a 1 *b b g>L n [10, 20, 30, 40, 50, 60, 70, 80] Statement separation reverts to standard rules once brackets close. A blank line ends the current declaration. Windows CRLF (`\r\n`) is normalised to `\n` before lexing, so files edited on Windows parse identically to Unix-line-ending files. [Multi-function files] Functions in a file are separated by **newlines**. The parser strips all newlines, so the token stream is flat. After parsing each function body, the parser uses the next newline-delimited boundary to start the next declaration. A non-last function body's **final expression must not be a bare variable reference (`Ref`) or a function call**, because the parser greedily reads following tokens as additional call arguments. Safe endings prevent this: Binary operator=`+n 0`, `*x 1`=✓=fixed arity - no greedy loop Index access=`xs.0`, `rec.field`=✓=returns `Expr::Index`, not `Ref` Match block=`?v{…}`=✓=ends with `}` ForEach block=`@x xs{…}`=✓=ends with `}` Parenthesised expr=`(x>>f>>g)`=✓=ends with `)` Record constructor=`point x:1 y:2`=✓=parses as `Expr::Record`, not `Ref` Text/number literal=`"ok"`, `42`=✓=literal, not `Ref` Bare variable (`Ref`)=`n`, `result`=✗=greedy loop fires Bare function call=`len xs`, `f a`=✗=greedy loop fires The **last function in a file** can end with anything - greedy parsing stops at EOF. -- Non-last functions: end with a binary expression digs n:n>n;t=str n;l=len t;+l 0 -- +l 0 = l (binary, safe) clmp n:n lo:n hi:n>n;n hi hi;+n 0 -- +n 0 = n (binary, safe; `clamp` is a builtin) -- Last function: bare call is fine sz xs:L n>n;len xs -- EOF - greedy loop stops naturally To use a pipe chain in a non-last function, wrap it in parentheses: dbl-inc x:n>n;(x>>dbl>>inc) -- parens prevent >> from consuming next function's name inc-sq x:n>n;x>>inc>>sq -- last function - no parens needed [DO / DON'T] -- DON'T: fac n:n>n;<=n 1 1;*n fac -n 1 -- ↑ *n sees fac as an atom operand, not a call -- DO: fac n:n>n;<=n 1 1;r=fac -n 1;*n r -- ↑ bind-first: call result goes into r, then *n r works -- DON'T: +fac -n 1 fac -n 2 -- ↑ + takes two operands; fac is just an atom ref -- DO: a=fac -n 1;b=fac -n 2;+a b -- ↑ bind both calls, then combine -ERROR DIAGNOSTICS: ilo verifies programs before execution and reports errors with stable codes, source context, and suggestions. [Error codes] Every error has a stable `ILO-` code. The letter is the namespace - the phase that raised the diagnostic - so agents and tools can route on prefix without parsing the message. Numeric ranges are reserved per namespace with generous gaps, so future codes slot in cleanly and the contract is forward-compatible. `ILO-L000-099`=L=Lexer / tokenisation=active `ILO-P100-199`=P=Parser / syntax=active `ILO-N200-299`=N=Names / resolution=reserved `ILO-I300-399`=I=Imports=reserved `ILO-T400-499`=T=Types=active `ILO-V500-599`=V=Verifier (post-type checks)=reserved `ILO-R600-699`=R=Runtime=active `ILO-D700-799`=D=Deprecation warnings=reserved `ILO-E800-899`=E=Engine-specific limitations=reserved `ILO-S900-999`=S=Skill / spec system=reserved **Historical codes.** ilo shipped with flat numbering inside each namespace - `ILO-L001`, `ILO-P001`, `ILO-T001`, `ILO-R001`, `ILO-W001`, all starting at 001. Those codes remain valid forever. The hundreds-block allocation above applies to new codes from now on, and a cross-engine regression test asserts every emitted code lives in a documented range. **Reserved namespaces.** `N`, `I`, `V`, `D`, `E`, `S` carry no codes today. They are forward declarations so the first code in each category slots into its own range without conflicting with the active namespaces. `D` is earmarked for deprecation warnings: when a feature is scheduled for removal it emits an `ILO-D7xx` warning at compile time without failing the build. Use `--explain` to see a detailed explanation: ilo --explain ILO-T004 [Source context] Errors point at the relevant source location with a caret: error[ILO-T005]: undefined function 'foo' (called with 1 args) --> 1:9 1 | f x:n>n;foo x = note: in function 'f' = suggestion: did you mean 'f'? Parser, verifier, and runtime errors all show source spans. The verifier uses the enclosing statement span as the best available location for expression-level errors. [Suggestions] The verifier provides context-aware hints: **Did you mean?** - Levenshtein-based suggestions for undefined variables, functions, fields, and types **Type conversion** - suggests `str` for n→t, `num` for t→n **Missing arms** - lists uncovered match patterns with types **Arity** - shows expected parameter signature [Error output formats] --ansi / -a ANSI colour (default for TTY) --text / -t Plain text (no colour) --json / -j JSON (default for piped output) --no-hints / -nh Suppress idiomatic hints --silent / -s Suppress program stdout (mainly for --bench; see below) NO_COLOR=1 Disable colour (same as --text) **`--silent` / `-s`.** Suppresses the program's own stdout (`prnt`, `prnv`, `jprn`, etc.) for the duration of execution. Designed for `ilo --bench`: combined with `--json` it lets agent harnesses (e.g. persona cost rollup) consume the bench JSON envelope on stdout without it being drowned in the benchmarked function's own output. Stderr is never silenced, so genuine errors still surface. Diagnostic output (including the bench JSON envelope and the human-readable bench summary block) is always emitted on stdout regardless of `--silent` — the flag only redirects program-level prints. Unix only (no-op on Windows for the program-stdout half; bench output still reaches stdout there). JSON error output follows a structured schema with `severity`, `code`, `message`, `labels` (with spans), `notes`, and `suggestion` fields. Runtime errors raised from the Cranelift JIT (opt-in via `--jit`) populate `labels` with the source span of the failing operation, matching tree and VM behaviour. Span coverage threads through every JIT runtime helper (unwrap, panic-unwrap, list-get, slice, index, jpth, mget, record-field strict access, builtin dispatch, dynamic call); AOT-compiled binaries inherit the same coverage. Pre-v0.11.6 builds surfaced `{"labels":[]}` for these shapes - if you see an empty labels array on a runtime error, the binary is out of date. AOT binaries also install an async-signal-safe handler in `ilo_aot_init` that catches fatal signals (SIGSEGV, SIGBUS, SIGFPE, SIGILL, SIGABRT) and writes a single JSON line on stderr identifying the signal before the process terminates with the conventional 128+signo exit code. The diagnostic uses `ILO-R015` (AOT runtime fault). Without the handler, a hard fault inside compiled native code would leave the process with raw signal exit (e.g. 139 for SIGSEGV) and no diagnostic — agents driving ilo couldn't distinguish a clean non-zero exit from a hard fault. A SIGSEGV from an AOT binary is always a bug in ilo (codegen or runtime helper); file an issue with the source program and the JSON line. AOT binaries also install an async-signal-safe handler in `ilo_aot_init` that catches fatal signals (SIGSEGV, SIGBUS, SIGFPE, SIGILL, SIGABRT) and writes a single JSON line on stderr identifying the signal before the process terminates with the conventional 128+signo exit code. The diagnostic uses `ILO-R015` (AOT runtime fault). Without the handler, a hard fault inside compiled native code would leave the process with raw signal exit (e.g. 139 for SIGSEGV) and no diagnostic — agents driving ilo couldn't distinguish a clean non-zero exit from a hard fault. A SIGSEGV from an AOT binary is always a bug in ilo (codegen or runtime helper); file an issue with the source program and the JSON line. [Top-level program output] For a program whose entry function returns a Result, the `~`/`^` wrapper is split across streams and exit codes so shell callers do not have to strip a prefix: `~v` (Ok)=`v` (bare)=-=0 `^e` (Err)=-=`^e`=1 any non-Result=`v`=-=0 In `--json` mode the value is always wrapped (`{"schemaVersion": 1, "ok": v}` / `{"schemaVersion": 1, "error": {...}}`) and emitted to stdout; exit codes match the plain-mode table. The `schemaVersion` field was added in 0.12.1 to every CLI `--json` envelope (`run`, `graph`, `--ast`, `serv`, `tools --json`, `spec --json`) so agents can route on a single field across every command. See `JSON_OUTPUT.md` for the full audit table. **`-j` short alias (ILO-442).** Every subcommand that accepts `--json` also accepts the `-j` short form with identical behaviour: `ilo check -j file.ilo`, `ilo run -j 'code'`, `ilo spec -j ai`, `ilo skill -j list`, `ilo tools -j --mcp m.json`, `ilo version -j`, `ilo explain -j ILO-T001`, `ilo build -j prog.ilo`, etc. Manifesto P6 (every subcommand has `--json`) plus terser invocations for agent prompts. `Display` on `Value::Ok` / `Value::Err` still renders `~v` / `^e` in every other context (nested values, `prnt`, REPL prompts, error messages, debug output) - only the top-level program-return print path is split. The contract applies uniformly to in-process runners (`ilo prog.ilo`, `--vm`, `--jit`) and to AOT-compiled standalone binaries from `ilo compile`. Both strip the top-level `~`/`^` wrapper on stdout, route `^e` to stderr, and use the same exit codes - output is byte-for-byte identical across every backend. **Auto-echo suppression for `prnt` + status sentinel.** When the entry function has at least one *unconditional top-level* `prnt` call AND the tail expression is a bare wrapped string literal (`~"text"` or `^"text"`), the top-level auto-echo is suppressed. The wrapped literal is treated as a status sentinel rather than a value the caller wants captured. Without this rule, a function shaped like `m>R t t;prnt "report";~"ok"` emits `report\nok\n` on stdout and shell callers piping the output have to strip the trailing `ok`. The rule does NOT fire when (a) there is no `prnt` in the body — `m>R t t;~"ok"` still prints `ok` because the wrapped literal IS the program's output (the `cli-tasks-save-ok.ilo` pattern); (b) the `prnt` is nested inside a guard, loop, or match arm — those are conditional and the `prnt` may never run; (c) the tail is `~v` where `v` is a binding or call — that's a real return value. `^"text"` errors still go to stderr with exit 1; the suppression rule never silently swallows an Err. Pinned by `tests/regression_tilde_str_noecho.rs` and `examples/tilde-str-noecho.ilo`. [Idiomatic hints] After successful execution, ilo scans the source for non-canonical forms and emits hints to stderr: hint: `==` → `=` saves 1 char (both mean equality in ilo) hint: `length` → `len` (canonical short form) Builtin alias hints appear at most once per program (the first long-form name found). In JSON mode, hints appear as `{"hints":["..."]}` on stderr. Suppress with `--no-hints` / `-nh`. [CLI invocation] ilo 'code' [args...] -- inline program; default-runs the entry function ilo program.ilo [func] [args] -- if `func` is omitted and the file declares exactly one function, that function runs automatically ilo run program.ilo [func] [a] -- verb form; same dispatch as the bare positional ilo check program.ilo [--json|-j] [--strict] -- run the verifier without executing (exit 0 = clean; --strict treats warnings as exit-code errors) ilo test [path] [--engine vm|jit|all] -- run `-- run:` / `-- out:` / `-- err:` assertions in .ilo files (exit 0 on all-pass, 1 on any failure) ilo build program.ilo -o out -- AOT compile to a standalone binary (alias for `compile`) ilo run program.ilo --emit js -- transpile to JavaScript and print to stdout (PR #713, ILO-73) ilo run program.ilo --emit python -- transpile to Python and print to stdout ilo program.ilo --ast -- print parsed AST as JSON and exit ilo --explain ILO-T004 -- print error explanation and exit ilo help ai -- compact AI spec to stdout (= contents of ai.txt) ilo serv -- long-lived JSON request/response loop ilo httpd handler.ilo [--port N] -- HTTP server: calls handler fn per request (default port 8080) ilo --max-ast-depth N -- cap parser nesting at N (default 256; protects `ilo serv` and other untrusted-source paths from DoS payloads, raises ILO-P103) ilo --max-runtime SECS -- cap wall-clock runtime at SECS (default 60; 0 disables; raises ILO-R016) ilo --max-output-bytes BYTES -- cap stdout output at BYTES (default ~100 MB; 0 disables; raises ILO-R017) ilo run --allow-net[=HOSTS] -- restrict outbound net to comma-separated hosts (* = all, empty = none) ilo run --allow-read[=PATHS] -- restrict file reads to comma-separated path prefixes ilo run --allow-write[=PATHS] -- restrict file writes to comma-separated path prefixes ilo run --allow-run[=CMDS] -- restrict subprocess spawning to comma-separated command names **Capability flags (`ILO-CAP-001`).** `ilo run --allow-net=HOSTS --allow-read=PATHS --allow-write=PATHS --allow-run=CMDS` gates IO builtins at the process level. Any `--allow-*` flag present switches the runtime from **permissive** (default — no restrictions, full backwards compatibility) to **restricted** (only listed targets are permitted). Denial returns a normal `R` Err value with code `ILO-CAP-001`; programs can pattern-match it. Capability matrix: `get`/`post`/`put`/`patch`/`del`/`fetch` → `--allow-net`; `rd`/`rd-lines`/`ls`/`lsr` → `--allow-read`; `wr`/`wr-lines`/`wr-app` → `--allow-write`; `run`/`run2` → `--allow-run`. Value syntax: omit = unrestricted; `*` = all permitted; empty (`--allow-net=`) = all blocked; comma list = only those targets. Matching: net = hostname extracted from URL, exact or `*.domain` wildcard; read/write = path-prefix with separator boundary; run = basename or full-path match. See `SANDBOX.md` for the operator guide and `examples/capability-sandbox.ilo` for a runnable demo. **Production-safety guards (`ILO-R016`, `ILO-R017`).** `ilo run` caps wall-clock runtime at 60 s and stdout output at ~100 MB by default. A runaway loop (missing increment, recursion with no base case) aborts with `ILO-R016` once the time budget hits, instead of burning CPU forever; a `prnt` loop without termination aborts with `ILO-R017` once the byte budget hits, instead of filling the agent transcript with megabytes of garbage. Both guards write a structured diagnostic to stderr and exit 1. Defaults are well above any legitimate program (real agent tasks finish under 10 s and produce kilobytes); raise with `--max-runtime SECS` / `--max-output-bytes BYTES`, set either to `0` to disable. The guards were installed by the mandelbrot persona report (2026-05-20) which spun in an infinite loop and wrote 165 MB of stdout before the harness intervened. **Verb-noun aliases.** `ilo run ` is an exact alias for the bare positional `ilo ` - same dispatch, same engine selection, same arg handling. `ilo build -o ` is an alias for `ilo compile -o `. Both exist to match the toolchain conventions used by `cargo`, `go`, and `zero` so agents and humans can guess the command name without consulting the help text. The bare positional forms remain fully supported for backwards compatibility; nothing has been removed. **`ilo check`.** Standalone verifier invocation: lex, parse, resolve imports, and run the type verifier without proceeding to bytecode compilation or execution. Exit code 0 means the program is well-typed and verifier-clean; exit code 1 means at least one diagnostic was emitted on stderr. The output mode follows the global flags (`--json` for NDJSON diagnostics, `--text` for plain text, `--ansi` for coloured output; auto-detected when omitted - JSON when stderr is not a TTY, ANSI otherwise). `ilo check` works on both files and inline code; on a syntactically-broken input it still reports the parse error rather than crashing, which is important for editor and agent loops that may feed in half-written programs. **`ilo test`.** Runs the `-- run: ` / `-- out: ` (or `-- err: `) annotations embedded in `.ilo` source files - the same format the in-tree `tests/examples_engines.rs` integration harness already uses. A file path tests that one file; a directory walks `*.ilo` recursively. Each case runs as a subprocess (`ilo --vm `), output is asserted against the expected payload, and the result prints as `PASS path::fn (line N)` / `FAIL path::fn (line N) (got: X, want: Y)`. The final line reports `N passed, M failed`. Exit 0 if everything passed, 1 if any case failed or no annotations were found. The default engine is `--vm`; pass `--engine jit` or `--engine all` to widen the matrix. Per-file `-- engine-skip: vm jit` annotations skip the listed engines, matching the integration harness. Because every example under `examples/` uses this annotation format already, `ilo test examples/` doubles as a smoke test for the language itself and as a worked reference an agent can read when writing tests for its own programs. **`ilo httpd`.** Starts an HTTP/1.1 server that calls a user-defined ilo handler function for every incoming request. The handler receives a `Request` record and must return a `Response` record (or a bare record with at least `status` and `body` fields). One OS thread is spawned per accepted connection. The handler is loaded once at startup; re-reads require a restart. ilo httpd handler.ilo -- serve on :8080 (default) ilo httpd --port 3000 handler.ilo -- serve on :3000 ilo httpd handler.ilo myhandler -- call function `myhandler` instead of `handler` Handler signature: -- Request fields injected by ilo httpd at runtime: -- method:t HTTP verb (GET, POST, ...) -- path:t request path including query string -- headers:M t t request headers (keys lowercased) -- body:t request body (empty string when absent) -- -- Response fields read by ilo httpd: -- status:n HTTP status code (200, 404, 500, ...) -- body:t response body -- headers:M t t optional response headers type rsp{status:n;body:t} handler req:_>rsp p=req.path msg=+"Hello! You requested: " p rsp status:200 body:msg Use `req:_` (wildcard) for the request param type — the `Request` record is created by the ilo httpd runtime and its field types cannot be declared in the handler source without a `type` alias that re-exports them. The dot-access `req.path`, `req.method`, `req.body`, `req.headers` work because ilo resolves record field access by name at runtime. `Content-Type` defaults to `text/plain; charset=utf-8` when not set in the response headers map. Distinct from `ilo serv` (which speaks the agent-protocol JSON-RPC loop); `httpd` is for user-facing HTTP traffic. The handler file's `use` imports are resolved at startup, relative to the handler's own directory, matching `ilo run` / `ilo check` (ILO-481). A handler can split logic across sibling modules (`use "store.ilo"`) rather than inlining everything. A missing import surfaces a real diagnostic and the server refuses to start. **`ilo check --strict`.** Treats every warning-severity diagnostic (ILO-T032 bare `fmt`, ILO-T033 bare `mset` / `+=` / `mdel`, ILO-W002 `@x (jpar! …){…}` steering to `jpar-list!`, future warning codes) as a hard exit-code failure. The diagnostic stream itself is unchanged: warnings still emit with `severity: "warning"` in the JSON output, so editor integrations that route by severity stay correct. Only the exit code is elevated. CI harnesses that gate merges on `ilo check` should use `--strict` so warnings can't slip through silently; for interactive use, the default (warnings-are-advisory) is the right behaviour. **Default-run.** Inline programs (`ilo 'code'`) and single-function files run their entry function with the remaining CLI args; no explicit function name needed. Multi-function files auto-pick a function called `main` when no positional func arg is supplied. The same heuristic applies to the explicit engine flags - `--vm` and `--jit` both auto-pick `main` on multi-fn files, matching the default-engine behaviour. With no `main` declared, supply a function-name argument. **AOT entry-pick.** `ilo compile file.ilo -o out` (alias `ilo build`) follows the same entry-pick rules as the in-process engines: a single user-defined function is used directly; on multi-function files the entry is `main` if defined, otherwise the explicit positional `func` arg (`ilo compile file.ilo -o out run`); otherwise the compile fails with `ILO-E801` and exits 1 without writing a binary. AOT does not fall back to "first declared function" - that historical default produced binaries that called the wrong entry symbol and SIGSEGV'd at runtime. **Default engine.** The bytecode register VM is the default execution path. It supports every opcode (closures with Phase 2 capture, listview windows, fused len-of-filter, every modern shape), and avoids the JIT compile-and-bail cost paid by the pre-v0.11.9 Cranelift-first default whenever a program touched an opcode the JIT couldn't handle. Cranelift JIT is opt-in via `--jit`; on opt-in, the JIT runs hot numeric loops and falls back to the VM on bailout. Phase 2 captures run natively on every public backend - VM, JIT, and AOT (`ilo compile`); AOT embeds the postcard `CompiledProgram` blob into the binary's `.rodata` so dispatch helpers can re-enter the VM on user-fn callbacks the same way the in-process runners do. For long-running workloads where the JIT pays for itself, opt in explicitly; for most agent workloads the VM is the right default. **Tree-walker is internal-only.** The tree-walking interpreter is no longer user-selectable: `--run-tree` and its `--run` alias were removed from the public CLI in 0.12.1 (they now error with the unknown-flag guard). The interpreter stays in-tree as the dispatch target for HOF / regex / fmt-variadic / IO / sleep / ct / rsrt / closure-bind-ctx shapes the VM and Cranelift haven't lifted natively yet - the VM bails to it transparently for the ops listed by `is_tree_bridge_eligible` (`rgx`, `rgxall`, `rgxall1`, `rgxall-multi`, `rgxsub`, `fmt`, `fmt2`, `rd`, `rdb`, `rdjl`, `rdin`, `rdinl`, `for-line`, `sleep`, `lsd`, `walk`, `glob`, `dirname`, `basename`, `pathjoin`, `fsize`, `mtime`, `isfile`, `isdir`, `run`, `env-all`, `jkeys`, `tz-offset`, `ct` 2-arg and 3-arg, `rsrt` 2-arg and 3-arg, `dur-parse`, `dur-fmt`, and the closure-bind ctx variants of `map`/`flt`/`fld`/`srt`). Cross-engine parity for those shapes is pinned by `tests/regression_builtin_bridge.rs` and `tests/regression_tree_bridge_invariants.rs`. 0.13.0+ is on track for a hard drop once the bridge consumers are lifted natively and the shared runtime types (`Value`, `MapKey`, `RuntimeError`, math helpers) are extracted from `src/interpreter/` to a non-engine module. **Subcommand dispatch.** The first positional argument is interpreted as a function name when it has the shape of an ilo identifier - `[a-z][a-z0-9]*(-[a-z0-9]+)*` - so `ilo file.ilo list-orders` routes to the `list-orders` function. Args that don't match the ident shape (file paths like `/tmp/data.json`, numbers, sigils, bracketed lists, anything with a `.` or `/`) route to `main` (or the entry function) as a positional CLI arg instead. Trailing dashes (`foo-`), doubled dashes (`foo--bar`), and negative numbers (`-1`) are not idents and pass through as data. **Unknown `--flag` guard.** Any token in the positional tail matching the clean long-flag shape `--word` or `--word-with-dashes` that isn't a recognised flag is rejected upfront with `error: unrecognised flag '--'. Use 'ilo --help' for valid flags. To pass it as a literal arg, separate with '--' first.` and exit 1. This prevents `ilo main.ilo --engine tree` from silently consuming `--engine` as a positional arg (which used to surface as misleading `ILO-R012 no functions defined` or `ILO-R004 main: expected N args, got N+1`). To pass a hyphen-prefixed token through as literal data, place the `--` separator first: `ilo main.ilo -- --foo`. Anything after the first `--` is data. Tokens with `=` (`--key=val`), trailing or doubled dashes (`--foo-`, `--foo--bar`), and negative numbers (`-1`) are not clean flag shapes and pass through unchanged. **Text-typed params.** When the entry function declares a parameter of type `t`, the CLI passes the raw arg through without numeric coercion. `ilo 'f x:t>t;x' 42` returns the string `"42"`, not the number 42. **Exit codes.** A program returning `Value::Err` (or `^reason` from the entry function) exits with code 1 and prints the err payload on stderr. `~v` (Ok) and any non-Result return value exit 0. Verifier and parser errors exit 2. **List args from the CLI.** Comma-separated args become `L n` or `L t` automatically: `ilo 'f xs:L n>n;sum xs' 1,2,3`. +ERROR DIAGNOSTICS: ilo verifies programs before execution and reports errors with stable codes, source context, and suggestions. [Error codes] Every error has a stable `ILO-` code. The letter is the namespace - the phase that raised the diagnostic - so agents and tools can route on prefix without parsing the message. Numeric ranges are reserved per namespace with generous gaps, so future codes slot in cleanly and the contract is forward-compatible. `ILO-L000-099`=L=Lexer / tokenisation=active `ILO-P100-199`=P=Parser / syntax=active `ILO-N200-299`=N=Names / resolution=reserved `ILO-I300-399`=I=Imports=reserved `ILO-T400-499`=T=Types=active `ILO-V500-599`=V=Verifier (post-type checks)=reserved `ILO-R600-699`=R=Runtime=active `ILO-D700-799`=D=Deprecation warnings=reserved `ILO-E800-899`=E=Engine-specific limitations=reserved `ILO-S900-999`=S=Skill / spec system=reserved **Historical codes.** ilo shipped with flat numbering inside each namespace - `ILO-L001`, `ILO-P001`, `ILO-T001`, `ILO-R001`, `ILO-W001`, all starting at 001. Those codes remain valid forever. The hundreds-block allocation above applies to new codes from now on, and a cross-engine regression test asserts every emitted code lives in a documented range. **Reserved namespaces.** `N`, `I`, `V`, `D`, `E`, `S` carry no codes today. They are forward declarations so the first code in each category slots into its own range without conflicting with the active namespaces. `D` is earmarked for deprecation warnings: when a feature is scheduled for removal it emits an `ILO-D7xx` warning at compile time without failing the build. Use `--explain` to see a detailed explanation: ilo --explain ILO-T004 [Source context] Errors point at the relevant source location with a caret: error[ILO-T005]: undefined function 'foo' (called with 1 args) --> 1:9 1 | f x:n>n;foo x = note: in function 'f' = suggestion: did you mean 'f'? Parser, verifier, and runtime errors all show source spans. The verifier uses the enclosing statement span as the best available location for expression-level errors. [Suggestions] The verifier provides context-aware hints: **Did you mean?** - Levenshtein-based suggestions for undefined variables, functions, fields, and types **Type conversion** - suggests `str` for n→t, `num` for t→n **Missing arms** - lists uncovered match patterns with types **Arity** - shows expected parameter signature [Error output formats] --ansi / -a ANSI colour (default for TTY) --text / -t Plain text (no colour) --json / -j JSON (default for piped output) --no-hints / -nh Suppress idiomatic hints --silent / -s Suppress program stdout (mainly for --bench; see below) NO_COLOR=1 Disable colour (same as --text) **`--silent` / `-s`.** Suppresses the program's own stdout (`prnt`, `prnv`, `jprn`, etc.) for the duration of execution. Designed for `ilo --bench`: combined with `--json` it lets agent harnesses (e.g. persona cost rollup) consume the bench JSON envelope on stdout without it being drowned in the benchmarked function's own output. Stderr is never silenced, so genuine errors still surface. Diagnostic output (including the bench JSON envelope and the human-readable bench summary block) is always emitted on stdout regardless of `--silent` — the flag only redirects program-level prints. Unix only (no-op on Windows for the program-stdout half; bench output still reaches stdout there). JSON error output follows a structured schema with `severity`, `code`, `message`, `labels` (with spans), `notes`, and `suggestion` fields. Runtime errors raised from the Cranelift JIT (opt-in via `--jit`) populate `labels` with the source span of the failing operation, matching tree and VM behaviour. Span coverage threads through every JIT runtime helper (unwrap, panic-unwrap, list-get, slice, index, jpth, mget, record-field strict access, builtin dispatch, dynamic call); AOT-compiled binaries inherit the same coverage. Pre-v0.11.6 builds surfaced `{"labels":[]}` for these shapes - if you see an empty labels array on a runtime error, the binary is out of date. AOT binaries also install an async-signal-safe handler in `ilo_aot_init` that catches fatal signals (SIGSEGV, SIGBUS, SIGFPE, SIGILL, SIGABRT) and writes a single JSON line on stderr identifying the signal before the process terminates with the conventional 128+signo exit code. The diagnostic uses `ILO-R015` (AOT runtime fault). Without the handler, a hard fault inside compiled native code would leave the process with raw signal exit (e.g. 139 for SIGSEGV) and no diagnostic — agents driving ilo couldn't distinguish a clean non-zero exit from a hard fault. A SIGSEGV from an AOT binary is always a bug in ilo (codegen or runtime helper); file an issue with the source program and the JSON line. AOT binaries also install an async-signal-safe handler in `ilo_aot_init` that catches fatal signals (SIGSEGV, SIGBUS, SIGFPE, SIGILL, SIGABRT) and writes a single JSON line on stderr identifying the signal before the process terminates with the conventional 128+signo exit code. The diagnostic uses `ILO-R015` (AOT runtime fault). Without the handler, a hard fault inside compiled native code would leave the process with raw signal exit (e.g. 139 for SIGSEGV) and no diagnostic — agents driving ilo couldn't distinguish a clean non-zero exit from a hard fault. A SIGSEGV from an AOT binary is always a bug in ilo (codegen or runtime helper); file an issue with the source program and the JSON line. [Top-level program output] For a program whose entry function returns a Result, the `~`/`^` wrapper is split across streams and exit codes so shell callers do not have to strip a prefix: `~v` (Ok)=`v` (bare)=-=0 `^e` (Err)=-=`^e`=1 any non-Result=`v`=-=0 In `--json` mode the value is always wrapped (`{"schemaVersion": 1, "ok": v}` / `{"schemaVersion": 1, "error": {...}}`) and emitted to stdout; exit codes match the plain-mode table. The `schemaVersion` field was added in 0.12.1 to every CLI `--json` envelope (`run`, `graph`, `--ast`, `serv`, `tools --json`, `spec --json`) so agents can route on a single field across every command. See `JSON_OUTPUT.md` for the full audit table. **`-j` short alias (ILO-442).** Every subcommand that accepts `--json` also accepts the `-j` short form with identical behaviour: `ilo check -j file.ilo`, `ilo run -j 'code'`, `ilo spec -j ai`, `ilo skill -j list`, `ilo tools -j --mcp m.json`, `ilo version -j`, `ilo explain -j ILO-T001`, `ilo build -j prog.ilo`, etc. Manifesto P6 (every subcommand has `--json`) plus terser invocations for agent prompts. `Display` on `Value::Ok` / `Value::Err` still renders `~v` / `^e` in every other context (nested values, `prnt`, REPL prompts, error messages, debug output) - only the top-level program-return print path is split. The contract applies uniformly to in-process runners (`ilo prog.ilo`, `--vm`, `--jit`) and to AOT-compiled standalone binaries from `ilo compile`. Both strip the top-level `~`/`^` wrapper on stdout, route `^e` to stderr, and use the same exit codes - output is byte-for-byte identical across every backend. **Auto-echo suppression for `prnt` + status sentinel.** When the entry function has at least one *unconditional top-level* `prnt` call AND the tail expression is a bare wrapped string literal (`~"text"` or `^"text"`), the top-level auto-echo is suppressed. The wrapped literal is treated as a status sentinel rather than a value the caller wants captured. Without this rule, a function shaped like `m>R t t;prnt "report";~"ok"` emits `report\nok\n` on stdout and shell callers piping the output have to strip the trailing `ok`. The rule does NOT fire when (a) there is no `prnt` in the body — `m>R t t;~"ok"` still prints `ok` because the wrapped literal IS the program's output (the `cli-tasks-save-ok.ilo` pattern); (b) the `prnt` is nested inside a guard, loop, or match arm — those are conditional and the `prnt` may never run; (c) the tail is `~v` where `v` is a binding or call — that's a real return value. `^"text"` errors still go to stderr with exit 1; the suppression rule never silently swallows an Err. Pinned by `tests/regression_tilde_str_noecho.rs` and `examples/tilde-str-noecho.ilo`. [Idiomatic hints] After successful execution, ilo scans the source for non-canonical forms and emits hints to stderr: hint: `==` → `=` saves 1 char (both mean equality in ilo) hint: `length` → `len` (canonical short form) Builtin alias hints appear at most once per program (the first long-form name found). In JSON mode, hints appear as `{"hints":["..."]}` on stderr. Suppress with `--no-hints` / `-nh`. [CLI invocation] ilo 'code' [args...] -- inline program; default-runs the entry function ilo program.ilo [func] [args] -- if `func` is omitted and the file declares exactly one function, that function runs automatically ilo run program.ilo [func] [a] -- verb form; same dispatch as the bare positional ilo check program.ilo [--json|-j] [--strict] -- run the verifier without executing (exit 0 = clean; --strict treats warnings as exit-code errors) ilo test [path] [--engine vm|jit|all] -- run `-- run:` / `-- out:` / `-- err:` assertions in .ilo files (exit 0 on all-pass, 1 on any failure) ilo build program.ilo -o out -- AOT compile to a standalone binary (alias for `compile`) ilo run program.ilo --emit js -- transpile to JavaScript and print to stdout (PR #713, ILO-73) ilo run program.ilo --emit python -- transpile to Python and print to stdout ilo program.ilo --ast -- print parsed AST as JSON and exit ilo --explain ILO-T004 -- print error explanation and exit ilo help ai -- compact AI spec to stdout (= contents of ai.txt) ilo serv -- long-lived JSON request/response loop ilo httpd handler.ilo [--port N] -- HTTP server: calls handler fn per request (default port 8080) ilo --max-ast-depth N -- cap parser nesting at N (default 256; protects `ilo serv` and other untrusted-source paths from DoS payloads, raises ILO-P103) ilo --max-runtime SECS -- cap wall-clock runtime at SECS (default 60; 0 disables; raises ILO-R016) ilo --max-output-bytes BYTES -- cap stdout output at BYTES (default ~100 MB; 0 disables; raises ILO-R017) ilo run --allow-net[=HOSTS] -- restrict outbound net to comma-separated hosts (* = all, empty = none) ilo run --allow-read[=PATHS] -- restrict file reads to comma-separated path prefixes ilo run --allow-write[=PATHS] -- restrict file writes to comma-separated path prefixes ilo run --allow-run[=CMDS] -- restrict subprocess spawning to comma-separated command names **Capability flags (`ILO-CAP-001`).** `ilo run --allow-net=HOSTS --allow-read=PATHS --allow-write=PATHS --allow-run=CMDS` gates IO builtins at the process level. Any `--allow-*` flag present switches the runtime from **permissive** (default — no restrictions, full backwards compatibility) to **restricted** (only listed targets are permitted). Denial returns a normal `R` Err value with code `ILO-CAP-001`; programs can pattern-match it. Capability matrix: `get`/`post`/`put`/`patch`/`del`/`fetch` → `--allow-net`; `rd`/`rd-lines`/`ls`/`lsr` → `--allow-read`; `wr`/`wr-lines`/`wr-app` → `--allow-write`; `run`/`run2` → `--allow-run`. Value syntax: omit = unrestricted; `*` = all permitted; empty (`--allow-net=`) = all blocked; comma list = only those targets. Matching: net = hostname extracted from URL, exact or `*.domain` wildcard; read/write = path-prefix with separator boundary; run = basename or full-path match. See `SANDBOX.md` for the operator guide and `examples/capability-sandbox.ilo` for a runnable demo. **Production-safety guards (`ILO-R016`, `ILO-R017`).** `ilo run` caps wall-clock runtime at 60 s and stdout output at ~100 MB by default. A runaway loop (missing increment, recursion with no base case) aborts with `ILO-R016` once the time budget hits, instead of burning CPU forever; a `prnt` loop without termination aborts with `ILO-R017` once the byte budget hits, instead of filling the agent transcript with megabytes of garbage. Both guards write a structured diagnostic to stderr and exit 1. Defaults are well above any legitimate program (real agent tasks finish under 10 s and produce kilobytes); raise with `--max-runtime SECS` / `--max-output-bytes BYTES`, set either to `0` to disable. The guards were installed by the mandelbrot persona report (2026-05-20) which spun in an infinite loop and wrote 165 MB of stdout before the harness intervened. **Verb-noun aliases.** `ilo run ` is an exact alias for the bare positional `ilo ` - same dispatch, same engine selection, same arg handling. `ilo build -o ` is an alias for `ilo compile -o `. Both exist to match the toolchain conventions used by `cargo`, `go`, and `zero` so agents and humans can guess the command name without consulting the help text. The bare positional forms remain fully supported for backwards compatibility; nothing has been removed. **`ilo check`.** Standalone verifier invocation: lex, parse, resolve imports, and run the type verifier without proceeding to bytecode compilation or execution. Exit code 0 means the program is well-typed and verifier-clean; exit code 1 means at least one diagnostic was emitted on stderr. The output mode follows the global flags (`--json` for NDJSON diagnostics, `--text` for plain text, `--ansi` for coloured output; auto-detected when omitted - JSON when stderr is not a TTY, ANSI otherwise). `ilo check` works on both files and inline code; on a syntactically-broken input it still reports the parse error rather than crashing, which is important for editor and agent loops that may feed in half-written programs. **`ilo test`.** Runs the `-- run: ` / `-- out: ` (or `-- err: `) annotations embedded in `.ilo` source files - the same format the in-tree `tests/examples_engines.rs` integration harness already uses. A file path tests that one file; a directory walks `*.ilo` recursively. Each case runs as a subprocess (`ilo --vm `), output is asserted against the expected payload, and the result prints as `PASS path::fn (line N)` / `FAIL path::fn (line N) (got: X, want: Y)`. The final line reports `N passed, M failed`. Exit 0 if everything passed, 1 if any case failed or no annotations were found. The default engine is `--vm`; pass `--engine jit` or `--engine all` to widen the matrix. Per-file `-- engine-skip: vm jit` annotations skip the listed engines, matching the integration harness. Because every example under `examples/` uses this annotation format already, `ilo test examples/` doubles as a smoke test for the language itself and as a worked reference an agent can read when writing tests for its own programs. **`ilo httpd`.** Starts an HTTP/1.1 server that calls a user-defined ilo handler function for every incoming request. The handler receives a `Request` record and must return a `Response` record (or a bare record with at least `status` and `body` fields). One OS thread is spawned per accepted connection. The handler is loaded once at startup; re-reads require a restart. ilo httpd handler.ilo -- serve on :8080 (default) ilo httpd --port 3000 handler.ilo -- serve on :3000 ilo httpd handler.ilo myhandler -- call function `myhandler` instead of `handler` Handler signature: -- Request fields injected by ilo httpd at runtime: -- method:t HTTP verb (GET, POST, ...) -- path:t request path including query string -- headers:M t t request headers (keys lowercased) -- body:t request body (empty string when absent) -- -- Response fields read by ilo httpd: -- status:n HTTP status code (200, 404, 500, ...) -- body response body: t (buffered), L t (eager chunked), -- or a lazy iterator (get-stream / for-line stdin) for -- true incremental streaming -- headers:M t t optional response headers type rsp{status:n;body:t} handler req:_>rsp p=req.path msg=+"Hello! You requested: " p rsp status:200 body:msg Use `req:_` (wildcard) for the request param type — the `Request` record is created by the ilo httpd runtime and its field types cannot be declared in the handler source without a `type` alias that re-exports them. The dot-access `req.path`, `req.method`, `req.body`, `req.headers` work because ilo resolves record field access by name at runtime. `Content-Type` defaults to `text/plain; charset=utf-8` when not set in the response headers map. Distinct from `ilo serv` (which speaks the agent-protocol JSON-RPC loop); `httpd` is for user-facing HTTP traffic. The handler file's `use` imports are resolved at startup, relative to the handler's own directory, matching `ilo run` / `ilo check` (ILO-481). A handler can split logic across sibling modules (`use "store.ilo"`) rather than inlining everything. A missing import surfaces a real diagnostic and the server refuses to start. The response `body` field may take three shapes (ILO-482): * `t` — a plain string, sent with `Content-Length` (the default). * `L t` — a list of strings, sent eagerly with `Transfer-Encoding: chunked`: each element becomes one chunk. The list is materialised before the first byte is written. * a lazy line iterator (`get-stream`/`pst-stream`, `for-line stdin`) — sent with `Transfer-Encoding: chunked` **lazily**: each line the iterator yields is written and flushed as its own chunk, so the handler can hold the connection open and emit chunks as they are produced (SSE, long-poll, tailing a growing source) without buffering the whole body first. If the client disconnects mid-stream the connection thread exits cleanly. A zero-arg `body` function (`FnRef`/closure) is called first and may itself return any of the three shapes. **`ilo check --strict`.** Treats every warning-severity diagnostic (ILO-T032 bare `fmt`, ILO-T033 bare `mset` / `+=` / `mdel`, ILO-W002 `@x (jpar! …){…}` steering to `jpar-list!`, future warning codes) as a hard exit-code failure. The diagnostic stream itself is unchanged: warnings still emit with `severity: "warning"` in the JSON output, so editor integrations that route by severity stay correct. Only the exit code is elevated. CI harnesses that gate merges on `ilo check` should use `--strict` so warnings can't slip through silently; for interactive use, the default (warnings-are-advisory) is the right behaviour. **Default-run.** Inline programs (`ilo 'code'`) and single-function files run their entry function with the remaining CLI args; no explicit function name needed. Multi-function files auto-pick a function called `main` when no positional func arg is supplied. The same heuristic applies to the explicit engine flags - `--vm` and `--jit` both auto-pick `main` on multi-fn files, matching the default-engine behaviour. With no `main` declared, supply a function-name argument. **AOT entry-pick.** `ilo compile file.ilo -o out` (alias `ilo build`) follows the same entry-pick rules as the in-process engines: a single user-defined function is used directly; on multi-function files the entry is `main` if defined, otherwise the explicit positional `func` arg (`ilo compile file.ilo -o out run`); otherwise the compile fails with `ILO-E801` and exits 1 without writing a binary. AOT does not fall back to "first declared function" - that historical default produced binaries that called the wrong entry symbol and SIGSEGV'd at runtime. **Default engine.** The bytecode register VM is the default execution path. It supports every opcode (closures with Phase 2 capture, listview windows, fused len-of-filter, every modern shape), and avoids the JIT compile-and-bail cost paid by the pre-v0.11.9 Cranelift-first default whenever a program touched an opcode the JIT couldn't handle. Cranelift JIT is opt-in via `--jit`; on opt-in, the JIT runs hot numeric loops and falls back to the VM on bailout. Phase 2 captures run natively on every public backend - VM, JIT, and AOT (`ilo compile`); AOT embeds the postcard `CompiledProgram` blob into the binary's `.rodata` so dispatch helpers can re-enter the VM on user-fn callbacks the same way the in-process runners do. For long-running workloads where the JIT pays for itself, opt in explicitly; for most agent workloads the VM is the right default. **Tree-walker is internal-only.** The tree-walking interpreter is no longer user-selectable: `--run-tree` and its `--run` alias were removed from the public CLI in 0.12.1 (they now error with the unknown-flag guard). The interpreter stays in-tree as the dispatch target for HOF / regex / fmt-variadic / IO / sleep / ct / rsrt / closure-bind-ctx shapes the VM and Cranelift haven't lifted natively yet - the VM bails to it transparently for the ops listed by `is_tree_bridge_eligible` (`rgx`, `rgxall`, `rgxall1`, `rgxall-multi`, `rgxsub`, `fmt`, `fmt2`, `rd`, `rdb`, `rdjl`, `rdin`, `rdinl`, `for-line`, `sleep`, `lsd`, `walk`, `glob`, `dirname`, `basename`, `pathjoin`, `fsize`, `mtime`, `isfile`, `isdir`, `run`, `env-all`, `jkeys`, `tz-offset`, `ct` 2-arg and 3-arg, `rsrt` 2-arg and 3-arg, `dur-parse`, `dur-fmt`, and the closure-bind ctx variants of `map`/`flt`/`fld`/`srt`). Cross-engine parity for those shapes is pinned by `tests/regression_builtin_bridge.rs` and `tests/regression_tree_bridge_invariants.rs`. 0.13.0+ is on track for a hard drop once the bridge consumers are lifted natively and the shared runtime types (`Value`, `MapKey`, `RuntimeError`, math helpers) are extracted from `src/interpreter/` to a non-engine module. **Subcommand dispatch.** The first positional argument is interpreted as a function name when it has the shape of an ilo identifier - `[a-z][a-z0-9]*(-[a-z0-9]+)*` - so `ilo file.ilo list-orders` routes to the `list-orders` function. Args that don't match the ident shape (file paths like `/tmp/data.json`, numbers, sigils, bracketed lists, anything with a `.` or `/`) route to `main` (or the entry function) as a positional CLI arg instead. Trailing dashes (`foo-`), doubled dashes (`foo--bar`), and negative numbers (`-1`) are not idents and pass through as data. **Unknown `--flag` guard.** Any token in the positional tail matching the clean long-flag shape `--word` or `--word-with-dashes` that isn't a recognised flag is rejected upfront with `error: unrecognised flag '--'. Use 'ilo --help' for valid flags. To pass it as a literal arg, separate with '--' first.` and exit 1. This prevents `ilo main.ilo --engine tree` from silently consuming `--engine` as a positional arg (which used to surface as misleading `ILO-R012 no functions defined` or `ILO-R004 main: expected N args, got N+1`). To pass a hyphen-prefixed token through as literal data, place the `--` separator first: `ilo main.ilo -- --foo`. Anything after the first `--` is data. Tokens with `=` (`--key=val`), trailing or doubled dashes (`--foo-`, `--foo--bar`), and negative numbers (`-1`) are not clean flag shapes and pass through unchanged. **Text-typed params.** When the entry function declares a parameter of type `t`, the CLI passes the raw arg through without numeric coercion. `ilo 'f x:t>t;x' 42` returns the string `"42"`, not the number 42. **Exit codes.** A program returning `Value::Err` (or `^reason` from the entry function) exits with code 1 and prints the err payload on stderr. `~v` (Ok) and any non-Result return value exit 0. Verifier and parser errors exit 2. **List args from the CLI.** Comma-separated args become `L n` or `L t` automatically: `ilo 'f xs:L n>n;sum xs' 1,2,3`. FORMATTER: Dense output is the default - newlines are for humans, not agents. No flag needed for dense format: ilo 'code' Dense wire format (default) ilo 'code' --dense / -d Same, explicit ilo 'code' --expanded / -e Expanded human format (for code review) [Dense format] Single line per declaration, minimal whitespace. Operators glue to first operand: cls sp:n>t;>=sp 1000{"gold"};>=sp 500{"silver"};"bronze" [Expanded format] Multi-line with 2-space indentation. Operators spaced from operands: cls sp:n > t >= sp 1000 { "gold" } >= sp 500 { "silver" } "bronze" Dense format is canonical - `dense(parse(dense(parse(src)))) == dense(parse(src))`. COMPLETE EXAMPLE: tool get-user"Retrieve user by ID" uid:t>R profile t timeout:5,retry:2 tool send-email"Send an email" to:t subject:t body:t>R _ t timeout:10,retry:1 type profile{id:t;name:t;email:t;verified:b} ntf uid:t msg:t>R _ t;get-user uid;?{^e:^+"Lookup failed: "e;~d:!d.verified{^"Email not verified"};send-email d.email "Notification" msg;?{^e:^+"Send failed: "e;~_:~_}} [Recursive Example] Factorial and Fibonacci as standalone functions: fac n:n>n;<=n 1 1;r=fac -n 1;*n r fib n:n>n;<=n 1 n;a=fib -n 1;b=fib -n 2;+a b STABILITY: See STABILITY.md at repo root for the per-surface stability matrix. Three tiers: stable (schemaVersion:1 envelope, ILO-error-codes, serv-protocol-phases, file-version-pragma, manifesto-principles, reserved-name-policy), provisional (builtin-signatures, cli-flag-names, error-message-prose, examples-corpus, ilo-test-surface), experimental (0.13-in-flight-features, aot-artifact-format, cranelift-jit-internals, extensions-dir, cargo-feature-flags). Stable surfaces are safe to pin across releases. Provisional surfaces carry a deprecation-window guarantee. Experimental surfaces may disappear without notice. `ilo spec --json ai` surfaces this matrix in the `stability` field of the JSON envelope, and per-item stability annotations on every builtin in the `builtins` array. diff --git a/docs/streaming.md b/docs/streaming.md index 5aef888c..73f929d8 100644 --- a/docs/streaming.md +++ b/docs/streaming.md @@ -54,6 +54,46 @@ relative to the handler's own directory, exactly like `ilo run` and (`use "store.ilo"`) instead of inlining everything in one file. A missing import surfaces a real diagnostic and the server refuses to start. +### Response body shapes + +The response `body` field takes one of three shapes: + +| Body value | Wire format | Buffering | +|---|---|---| +| `t` (string) | `Content-Length` | n/a (already a string) | +| `L t` (list) | `Transfer-Encoding: chunked` | list materialised before first byte | +| lazy iterator | `Transfer-Encoding: chunked` | **none** — streamed line by line | + +A lazy iterator body (ILO-482) is the SSE / long-poll / file-tail path: +each line the iterator yields is written and flushed as its own chunk as +soon as the handler produces it, so the connection can be held open +indefinitely and the body is never fully buffered. Any value that drains +lazily works as the body: + +```ilo +type rsp{status:n;body:_} + +-- Proxy an upstream SSE / chunked source straight through, streaming. +handler req:_>rsp + rsp status:200 body:(get-stream "http://localhost:7778/events/stream") +``` + +`for-line stdin` works the same way (tail stdin line by line). If the +client disconnects mid-stream the connection thread drops the iterator +(closing any upstream connection / open file) and exits cleanly with no +panic. A zero-arg `body` function is called first and may itself return +any of the three shapes. + +> **Note (`get-stream` granularity).** `get-stream`'s underlying reader +> fills a 16 KiB backing buffer before yielding a line, so when proxying +> an upstream whose total body is smaller than that, lines can arrive in +> one batch rather than one at a time. The httpd plumbing itself streams +> per line; the batching is a `get-stream` buffer-size artifact tracked +> as a follow-up. A `tail-file` source (a lazy `tail -f` over a growing +> file, which crew's `/events/stream` needs) is the other follow-up. + +Reference: `examples/httpd-stream.ilo`. + ## Buffered HTTP (unchanged) For request/response patterns where the entire body is small and easy to @@ -83,3 +123,4 @@ For stdin streaming, see `for-line stdin` (ILO-70). * ILO-379 — chunked transfer-encoding for `ilo httpd` * ILO-448 — client-side HTTP streaming builtins (`get-stream`, `pst-stream`, …) * ILO-481 — `ilo httpd` resolves `use` imports in handler files +* ILO-482 — `ilo httpd` lazy streaming response body (handler-driven SSE) diff --git a/examples/httpd-stream.ilo b/examples/httpd-stream.ilo new file mode 100644 index 00000000..107e01ae --- /dev/null +++ b/examples/httpd-stream.ilo @@ -0,0 +1,25 @@ +-- ilo httpd lazy streaming response body (ILO-482). +-- +-- A handler whose `body` field is a lazy line iterator streams each line to +-- the client as it is produced, instead of buffering the whole body first. +-- This lets a handler hold the connection open for SSE / long-poll / a tail +-- of a growing source. The eager `L t` body (a list of chunks) still works +-- unchanged; the lazy body is an additive response shape. +-- +-- Any value that drains lazily can be the body: +-- * `get-stream url` proxy an upstream chunked / SSE response (shown here) +-- * `for-line stdin` tail stdin line by line +-- +-- Run it: +-- ilo httpd examples/httpd-stream.ilo --port 7777 +-- curl -N http://localhost:7777/ # -N: no client-side buffering +-- +-- Each upstream line is re-emitted as its own chunk and flushed immediately, +-- so `curl -N` prints lines as the upstream produces them rather than all at +-- once at the end. There is no `-- run:` assertion because it needs a live +-- server; see tests/httpd_streaming.rs for runnable coverage. + +type rsp{status:n;body:_} + +handler req:_>rsp + rsp status:200 body:(get-stream "http://localhost:7778/events/stream") diff --git a/src/interpreter/mod.rs b/src/interpreter/mod.rs index f3708e79..3cf14f32 100644 --- a/src/interpreter/mod.rs +++ b/src/interpreter/mod.rs @@ -372,6 +372,27 @@ impl HttpLinesHandle { } } +/// A backend-agnostic lazy line source, pulled one line at a time. +/// +/// Wraps the two existing pull-based iterators ([`StdinLinesHandle`], +/// [`HttpLinesHandle`]) behind a single `next_line()` so callers outside the +/// interpreter (e.g. `ilo httpd`'s streaming response body, ILO-482) can drain +/// a handler-returned lazy body without caring which source produced it. +pub enum LazyLines { + Stdin(StdinLinesHandle), + Http(HttpLinesHandle), +} + +impl LazyLines { + /// Pull the next line from the underlying iterator, or `None` at end. + pub fn next_line(&self) -> Option> { + match self { + LazyLines::Stdin(h) => h.next_line(), + LazyLines::Http(h) => h.next_line(), + } + } +} + #[derive(Debug, Clone, PartialEq)] pub enum Value { Number(f64), diff --git a/src/main.rs b/src/main.rs index 387c3d38..9d0921d9 100644 --- a/src/main.rs +++ b/src/main.rs @@ -1179,10 +1179,18 @@ fn handle_http_connection( other => other, }; - // Body shape: either a plain string or a list of chunks for chunked transfer. + // Body shape: either a plain string, an eagerly-collected list of chunks, + // or a lazy line iterator whose chunks are pulled and written one at a + // time (true streaming / SSE, ILO-482). enum BodyShape { Plain(String), Chunked(Vec), + /// A pull-based line iterator. Each `next_line()` is written as its + /// own chunk and flushed immediately, so a handler can hold the + /// connection open and emit chunks as they are produced (e.g. tailing + /// a growing file, proxying an upstream SSE source) without buffering + /// the whole body first. + Lazy(interpreter::LazyLines), } let (status, resp_headers, body_shape) = match &resp { @@ -1197,6 +1205,16 @@ fn handle_http_connection( // FnRef/Closure → call it (no args) expecting a List, then chunk let body_shape = match fields.get("body") { Some(Value::Text(s)) => BodyShape::Plain((**s).clone()), + // Lazy line iterators (ILO-482): a handler returning + // `get-stream`/`for-line stdin` (or, once it lands, a + // file-tail iterator) gets each line written as its own chunk + // as the iterator yields, with no full-body buffering. + Some(Value::LazyHttpLines(h)) => { + BodyShape::Lazy(interpreter::LazyLines::Http(h.clone())) + } + Some(Value::LazyStdinLines(h)) => { + BodyShape::Lazy(interpreter::LazyLines::Stdin(h.clone())) + } Some(Value::List(items)) => { let chunks = items.iter().map(|v| v.to_string()).collect(); BodyShape::Chunked(chunks) @@ -1207,6 +1225,12 @@ fn handle_http_connection( let chunks = items.iter().map(|v| v.to_string()).collect(); BodyShape::Chunked(chunks) } + Ok(Value::LazyHttpLines(h)) => { + BodyShape::Lazy(interpreter::LazyLines::Http(h)) + } + Ok(Value::LazyStdinLines(h)) => { + BodyShape::Lazy(interpreter::LazyLines::Stdin(h)) + } Ok(other) => BodyShape::Plain(other.to_string()), Err(e) => BodyShape::Plain(format!("chunk-fn error: {}", e)), } @@ -1217,6 +1241,12 @@ fn handle_http_connection( let chunks = items.iter().map(|v| v.to_string()).collect(); BodyShape::Chunked(chunks) } + Ok(Value::LazyHttpLines(h)) => { + BodyShape::Lazy(interpreter::LazyLines::Http(h)) + } + Ok(Value::LazyStdinLines(h)) => { + BodyShape::Lazy(interpreter::LazyLines::Stdin(h)) + } Ok(other) => BodyShape::Plain(other.to_string()), Err(e) => BodyShape::Plain(format!("chunk-fn error: {}", e)), } @@ -1303,6 +1333,66 @@ fn handle_http_connection( // Terminating chunk writer.write_all(b"0\r\n\r\n")?; } + BodyShape::Lazy(lines) => { + // Lazy streaming body (ILO-482): write the chunked header block, + // then pull each line from the iterator and flush it as its own + // chunk so the client sees data as soon as the handler produces + // it. The body is never fully buffered, so the connection can be + // held open indefinitely (SSE, long-poll, file tail). + let mut header_block = format!("HTTP/1.1 {} {}\r\n", status, status_text); + if !has_content_type { + header_block.push_str("Content-Type: text/plain; charset=utf-8\r\n"); + } + for (k, v) in &resp_headers { + header_block.push_str(&format!("{}: {}\r\n", k, v)); + } + header_block.push_str("Transfer-Encoding: chunked\r\n"); + header_block.push_str("Connection: close\r\n"); + header_block.push_str("\r\n"); + writer.write_all(header_block.as_bytes())?; + writer.flush()?; + + loop { + match lines.next_line() { + Some(Ok(line)) => { + // Re-attach the newline the line iterator strips, so a + // client doing line-oriented reads (SSE) sees a record + // boundary per chunk. + let mut data = line.into_bytes(); + data.push(b'\n'); + // A failed write means the client hung up mid-stream. + // Drop the iterator (closing any upstream connection / + // file) and exit the thread cleanly rather than + // panicking. + if writer + .write_all(format!("{:x}\r\n", data.len()).as_bytes()) + .and_then(|_| writer.write_all(&data)) + .and_then(|_| writer.write_all(b"\r\n")) + .and_then(|_| writer.flush()) + .is_err() + { + eprintln!( + "{} {} {} -> {} (client disconnected)", + peer, method, path, status + ); + return Ok(()); + } + } + Some(Err(e)) => { + // Mid-stream read error from the source. Best effort: + // close the chunked stream and stop. + eprintln!("stream read error: {}", e); + let _ = writer.write_all(b"0\r\n\r\n"); + let _ = writer.flush(); + return Ok(()); + } + None => break, + } + } + // Terminating chunk. + let _ = writer.write_all(b"0\r\n\r\n"); + let _ = writer.flush(); + } } eprintln!("{} {} {} -> {}", peer, method, path, status); diff --git a/tests/httpd_imports.rs b/tests/httpd_imports.rs index a16a4a22..626921c2 100644 --- a/tests/httpd_imports.rs +++ b/tests/httpd_imports.rs @@ -8,10 +8,9 @@ //! reachable over HTTP, that a missing import surfaces a diagnostic, and that //! plain single-file handlers still work. -use std::io::{Read, Write}; +use std::io::{BufRead, BufReader, Read, Write}; use std::net::TcpStream; use std::process::{Child, Command}; -use std::time::{Duration, Instant}; fn ilo() -> Command { Command::new(env!("CARGO_BIN_EXE_ilo")) @@ -23,10 +22,17 @@ fn free_port() -> u16 { listener.local_addr().expect("local_addr").port() } -/// Spawn `ilo httpd --port ` and wait until the port accepts -/// connections (or time out). Returns the child so the caller can kill it. +/// Spawn `ilo httpd --port ` and wait until it logs that it is +/// listening (or time out). Returns the child so the caller can kill it. +/// +/// Readiness is detected by reading the child's stderr for the +/// `ilo httpd listening on` line, NOT by probing the port with a TCP connect. +/// A raw connect probe is itself an accepted connection that httpd dispatches +/// to a handler thread; under load that spurious startup request races the +/// real test request (ILO-505). Waiting on the log line avoids running the +/// handler during startup at all. fn spawn_httpd(handler: &std::path::Path, port: u16) -> Child { - let child = ilo() + let mut child = ilo() .args([ "httpd", "--port", @@ -38,13 +44,18 @@ fn spawn_httpd(handler: &std::path::Path, port: u16) -> Child { .spawn() .expect("spawn ilo httpd"); - // Poll the port until it's listening. - let deadline = Instant::now() + Duration::from_secs(10); - while Instant::now() < deadline { - if TcpStream::connect(("127.0.0.1", port)).is_ok() { - return child; + // Read stderr until the server logs that it is listening, rather than + // probing the port (which would trigger a spurious startup handler call). + let stderr = child.stderr.take().expect("piped stderr"); + let mut reader = BufReader::new(stderr); + for _ in 0..100 { + let mut line = String::new(); + if reader.read_line(&mut line).unwrap_or(0) == 0 { + break; // EOF: server exited before logging readiness + } + if line.contains("listening on") { + break; } - std::thread::sleep(Duration::from_millis(50)); } child } diff --git a/tests/httpd_streaming.rs b/tests/httpd_streaming.rs new file mode 100644 index 00000000..3b9a042e --- /dev/null +++ b/tests/httpd_streaming.rs @@ -0,0 +1,326 @@ +//! Integration tests for `ilo httpd`'s lazy streaming response body (ILO-482). +//! +//! Before ILO-482 `handle_http_connection` materialised the entire response +//! body before writing the first byte, so a handler could not hold a +//! connection open and emit chunks as they were produced (true SSE / +//! long-poll). These tests prove a handler returning a lazy line iterator +//! (`get-stream`, here proxying a test-controlled slow upstream) streams each +//! chunk incrementally, that the eager `L t` body still works unchanged, and +//! that a client disconnecting mid-stream does not panic the server. + +use std::io::{BufRead, BufReader, Read, Write}; +use std::net::{TcpListener, TcpStream}; +use std::process::{Child, Command}; +use std::sync::mpsc; +use std::time::{Duration, Instant}; + +fn ilo() -> Command { + Command::new(env!("CARGO_BIN_EXE_ilo")) +} + +/// Pick a free TCP port by binding to :0 and reading back the assigned port. +fn free_port() -> u16 { + let listener = TcpListener::bind("127.0.0.1:0").expect("bind ephemeral port"); + listener.local_addr().expect("local_addr").port() +} + +/// Spawn `ilo httpd --port ` and wait until it logs that it is +/// listening (or time out). Returns the child so the caller can kill+wait it. +/// +/// Readiness is detected by reading the child's stderr for the +/// `ilo httpd listening on` line, NOT by probing the port with a TCP connect. +/// A raw connect probe is itself an accepted connection: httpd spawns a handler +/// thread for it, and for a handler that proxies an upstream via `get-stream` +/// that thread consumes the upstream's single `accept()` before the real test +/// request ever arrives, so the test sees an empty body. Waiting on the log +/// line avoids triggering the handler during startup. +fn spawn_httpd(handler: &std::path::Path, port: u16) -> Child { + let mut child = ilo() + .args([ + "httpd", + "--port", + &port.to_string(), + handler.to_str().unwrap(), + ]) + .stderr(std::process::Stdio::piped()) + .stdout(std::process::Stdio::piped()) + .spawn() + .expect("spawn ilo httpd"); + + // Read stderr until the server logs that it is listening. The pipe read + // blocks, so a server that never starts is bounded by the test runner's + // own timeout; we cap the line count as a belt-and-braces fallback. + let stderr = child.stderr.take().expect("piped stderr"); + let mut reader = BufReader::new(stderr); + for _ in 0..100 { + let mut line = String::new(); + if reader.read_line(&mut line).unwrap_or(0) == 0 { + break; // EOF: server exited before logging readiness + } + if line.contains("listening on") { + break; + } + } + child +} + +/// A tiny test-controlled upstream that responds to one GET with a chunked +/// body, emitting `n` lines spaced `gap` apart. Each emitted line is also sent +/// on `tx` so the test can observe production timing. Runs on its own thread; +/// returns the bound port. +/// +/// Each event line is padded past 16 KiB on purpose. The handler proxies this +/// upstream with `get-stream`, whose underlying `minreq::ResponseLazy::read` +/// fills the wrapping `BufReader`'s 16 KiB backing buffer before returning a +/// line. With sub-buffer payloads that read drains the whole (small) response +/// in one go, masking the lazy path; padding past the buffer forces one line +/// per `read`, so the proxy genuinely yields each event as it arrives and the +/// test measures the streaming plumbing rather than minreq's buffer size. +/// (The buffer-granularity quirk in `get-stream` is tracked as a follow-up; +/// `tail-file` is the source crew's `/events/stream` ultimately needs.) +const PAD: usize = 20_000; + +fn spawn_slow_upstream(n: usize, gap: Duration, tx: mpsc::Sender) -> u16 { + let listener = TcpListener::bind("127.0.0.1:0").expect("bind upstream"); + let port = listener.local_addr().unwrap().port(); + std::thread::spawn(move || { + // Serve exactly one connection then return. + if let Ok((mut sock, _)) = listener.accept() { + // Drain the request headers. + let mut reader = BufReader::new(sock.try_clone().unwrap()); + loop { + let mut line = String::new(); + if reader.read_line(&mut line).unwrap_or(0) == 0 { + break; + } + if line == "\r\n" || line == "\n" { + break; + } + } + let header = "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nTransfer-Encoding: chunked\r\nConnection: close\r\n\r\n"; + if sock.write_all(header.as_bytes()).is_err() { + return; + } + let _ = sock.flush(); + for i in 0..n { + let data = format!("event-{i}-{}\n", "x".repeat(PAD)); + let chunk = format!("{:x}\r\n{}\r\n", data.len(), data); + if sock.write_all(chunk.as_bytes()).is_err() { + return; + } + let _ = sock.flush(); + let _ = tx.send(i); + std::thread::sleep(gap); + } + let _ = sock.write_all(b"0\r\n\r\n"); + let _ = sock.flush(); + } + }); + port +} + +/// A handler returning a lazy body (`get-stream` over a slow upstream) streams +/// each chunk to the client as the upstream produces it — the client reads an +/// early chunk before the upstream has emitted the final one, proving no full +/// buffering. +#[test] +fn lazy_body_streams_incrementally() { + let dir = tempfile::tempdir().expect("tempdir"); + + // Slow upstream: 4 events, 200ms apart. + let (tx, upstream_rx) = mpsc::channel(); + let upstream_port = spawn_slow_upstream(4, Duration::from_millis(200), tx); + + // Handler proxies the upstream stream as its lazy response body. + let handler_src = format!( + "type rsp{{status:n;body:_}}\nhandler req:_>rsp\n rsp status:200 body:(get-stream \"http://127.0.0.1:{upstream_port}/\")\n" + ); + std::fs::write(dir.path().join("handler.ilo"), handler_src).expect("write handler"); + + let port = free_port(); + let mut child = spawn_httpd(&dir.path().join("handler.ilo"), port); + + // Open the connection and read incrementally. + let mut stream = TcpStream::connect(("127.0.0.1", port)).expect("connect to httpd"); + stream + .set_read_timeout(Some(Duration::from_secs(10))) + .unwrap(); + stream + .write_all(b"GET / HTTP/1.1\r\nHost: localhost\r\nConnection: close\r\n\r\n") + .expect("write request"); + stream.flush().ok(); + + let mut reader = BufReader::new(stream); + + // Read the response status + headers. + let mut saw_chunked = false; + loop { + let mut line = String::new(); + reader.read_line(&mut line).expect("read header line"); + if line.to_lowercase().contains("transfer-encoding: chunked") { + saw_chunked = true; + } + if line == "\r\n" { + break; + } + } + assert!( + saw_chunked, + "expected chunked transfer-encoding for lazy body" + ); + + // Read the first event chunk. We must receive it well before the upstream + // has emitted all 4 events (which takes ~800ms total). + let first_seen = Instant::now(); + let mut got_first = String::new(); + // chunk size line + let mut size_line = String::new(); + reader.read_line(&mut size_line).expect("read chunk size"); + let want = usize::from_str_radix(size_line.trim(), 16).expect("hex chunk size"); + let mut buf = vec![0u8; want]; + reader.read_exact(&mut buf).expect("read chunk body"); + got_first.push_str(&String::from_utf8_lossy(&buf)); + let mut crlf = [0u8; 2]; + reader.read_exact(&mut crlf).ok(); + let first_elapsed = first_seen.elapsed(); + + assert!( + got_first.contains("event-0"), + "expected first chunk to be event-0, got: {got_first:?}" + ); + // The upstream has produced at most 1-2 events by now, definitely not all 4. + assert!( + first_elapsed < Duration::from_millis(700), + "first chunk arrived too late ({first_elapsed:?}); body was likely buffered" + ); + + // Now drain the remaining chunks and collect every event prefix. + let event_prefix = |s: &str| s.trim().split('-').take(2).collect::>().join("-"); + let mut events = vec![event_prefix(&got_first)]; + loop { + let mut size_line = String::new(); + if reader.read_line(&mut size_line).unwrap_or(0) == 0 { + break; + } + let trimmed = size_line.trim(); + if trimmed.is_empty() { + continue; + } + let want = match usize::from_str_radix(trimmed, 16) { + Ok(n) => n, + Err(_) => break, + }; + if want == 0 { + break; // terminating chunk + } + let mut buf = vec![0u8; want]; + if reader.read_exact(&mut buf).is_err() { + break; + } + events.push(event_prefix(&String::from_utf8_lossy(&buf))); + let mut crlf = [0u8; 2]; + reader.read_exact(&mut crlf).ok(); + } + + child.kill().ok(); + child.wait().ok(); + + // Drain the upstream observer channel so the thread can finish. + while upstream_rx.try_recv().is_ok() {} + + assert_eq!( + events, + vec!["event-0", "event-1", "event-2", "event-3"], + "expected all four events in order, got: {events:?}" + ); +} + +/// The eager `L t` body (each list element a chunk) still works unchanged. +#[test] +fn eager_list_body_still_works() { + let dir = tempfile::tempdir().expect("tempdir"); + + std::fs::write( + dir.path().join("handler.ilo"), + "type rsp{status:n;body:_}\nhandler req:_>rsp\n rsp status:200 body:[\"a\" \"b\" \"c\"]\n", + ) + .expect("write handler"); + + let port = free_port(); + let mut child = spawn_httpd(&dir.path().join("handler.ilo"), port); + + let mut stream = TcpStream::connect(("127.0.0.1", port)).expect("connect"); + stream + .write_all(b"GET / HTTP/1.1\r\nHost: localhost\r\nConnection: close\r\n\r\n") + .expect("write request"); + stream.flush().ok(); + let mut resp = String::new(); + stream.read_to_string(&mut resp).expect("read response"); + + child.kill().ok(); + child.wait().ok(); + + assert!( + resp.to_lowercase().contains("transfer-encoding: chunked"), + "expected chunked encoding for list body, got: {resp:?}" + ); + for want in ["a", "b", "c"] { + assert!(resp.contains(want), "expected chunk {want:?} in {resp:?}"); + } +} + +/// A client disconnecting mid-stream must not panic the server: the connection +/// thread exits cleanly and the server keeps serving subsequent requests. +#[test] +fn client_disconnect_midstream_is_clean() { + let dir = tempfile::tempdir().expect("tempdir"); + + // Slow upstream the handler proxies. Generous count so the stream is still + // open when we hang up. + let (tx, upstream_rx) = mpsc::channel(); + let upstream_port = spawn_slow_upstream(50, Duration::from_millis(50), tx); + + let handler_src = format!( + "type rsp{{status:n;body:_}}\nhandler req:_>rsp\n rsp status:200 body:(get-stream \"http://127.0.0.1:{upstream_port}/\")\n" + ); + std::fs::write(dir.path().join("handler.ilo"), handler_src).expect("write handler"); + + let port = free_port(); + let mut child = spawn_httpd(&dir.path().join("handler.ilo"), port); + + // Connect, read just the first chunk, then drop the socket mid-stream. + { + let mut stream = TcpStream::connect(("127.0.0.1", port)).expect("connect"); + stream + .set_read_timeout(Some(Duration::from_secs(5))) + .unwrap(); + stream + .write_all(b"GET / HTTP/1.1\r\nHost: localhost\r\nConnection: close\r\n\r\n") + .expect("write request"); + stream.flush().ok(); + let mut buf = [0u8; 64]; + // Read something (headers + first chunk) then drop the stream below. + let _ = stream.read(&mut buf); + // stream dropped here -> client disconnect mid-stream + } + + // Give the server a moment to notice the broken pipe on its next write. + std::thread::sleep(Duration::from_millis(200)); + + // The server must still be alive and able to serve. A handler-error or + // panic would have torn down the process or the accept loop. + let still_up = TcpStream::connect(("127.0.0.1", port)).is_ok(); + + child.kill().ok(); + child.wait().ok(); + while upstream_rx.try_recv().is_ok() {} + + assert!( + still_up, + "server should keep accepting connections after a client disconnect" + ); + + // The process should not have crashed; killing a healthy child yields a + // signal-terminated status (not a panic exit we triggered ourselves). + // Nothing more to assert — reaching here without a hang is the signal. +}