Widen nested_type Callable branch on introspection failure (#673)#676
Merged
Conversation
43cdd8b to
7b3f2f3
Compare
Closes #673. The Callable branch of `nested_type` could raise on three real values where the documented `Box(type(value))` fallback was expected: - `pytest.mark.parametrize` (a `MarkDecorator`): `typing.get_overloads` raises `AttributeError` because `MarkDecorator` lacks `__qualname__`. - `dict.get` (a `method_descriptor`): same shape — missing `__module__`. - An `Operation` wrapping a callable with a stringified annotation that does not resolve in scope: `inspect.signature` consults `Operation.__signature__`, which calls `typing.get_type_hints`, which raises `NameError`. Per the canonicalize-widening principle (eb8680 on PR #613): when introspection cannot construct a more precise `TypeExpression`, the right move is to widen rather than propagate the failure. - Wrap `typing.get_overloads(value)` and `inspect.signature(value)` with broader `except` clauses (adding `AttributeError`, `TypeError`, `NameError`) that fall back to `Box(type(value))`. - The `Operation` branch propagates the widening when its Callable delegate returns a bare type rather than a parameterised one. Adds three regression tests pinning each failure mode.
7b3f2f3 to
2b8c274
Compare
eb8680
approved these changes
Jun 8, 2026
datvo06
added a commit
that referenced
this pull request
Jun 8, 2026
After merging master (which now contains the #675 collect_imports fix and the #676 nested_type widening), several touch-ups: - `_define_lexical_reader` is now a `_LexicalVariableTool.define` classmethod on a Tool subclass. The reader closes over the value snapshot rather than `env[name]` because tools are reconstructed fresh each `call_assistant` invocation. - Add a runtime assertion in `_LexicalVariableTool.define` that Tool/Template values must not be re-wrapped as lexical readers. - Narrow `_collect_tools`'s catch tuple to the three Pydantic schema errors plus `TypeError`. `TypeError` stays because the `Encodable[T]` registry raises it from `_pydantic_type_operation`, `_pydantic_type_term`, and `_pydantic_callable`'s incomplete- signature path. `AttributeError`/`NameError` are gone now that #673 widens at the source. - `test_template_synthesis_uses_lexical_reader` (Test A) unskipped; fixtures recorded against gpt-4o-mini and committed. - `test_lexical_reader_skips_marker_objects` removed: with #673, `pytest.mark.parametrize` no longer raises in `nested_type` and the `MarkDecorator` class itself is now schema-encodable through Pydantic's dataclass detection. The test pinned the old catch path; the new contract is "the value flows through". - `test_collect_tools_real_tools_take_precedence_over_value_readers` removed: the invariant (Tools never get wrapped as lexical readers) is now enforced by the runtime assertion inside `_LexicalVariableTool.define`. - `test_collect_tools_skips_agent_instances_but_exposes_their_tools` asserts on values (`inst.t in result.values()`) instead of the internal `agent__method_name` naming convention. - Live-read semantics tests (`returns_live_value`, `rebind`, `raises_when_name_deleted`) replaced with snapshot-semantics counterparts (`returns_captured_value`, `snapshot_survives_rebind`, `snapshot_survives_deletion`).
eb8680
pushed a commit
that referenced
this pull request
Jun 9, 2026
* Add synthetic readers for lexical context (closes #497) Templates collect synthetic read-only Tools for non-Tool symbols in their lexical scope, alongside the existing real-Tool collection. The LLM can call these readers to inspect lexical state on demand instead of having the entire scope dumped into the system prompt. Two reader flavors via singledispatch on the value's type: - Definition-readers for classes and functions return text via pydoc.render_doc (level="short", default — byte-equivalent to help(obj)) or inspect.getsource (level="full"). They bypass Encodable and just return str. - Value-readers for everything else return the live value, encoded through the existing Encodable pipeline. Probe is TypeAdapter(Encodable[T]).json_schema(); on any failure (Pydantic schema error, unencodable types like Term/Operation/TypeVar) the symbol is silently skipped. _collect_synthetic_readers is wired in two places: call_assistant (sees template.__context__ + bound args, mirroring Python call semantics) and Template.tools (sees template.__context__ only). Real Tools collected by _collect_tools take precedence — synthetic readers fill the gap. A short static preface sentence is appended to Template.__system_prompt__ so the LLM knows the read-only-readers category exists. The structured tools array carries per-tool semantics; the preface does not enumerate them. Two existing assertions in test_handlers_llm_template.py flip from 'local_variable not in a.f.tools' to 'in', reflecting the new behavior. 19 new unit tests cover the singledispatch matrix, the probe contract, live-read semantics, the BaseModel-via-metaclass dispatch case, the Box-via-TypeError-chain skip path, and the system-prompt preface. One recorded-fixture integration test exercises the end-to-end LLM-reads-lexical-value path. hide=/expose= knob deferred to a follow-up. * Fix mypy types in synthetic-reader probe * Constrain test_tool_calling prompt against reader exploration Adds an explicit instruction to generate_good_poem to ignore any read-only lexical reader tools that may appear in the tool list. With synthetic readers now exposing module-level imports/classes as inspectable tools, the LLM was exploring those instead of finishing the task, exceeding max_calls=4. * Replace synthetic-reader singledispatch with _LexicalVariableTool Address review feedback on PR #670: - Add `_LexicalVariableTool[T](Tool[[], T])` in completions.py with a classmethod `define(env, *, name)` that probes `TypeAdapter(Encodable[typ]).json_schema()` and lets schema failures propagate to the call site. - Inline reader generation into `_collect_tools`; remove `_collect_synthetic_readers`, `_build_synthetic_reader`, `_build_definition_reader`, all `@functools.singledispatch.register` handlers for `type`/`FunctionType`/`MethodType`/`BuiltinFunctionType`/ `ModuleType`/`Tool`/`Agent`, and the `Literal["short", "full"]` toggle. - Register passthrough handlers in encoding.py for `types.ModuleType`, `types.FunctionType`, `types.BuiltinFunctionType`, `types.MethodType`, `type`, and `Agent`. They preempt the broad `_pydantic_callable` fallback so Pydantic's natural schema-generation error fires for these types during the probe; the call site catches and skips. - Delete `_LEXICAL_READERS_PREFACE` and the system-prompt injection path; per-instance `tool_fn.__doc__` carries the framing the LLM sees, scoped to one specific lexical variable per reader. - Widen the probe failure catch tuple to cover the empirically observed failure modes (`PydanticInvalidForJsonSchema`, `PydanticUserError`, `TypeError`, `AttributeError`, `NameError`). - Filter `_collect_tools` to identifier-only, non-dunder names to skip the `@py_builtins`/`@py_assert*` names that pytest's assertion rewriting injects into module globals. - Move `_known_data` from module scope into the test function above `report_sum` in `test_llm_reads_lexical_value`. - Add Test A `test_template_synthesis_uses_lexical_reader` (skipped pending fixture recording with a real API key) and Test B `test_template_skips_lexical_classes` (no LLM, locks the skip-via- catch contract using the #497 `Hand`/`Finger` example). - Rewrite `test_handlers_llm_template.py` PR545 section against the new entry points. Add positive-skip coverage for modules, user classes, unannotated functions/methods, builtins, and Agents; pin that pytest's `MarkDecorator` and `__builtins__` are skipped without aborting collection; pin that annotated callables ARE exposed (the "annotated callable" caveat). * Skip class values at _LexicalVariableTool.define `nested_type(SomeClass)` routes through its Callable branch and extracts `__init__`'s signature, returning `Callable[Args, Return]`. For dataclass-like classes with annotated constructors that schema generates fine via `_pydantic_callable`, bypassing the `type` passthrough in `Encodable[T]`. Add an `isinstance(value, type)` pre-check at the top of `_LexicalVariableTool.define` so the skip-via-catch direction stays consistent for both bare classes (caught via `Encodable[type]`) and dataclass-like classes (caught here). * Collapse skip mechanism into one _is_synthetic_reader_eligible predicate Replaces two parallel skip mechanisms (encoding.py passthrough handlers that provoke a Pydantic schema error, plus an isinstance raise in _LexicalVariableTool.define that forges PydanticSchemaGenerationError from outside Pydantic) with a single predicate in completions.py. - Add _is_synthetic_reader_eligible(value) -> bool that rejects values whose type is in _NON_READER_TYPES (type, Module, Function, Method, BuiltinFunction, Agent, Tool) and probes Encodable[T] schema generation for the rest. - Use the predicate as the third branch's gate in _collect_tools; no try/except around tool construction. - Strip the isinstance(value, type) raise and the in-define probe from _LexicalVariableTool.define; the class is now purely constructive and assumes its caller has already gated on the predicate. - Drop the six TypeToPydanticType registrations and the _pydantic_type_passthrough function from encoding.py; drop the now- unused Agent import there. - Re-add `import types` to completions.py for the predicate's isinstance tuple. Behavior change: annotated callables in lexical scope are now skipped along with unannotated ones, matching how class objects (which also resolve to Callable[Args, Return] via nested_type) are treated. The test that previously pinned "annotated callables ARE exposed" is inverted to pin "they are skipped". * Pure-Encodable reader collection: drop isinstance pre-filter Replace the class+predicate pair (_LexicalVariableTool + _is_synthetic_reader_eligible) with a single free function _define_lexical_reader that probes Encodable[T].json_schema() and lets failures propagate. _collect_tools catches the probe failures inline (six-exception tuple no longer named — the catch is the only consumer). Behavior consequence: classes (plain and dataclass-shaped), unannotated functions, lambdas, bound methods, and builtins are now exposed as readers because nested_type resolves them to a Callable shape that _pydantic_callable schematises. Modules and Agent-subclass instances still naturally fail the probe. The skip-set is whatever Encodable rejects — no per-type code path remains. Tests: - Replace _LexicalVariableTool.define call sites with _define_lexical_reader. - Replace _is_synthetic_reader_eligible checks with collect-and-check or pytest.raises. - Flip Test B (provider) from "classes skipped" to "classes exposed". - Flip the skips_user_classes / skips_*_functions / skips_*_methods / skips_builtin_functions tests into exposure tests parametrised over the Callable-shaped categories. - Keep skips for modules and Agent instances (still genuinely fail Encodable). - Drop the substring assertion on "lexical variable" from the doc test; assert the variable name appears instead. * Black-format the test parametrize call * Drop dunder filter from synthetic-reader gate; link Test A to #674 Two minor adjustments: 1. `_collect_tools` only filters by `name.isidentifier()` now; the `not name.startswith("__")` clause was hygienic, not load-bearing. Module dunders that happen to encode (e.g. `__name__: str`) flow through as readers; module dunders that don't (`__builtins__`, `__class__` of an Agent) still naturally fail the Encodable probe. 2. `test_template_synthesis_uses_lexical_reader` (Test A) is re-skipped pending #674. Initial recording attempt revealed a pre-existing synthesizer bug: `collect_imports` drops `_`-prefixed module imports even when referenced by the emitted variable stubs, so the pytest `request` fixture's `_pytest.fixtures.TopRequest` type crashes mypy_type_check. Skip reason references the issue. * Post-#675/#676 fixup: subclass + snapshot semantics, record Test A After merging master (which now contains the #675 collect_imports fix and the #676 nested_type widening), several touch-ups: - `_define_lexical_reader` is now a `_LexicalVariableTool.define` classmethod on a Tool subclass. The reader closes over the value snapshot rather than `env[name]` because tools are reconstructed fresh each `call_assistant` invocation. - Add a runtime assertion in `_LexicalVariableTool.define` that Tool/Template values must not be re-wrapped as lexical readers. - Narrow `_collect_tools`'s catch tuple to the three Pydantic schema errors plus `TypeError`. `TypeError` stays because the `Encodable[T]` registry raises it from `_pydantic_type_operation`, `_pydantic_type_term`, and `_pydantic_callable`'s incomplete- signature path. `AttributeError`/`NameError` are gone now that #673 widens at the source. - `test_template_synthesis_uses_lexical_reader` (Test A) unskipped; fixtures recorded against gpt-4o-mini and committed. - `test_lexical_reader_skips_marker_objects` removed: with #673, `pytest.mark.parametrize` no longer raises in `nested_type` and the `MarkDecorator` class itself is now schema-encodable through Pydantic's dataclass detection. The test pinned the old catch path; the new contract is "the value flows through". - `test_collect_tools_real_tools_take_precedence_over_value_readers` removed: the invariant (Tools never get wrapped as lexical readers) is now enforced by the runtime assertion inside `_LexicalVariableTool.define`. - `test_collect_tools_skips_agent_instances_but_exposes_their_tools` asserts on values (`inst.t in result.values()`) instead of the internal `agent__method_name` naming convention. - Live-read semantics tests (`returns_live_value`, `rebind`, `raises_when_name_deleted`) replaced with snapshot-semantics counterparts (`returns_captured_value`, `snapshot_survives_rebind`, `snapshot_survives_deletion`). * stronger test * Identity-check returned values in lexical-reader exposure test * Drop docstring-substring test; remove now-unused type-ignore * Gate synthetic lexical-reader generation behind LexicalReaders handler Per eb8680: needing to coach the LLM around noisy lexical-reader tools in `test_handlers_llm_tool_calling_poem.py` was a smell. Make the generation opt-in instead. - New `expose_lexical_readers()` Operation in `completions.py`, default return `False`. `_collect_tools` gates the reader branch on it. - New `LexicalReaders` ObjectInterpretation overrides the Operation to return `True`; users install it for the call sites where the LLM should see closure state. - Revert the poem prompt-hack: the docstring no longer has to ask the LLM to ignore read-only lexical readers, because they are now off by default. - Test A (`test_template_synthesis_uses_lexical_reader`) and the reader-integration test (`test_llm_reads_lexical_value`) install `LexicalReaders` in their handler stack. Both fixtures re-recorded against gpt-4o-mini. - Template tests that exercise reader generation install the handler. New `test_lexical_readers_off_by_default` and `test_lexical_readers_handler_enables_collection` pin both sides of the gate. - `test_template_method` / `test_template_method_nested_class`: drop the side-note `"local_variable" in tools` assertions; those pinned implicit reader generation. The core method-template tool collection (random, reverse, etc.) is the actual point and still passes. - `test_template_exposes_lexical_classes` now pins both off (default) and on (handler installed) — the #497 motivating example with explicit gate semantics. * remove weak test * Format * Promote collect_tools to a public Operation; LexicalReaders overrides it Replaces the narrow `expose_lexical_readers` bool gate with a general extension point: `collect_tools` is now an `Operation` whose default rule does the minimal Tool/Template/Agent collection (including the same-Tool-different-name dedup). Handlers override it to customise what gets exposed to the LLM. `LexicalReaders` becomes an `ObjectInterpretation` that `@implements(collect_tools)`: call `fwd()` to get the base set, then add a synthetic `_LexicalVariableTool` for each plain value in env whose `Encodable[T]` accepts it. Renames the internal `_collect_tools` to public `collect_tools` everywhere: `Template.tools`, `call_assistant`, and the template tests. No behaviour change beyond the design promotion. * Drop redundant Tool|Template|Agent isinstance check in LexicalReaders
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Close #673.
Fix
nested_typeCallable branch was raising rather than widening toBox(type(value))on three real values:AttributeError/TypeErrorfromtyping.get_overloads(value)— coverspytest.mark.parametrize(MarkDecoratorlacks__qualname__) anddict.get(method_descriptorlacks__module__).NameError/AttributeError/TypeErrorfrominspect.signature(value)— coversOperationwhose__signature__runstyping.get_type_hintsand hits an unresolvable forward-ref.Operationbranch when its Callable delegate returns a bare type (typing.get_argswas failing to unpack).Per the canonicalize-widening principle on #613.
Tests
Added corresponding tests.