Skip to content

Widen nested_type Callable branch on introspection failure (#673)#676

Merged
eb8680 merged 1 commit into
masterfrom
dn-fix-673-nested-type-widening
Jun 8, 2026
Merged

Widen nested_type Callable branch on introspection failure (#673)#676
eb8680 merged 1 commit into
masterfrom
dn-fix-673-nested-type-widening

Conversation

@datvo06

@datvo06 datvo06 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Close #673.

Fix

nested_type Callable branch was raising rather than widening to Box(type(value)) on three real values:

  • Widen on AttributeError / TypeError from typing.get_overloads(value) — covers pytest.mark.parametrize (MarkDecorator lacks __qualname__) and dict.get (method_descriptor lacks __module__).
  • Widen on NameError / AttributeError / TypeError from inspect.signature(value) — covers Operation whose __signature__ runs typing.get_type_hints and hits an unresolvable forward-ref.
  • Propagate the widening through the Operation branch when its Callable delegate returns a bare type (typing.get_args was failing to unpack).

Per the canonicalize-widening principle on #613.

Tests

Added corresponding tests.

@eb8680 eb8680 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the eager-annotation case from #673

Comment thread tests/test_internals_unification.py
@datvo06 datvo06 force-pushed the dn-fix-673-nested-type-widening branch from 43cdd8b to 7b3f2f3 Compare June 8, 2026 15:54
Closes #673.

The Callable branch of `nested_type` could raise on three real values
where the documented `Box(type(value))` fallback was expected:

- `pytest.mark.parametrize` (a `MarkDecorator`): `typing.get_overloads`
  raises `AttributeError` because `MarkDecorator` lacks `__qualname__`.
- `dict.get` (a `method_descriptor`): same shape — missing `__module__`.
- An `Operation` wrapping a callable with a stringified annotation that
  does not resolve in scope: `inspect.signature` consults
  `Operation.__signature__`, which calls `typing.get_type_hints`, which
  raises `NameError`.

Per the canonicalize-widening principle (eb8680 on PR #613): when
introspection cannot construct a more precise `TypeExpression`, the
right move is to widen rather than propagate the failure.

- Wrap `typing.get_overloads(value)` and `inspect.signature(value)`
  with broader `except` clauses (adding `AttributeError`, `TypeError`,
  `NameError`) that fall back to `Box(type(value))`.
- The `Operation` branch propagates the widening when its Callable
  delegate returns a bare type rather than a parameterised one.

Adds three regression tests pinning each failure mode.
@datvo06 datvo06 force-pushed the dn-fix-673-nested-type-widening branch from 7b3f2f3 to 2b8c274 Compare June 8, 2026 16:04
@eb8680 eb8680 merged commit 961af72 into master Jun 8, 2026
29 checks passed
@eb8680 eb8680 deleted the dn-fix-673-nested-type-widening branch June 8, 2026 16:30
datvo06 added a commit that referenced this pull request Jun 8, 2026
After merging master (which now contains the #675 collect_imports fix
and the #676 nested_type widening), several touch-ups:

- `_define_lexical_reader` is now a `_LexicalVariableTool.define`
  classmethod on a Tool subclass.  The reader closes over the value
  snapshot rather than `env[name]` because tools are reconstructed
  fresh each `call_assistant` invocation.
- Add a runtime assertion in `_LexicalVariableTool.define` that
  Tool/Template values must not be re-wrapped as lexical readers.
- Narrow `_collect_tools`'s catch tuple to the three Pydantic schema
  errors plus `TypeError`.  `TypeError` stays because the
  `Encodable[T]` registry raises it from `_pydantic_type_operation`,
  `_pydantic_type_term`, and `_pydantic_callable`'s incomplete-
  signature path.  `AttributeError`/`NameError` are gone now that
  #673 widens at the source.
- `test_template_synthesis_uses_lexical_reader` (Test A) unskipped;
  fixtures recorded against gpt-4o-mini and committed.
- `test_lexical_reader_skips_marker_objects` removed: with #673,
  `pytest.mark.parametrize` no longer raises in `nested_type` and the
  `MarkDecorator` class itself is now schema-encodable through
  Pydantic's dataclass detection.  The test pinned the old catch
  path; the new contract is "the value flows through".
- `test_collect_tools_real_tools_take_precedence_over_value_readers`
  removed: the invariant (Tools never get wrapped as lexical readers)
  is now enforced by the runtime assertion inside
  `_LexicalVariableTool.define`.
- `test_collect_tools_skips_agent_instances_but_exposes_their_tools`
  asserts on values (`inst.t in result.values()`) instead of the
  internal `agent__method_name` naming convention.
- Live-read semantics tests (`returns_live_value`, `rebind`,
  `raises_when_name_deleted`) replaced with snapshot-semantics
  counterparts (`returns_captured_value`, `snapshot_survives_rebind`,
  `snapshot_survives_deletion`).
eb8680 pushed a commit that referenced this pull request Jun 9, 2026
* Add synthetic readers for lexical context (closes #497)

Templates collect synthetic read-only Tools for non-Tool symbols in
their lexical scope, alongside the existing real-Tool collection. The
LLM can call these readers to inspect lexical state on demand instead
of having the entire scope dumped into the system prompt.

Two reader flavors via singledispatch on the value's type:

- Definition-readers for classes and functions return text via
  pydoc.render_doc (level="short", default — byte-equivalent to
  help(obj)) or inspect.getsource (level="full"). They bypass
  Encodable and just return str.
- Value-readers for everything else return the live value, encoded
  through the existing Encodable pipeline. Probe is
  TypeAdapter(Encodable[T]).json_schema(); on any failure (Pydantic
  schema error, unencodable types like Term/Operation/TypeVar) the
  symbol is silently skipped.

_collect_synthetic_readers is wired in two places: call_assistant
(sees template.__context__ + bound args, mirroring Python call
semantics) and Template.tools (sees template.__context__ only).
Real Tools collected by _collect_tools take precedence — synthetic
readers fill the gap.

A short static preface sentence is appended to Template.__system_prompt__
so the LLM knows the read-only-readers category exists. The structured
tools array carries per-tool semantics; the preface does not enumerate
them.

Two existing assertions in test_handlers_llm_template.py flip from
'local_variable not in a.f.tools' to 'in', reflecting the new
behavior. 19 new unit tests cover the singledispatch matrix, the
probe contract, live-read semantics, the BaseModel-via-metaclass
dispatch case, the Box-via-TypeError-chain skip path, and the
system-prompt preface. One recorded-fixture integration test
exercises the end-to-end LLM-reads-lexical-value path.

hide=/expose= knob deferred to a follow-up.

* Fix mypy types in synthetic-reader probe

* Constrain test_tool_calling prompt against reader exploration

Adds an explicit instruction to generate_good_poem to ignore any
read-only lexical reader tools that may appear in the tool list.
With synthetic readers now exposing module-level imports/classes as
inspectable tools, the LLM was exploring those instead of finishing
the task, exceeding max_calls=4.

* Replace synthetic-reader singledispatch with _LexicalVariableTool

Address review feedback on PR #670:

- Add `_LexicalVariableTool[T](Tool[[], T])` in completions.py with a
  classmethod `define(env, *, name)` that probes
  `TypeAdapter(Encodable[typ]).json_schema()` and lets schema failures
  propagate to the call site.

- Inline reader generation into `_collect_tools`; remove
  `_collect_synthetic_readers`, `_build_synthetic_reader`,
  `_build_definition_reader`, all `@functools.singledispatch.register`
  handlers for `type`/`FunctionType`/`MethodType`/`BuiltinFunctionType`/
  `ModuleType`/`Tool`/`Agent`, and the `Literal["short", "full"]` toggle.

- Register passthrough handlers in encoding.py for `types.ModuleType`,
  `types.FunctionType`, `types.BuiltinFunctionType`, `types.MethodType`,
  `type`, and `Agent`. They preempt the broad `_pydantic_callable`
  fallback so Pydantic's natural schema-generation error fires for
  these types during the probe; the call site catches and skips.

- Delete `_LEXICAL_READERS_PREFACE` and the system-prompt injection
  path; per-instance `tool_fn.__doc__` carries the framing the LLM
  sees, scoped to one specific lexical variable per reader.

- Widen the probe failure catch tuple to cover the empirically
  observed failure modes (`PydanticInvalidForJsonSchema`,
  `PydanticUserError`, `TypeError`, `AttributeError`, `NameError`).

- Filter `_collect_tools` to identifier-only, non-dunder names to skip
  the `@py_builtins`/`@py_assert*` names that pytest's assertion
  rewriting injects into module globals.

- Move `_known_data` from module scope into the test function above
  `report_sum` in `test_llm_reads_lexical_value`.

- Add Test A `test_template_synthesis_uses_lexical_reader` (skipped
  pending fixture recording with a real API key) and Test B
  `test_template_skips_lexical_classes` (no LLM, locks the skip-via-
  catch contract using the #497 `Hand`/`Finger` example).

- Rewrite `test_handlers_llm_template.py` PR545 section against the
  new entry points. Add positive-skip coverage for modules, user
  classes, unannotated functions/methods, builtins, and Agents; pin
  that pytest's `MarkDecorator` and `__builtins__` are skipped
  without aborting collection; pin that annotated callables ARE
  exposed (the "annotated callable" caveat).

* Skip class values at _LexicalVariableTool.define

`nested_type(SomeClass)` routes through its Callable branch and extracts
`__init__`'s signature, returning `Callable[Args, Return]`.  For
dataclass-like classes with annotated constructors that schema generates
fine via `_pydantic_callable`, bypassing the `type` passthrough in
`Encodable[T]`.

Add an `isinstance(value, type)` pre-check at the top of
`_LexicalVariableTool.define` so the skip-via-catch direction stays
consistent for both bare classes (caught via `Encodable[type]`) and
dataclass-like classes (caught here).

* Collapse skip mechanism into one _is_synthetic_reader_eligible predicate

Replaces two parallel skip mechanisms (encoding.py passthrough handlers
that provoke a Pydantic schema error, plus an isinstance raise in
_LexicalVariableTool.define that forges PydanticSchemaGenerationError
from outside Pydantic) with a single predicate in completions.py.

- Add _is_synthetic_reader_eligible(value) -> bool that rejects values
  whose type is in _NON_READER_TYPES (type, Module, Function, Method,
  BuiltinFunction, Agent, Tool) and probes Encodable[T] schema
  generation for the rest.
- Use the predicate as the third branch's gate in _collect_tools; no
  try/except around tool construction.
- Strip the isinstance(value, type) raise and the in-define probe from
  _LexicalVariableTool.define; the class is now purely constructive
  and assumes its caller has already gated on the predicate.
- Drop the six TypeToPydanticType registrations and the
  _pydantic_type_passthrough function from encoding.py; drop the now-
  unused Agent import there.
- Re-add `import types` to completions.py for the predicate's
  isinstance tuple.

Behavior change: annotated callables in lexical scope are now skipped
along with unannotated ones, matching how class objects (which also
resolve to Callable[Args, Return] via nested_type) are treated. The
test that previously pinned "annotated callables ARE exposed" is
inverted to pin "they are skipped".

* Pure-Encodable reader collection: drop isinstance pre-filter

Replace the class+predicate pair (_LexicalVariableTool + _is_synthetic_reader_eligible) with a single free function _define_lexical_reader that probes Encodable[T].json_schema() and lets failures propagate.

_collect_tools catches the probe failures inline (six-exception tuple
no longer named — the catch is the only consumer).

Behavior consequence: classes (plain and dataclass-shaped),
unannotated functions, lambdas, bound methods, and builtins are now
exposed as readers because nested_type resolves them to a Callable
shape that _pydantic_callable schematises. Modules and Agent-subclass
instances still naturally fail the probe. The skip-set is whatever
Encodable rejects — no per-type code path remains.

Tests:
- Replace _LexicalVariableTool.define call sites with _define_lexical_reader.
- Replace _is_synthetic_reader_eligible checks with collect-and-check or pytest.raises.
- Flip Test B (provider) from "classes skipped" to "classes exposed".
- Flip the skips_user_classes / skips_*_functions / skips_*_methods / skips_builtin_functions tests into exposure tests parametrised over the Callable-shaped categories.
- Keep skips for modules and Agent instances (still genuinely fail Encodable).
- Drop the substring assertion on "lexical variable" from the doc test; assert the variable name appears instead.

* Black-format the test parametrize call

* Drop dunder filter from synthetic-reader gate; link Test A to #674

Two minor adjustments:

1. `_collect_tools` only filters by `name.isidentifier()` now; the
   `not name.startswith("__")` clause was hygienic, not load-bearing.
   Module dunders that happen to encode (e.g. `__name__: str`) flow
   through as readers; module dunders that don't (`__builtins__`,
   `__class__` of an Agent) still naturally fail the Encodable probe.
2. `test_template_synthesis_uses_lexical_reader` (Test A) is re-skipped
   pending #674. Initial recording attempt revealed a pre-existing
   synthesizer bug: `collect_imports` drops `_`-prefixed module
   imports even when referenced by the emitted variable stubs, so the
   pytest `request` fixture's `_pytest.fixtures.TopRequest` type
   crashes mypy_type_check. Skip reason references the issue.

* Post-#675/#676 fixup: subclass + snapshot semantics, record Test A

After merging master (which now contains the #675 collect_imports fix
and the #676 nested_type widening), several touch-ups:

- `_define_lexical_reader` is now a `_LexicalVariableTool.define`
  classmethod on a Tool subclass.  The reader closes over the value
  snapshot rather than `env[name]` because tools are reconstructed
  fresh each `call_assistant` invocation.
- Add a runtime assertion in `_LexicalVariableTool.define` that
  Tool/Template values must not be re-wrapped as lexical readers.
- Narrow `_collect_tools`'s catch tuple to the three Pydantic schema
  errors plus `TypeError`.  `TypeError` stays because the
  `Encodable[T]` registry raises it from `_pydantic_type_operation`,
  `_pydantic_type_term`, and `_pydantic_callable`'s incomplete-
  signature path.  `AttributeError`/`NameError` are gone now that
  #673 widens at the source.
- `test_template_synthesis_uses_lexical_reader` (Test A) unskipped;
  fixtures recorded against gpt-4o-mini and committed.
- `test_lexical_reader_skips_marker_objects` removed: with #673,
  `pytest.mark.parametrize` no longer raises in `nested_type` and the
  `MarkDecorator` class itself is now schema-encodable through
  Pydantic's dataclass detection.  The test pinned the old catch
  path; the new contract is "the value flows through".
- `test_collect_tools_real_tools_take_precedence_over_value_readers`
  removed: the invariant (Tools never get wrapped as lexical readers)
  is now enforced by the runtime assertion inside
  `_LexicalVariableTool.define`.
- `test_collect_tools_skips_agent_instances_but_exposes_their_tools`
  asserts on values (`inst.t in result.values()`) instead of the
  internal `agent__method_name` naming convention.
- Live-read semantics tests (`returns_live_value`, `rebind`,
  `raises_when_name_deleted`) replaced with snapshot-semantics
  counterparts (`returns_captured_value`, `snapshot_survives_rebind`,
  `snapshot_survives_deletion`).

* stronger test

* Identity-check returned values in lexical-reader exposure test

* Drop docstring-substring test; remove now-unused type-ignore

* Gate synthetic lexical-reader generation behind LexicalReaders handler

Per eb8680: needing to coach the LLM around noisy lexical-reader tools
in `test_handlers_llm_tool_calling_poem.py` was a smell.  Make the
generation opt-in instead.

- New `expose_lexical_readers()` Operation in `completions.py`, default
  return `False`.  `_collect_tools` gates the reader branch on it.
- New `LexicalReaders` ObjectInterpretation overrides the Operation to
  return `True`; users install it for the call sites where the LLM
  should see closure state.
- Revert the poem prompt-hack: the docstring no longer has to ask the
  LLM to ignore read-only lexical readers, because they are now off
  by default.
- Test A (`test_template_synthesis_uses_lexical_reader`) and the
  reader-integration test (`test_llm_reads_lexical_value`) install
  `LexicalReaders` in their handler stack.  Both fixtures re-recorded
  against gpt-4o-mini.
- Template tests that exercise reader generation install the handler.
  New `test_lexical_readers_off_by_default` and
  `test_lexical_readers_handler_enables_collection` pin both sides of
  the gate.
- `test_template_method` / `test_template_method_nested_class`:
  drop the side-note `"local_variable" in tools` assertions; those
  pinned implicit reader generation.  The core method-template tool
  collection (random, reverse, etc.) is the actual point and still
  passes.
- `test_template_exposes_lexical_classes` now pins both off (default)
  and on (handler installed) — the #497 motivating example with
  explicit gate semantics.

* remove weak test

* Format

* Promote collect_tools to a public Operation; LexicalReaders overrides it

Replaces the narrow `expose_lexical_readers` bool gate with a
general extension point: `collect_tools` is now an `Operation` whose
default rule does the minimal Tool/Template/Agent collection
(including the same-Tool-different-name dedup).  Handlers override
it to customise what gets exposed to the LLM.

`LexicalReaders` becomes an `ObjectInterpretation` that
`@implements(collect_tools)`: call `fwd()` to get the base set,
then add a synthetic `_LexicalVariableTool` for each plain value
in env whose `Encodable[T]` accepts it.

Renames the internal `_collect_tools` to public `collect_tools`
everywhere: `Template.tools`, `call_assistant`, and the template
tests.  No behaviour change beyond the design promotion.

* Drop redundant Tool|Template|Agent isinstance check in LexicalReaders
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nested_type Callable-branch is brittle on values lacking __qualname__/__module__

2 participants