fix(context): always return inner dict from process_sources#54
Open
0xghost42 wants to merge 1 commit into
Open
fix(context): always return inner dict from process_sources#540xghost42 wants to merge 1 commit into
0xghost42 wants to merge 1 commit into
Conversation
process_sources() is fed a SearchResult from serp_search.get_sources but
its consumer build_context() reads .get('organic') / .get('topStories')
on a plain dict (build_context.py:55-90). Three of the four return paths
already unwrap the inner dict via sources.data:
return sources.data
return self._update_sources_with_content(sources.data, ...)
The remaining two paths return the SearchResult wrapper instead:
if not valid_sources:
return sources # <-- SearchResult
except Exception as e:
...
return sources # <-- SearchResult
Whenever the SERP response has no organic results (empty query, rate
limit, quota exhaustion) or scraping/reranking raises, build_context()
hits the wrapper and crashes with
AttributeError: 'SearchResult' object has no attribute 'get'
mirroring the report in issue sentient-agi#15. The reporter's root cause turned out
to be an install issue ('No module named src') but the underlying
inconsistency is real: any exception inside process_sources surfaces as
this misleading AttributeError downstream.
Unwrap to sources.data in both branches, falling back to {} when data is
None (the error-result case from SerperAPI/SearXNG). build_context()
already tolerates an empty dict (returns ''), so the agent now degrades
gracefully instead of crashing.
Return annotation updated from List[dict] to dict to match the actual
contract; sources parameter annotation dropped (was List[dict], actually
SearchResult — leaving it untyped rather than importing SearchResult
into this module just for the hint).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SourceProcessor.process_sourcesis invoked with aSearchResultfromserp_search.get_sourcesand its return value is fed straight intobuild_context, which reads.get('organic')/.get('topStories')/.get('answerBox')on a plain dict (context_building/build_context.py:55-90).Three of the four return paths inside
process_sourcesalready unwrap the inner dict:The other two leak the
SearchResultwrapper:Whenever the SERP call returns no organic results (empty query, rate limiting, quota exhaustion, etc.) or the scrape/rerank step raises,
build_contextthen crashes with:This is the same error surfaced in #15. The reporter's root cause turned out to be an install issue (
No module named 'src') that ended up in theexceptbranch, but the underlying inconsistency is real — any future exception insideprocess_sourceswill surface as the same misleadingAttributeErrordownstream rather than being logged and recovered.Change
src/opendeepsearch/context_building/process_sources_pro.py:sources.datain both leaking branches, falling back to{}whendataisNone(the error-result case —SearchResult.__init__leavesdata=Nonewhen constructed witherror=...).build_contextalready tolerates an empty dict (itsextract_information(sources_result.get('organic', []))chain returns''cleanly), so the agent now degrades gracefully instead of crashing.List[dict]todictto match the actual contract.sourcesparameter annotation dropped — it was typedList[dict]but is in fact aSearchResult, and importingSearchResultinto this module just for the hint is more churn than the fix warrants.Verification
python -m py_compile src/opendeepsearch/context_building/process_sources_pro.pyclean.build_contextand confirmed they previously triggeredAttributeError: 'SearchResult' object has no attribute 'get'; after the fix,build_contextreceives{}and returns''.return sources.data,return sources.datavia wiki branch,return self._update_sources_with_content(sources.data, ...)) are unchanged.Out of scope
The wider type-annotation mismatch (
_get_valid_sources/_update_sources_with_contentdeclareList[dict]but operate on the inner SERP dict) is left for a separate cleanup PR — the runtime bug is what this PR addresses.