[fix] Resolve code review gaps in `v0.72.0` #3312

junaway · 2025-12-25T10:25:04Z

No description provided.

…ove-autoevals-and-rag-evaluators

vercel · 2025-12-25T10:25:09Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Dec 25, 2025 2:02pm

Copilot

Pull request overview

This PR refactors pagination for sessions and users endpoints from using a simple next_cursor (datetime) field to using a more comprehensive windowing object structure. The changes affect the API response format, database queries, and frontend pagination logic. Additionally, the PR improves handling of reference attributes in the SDK's tracing processors to support BaseModel objects and nested dictionary structures.

Key changes:

Replaced next_cursor field with windowing object containing pagination state in API responses
Updated frontend query logic to extract the next cursor from the windowing object
Enhanced SDK tracing processors to handle BaseModel references by serializing them to dictionaries

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
api/oss/src/apis/fastapi/tracing/router.py	Modified sessions and users endpoints to return windowing object instead of next_cursor; constructs windowing with pagination state
api/oss/src/apis/fastapi/tracing/models.py	Updated SessionIdsResponse and UserIdsResponse models to replace next_cursor field with windowing field
api/oss/src/dbs/postgres/tracing/dao.py	Added support for windowing.next parameter in sessions and users queries
web/oss/src/state/newObservability/stores/sessionsStore.ts	Updated store to extract cursor from result.windowing.next instead of result.next_cursor
web/oss/src/state/newObservability/atoms/queries.ts	Updated query atom to use windowing.next for pagination and extract nextWindowing from response
sdk/agenta/sdk/tracing/processors.py	Added handling for primitive type references from baggage
sdk/agenta/sdk/engines/tracing/processors.py	Enhanced reference processing to serialize BaseModel objects and handle nested dictionaries
sdk/agenta/sdk/decorators/tracing.py	Improved baggage attachment to support BaseModel references and nested dictionary structures

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

api/oss/src/dbs/postgres/tracing/dao.py

api/oss/src/apis/fastapi/tracing/router.py

Copilot

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

web/oss/src/state/newObservability/atoms/queries.ts

web/oss/src/components/pages/observability/components/SessionsTable/index.tsx

api/oss/src/dbs/postgres/tracing/dao.py

web/oss/src/state/newObservability/stores/sessionsStore.ts

api/oss/src/apis/fastapi/tracing/models.py

api/oss/src/apis/fastapi/tracing/router.py

api/oss/src/core/tracing/interfaces.py

…ng-gaps-in-sessions-pr

Copilot

Pull request overview

Copilot reviewed 25 out of 27 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

web/oss/src/state/newObservability/stores/sessionsStore.ts

web/oss/src/components/pages/observability/components/ObservabilityHeader/index.tsx

web/oss/src/state/newObservability/atoms/queries.ts

api/oss/src/dbs/postgres/tracing/dao.py

sdk/agenta/sdk/middlewares/routing/otel.py

web/oss/src/components/pages/observability/components/ObservabilityTable/index.tsx

Copilot

Pull request overview

Copilot reviewed 26 out of 29 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

web/oss/src/state/newObservability/atoms/queries.ts

Copilot

Pull request overview

Copilot reviewed 26 out of 29 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

web/oss/src/state/newObservability/atoms/queries.ts

sdk/agenta/sdk/middlewares/routing/otel.py

api/oss/src/services/evaluators_service.py

sdk/agenta/sdk/engines/tracing/processors.py

sdk/agenta/sdk/decorators/tracing.py

sdk/agenta/sdk/tracing/processors.py

web/oss/src/components/pages/observability/components/ObservabilityHeader/index.tsx

Copilot

Pull request overview

Copilot reviewed 27 out of 30 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-25T14:05:59Z

api/oss/src/services/evaluators_service.py

 from fastapi import HTTPException
 from numpy._core._multiarray_umath import array
-from autoevals.ragas import Faithfulness, ContextRelevancy
+# from autoevals.ragas import Faithfulness, ContextRelevancy  # Commented out due to autoevals removal


The import statement is commented out but still present. It's cleaner to remove commented-out imports entirely rather than leaving them in the code, as they add no value and create clutter.

Suggested change

# from autoevals.ragas import Faithfulness, ContextRelevancy # Commented out due to autoevals removal

Copilot · 2025-12-25T14:06:00Z

api/oss/src/services/evaluators_service.py

 import numpy as np
 from openai import AsyncOpenAI
 from fastapi import HTTPException
 from numpy._core._multiarray_umath import array


The unused import 'array' from numpy._core._multiarray_umath is still present at line 11. If it's not used elsewhere in the file, this import should be removed to clean up the dependencies.

Suggested change

from numpy._core._multiarray_umath import array

Copilot · 2025-12-25T14:06:00Z

api/oss/src/services/evaluators_service.py

+# COMMENTED OUT: RAG evaluation functions removed due to autoevals dependency removal
+# async def measure_rag_consistency(
+#     input: EvaluatorInputInterface,
+# ) -> EvaluatorOutputInterface:
+#     openai_api_key = input.credentials.get("OPENAI_API_KEY", None)
+#     if not openai_api_key:
+#         raise Exception(
+#             "No OpenAI key was found. RAG evaluator requires a valid OpenAI API key to function. Please configure your OpenAI API and try again."
+#         )
+#
+#     # Initialize RAG evaluator to calculate faithfulness score
+#     faithfulness = Faithfulness(api_key=openai_api_key)
+#     eval_score = await faithfulness._run_eval_async(
+#         output=input.inputs["answer_key"],
+#         input=input.inputs["question_key"],
+#         context=input.inputs["contexts_key"],
+#     )
+#     return {"outputs": {"score": eval_score.score}}
+
+
+# COMMENTED OUT: RAG faithfulness function removed due to autoevals dependency removal
+# async def rag_faithfulness(
+#     inputs: Dict[str, Any],  # pylint: disable=unused-argument
+#     output: Union[str, Dict[str, Any]],
+#     data_point: Dict[str, Any],  # pylint: disable=unused-argument
+#     app_params: Dict[str, Any],  # pylint: disable=unused-argument
+#     settings_values: Dict[str, Any],  # pylint: disable=unused-argument
+#     lm_providers_keys: Dict[str, Any],  # pylint: disable=unused-argument
+# ) -> Result:
+#     try:
+#         if isinstance(output, str):
+#             log.error("'output' is most likely not BaseResponse.")
+#             raise NotImplementedError(
+#                 "Please update the SDK to the latest version, which supports RAG evaluators."
+#             )
+#
+#         # Get required keys for rag evaluator
+#         mapping_keys = remove_trace_prefix(settings_values=settings_values)
+#         question_key: Union[str, None] = mapping_keys.get("question_key", None)
+#         answer_key: Union[str, None] = mapping_keys.get("answer_key", None)
+#         contexts_key: Union[str, None] = mapping_keys.get("contexts_key", None)
+#
+#         if None in [question_key, answer_key, contexts_key]:
+#             log.error(
+#                 f"Missing evaluator settings ? {['question', question_key is None, 'answer', answer_key is None, 'context', contexts_key is None]}"
+#             )
+#             raise ValueError(
+#                 "Missing required configuration keys: 'question_key', 'answer_key', or 'contexts_key'. Please check your evaluator settings and try again."
+#             )
+#
+#         # Turn distributed trace into trace tree
+#         trace = {}
+#         version = output.get("version")
+#         if version == "3.0":
+#             trace = output.get("tree", {})
+#         elif version == "2.0":
+#             trace = output.get("trace", {})
+#
+#         trace = process_distributed_trace_into_trace_tree(trace, version)
+#
+#         # Get value of required keys for rag evaluator
+#         question_val: Any = get_field_value_from_trace_tree(
+#             trace, question_key, version
+#         )
+#         answer_val: Any = get_field_value_from_trace_tree(trace, answer_key, version)
+#         contexts_val: Any = get_field_value_from_trace_tree(
+#             trace, contexts_key, version
+#         )
+#
+#         if None in [question_val, answer_val, contexts_val]:
+#             log.warn(
+#                 f"Missing trace field ? {['question', question_val is None, 'answer', answer_val is None, 'context', contexts_val is None]}"
+#             )
+#
+#             message = ""
+#             if question_val is None:
+#                 message += (
+#                     f"'question_key' is set to {question_key} which can't be found. "
+#                 )
+#             if answer_val is None:
+#                 message += f"'answer_key' is set to {answer_key} which can't be found. "
+#             if contexts_val is None:
+#                 message += (
+#                     f"'contexts_key' is set to {contexts_key} which can't be found. "
+#                 )
+#             message += "Please check your evaluator settings and try again."
+#
+#             raise ValueError(message)
+#
+#         measurement = await measure_rag_consistency(
+#             input=EvaluatorInputInterface(
+#                 **{
+#                     "inputs": {
+#                         "question_key": question_val,
+#                         "contexts_key": contexts_val,
+#                         "answer_key": answer_val,
+#                     },
+#                     "settings": settings_values,
+#                     "credentials": lm_providers_keys,
+#                 }
+#             )
+#         )
+#         return Result(type="number", value=measurement["outputs"]["score"])
+#
+#     except Exception:
+#         return Result(
+#             type="error",
+#             value=None,
+#             error=Error(
+#                 message="Error during RAG Faithfulness evaluation",
+#                 stacktrace=str(traceback.format_exc()),
+#             ),
+#         )
+
+
+# COMMENTED OUT: RAG evaluation functions removed due to autoevals dependency removal
+# async def measure_context_coherence(
+#     input: EvaluatorInputInterface,
+# ) -> EvaluatorOutputInterface:
+#     openai_api_key = input.credentials.get("OPENAI_API_KEY", None)
+#     if not openai_api_key:
+#         raise Exception(
+#             "No OpenAI key was found. RAG evaluator requires a valid OpenAI API key to function. Please configure your OpenAI API and try again."
+#         )
+#
+#     # Initialize RAG evaluator to calculate context relevancy score
+#     context_rel = ContextRelevancy(api_key=openai_api_key)
+#     eval_score = await context_rel._run_eval_async(
+#         output=input.inputs["answer_key"],
+#         input=input.inputs["question_key"],
+#         context=input.inputs["contexts_key"],
+#     )
+#     return {"outputs": {"score": eval_score.score}}
+
+
+# COMMENTED OUT: RAG context relevancy function removed due to autoevals dependency removal
+# async def rag_context_relevancy(
+#     inputs: Dict[str, Any],  # pylint: disable=unused-argument
+#     output: Union[str, Dict[str, Any]],
+#     data_point: Dict[str, Any],  # pylint: disable=unused-argument
+#     app_params: Dict[str, Any],  # pylint: disable=unused-argument
+#     settings_values: Dict[str, Any],  # pylint: disable=unused-argument
+#     lm_providers_keys: Dict[str, Any],  # pylint: disable=unused-argument
+# ) -> Result:
+#     try:
+#         if isinstance(output, str):
+#             log.error("'output' is most likely not BaseResponse.")
+#             raise NotImplementedError(
+#                 "Please update the SDK to the latest version, which supports RAG evaluators."
+#             )
+#
+#         # Get required keys for rag evaluator
+#         mapping_keys = remove_trace_prefix(settings_values=settings_values)
+#         question_key: Union[str, None] = mapping_keys.get("question_key", None)
+#         answer_key: Union[str, None] = mapping_keys.get("answer_key", None)
+#         contexts_key: Union[str, None] = mapping_keys.get("contexts_key", None)
+#
+#         if None in [question_key, answer_key, contexts_key]:
+#             log.error(
+#                 f"Missing evaluator settings ? {['question', question_key is None, 'answer', answer_key is None, 'context', contexts_key is None]}"
+#             )
+#             raise ValueError(
+#                 "Missing required configuration keys: 'question_key', 'answer_key', or 'contexts_key'. Please check your evaluator settings and try again."
+#             )
+#
+#         # Turn distributed trace into trace tree
+#         trace = {}
+#         version = output.get("version")
+#         if version == "3.0":
+#             trace = output.get("tree", {})
+#         elif version == "2.0":
+#             trace = output.get("trace", {})
+#
+#         trace = process_distributed_trace_into_trace_tree(trace, version)
+#
+#         # Get value of required keys for rag evaluator
+#         question_val: Any = get_field_value_from_trace_tree(
+#             trace, question_key, version
+#         )
+#         answer_val: Any = get_field_value_from_trace_tree(trace, answer_key, version)
+#         contexts_val: Any = get_field_value_from_trace_tree(
+#             trace, contexts_key, version
+#         )
+#
+#         if None in [question_val, answer_val, contexts_val]:
+#             log.warn(
+#                 f"Missing trace field ? {['question', question_val is None, 'answer', answer_val is None, 'context', contexts_val is None]}"
+#             )
+#
+#             message = ""
+#             if question_val is None:
+#                 message += (
+#                     f"'question_key' is set to {question_key} which can't be found. "
+#                 )
+#             if answer_val is None:
+#                 message += f"'answer_key' is set to {answer_key} which can't be found. "
+#             if contexts_val is None:
+#                 message += (
+#                     f"'contexts_key' is set to {contexts_key} which can't be found. "
+#                 )
+#             message += "Please check your evaluator settings and try again."
+#
+#             raise ValueError(message)
+#
+#         measurement = await measure_context_coherence(
+#             input=EvaluatorInputInterface(
+#                 **{
+#                     "inputs": {
+#                         "question_key": question_val,
+#                         "contexts_key": contexts_val,
+#                         "answer_key": answer_val,
+#                     },
+#                     "settings": settings_values,
+#                     "credentials": lm_providers_keys,
+#                 }
+#             )
+#         )
+#         return Result(type="number", value=measurement["outputs"]["score"])
+#
+#     except Exception:
+#         return Result(
+#             type="error",
+#             value=None,
+#             error=Error(
+#                 message="Error during RAG Context Relevancy evaluation",
+#                 stacktrace=str(traceback.format_exc()),
+#             ),
+#         )


Instead of commenting out large blocks of code (230+ lines), it would be better to remove them entirely. Commented-out code adds maintenance burden and clutter. Version control systems preserve the code history if it needs to be recovered later.

Copilot · 2025-12-25T14:06:00Z

api/oss/src/services/evaluators_service.py

-                stacktrace=str(traceback.format_exc()),
-            ),
-        )
+# COMMENTED OUT: RAG evaluation functions removed due to autoevals dependency removal


The comment uses UPPERCASE for emphasis ("COMMENTED OUT"). While this makes it visible, a more standard approach would be to use a TODO or FIXME tag with a tracking issue reference. For example: "# TODO(#issue-number): Remove RAG evaluation functions after autoevals dependency is fully removed"

Copilot · 2025-12-25T14:06:01Z

web/oss/src/state/newObservability/stores/sessionsStore.ts

-            nextWindowing: null,
+            // Store the entire windowing object as the cursor for next page
+            nextCursor: nextWindowing || null,
+            nextWindowing: nextWindowing,


The nextCursor and nextWindowing are set to the same value on lines 226-227. This redundancy suggests one of these fields may be unnecessary. Consider removing the redundant field or documenting why both are needed.

Suggested change

nextWindowing: nextWindowing,

Copilot · 2025-12-25T14:06:01Z

web/oss/src/components/pages/observability/components/ObservabilityHeader/index.tsx

+        // Start fresh animation
+        setProgress(0)
+


The progress animation uses setProgress(0) twice in the same effect (lines 61-62). The first setProgress(0) on line 62 is redundant since line 57 or the effect dependency already resets it. This is a minor inefficiency but doesn't affect functionality.

Suggested change

// Start fresh animation

setProgress(0)

jp-agenta added 4 commits December 22, 2025 17:16

quick cleanup

f2018bf

Merge branch 'fix/references-serialization-issue-in-sdk' into fix/rem…

b7cb56f

…ove-autoevals-and-rag-evaluators

harden rules

08acb18

WIP

5526987

Copilot AI review requested due to automatic review settings December 25, 2025 10:25

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 25, 2025

junaway changed the title ~~WIP~~ [fix] Resolve code review gaps in v0.72.1 Dec 25, 2025

Copilot started reviewing on behalf of junaway December 25, 2025 10:25 View session

vercel bot deployed to Preview December 25, 2025 10:25 View deployment

dosubot bot added Backend Frontend SDK labels Dec 25, 2025

Copilot AI reviewed Dec 25, 2025

View reviewed changes

WIP fixing windowing

648d409

vercel bot deployed to Preview December 25, 2025 11:38 View deployment

WIP at FE

fd23891

Copilot AI review requested due to automatic review settings December 25, 2025 11:52

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Dec 25, 2025

Copilot started reviewing on behalf of junaway December 25, 2025 11:52 View session

vercel bot deployed to Preview December 25, 2025 11:53 View deployment

Copilot AI reviewed Dec 25, 2025

View reviewed changes

fix auto-rerfresh and windowing

32f22a6

vercel bot deployed to Preview December 25, 2025 12:37 View deployment

Merge branch 'fix/remove-autoevals-and-rag-evaluators' into fix/missi…

a7bf5b3

…ng-gaps-in-sessions-pr

Copilot AI review requested due to automatic review settings December 25, 2025 12:42

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Dec 25, 2025

Copilot started reviewing on behalf of junaway December 25, 2025 12:43 View session

vercel bot deployed to Preview December 25, 2025 12:43 View deployment

Copilot AI reviewed Dec 25, 2025

View reviewed changes

junaway changed the title ~~[fix] Resolve code review gaps in v0.72.1~~ [fix] Resolve code review gaps in v0.72.0 Dec 25, 2025

fix refresh, animation, state, windowing

1184958

junaway changed the base branch from release/v0.72.0 to release/v0.72.1 December 25, 2025 13:10

vercel bot deployed to Preview December 25, 2025 13:10 View deployment

poetry update, ruff format, eslint fix

ce43a36

Copilot AI review requested due to automatic review settings December 25, 2025 13:14

Copilot started reviewing on behalf of junaway December 25, 2025 13:14 View session

vercel bot deployed to Preview December 25, 2025 13:14 View deployment

cleanup copilot suggestions

6138334

vercel bot deployed to Preview December 25, 2025 13:17 View deployment

Copilot AI reviewed Dec 25, 2025

View reviewed changes

web/oss/src/state/newObservability/atoms/queries.ts Show resolved Hide resolved

Fix backend suggestions

76e60eb

vercel bot deployed to Preview December 25, 2025 13:26 View deployment

fix frontend suggestions

72967d2

Copilot AI review requested due to automatic review settings December 25, 2025 13:26

Copilot started reviewing on behalf of junaway December 25, 2025 13:27 View session

vercel bot deployed to Preview December 25, 2025 13:27 View deployment

more web cleanup

3ac0b3c

vercel bot deployed to Preview December 25, 2025 13:29 View deployment

Copilot AI reviewed Dec 25, 2025

View reviewed changes

fixed issues

4d6574a

vercel bot deployed to Preview December 25, 2025 13:45 View deployment

fix const and auto-refresh

c671dcd

Copilot AI review requested due to automatic review settings December 25, 2025 14:01

Copilot started reviewing on behalf of junaway December 25, 2025 14:02 View session

vercel bot deployed to Preview December 25, 2025 14:02 View deployment

Copilot AI reviewed Dec 25, 2025

View reviewed changes

junaway merged commit d191cf2 into release/v0.72.1 Dec 25, 2025
11 of 12 checks passed

[fix] Resolve code review gaps in v0.72.0 #3312

[fix] Resolve code review gaps in v0.72.0 #3312

Conversation

junaway commented Dec 25, 2025

Uh oh!

vercel bot commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

[fix] Resolve code review gaps in `v0.72.0` #3312

[fix] Resolve code review gaps in `v0.72.0` #3312

vercel bot commented Dec 25, 2025 •

edited

Loading