Skip to content

Flaky test: test_temporal_grounding_integration_last_year #74

@abrookins

Description

@abrookins

Example failing run:

uv run pytest --run-api-tests
============================= test session starts ==============================
platform linux -- Python 3.12.11, pytest-8.4.1, pluggy-1.6.0
rootdir: /home/runner/work/agent-memory-server/agent-memory-server
configfile: pytest.ini
testpaths: tests
plugins: anyio-4.9.0, asyncio-1.0.0, langsmith-0.4.2, cov-6.2.1, xdist-3.7.0
asyncio: mode=Mode.AUTO, asyncio_default_fixture_loop_scope=function, asyncio_default_test_loop_scope=function
collected 487 items

tests/integration/test_vectorstore_factory_integration.py ........       [  1%]
tests/test_api.py ....................................                   [  9%]
tests/test_auth.py ..................................................... [ 19%]
                                                                         [ 19%]
tests/test_cli.py ...........................                            [ 25%]
tests/test_client_api.py .............                                   [ 28%]
tests/test_client_enhancements.py ........................               [ 33%]
tests/test_client_tool_calls.py ..............................           [ 39%]
tests/test_context_percentage_calculation.py ........                    [ 40%]
tests/test_contextual_grounding.py ..                                    [ 41%]
tests/test_contextual_grounding_integration.py .F..s                     [ 42%]
tests/test_dependencies.py ..........                                    [ 44%]
tests/test_extraction.py .........                                       [ 46%]
tests/test_extraction_logic_fix.py ....                                  [ 47%]
tests/test_forgetting.py ......                                          [ 48%]
tests/test_forgetting_job.py ..                                          [ 48%]
tests/test_full_integration.py sssssssssssssssssssssssss                 [ 53%]
tests/test_llm_judge_evaluation.py ...s.......                           [ 56%]
tests/test_llms.py ..................                                    [ 59%]
tests/test_long_term_memory.py ....................                      [ 63%]
tests/test_mcp.py ...............                                        [ 66%]
tests/test_memory_compaction.py .....                                    [ 67%]
tests/test_memory_strategies.py ......................                   [ 72%]
tests/test_models.py .....                                               [ 73%]
tests/test_prompt_security.py .................                          [ 77%]
tests/test_query_optimization_errors.py ...........                      [ 79%]
tests/test_recency_aggregation.py ..                                     [ 79%]
tests/test_recent_messages_limit.py ........                             [ 81%]
tests/test_summarization.py ....                                         [ 82%]
tests/test_thread_aware_grounding.py s..s                                [ 82%]
tests/test_token_auth.py ...................                             [ 86%]
tests/test_token_cli.py ............                                     [ 89%]
tests/test_tool_contextual_grounding.py ....                             [ 90%]
tests/test_vectorstore_adapter.py ............                           [ 92%]
tests/test_working_memory.py ........                                    [ 94%]
tests/test_working_memory_reconstruction.py .....                        [ 95%]
tests/test_working_memory_strategies.py ............                     [ 97%]
tests/unit/test_factory_patterns.py ...........                          [100%]

=================================== FAILURES ===================================
_ TestContextualGroundingIntegration.test_temporal_grounding_integration_last_year _

self = <tests.test_contextual_grounding_integration.TestContextualGroundingIntegration object at 0x7fb2228ff2c0>

    async def test_temporal_grounding_integration_last_year(self):
        """Integration test for temporal grounding with real LLM"""
        example = ContextualGroundingBenchmark.get_temporal_grounding_examples()[0]
        session_id = f"test-temporal-{ulid.ULID()}"
    
        # Set up conversation context
        await self.create_test_conversation_with_context(
            example["messages"], example["context_date"], session_id
        )
    
        # Use thread-aware extraction
        from agent_memory_server.long_term_memory import (
            extract_memories_from_session_thread,
        )
    
        extracted_memories = await extract_memories_from_session_thread(
            session_id=session_id,
            namespace="test-namespace",
            user_id="test-integration-user",
        )
    
        # Verify extraction was successful
>       assert len(extracted_memories) >= 1, "Expected at least one extracted memory"
E       AssertionError: Expected at least one extracted memory
E       assert 0 >= 1
E        +  where 0 = len([])

tests/test_contextual_grounding_integration.py:349: AssertionError

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions