Skip to content

fix(db): flush session before accessing deduplication_event.id (fixes #5496)#6228

Open
asheesh-devops wants to merge 1 commit intokeephq:mainfrom
asheesh-devops:fix/5496-deferred-loader-flush
Open

fix(db): flush session before accessing deduplication_event.id (fixes #5496)#6228
asheesh-devops wants to merge 1 commit intokeephq:mainfrom
asheesh-devops:fix/5496-deferred-loader-flush

Conversation

@asheesh-devops
Copy link
Copy Markdown

Summary

Fixes KeyError: "Deferred loader for attribute 'id' failed to populate correctly" in create_deduplication_event() under high alert load (~1000+ alerts/minute).

Root Cause

In create_deduplication_event(), session.add(deduplication_event) is followed by session.commit(), which expires all ORM attributes. When deduplication_event.id is accessed in the logger after commit, SQLAlchemy triggers a deferred load to re-fetch the ID — this requires a new DB round-trip which fails under high load:

# Current (broken under load):
session.add(deduplication_event)
session.commit()  # Expires all attributes
logger.debug(..., extra={"deduplication_event_id": deduplication_event.id})  # Deferred load fails!

The Fix

Add session.flush() to force the INSERT and populate the auto-generated ID, then capture it in a local variable before commit() expires the ORM object:

# Before (broken):
session.add(deduplication_event)
session.commit()
logger.debug(..., extra={"deduplication_event_id": deduplication_event.id})

# After (fixed):
session.add(deduplication_event)
session.flush()  # Force INSERT, ID now populated
deduplication_event_id = deduplication_event.id  # Capture before expire
session.commit()
logger.debug(..., extra={"deduplication_event_id": deduplication_event_id})

This follows the same session.add()session.flush() pattern already used in create_incident_from_dict() (line 2420) and _create_application_based_incident() (fixed in #6174).

Changes

  • keep/api/core/db.py — add session.flush() and capture ID before commit in create_deduplication_event()

Testing

  • Verified the same flush() → capture → commit() pattern is used in create_incident_from_dict() (line 2420-2425)
  • The fix is identical in nature to fix(topology): persist incident before adding alerts (fixes #5463) #6174 which was merged for the same class of bug in topology processing
  • No behavioral change — only timing of when the ID is read from the ORM object

Fixes #5496 (Issue 2: Deferred loader failure after INSERT)

…eephq#5496)

Add session.flush() after session.add() and capture the ID before
session.commit() expires attributes. Prevents KeyError on deferred
loader under high alert load.
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. Bug Something isn't working labels Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Something isn't working size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[🐛 Bug]: Critical database connection issues under high alert load

1 participant