[PQE-401] Kpom/safety harnesses by kpom-specter · Pull Request #48 · SpecterOps/DAWGS

kpom-specter · 2026-03-25T18:30:34Z

Add opengraph package and integration test harness

Summary

Add opengraph/ package for graph serialization (load, export, validate) backed by graph.Database interfaces
Add integration/ test harness that loads JSON graph datasets and runs Cypher queries against a real database to verify correctness
Harness is driver-agnostic via -driver flag (defaults to pg)

What this enables

Today dawgs only tests "did the SQL generate correctly" (translation tests). This PR adds tests that verify "did I get the correct answer" — load a known graph, run a query, assert the results. This is the safety net needed before making changes to the BFS harness, edges_to_path, or the query planner.

opengraph package

Load(ctx, db, reader) — decode JSON, validate, write nodes/edges, return ID mapping
Export(ctx, db, writer) — read all nodes/edges, write indented JSON
Validate(doc) — structural checks (duplicate IDs, dangling edge refs, empty fields)
8 unit tests, 3 integration tests (load, export, round-trip)

integration test harness

SetupDB(t) — connect to database, assert schema, cleanup on teardown
LoadDataset(t, db, ctx, name) — load a named JSON fixture from testdata/
QueryPaths, AssertPaths, AssertIDSet — query and assertion helpers
ClearGraph — wipe nodes/edges between subtests
integration/testdata/local/ gitignored for custom datasets

Test suites (5 suites, 23 tests)

Suite	What it tests	Datasets
`TestMatchNodesByKind`	Node counts after load	all 6
`TestMatchEdgesByKind`	Edge counts after load	all 6
`TestMatchReturnNodes`	Single-hop neighbor queries	3
`TestAllShortestPaths`	BFS correctness via `allShortestPaths`	all 6
`TestVariableLengthTraversal`	Variable-length reachability	5

Running

# Unit tests
go test ./opengraph/ -run TestValidate

# Integration tests (requires PG)
PG_CONNECTION_STRING="postgresql://..." go test -tags=manual_integration ./opengraph/ -v
PG_CONNECTION_STRING="postgresql://..." go test -tags=manual_integration ./integration/ -v

Summary by CodeRabbit

Release Notes

New Features

Export graph databases to JSON format for backup and test data management
Benchmarking tool for measuring query performance across datasets with detailed timing reports
Integration testing framework supporting data-driven Cypher query validation

Documentation

Added guides for export and benchmark commands

…nd validation

coderabbitai · 2026-03-25T18:30:48Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

The pull request introduces OpenGraph JSON serialization/deserialization utilities, a comprehensive integration testing harness for graph databases, an export command to persist graphs to JSON, a benchmarking CLI tool with scenario execution, and extensive integration test fixtures covering various Cypher query patterns.

Changes

Cohort / File(s)	Summary
OpenGraph Core Types & Serialization `opengraph/types.go`, `opengraph/validate.go`, `opengraph/validate_test.go`, `opengraph/load.go`, `opengraph/export.go`	Introduces OpenGraph JSON document model with Node/Edge types, validation logic enforcing node ID uniqueness and edge referential integrity, loading/writing graph data to/from databases with IDMap tracking, and exporting databases to JSON format with proper node/edge resolution.
OpenGraph Integration Tests `opengraph/opengraph_integration_test.go`	Adds manual integration tests validating Load, Export, and round-trip functionality against a live PostgreSQL database with setupTestDB helper.
Integration Test Infrastructure `integration/harness.go`, `integration/cypher_test.go`, `integration/local_dataset_test.go`	Provides test harness with SetupDB, ClearGraph, LoadDataset utilities; comprehensive Cypher query test discovery and execution framework; and local dataset testing with node/kind-based helpers.
Benchmark CLI & Infrastructure `cmd/benchmark/main.go`, `cmd/benchmark/runner.go`, `cmd/benchmark/report.go`, `cmd/benchmark/scenarios.go`, `cmd/benchmark/README.md`	Implements benchmarking tool with scenario execution, per-iteration timing statistics (Median, P95, Max), Markdown report generation, and predefined scenarios for multiple datasets including traversal and shortest-path queries.
Export Command `cmd/export/main.go`, `cmd/export/README.md`	Adds export CLI entrypoint supporting custom output filenames and `PGCONN` environment variable override for exporting graph databases to JSON.
Integration Test Fixtures `integration/testdata/base.json`, `integration/testdata/quantifier.json`, `integration/testdata/cases/*.json`	Provides 1 core dataset and 30+ test case files covering aggregation, deletion, expansion, multipart flows, node filtering, pattern binding, quantifiers, shortest paths, stepwise traversal, and updates, with both pre-loaded and inline fixture variants.
Git Configuration & Documentation `.gitignore`, `integration/BENCHMARKS.md`	Updates git ignore to exclude `integration/testdata/local/` and adds dated benchmark report documentation.

Sequence Diagram(s)

sequenceDiagram
    participant Test as Integration Test
    participant Harness as Test Harness
    participant OpenGraph as OpenGraph Module
    participant DB as Graph Database
    
    Test->>Harness: SetupDB(datasets...)
    Harness->>OpenGraph: ParseDocument(dataset.json)
    OpenGraph-->>Harness: Document
    Harness->>Harness: Aggregate node/edge kinds
    Harness->>DB: AssertSchema(kinds)
    DB-->>Harness: Schema asserted
    Harness->>Test: Database connection ready
    
    Test->>Harness: LoadDataset(name)
    Harness->>OpenGraph: ParseDocument(dataset.json)
    OpenGraph-->>Harness: Document
    Harness->>OpenGraph: WriteGraph(db, graph)
    OpenGraph->>DB: WriteTransaction
    OpenGraph->>DB: Create nodes
    DB-->>OpenGraph: Node IDs mapped
    OpenGraph->>DB: BatchOperation (edges)
    OpenGraph->>DB: Create relationships
    DB-->>OpenGraph: IDMap
    OpenGraph-->>Harness: IDMap
    Harness-->>Test: IDMap
    
    Test->>Harness: QueryPaths(cypher)
    Harness->>DB: ReadTransaction
    DB->>DB: Execute query
    DB-->>Harness: Rows of graph.Path
    Harness-->>Test: []graph.Path
    
    Test->>Harness: AssertPaths(paths, idMap, expected)
    Harness->>Harness: Convert paths to fixture ID signatures
    Harness->>Harness: Compare signatures
    Harness-->>Test: Assertion result

sequenceDiagram
    participant CLI as Export CLI
    participant OpenGraph as OpenGraph Module
    participant DB as Graph Database
    participant File as Output File
    
    CLI->>DB: NewPool + Open database
    DB-->>CLI: Database connection
    CLI->>OpenGraph: Export(ctx, db, file)
    OpenGraph->>DB: ReadTransaction
    DB->>DB: Fetch all nodes (cursor)
    DB-->>OpenGraph: Nodes
    OpenGraph->>OpenGraph: Build ID map (numeric → string)
    DB->>DB: Fetch all relationships (cursor)
    DB-->>OpenGraph: Relationships
    OpenGraph->>OpenGraph: Resolve edge IDs via map
    OpenGraph->>OpenGraph: Construct Document
    OpenGraph->>File: json.Encoder.Encode(Document)
    File-->>CLI: Written successfully

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Behold the OpenGraph dance,
Where rabbits load and export with chance,
Benchmarks hop through datasets wide,
While integration tests run side by side!
Graph JSON flows like morning dew—
A fuzzy new framework, tried and true! 🌿✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 35.71% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The PR title 'PQE-401 Kpom/safety harnesses' references a ticket ID and branch name but does not clearly summarize the main changes being introduced.	Consider a more descriptive title that highlights the primary change, such as 'Add opengraph package and integration test harness' or 'Add OpenGraph JSON loading/export and integration testing utilities'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description is comprehensive and well-structured, covering objectives, implementation details, and testing instructions, though it does not follow the repository template with formal type/testing checkboxes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch kpom/safety-harnesses

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (4)

integration/shortest_path_test.go (1)

50-53: Isolate subtests by clearing graph state before each dataset load.

Using a shared DB without ClearGraph can make subtests interdependent as fixtures grow.

♻️ Suggested change

 for _, tt := range tests {
 	t.Run(tt.dataset, func(t *testing.T) {
+		ClearGraph(t, db, ctx)
 		idMap := LoadDataset(t, db, ctx, tt.dataset)
 		paths := QueryPaths(t, ctx, db, ASPQuery(idMap, tt.start, tt.end))
 		AssertPaths(t, paths, idMap, tt.expected)
 	})
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration/shortest_path_test.go` around lines 50 - 53, The subtests reuse a
shared DB and don't clear graph state, causing interdependent fixtures; before
loading each dataset call ClearGraph to reset DB state so LoadDataset starts
from a clean graph. Update the subtest setup in the loop where LoadDataset,
ASPQuery, and QueryPaths are invoked to invoke ClearGraph(ctx, db) (or the
project’s equivalent ClearGraph function) immediately prior to LoadDataset(t,
db, ctx, tt.dataset) so each t.Run uses an isolated graph state.

integration/traversal_test.go (1)

48-51: Clear DB state per subtest to keep traversal fixtures independent.

This keeps expected reachability assertions insulated from prior fixture loads.

♻️ Suggested change

 for _, tt := range tests {
 	t.Run(tt.dataset, func(t *testing.T) {
+		ClearGraph(t, db, ctx)
 		idMap := LoadDataset(t, db, ctx, tt.dataset)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration/traversal_test.go` around lines 48 - 51, Subtests share DB state
causing fixture interference; before calling LoadDataset(t, db, ctx, tt.dataset)
inside t.Run, reset the DB to a clean state so each subtest's fixtures are
independent (e.g., invoke a Reset/Truncate helper or run a transaction
rollback/clean function against the shared db). Ensure the reset runs for each
iteration (before LoadDataset) and optionally defer a per-subtest cleanup to
guarantee isolation for the idMap and subsequent reachability assertions.

opengraph/export.go (1)

57-65: Fail fast instead of silently synthesizing endpoint IDs.

If an edge endpoint is missing from idToString, export should return an error rather than emitting potentially inconsistent references.

♻️ Suggested change

 				startStr, ok := idToString[rel.StartID]
 				if !ok {
-					startStr = strconv.FormatInt(int64(rel.StartID), 10)
+					return fmt.Errorf("relationship %d has unmapped start node ID %d", rel.ID, rel.StartID)
 				}

 				endStr, ok := idToString[rel.EndID]
 				if !ok {
-					endStr = strconv.FormatInt(int64(rel.EndID), 10)
+					return fmt.Errorf("relationship %d has unmapped end node ID %d", rel.ID, rel.EndID)
 				}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@opengraph/export.go` around lines 57 - 65, The code currently synthesizes
string IDs when a mapping for rel.StartID or rel.EndID is missing in idToString;
change this to fail fast by returning an error from the export function (the
function that iterates relations and uses idToString) whenever a StartID or
EndID lookup is missing, instead of generating a numeric fallback. Locate the
loop using idToString and rel.StartID/rel.EndID, replace the fallback branches
with early returns of a descriptive error that includes the missing numeric ID
and whether it was the start or end, and update the function signature and any
callers to propagate/handle the error accordingly.

integration/match_test.go (1)

119-121: Apply the same graph reset pattern in TestMatchReturnNodes.

This test is the only one in this file that skips ClearGraph between dataset subtests.

♻️ Suggested change

 for _, tt := range tests {
 	t.Run(tt.dataset, func(t *testing.T) {
+		ClearGraph(t, db, ctx)
 		idMap := LoadDataset(t, db, ctx, tt.dataset)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration/match_test.go` around lines 119 - 121, TestMatchReturnNodes
currently omits the ClearGraph call between dataset subtests, causing state
leakage; update TestMatchReturnNodes to follow the same pattern used elsewhere
by calling ClearGraph(ctx, db) (or the test helper that resets the graph) before
each t.Run/LoadDataset invocation and after each subtest as appropriate so each
dataset subtest starts with a clean graph; locate TestMatchReturnNodes and
insert the ClearGraph invocation around the LoadDataset/tt.dataset usage to
mirror other tests' reset pattern.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration/harness.go`:
- Around line 163-167: ASPQuery currently indexes idMap[start] and idMap[end]
directly which returns 0 for missing keys; modify ASPQuery to accept a
*testing.T parameter, validate both keys exist in the idMap (use the comma-ok
form) and call t.Fatalf with a clear message if either is missing, then build
and return the query using the validated IDs; update the single call site in
integration/shortest_path_test.go to pass the test *testing.T into ASPQuery.

In `@opengraph/load.go`:
- Around line 54-60: When persisting nodes in opengraph/load.go, preserve the
logical node.ID by adding it into the node properties (e.g., under key
"_opengraph_id") before calling
tx.CreateNode(graph.AsProperties(node.Properties), ...), so the original ID is
stored alongside the DB-generated ID; continue to populate nodeMap[node.ID] =
dbNode.ID as before. Then update opengraph/export.go to prefer the stored
"_opengraph_id" property when exporting nodes, falling back to the DB ID only if
"_opengraph_id" is absent. Ensure you reference the CreateNode call and nodeMap
handling in load.go and the export path in export.go when making these changes.
- Around line 49-53: WriteGraph currently dereferences g (len(g.Nodes)) and will
panic if a nil *Graph is passed; add a nil check at the start of WriteGraph
(e.g., if g == nil) and return a zero-valued IDMap (or nil) with a descriptive
error instead of continuing; adjust callers/tests if they rely on panics and
ensure the check is placed before nodeMap := make(IDMap, len(g.Nodes)) so
symbols to modify are WriteGraph, g and g.Nodes.

In `@opengraph/opengraph_integration_test.go`:
- Around line 173-174: The test currently ignores the error from
json.Marshal(original) before calling Load; change the marshal to capture and
check the error (e.g., inputBytes, err := json.Marshal(original); if err != nil
{ t.Fatalf("json.Marshal failed: %v", err) } or use your test helper like
require.NoError) so marshal failures are reported immediately instead of
surfacing as confusing Load errors; update the code around the json.Marshal call
and ensure bytes.NewReader(inputBytes) is only used after the successful
marshal.

---

Nitpick comments:
In `@integration/match_test.go`:
- Around line 119-121: TestMatchReturnNodes currently omits the ClearGraph call
between dataset subtests, causing state leakage; update TestMatchReturnNodes to
follow the same pattern used elsewhere by calling ClearGraph(ctx, db) (or the
test helper that resets the graph) before each t.Run/LoadDataset invocation and
after each subtest as appropriate so each dataset subtest starts with a clean
graph; locate TestMatchReturnNodes and insert the ClearGraph invocation around
the LoadDataset/tt.dataset usage to mirror other tests' reset pattern.

In `@integration/shortest_path_test.go`:
- Around line 50-53: The subtests reuse a shared DB and don't clear graph state,
causing interdependent fixtures; before loading each dataset call ClearGraph to
reset DB state so LoadDataset starts from a clean graph. Update the subtest
setup in the loop where LoadDataset, ASPQuery, and QueryPaths are invoked to
invoke ClearGraph(ctx, db) (or the project’s equivalent ClearGraph function)
immediately prior to LoadDataset(t, db, ctx, tt.dataset) so each t.Run uses an
isolated graph state.

In `@integration/traversal_test.go`:
- Around line 48-51: Subtests share DB state causing fixture interference;
before calling LoadDataset(t, db, ctx, tt.dataset) inside t.Run, reset the DB to
a clean state so each subtest's fixtures are independent (e.g., invoke a
Reset/Truncate helper or run a transaction rollback/clean function against the
shared db). Ensure the reset runs for each iteration (before LoadDataset) and
optionally defer a per-subtest cleanup to guarantee isolation for the idMap and
subsequent reachability assertions.

In `@opengraph/export.go`:
- Around line 57-65: The code currently synthesizes string IDs when a mapping
for rel.StartID or rel.EndID is missing in idToString; change this to fail fast
by returning an error from the export function (the function that iterates
relations and uses idToString) whenever a StartID or EndID lookup is missing,
instead of generating a numeric fallback. Locate the loop using idToString and
rel.StartID/rel.EndID, replace the fallback branches with early returns of a
descriptive error that includes the missing numeric ID and whether it was the
start or end, and update the function signature and any callers to
propagate/handle the error accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0474eeed-7027-43be-b15d-107ed10ce53a

📥 Commits

Reviewing files that changed from the base of the PR and between ec6ffe0 and 218ab0f.

📒 Files selected for processing (17)

.gitignore
integration/harness.go
integration/match_test.go
integration/shortest_path_test.go
integration/testdata/dead_end.json
integration/testdata/diamond.json
integration/testdata/direct_shortcut.json
integration/testdata/disconnected.json
integration/testdata/linear.json
integration/testdata/wide_diamond.json
integration/traversal_test.go
opengraph/export.go
opengraph/load.go
opengraph/opengraph_integration_test.go
opengraph/types.go
opengraph/validate.go
opengraph/validate_test.go

integration/harness.go

opengraph/load.go

opengraph/opengraph_integration_test.go

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

integration/match_test.go (1)

122-125: ⚠️ Potential issue | 🟡 Minor

Add validation for missing fixture ID key.

Direct map access idMap[tt.queryID] returns 0 if the key doesn't exist, silently producing a query with an invalid node ID. This mirrors the same pattern previously flagged for ASPQuery. Validate the key exists to fail fast on test configuration errors.

Proposed fix

 		t.Run(tt.dataset, func(t *testing.T) {
+			ClearGraph(t, db, ctx)
 			idMap := LoadDataset(t, db, ctx, tt.dataset)
 
+			startID, ok := idMap[tt.queryID]
+			if !ok {
+				t.Fatalf("fixture ID %q not found in dataset %q", tt.queryID, tt.dataset)
+			}
+
 			cypher := fmt.Sprintf(
 				"MATCH (s)-[:EdgeKind1]->(e) WHERE id(s) = %d RETURN e",
-				idMap[tt.queryID],
+				startID,
 			)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration/match_test.go` around lines 122 - 125, The test constructs a
Cypher query using idMap[tt.queryID] which returns 0 for missing keys and
produces an invalid query; before building cypher, look up the ID with the
comma-ok pattern (e.g., value, ok := idMap[tt.queryID]) and if !ok fail the test
immediately (t.Fatalf or t.Fatalf-like helper) with a clear message mentioning
tt.queryID and idMap, then use the retrieved value when formatting the cypher
string.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration/match_test.go`:
- Around line 118-129: TestMatchReturnNodes is missing a ClearGraph call,
causing datasets to accumulate across subtests; before calling LoadDataset(t,
db, ctx, tt.dataset) inside the t.Run loop, invoke ClearGraph(t, db, ctx) so
each subtest starts with a clean DB state; update the loop where LoadDataset is
used (referencing LoadDataset, ClearGraph, QueryNodeIDs, and AssertIDSet) to
call ClearGraph first to prevent cross-test contamination.

---

Duplicate comments:
In `@integration/match_test.go`:
- Around line 122-125: The test constructs a Cypher query using
idMap[tt.queryID] which returns 0 for missing keys and produces an invalid
query; before building cypher, look up the ID with the comma-ok pattern (e.g.,
value, ok := idMap[tt.queryID]) and if !ok fail the test immediately (t.Fatalf
or t.Fatalf-like helper) with a clear message mentioning tt.queryID and idMap,
then use the retrieved value when formatting the cypher string.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dfbb35a1-b0d6-45a0-afc7-086b5b8e8e51

📥 Commits

Reviewing files that changed from the base of the PR and between 218ab0f and 2121e36.

📒 Files selected for processing (4)

integration/harness.go
integration/match_test.go
integration/shortest_path_test.go
integration/traversal_test.go

✅ Files skipped from review due to trivial changes (1)

integration/shortest_path_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

integration/traversal_test.go

integration/match_test.go

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (2)

integration/harness.go (2)

38-41: Flag names may conflict with other test packages.

Package-level flag.String calls use generic names like -driver and -connection. If other packages in the same test binary define flags with the same names, this will cause a panic at init time.

Consider prefixing with a namespace (e.g., -integration.driver, -integration.connection) to avoid collisions.

♻️ Suggested change

 var (
-	driverFlag  = flag.String("driver", "pg", "database driver to test against (pg, neo4j)")
-	connStrFlag = flag.String("connection", "", "database connection string (overrides PG_CONNECTION_STRING env var)")
+	driverFlag  = flag.String("integration.driver", "pg", "database driver to test against (pg, neo4j)")
+	connStrFlag = flag.String("integration.connection", "", "database connection string (overrides PG_CONNECTION_STRING env var)")
 )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration/harness.go` around lines 38 - 41, The package-level flags
driverFlag and connStrFlag use generic names ("driver", "connection") that can
collide with flags from other test packages; change their flag.String key names
to a namespaced prefix (for example "integration.driver" and
"integration.connection") and update the corresponding usage/help strings to
reflect the new names so consumers know the namespace, ensuring the variables
driverFlag and connStrFlag remain but the first argument to flag.String is the
prefixed name to avoid init-time conflicts.

198-214: Mutating input slices may cause unexpected side effects.

sort.Strings(got) and sort.Strings(expected) modify the caller's slices in place. While this may not cause issues in current tests, it's defensive to copy slices before sorting to avoid surprises if callers reuse them.

♻️ Suggested fix

 func AssertIDSet(t *testing.T, got, expected []string) {
 	t.Helper()

-	sort.Strings(got)
-	sort.Strings(expected)
+	gotSorted := make([]string, len(got))
+	copy(gotSorted, got)
+	sort.Strings(gotSorted)
+
+	expectedSorted := make([]string, len(expected))
+	copy(expectedSorted, expected)
+	sort.Strings(expectedSorted)

-	if len(got) != len(expected) {
-		t.Fatalf("ID set length: got %d, want %d\n  got:  %v\n  want: %v", len(got), len(expected), got, expected)
+	if len(gotSorted) != len(expectedSorted) {
+		t.Fatalf("ID set length: got %d, want %d\n  got:  %v\n  want: %v", len(gotSorted), len(expectedSorted), gotSorted, expectedSorted)
 	}

-	for i := range got {
-		if got[i] != expected[i] {
-			t.Fatalf("ID set mismatch at index %d:\n  got:  %v\n  want: %v", i, got, expected)
+	for i := range gotSorted {
+		if gotSorted[i] != expectedSorted[i] {
+			t.Fatalf("ID set mismatch at index %d:\n  got:  %v\n  want: %v", i, gotSorted, expectedSorted)
 		}
 	}
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration/harness.go` around lines 198 - 214, AssertIDSet mutates the
caller's slices by calling sort.Strings on the passed-in got and expected; make
shallow copies first (e.g., gotCopy and expectedCopy using make and copy) and
call sort.Strings on those copies, then perform the length check and element
comparisons against the copies so the original slices passed into AssertIDSet
are not modified.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration/harness.go`:
- Around line 64-75: When pg.NewPool succeeds but dawgs.Open fails, the created
pool is leaked; after creating the pool (pg.NewPool) assign it to cfg.Pool and
then call dawgs.Open, but if dawgs.Open returns an error explicitly Close the
pool (call pool.Close() or the pool's appropriate cleanup method) before calling
t.Fatalf; reference the symbols driverFlag, pg.DriverName, pg.NewPool, cfg.Pool
and dawgs.Open so the cleanup is done in the same block where the pool is
created.

In `@integration/match_test.go`:
- Around line 122-125: The test builds a Cypher using idMap[tt.queryID] without
checking the map membership, which silently yields 0 for missing keys; update
the test around the cypher construction to first lookup the id with the comma-ok
pattern (checking idMap and tt.queryID), and if the id is missing call t.Fatalf
(or t.Fatalf equivalent) with a clear error message so the test fails fast
instead of producing "id(s) = 0"; only then use the validated id to build the
cypher via fmt.Sprintf.

In `@integration/traversal_test.go`:
- Around line 46-56: The test loop calls LoadDataset without clearing prior
data, causing data accumulation; before calling LoadDataset in the t.Run
anonymous function add a call to ClearGraph (or the test helper that empties the
DB) so each subtest starts with a fresh graph; specifically modify the anonymous
test body containing LoadDataset(t, db, ctx, tt.dataset) to invoke
ClearGraph(ctx, db) (or the existing ClearGraph helper) immediately prior to
LoadDataset, ensuring isolation for the QueryNodeIDs/assertions.
- Around line 50-53: The code constructs the Cypher query using idMap[tt.start]
which returns 0 when the key is missing and can produce an invalid/unintended
query; change the lookup in the test (where cypher := fmt.Sprintf(...
idMap[tt.start] ...)) to validate presence first—e.g., check the map with the
comma-ok pattern and fail the test (t.Fatalf) or call a small helper (e.g.,
getIDOrFail/idFromMapOrFail) that panics/fails the test if tt.start is not
present—then use the validated id value when building the cypher string.

---

Nitpick comments:
In `@integration/harness.go`:
- Around line 38-41: The package-level flags driverFlag and connStrFlag use
generic names ("driver", "connection") that can collide with flags from other
test packages; change their flag.String key names to a namespaced prefix (for
example "integration.driver" and "integration.connection") and update the
corresponding usage/help strings to reflect the new names so consumers know the
namespace, ensuring the variables driverFlag and connStrFlag remain but the
first argument to flag.String is the prefixed name to avoid init-time conflicts.
- Around line 198-214: AssertIDSet mutates the caller's slices by calling
sort.Strings on the passed-in got and expected; make shallow copies first (e.g.,
gotCopy and expectedCopy using make and copy) and call sort.Strings on those
copies, then perform the length check and element comparisons against the copies
so the original slices passed into AssertIDSet are not modified.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2ec080fd-87b7-4eed-8415-8aeed18b1735

📥 Commits

Reviewing files that changed from the base of the PR and between 2121e36 and f6372ea.

📒 Files selected for processing (5)

integration/harness.go
integration/match_test.go
integration/shortest_path_test.go
integration/traversal_test.go
opengraph/load.go

✅ Files skipped from review due to trivial changes (1)

integration/shortest_path_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

opengraph/load.go

integration/harness.go

integration/match_test.go

integration/traversal_test.go

Support loading arbitrary opengraph datasets by deriving the schema from the dataset's node/edge kinds instead of hardcoding them. Add batch edge writes for performance and a flag-driven test for large local datasets.

…opengraph doc and one to run benchmarks

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (7)

integration/shortest_path_test.go (1)

57-60: Direct map indexing may produce invalid queries for missing fixture IDs.

idMap[tt.start] and idMap[tt.end] return 0 if the keys don't exist, creating queries with id(s) = 0 that may silently match nothing or an unintended node. For controlled test fixtures this is low-risk, but adding validation would catch dataset/test case mismatches early.

♻️ Optional validation

+			startID, ok := idMap[tt.start]
+			if !ok {
+				t.Fatalf("fixture ID %q not found in dataset %q", tt.start, tt.dataset)
+			}
+			endID, ok := idMap[tt.end]
+			if !ok {
+				t.Fatalf("fixture ID %q not found in dataset %q", tt.end, tt.dataset)
+			}
+
 			cypher := fmt.Sprintf(
 				"MATCH p = allShortestPaths((s)-[:EdgeKind1*1..]->(e)) WHERE id(s) = %d AND id(e) = %d RETURN p",
-				idMap[tt.start], idMap[tt.end],
+				startID, endID,
 			)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration/shortest_path_test.go` around lines 57 - 60, The cypher string is
built using direct map indexing idMap[tt.start] and idMap[tt.end], which yields
0 for missing keys and can produce invalid queries; update the test to look up
values with the comma-ok form (e.g., startID, ok := idMap[tt.start]) and
similarly for endID, validate ok for both and call t.Fatalf or t.Fatalf-like
failure with a clear message if either is missing, then use the validated
startID/endID in fmt.Sprintf to build the cypher; reference idMap, tt.start,
tt.end, cypher and fmt.Sprintf when making the change.

integration/harness.go (1)

110-120: Use defer for file close to ensure cleanup on all exit paths.

Currently f.Close() is called inline after ParseDocument, but if ParseDocument panics, the file handle leaks. Using defer ensures the file is closed regardless of how the function exits.

♻️ Suggested fix

 	for _, name := range datasets {
 		f, err := os.Open(datasetPath(name))
 		if err != nil {
 			t.Fatalf("failed to open dataset %q for kind scanning: %v", name, err)
 		}
+		defer f.Close()

 		doc, err := opengraph.ParseDocument(f)
-		f.Close()
 		if err != nil {
 			t.Fatalf("failed to parse dataset %q: %v", name, err)
 		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@integration/harness.go` around lines 110 - 120, The file opened with
os.Open(datasetPath(name)) must be closed via defer immediately after a
successful open to guarantee cleanup on all exit paths; update the loop around
opengraph.ParseDocument to call defer f.Close() right after the os.Open success,
and to avoid accumulating defers across iterations either wrap the per-dataset
logic in a short anonymous func (e.g., func() { f, err := os.Open(...); if err
!= nil { ... }; defer f.Close(); doc, err := opengraph.ParseDocument(f); ...
}()) or explicitly close f on every return/error path—modify the loop that uses
os.Open, datasetPath, opengraph.ParseDocument and f.Close accordingly.

cmd/benchmark/scenarios.go (1)

80-99: Missing key validation in query builders may produce invalid queries.

shortestPathQuery, traversalQuery, and matchReturnQuery use direct map indexing (idMap[start], idMap[end]) without checking if the keys exist. If a key is missing, the query will use 0 as the node ID, which may silently match nothing or an unintended node.

In contrast, phantomScenarios (lines 188, 218-219) validates key presence before building queries. Apply the same pattern here for consistency and to fail fast on misconfigured scenarios.
♻️ Suggested validation pattern
 func shortestPathQuery(idMap opengraph.IDMap, start, end string) func(tx graph.Transaction) error {
+	startID, startOK := idMap[start]
+	endID, endOK := idMap[end]
+	if !startOK || !endOK {
+		return func(tx graph.Transaction) error {
+			return fmt.Errorf("missing fixture ID: start=%q (found=%v), end=%q (found=%v)", start, startOK, end, endOK)
+		}
+	}
 	return cypherQuery(fmt.Sprintf(
 		"MATCH p = allShortestPaths((s)-[*1..]->(e)) WHERE id(s) = %d AND id(e) = %d RETURN p",
-		idMap[start], idMap[end],
+		startID, endID,
 	))
 }
Apply similar changes to traversalQuery and matchReturnQuery.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/benchmark/scenarios.go` around lines 80 - 99, shortestPathQuery,
traversalQuery, and matchReturnQuery currently index idMap[start] / idMap[end]
without checking presence; update each builder to validate keys like
phantomScenarios does and fail fast: check for start (and end in
shortestPathQuery) using the comma-ok idiom (e.g., id, ok := idMap[start]); if a
key is missing return/build a failing func or panic with a clear message (e.g.,
"missing idMap entry for start/ end") instead of using zero, so the generated
cypherQuery isn't silently invalid. Ensure you reference and modify the
functions shortestPathQuery, traversalQuery, and matchReturnQuery accordingly.

cmd/benchmark/main.go (3)

125-125: Use filepath.Join for cross-platform path construction.

String concatenation with / may not work correctly on Windows.

♻️ Proposed fix

Add import:

import "path/filepath"

Then:

-		path := *datasetDir + "/" + ds + ".json"
+		path := filepath.Join(*datasetDir, ds+".json")

Apply similarly in scanKinds at line 178.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cmd/benchmark/main.go` at line 125, Replace the manual string concat that
builds file paths (the assignment to variable "path" using *datasetDir + "/" +
ds + ".json") with filepath.Join to ensure cross-platform correctness; add the
import "path/filepath" and change the construction to join *datasetDir and ds +
".json" (and make the same change inside the scanKinds function where paths are
built) so Windows path separators are handled correctly.

174-196: Silent errors in scanKinds could mask configuration issues.

If all dataset files fail to open or parse (e.g., wrong datasetDir path), the function returns empty kinds, leading to a schema with no node/edge types. Consider logging failures to stderr for debugging.

♻️ Proposed fix

 func scanKinds(datasetDir string, datasets []string) (graph.Kinds, graph.Kinds) {
 	var nodeKinds, edgeKinds graph.Kinds
 
 	for _, ds := range datasets {
 		path := datasetDir + "/" + ds + ".json"
 		f, err := os.Open(path)
 		if err != nil {
+			fmt.Fprintf(os.Stderr, "warning: could not open %s: %v\n", path, err)
 			continue
 		}
 
 		doc, err := opengraph.ParseDocument(f)
 		f.Close()
 		if err != nil {
+			fmt.Fprintf(os.Stderr, "warning: could not parse %s: %v\n", path, err)
 			continue
 		}
 
 		nk, ek := doc.Graph.Kinds()
 		nodeKinds = nodeKinds.Add(nk...)
 		edgeKinds = edgeKinds.Add(ek...)
 	}
 
 	return nodeKinds, edgeKinds
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cmd/benchmark/main.go` around lines 174 - 196, The scanKinds function
currently swallows file-open and parse errors which can produce an empty schema
silently; update scanKinds to log failures to stderr (including the dataset path
and the error) when os.Open or opengraph.ParseDocument returns an error so
configuration/path issues are visible. Specifically, inside scanKinds where
os.Open(path) and opengraph.ParseDocument(f) now just continue, add logging
calls that print the path and err (using fmt.Fprintf(os.Stderr, ...) or
log.Printf) before continuing; keep returning nodeKinds and edgeKinds as before
but ensure every open/parse error for the given dataset is reported for
debugging.

50-56: Consider a driver-agnostic env var name.

PG_CONNECTION_STRING is used as fallback even when -driver=neo4j. A generic name like DB_CONNECTION_STRING would be clearer, or add separate env vars per driver.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cmd/benchmark/main.go` around lines 50 - 56, The code uses
PG_CONNECTION_STRING as the fallback for conn (derived from connStr) even when a
different driver is selected; update the fallback logic in the block that
initializes conn (the conn := *connStr section) to use a driver-agnostic env var
(e.g., DB_CONNECTION_STRING) or to check driver-specific env vars based on the
-driver flag (inspect the driver flag variable, then try DRIVER-specific env var
like NEO4J_CONNECTION_STRING before falling back to DB_CONNECTION_STRING);
ensure the fatal(...) message remains accurate if no connection string is found.

cmd/benchmark/report.go (1)

34-56: Consider handling or removing the error return.

writeMarkdown always returns nil, making the error return type misleading. Either propagate write errors from fmt.Fprintf or simplify the signature.

♻️ Option A: Propagate write errors

 func writeMarkdown(w io.Writer, r Report) error {
-	fmt.Fprintf(w, "# Benchmarks — %s @ %s (%s, %d iterations)\n\n", r.Driver, r.GitRef, r.Date, r.Iterations)
-	fmt.Fprintf(w, "| Query | Dataset | Median | P95 | Max |\n")
-	fmt.Fprintf(w, "|-------|---------|-------:|----:|----:|\n")
+	if _, err := fmt.Fprintf(w, "# Benchmarks — %s @ %s (%s, %d iterations)\n\n", r.Driver, r.GitRef, r.Date, r.Iterations); err != nil {
+		return err
+	}
+	if _, err := fmt.Fprintf(w, "| Query | Dataset | Median | P95 | Max |\n"); err != nil {
+		return err
+	}
+	if _, err := fmt.Fprintf(w, "|-------|---------|-------:|----:|----:|\n"); err != nil {
+		return err
+	}
 
 	for _, res := range r.Results {
 		// ...
+		if _, err := fmt.Fprintf(w, "| %s | %s | %s | %s | %s |\n", ...); err != nil {
+			return err
+		}
 	}
 
-	fmt.Fprintln(w)
+	_, err := fmt.Fprintln(w)
+	return err
-	return nil
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cmd/benchmark/report.go` around lines 34 - 56, The function writeMarkdown
currently declares an error return but always returns nil; update writeMarkdown
to either propagate write errors from the fmt.Fprintf / fmt.Fprintln calls
(capture each call's returned error and return it on first failure, e.g., if err
:= fmt.Fprintf(...); err != nil { return err }) or remove the error return from
the signature and return nothing; locate writeMarkdown and the fmt.Fprintf /
fmt.Fprintln invocations to implement the chosen fix (ensure callers are updated
if you change the signature).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/benchmark/runner.go`:
- Around line 67-77: computeStats currently assumes durations is non-empty and
will panic on index access if called with an empty slice; update computeStats to
check n := len(durations) and if n == 0 return a zero-valued Stats (e.g.,
Stats{}) immediately, otherwise proceed with the existing sort and index-based
Median/P95/Max computation so the function never panics when given an empty
slice.

In `@integration/shortest_path_test.go`:
- Around line 53-65: The tests are accumulating data because each subtest calls
LoadDataset without clearing prior data; update the subtest body inside the
t.Run to invoke ClearGraph (the existing ClearGraph helper) before calling
LoadDataset so the DB is reset per subtest (i.e., call ClearGraph with the same
context/db parameters used by LoadDataset and QueryPaths at the start of the
anonymous func in the for loop). Ensure the ClearGraph call uses the correct
signature (ctx, db or t, db, ctx as appropriate) so idMap from LoadDataset
refers to a fresh graph for each tt.

In `@opengraph/load.go`:
- Around line 67-103: The current Load implementation writes nodes via
db.WriteTransaction and then creates edges via db.BatchOperation which uses a
separate transaction, risking orphan nodes if edge creation fails; update Load
so node creation and edge creation occur in the same transactional context
(e.g., perform both node CreateNode and relationship creation using
CreateRelationshipByIDs inside the same db.WriteTransaction callback or, if the
driver exposes a way to obtain a batch from the active graph.Transaction, reuse
that instead of calling db.BatchOperation), ensuring you still use
graph.Transaction, CreateNode and CreateRelationshipByIDs and preserve the
nodeMap mapping; alternatively, if you keep separate calls, add a clear comment
in Load documenting that partial writes are possible and acceptable for
test-only usage.

---

Nitpick comments:
In `@cmd/benchmark/main.go`:
- Line 125: Replace the manual string concat that builds file paths (the
assignment to variable "path" using *datasetDir + "/" + ds + ".json") with
filepath.Join to ensure cross-platform correctness; add the import
"path/filepath" and change the construction to join *datasetDir and ds + ".json"
(and make the same change inside the scanKinds function where paths are built)
so Windows path separators are handled correctly.
- Around line 174-196: The scanKinds function currently swallows file-open and
parse errors which can produce an empty schema silently; update scanKinds to log
failures to stderr (including the dataset path and the error) when os.Open or
opengraph.ParseDocument returns an error so configuration/path issues are
visible. Specifically, inside scanKinds where os.Open(path) and
opengraph.ParseDocument(f) now just continue, add logging calls that print the
path and err (using fmt.Fprintf(os.Stderr, ...) or log.Printf) before
continuing; keep returning nodeKinds and edgeKinds as before but ensure every
open/parse error for the given dataset is reported for debugging.
- Around line 50-56: The code uses PG_CONNECTION_STRING as the fallback for conn
(derived from connStr) even when a different driver is selected; update the
fallback logic in the block that initializes conn (the conn := *connStr section)
to use a driver-agnostic env var (e.g., DB_CONNECTION_STRING) or to check
driver-specific env vars based on the -driver flag (inspect the driver flag
variable, then try DRIVER-specific env var like NEO4J_CONNECTION_STRING before
falling back to DB_CONNECTION_STRING); ensure the fatal(...) message remains
accurate if no connection string is found.

In `@cmd/benchmark/report.go`:
- Around line 34-56: The function writeMarkdown currently declares an error
return but always returns nil; update writeMarkdown to either propagate write
errors from the fmt.Fprintf / fmt.Fprintln calls (capture each call's returned
error and return it on first failure, e.g., if err := fmt.Fprintf(...); err !=
nil { return err }) or remove the error return from the signature and return
nothing; locate writeMarkdown and the fmt.Fprintf / fmt.Fprintln invocations to
implement the chosen fix (ensure callers are updated if you change the
signature).

In `@cmd/benchmark/scenarios.go`:
- Around line 80-99: shortestPathQuery, traversalQuery, and matchReturnQuery
currently index idMap[start] / idMap[end] without checking presence; update each
builder to validate keys like phantomScenarios does and fail fast: check for
start (and end in shortestPathQuery) using the comma-ok idiom (e.g., id, ok :=
idMap[start]); if a key is missing return/build a failing func or panic with a
clear message (e.g., "missing idMap entry for start/ end") instead of using
zero, so the generated cypherQuery isn't silently invalid. Ensure you reference
and modify the functions shortestPathQuery, traversalQuery, and matchReturnQuery
accordingly.

In `@integration/harness.go`:
- Around line 110-120: The file opened with os.Open(datasetPath(name)) must be
closed via defer immediately after a successful open to guarantee cleanup on all
exit paths; update the loop around opengraph.ParseDocument to call defer
f.Close() right after the os.Open success, and to avoid accumulating defers
across iterations either wrap the per-dataset logic in a short anonymous func
(e.g., func() { f, err := os.Open(...); if err != nil { ... }; defer f.Close();
doc, err := opengraph.ParseDocument(f); ... }()) or explicitly close f on every
return/error path—modify the loop that uses os.Open, datasetPath,
opengraph.ParseDocument and f.Close accordingly.

In `@integration/shortest_path_test.go`:
- Around line 57-60: The cypher string is built using direct map indexing
idMap[tt.start] and idMap[tt.end], which yields 0 for missing keys and can
produce invalid queries; update the test to look up values with the comma-ok
form (e.g., startID, ok := idMap[tt.start]) and similarly for endID, validate ok
for both and call t.Fatalf or t.Fatalf-like failure with a clear message if
either is missing, then use the validated startID/endID in fmt.Sprintf to build
the cypher; reference idMap, tt.start, tt.end, cypher and fmt.Sprintf when
making the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6f6f9137-eb21-4d6c-acf7-aa8d2ab33c2c

📥 Commits

Reviewing files that changed from the base of the PR and between f6372ea and 08ee9c0.

📒 Files selected for processing (15)

cmd/benchmark/README.md
cmd/benchmark/main.go
cmd/benchmark/report.go
cmd/benchmark/runner.go
cmd/benchmark/scenarios.go
cmd/export/README.md
cmd/export/main.go
integration/BENCHMARKS.md
integration/harness.go
integration/local_dataset_test.go
integration/match_test.go
integration/shortest_path_test.go
integration/traversal_test.go
opengraph/load.go
opengraph/types.go

✅ Files skipped from review due to trivial changes (4)

integration/BENCHMARKS.md
cmd/export/README.md
cmd/benchmark/README.md
integration/traversal_test.go

cmd/benchmark/runner.go

integration/shortest_path_test.go

opengraph/load.go

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@integration/cypher_test.go`:
- Around line 153-184: The parseAssertion function currently ignores
json.Unmarshal errors for object-valued assertions and allows multi-key
assertion objects; update parseAssertion to (1) enforce the object has exactly
one key (fail with t.Fatalf if len(obj) != 1) and (2) check and
propagate/unhandled json.Unmarshal errors for each case ("row_count",
"at_least_int", "exact_int", "contains_node_with_prop") — if unmarshalling val
into n, n int/int64, or pair [2]string fails, call t.Fatalf with the json error;
on success return the existing assertRowCount, assertAtLeastInt64,
assertExactInt64, or assertContainsNodeWithProp results as before.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 34f58fde-8c32-4a3b-9121-a16e3cdfd96e

📥 Commits

Reviewing files that changed from the base of the PR and between 08ee9c0 and 14098d4.

📒 Files selected for processing (24)

integration/cypher_test.go
integration/testdata/base.json
integration/testdata/cases/aggregation.json
integration/testdata/cases/aggregation_inline.json
integration/testdata/cases/delete.json
integration/testdata/cases/delete_inline.json
integration/testdata/cases/expansion.json
integration/testdata/cases/expansion_inline.json
integration/testdata/cases/multipart.json
integration/testdata/cases/multipart_inline.json
integration/testdata/cases/nodes.json
integration/testdata/cases/nodes_inline.json
integration/testdata/cases/pattern_binding.json
integration/testdata/cases/pattern_binding_inline.json
integration/testdata/cases/quantifiers.json
integration/testdata/cases/quantifiers_inline.json
integration/testdata/cases/shortest_paths.json
integration/testdata/cases/shortest_paths_inline.json
integration/testdata/cases/stepwise.json
integration/testdata/cases/stepwise_inline.json
integration/testdata/cases/update.json
integration/testdata/cases/update_inline.json
integration/testdata/quantifier.json
opengraph/load.go

✅ Files skipped from review due to trivial changes (21)

integration/testdata/quantifier.json
integration/testdata/cases/update.json
integration/testdata/cases/shortest_paths.json
integration/testdata/cases/delete.json
integration/testdata/base.json
integration/testdata/cases/multipart.json
integration/testdata/cases/pattern_binding.json
integration/testdata/cases/delete_inline.json
integration/testdata/cases/aggregation_inline.json
integration/testdata/cases/aggregation.json
integration/testdata/cases/shortest_paths_inline.json
integration/testdata/cases/expansion_inline.json
integration/testdata/cases/quantifiers.json
integration/testdata/cases/update_inline.json
integration/testdata/cases/multipart_inline.json
integration/testdata/cases/pattern_binding_inline.json
integration/testdata/cases/nodes_inline.json
integration/testdata/cases/stepwise_inline.json
integration/testdata/cases/expansion.json
integration/testdata/cases/stepwise.json
integration/testdata/cases/quantifiers_inline.json

🚧 Files skipped from review as they are similar to previous changes (1)

opengraph/load.go

integration/cypher_test.go

ddlees

LGTM. I understand the impetus for this separate benchmark utility; that said, my only feedback is that we should consider documenting clear guidelines for when contributors should throw scenarios into the benchmark utility versus the built-in Go bench-testing platform. E.g. measuring allocs, cpu/memory profiling, pprof, etc.

kpom-specter added 4 commits March 24, 2026 16:49

feat(opengraph): add graph serialization package with load, export, a…

dafd4a4

…nd validation

fix(opengraph): simplify Load API and fix integration test setup

2c24c30

feat: add integration test harness with driver-agnostic dataset loading

fdf3e02

feat: add driver-agnostic integration test harness with dataset loading

218ab0f

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

integration/harness.go Outdated Show resolved Hide resolved

opengraph/load.go Show resolved Hide resolved

opengraph/load.go Show resolved Hide resolved

opengraph/opengraph_integration_test.go Show resolved Hide resolved

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

integration/match_test.go Outdated Show resolved Hide resolved

refactor: extract QueryNodeIDs helper and remove ASPQuery from harness

f6372ea

kpom-specter force-pushed the kpom/safety-harnesses branch from 2121e36 to f6372ea Compare March 25, 2026 19:07

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

integration/harness.go Show resolved Hide resolved

integration/match_test.go Outdated Show resolved Hide resolved

integration/traversal_test.go Outdated Show resolved Hide resolved

integration/traversal_test.go Outdated Show resolved Hide resolved

kpom-specter added 2 commits March 30, 2026 14:09

feat(cmd/export, cmd/benchmark): Added two new endpoints to export a …

08ee9c0

…opengraph doc and one to run benchmarks

coderabbitai bot reviewed Mar 30, 2026

View reviewed changes

cmd/benchmark/runner.go Show resolved Hide resolved

integration/shortest_path_test.go Outdated Show resolved Hide resolved

opengraph/load.go Show resolved Hide resolved

(chore): REAL test cases

14098d4

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

integration/cypher_test.go Show resolved Hide resolved

(chore): Update benchmarks to use current datasets

7a759ed

kpom-specter changed the title ~~Kpom/safety harnesses~~ [PQE-401] Kpom/safety harnesses Apr 1, 2026

ddlees approved these changes Apr 1, 2026

View reviewed changes

kpom-specter merged commit f289bda into main Apr 1, 2026
2 checks passed

Conversation

kpom-specter commented Mar 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add opengraph package and integration test harness

Summary

What this enables

opengraph package

integration test harness

Test suites (5 suites, 23 tests)

Running

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ddlees left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kpom-specter commented Mar 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 25, 2026 •

edited by atlassian bot

Loading