feat: integrate sproink Rust spreading activation via CGO#268
feat: integrate sproink Rust spreading activation via CGO#268nvandessel merged 33 commits intomainfrom
Conversation
Add Activator interface to decouple spreading activation consumers from the concrete Engine implementation, enabling the upcoming NativeEngine (sproink CGO) to be used interchangeably. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements IDMap for the sproink FFI layer. Provides GetOrAssign (sequential ID assignment), ToU32/ToUUID (bidirectional lookup), and Len. Includes TDD tests for roundtrip, contiguity, idempotency, unknown-UUID lookup, out-of-range panic, and length tracking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Map floop store.EdgeKind values to sproink uint8 constants: conflicts→1, overrides/deprecated-to/merged-into→2, feature-affinity→4, everything else→0 (positive). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add GetAllEdges for bulk edge retrieval (sproink CSR graph building) and atomic Version counter for stale-graph detection. MultiGraphStore delegates both to localStore. All mutating methods bump version via sync/atomic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Thin Go wrappers around all sproink C functions: graph build/free, activate, results extraction, co-activation pairs, and Oja update. The nocgo stub provides compile-time Activator conformance with runtime errors when CGO is disabled. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Loads edges from store, builds IDMap with isolated node support, pre-materializes deduplicated affinity edges, applies temporal decay, and calls sproink_graph_build via CGO FFI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…proink FFI NativeEngine wraps the sproink CSR graph for spreading activation. It detects store staleness via version checks before acquiring a read lock, rebuilds the graph transparently, and maps FFI results back to UUIDs with seed-source attribution and activation-descending sort. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verify that non-CGO builds return expected errors from NativeEngine constructor and all methods, ensuring clean build separation between CGO and non-CGO paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…patch Construct spreading Activator once at server startup (NativeEngine via sproink FFI when ExtendedGraphStore is available, pure-Go Engine as fallback). Replace per-call engine creation in handleFloopActive with s.activator.Activate(). Close NativeEngine via io.Closer assertion in Server.Close(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement high-level FFI wrappers for sproink's co-activation pair extraction and Oja weight update. Both wrappers are tested for parity with Go-side implementations. Handler keeps Go-side functions per plan. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verify both engines produce identical results for the same inputs. Tests cover positive graphs, conflict suppression, directional suppressive edges (overrides/merged-into), empty seeds, single seed with no neighbors, inhibition toggle, and multi-seed overlap. MinActivation raised to 0.05 in test config to avoid sigmoid(0) phantom entries. Rank order comparison tolerates tied activations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
NativeEngine and Go Engine can produce slightly different results for nodes near MinActivation threshold due to propagation order differences (Go uses map iteration; sproink uses CSR with bidirectional storage). Nodes present in only one result set with activation below 2x MinActivation are now tolerated as boundary effects. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #268 +/- ##
==========================================
- Coverage 78.81% 78.75% -0.06%
==========================================
Files 186 189 +3
Lines 18646 18779 +133
==========================================
+ Hits 14695 14789 +94
- Misses 2731 2757 +26
- Partials 1220 1233 +13 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Greptile SummaryThis PR replaces per-node SQLite edge queries with a single FFI call into a Rust CSR graph (sproink) for spreading activation, introducing a Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| internal/spreading/sproink_cgo.go | Core NativeEngine implementation: all previously flagged issues (version race, TOCTOU, double-free, thundering herd) are fixed; one P2 remains — IDMap.ToUUID panics on out-of-range FFI node IDs without a defensive bounds check. |
| internal/spreading/graph_loader.go | Bulk graph loading with temporal decay, affinity edge materialisation, and empty-store handling; correct TOCTOU-safe pattern (version snapshot before reading edges) applied by callers. |
| internal/store/multi.go | GetAllEdges and Version() now merge local + global stores; Version() uses addition (monotonically non-decreasing), global GetAllEdges errors are propagated rather than silently swallowed. |
| internal/store/sqlite.go | Adds GetAllEdges, Version(), and bumpVersion(); all write paths (AddNode, UpdateNode, DeleteNode, AddEdge, RemoveEdge) correctly bump the atomic version counter. |
| internal/mcp/server.go | NativeEngine wired into Server; io.Closer Close() called on shutdown after workerWg.Wait() ensuring no active activations race with cleanup. |
| .github/workflows/ci.yml | All prior CI cache-logic bugs fixed; one P2 remains — cache key based on sproink.h hash does not detect Rust source-only changes, meaning a bug-fix in sproink without a header change reuses the stale cached binary. |
| internal/spreading/idmap.go | Clean bidirectional UUID ↔ uint32 mapping; ToUUID intentionally panics on out-of-range — defensive callers needed at FFI boundary. |
| internal/store/sqlite_extended.go | TouchEdges and BatchUpdateEdgeWeights now correctly bump version; PruneWeakEdges also bumped. |
| internal/spreading/sproink_nocgo.go | Correct stub for CGO_ENABLED=0 builds; NewNativeEngine returns error, activator falls back to pure-Go Engine. |
| internal/spreading/sproink_pairs_cgo.go | FFI pair extraction and Oja update wrappers; nativeActivateAndExtractPairs correctly holds RLock before accessing idmap/graph; NativeExtractPairs exported but parameter type (C.SproinkResults) is inaccessible from outside the package. |
| internal/spreading/parity_test.go | 8 parity scenarios with appropriate epsilon tolerance; boundary zone handling for nodes near MinActivation threshold is well-reasoned. |
Sequence Diagram
sequenceDiagram
participant H as handler_active
participant NE as NativeEngine
participant S as Store (Multi/SQLite)
participant FF as sproink FFI (Rust)
H->>NE: Activate(ctx, seeds)
NE->>S: Version() [atomic]
alt "store version > engine version"
NE->>NE: Rebuild(ctx) [write lock]
NE->>S: Version() [snapVersion]
NE->>S: GetAllEdges(ctx)
S-->>NE: []Edge (local + global)
NE->>S: "QueryNodes(kind=behavior)"
S-->>NE: []Node (isolated nodes)
NE->>FF: sproink_graph_build(numNodes, edges, weights, kinds)
FF-->>NE: "*SproinkGraph (CSR)"
NE->>NE: atomic.Store(version, snapVersion)
end
NE->>NE: RLock
NE->>NE: map seeds → u32 via IDMap
NE->>FF: sproink_activate(graph, seeds, params)
FF-->>NE: "*SproinkResults"
NE->>NE: map u32 results → UUID via IDMap
NE->>NE: RUnlock
NE-->>H: []Result (sorted by activation desc)
H->>S: TouchEdges + Hebbian learning (Go-side)
Reviews (20): Last reviewed commit: "fix(test): stabilize hybrid scorer tests..." | Re-trigger Greptile
- P1: Use sync/atomic for NativeEngine.version to prevent data race between Activate (read) and Rebuild (write) - P1: MultiGraphStore.GetAllEdges now merges local + global store edges so NativeEngine sees the same graph as the pure-Go Engine - P1: MultiGraphStore.Version sums both stores' versions for staleness detection across both scopes - P2: Rebuild re-checks version after acquiring write lock to prevent thundering herd (N goroutines observe staleness, only first rebuilds) - Widen parity test epsilon for mixed conflict+positive edges (different propagation order causes different suppression magnitudes) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Set e.graph = nil immediately after sproinkGraphFree in Rebuild to prevent double-free if loadGraph fails on retry - Add rows.Err() check after GetAllEdges iteration loop Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- P1: Capture version snapshot BEFORE loadGraph in Rebuild to prevent racing writes from being silently missed (TOCTOU fix) - P2: Add bumpVersion() to TouchEdges so last_activated updates trigger CSR graph rebuilds with fresh temporal decay values Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Commit third_party/sproink/include/sproink.h (C header needed for CGO builds; binaries remain gitignored) - Fix nil context lint errors in nocgo tests (use context.TODO()) - Add #cgo CFLAGS to sproink_pairs_cgo.go for independent compilation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Two issues found — one correctness bug, one dead code cleanup.
P1 — TOCTOU in NewNativeEngine (same class of bug as the one fixed in Rebuild): The Rebuild method correctly captures snapVersion before calling loadGraph, but NewNativeEngine still calls s.Version() after loadGraph returns. If a concurrent write lands between GetAllEdges inside loadGraph and the s.Version() call, the engine is initialised with a stale graph but a version counter that already reflects the post-write state. The next Activate call will then observe store.Version() <= e.version and skip the rebuild, silently serving a stale graph until another write arrives.
P2 — Dead else-branch in activator selection: *MultiGraphStore now implements ExtendedGraphStore in full (it gained GetAllEdges and Version() in this PR). The store.GraphStore(graphStore).(store.ExtendedGraphStore) assertion will therefore always succeed, making the else branch (activator = spreading.NewEngine(graphStore, spreadConfig)) unreachable dead code. The intermediate store.GraphStore(...) widening conversion is also redundant.
Capture store version before loadGraph (same pattern as Rebuild) to prevent a racing write from being silently missed at startup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Build libsproink.a from source in the test-cgo CI job so CGO+LanceDB tests can link against it. Uses dtolnay/rust-toolchain action and caches the Rust build artifacts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Handle Windows Rust output (sproink.lib → libsproink.a) in CI - Make TokenStats test engine-agnostic: measure seed baseline dynamically then verify user behaviors add expected deltas, rather than hardcoding absolute counts that depend on which engine activates Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rust defaults to MSVC on Windows but Go CGO uses MinGW. Build sproink with --target x86_64-pc-windows-gnu to produce a MinGW-compatible static library. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevents git clone failure on cache hit when sproink-build/target/ is restored but libsproink.a hasn't been copied to third_party/ yet. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Check for .git subdir (not just directory) before cloning sproink, since actions/cache restores target/ creating a non-git directory - Add --allow-multiple-definition for Windows to resolve duplicate rust_eh_personality symbols when linking both sproink and lancedb static libraries (both bundle Rust stdlib) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The build guard was checking third_party/sproink/lib/.../libsproink.a which is never cached (only sproink-build/target/ is). On cache hit, this always re-entered the build block where cargo build failed because only target/ was restored without Cargo.toml and src/. Now checks the cached artifact path directly and always copies to third_party/ after build or cache restore. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cache restores sproink-build/target/ but not .git, so the .git guard always triggered rm + fresh clone, defeating the cache. Now checks for Cargo.toml which exists after clone and won't conflict with cache restore of target/. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On cache hit, sproink-build/ exists (containing target/) but has no source files. git clone fails because the directory is non-empty. Now clones to a temp dir, preserves any cached target/, then renames. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…adation Silent degradation on global store errors taints the version snapshot in Rebuild — snapVersion includes the global counter but the graph is built from local-only edges. Subsequent Activate calls see version as current and permanently skip rebuild. Propagating the error forces a retry on the next activation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
time.Since(CreatedAt) can be 0 on Windows (15ms clock) when the behavior is created immediately before scoring, causing baseLevelScore to diverge between the two Score() calls in formula-verification tests. Set CreatedAt to 24h ago so age is large and stable regardless of clock resolution. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The test-release CGO build step was missing the sproink library that was added in PR #268. Without it the linker fails with "cannot find -lsproink". Adds Rust toolchain setup, build cache, and sproink compilation matching the pattern used in ci.yml. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Activatorinterface withNativeEngine(sproink via CGO) and existingEngine(pure-Go fallback)CGO_ENABLED=0or missing libsproinkGetAllEdgesandVersion()toExtendedGraphStorefor bulk graph loading and staleness detectionArchitecture
Key Design Decisions
Files (23 changed, +2381/-11)
activator.go,engine.go(conformance check)idmap.go,edgekind.gosproink_cgo.go,sproink_nocgo.go,sproink_pairs_cgo.gograph_loader.gostore.go,sqlite.go,sqlite_extended.go,multi.goserver.go,handler_active.goTest plan
CGO_ENABLED=1 go test ./internal/spreading/... -race— all passCGO_ENABLED=0 go test ./internal/spreading/...— nocgo stub tests passgo test ./internal/store/...— store tests passFollow-up beads
floop-s8l(P2) — CI: Add Rust toolchain for sproink buildsfloop-1wi(P3) — Wire FFI pair extraction into handlerfloop-5ba(P3) — Expose temporal decay + step snapshots via sproink FFIfloop-gkd(P3) — Performance benchmarks🤖 Generated with Claude Code