harden redis cache against stalled connections by efstajas · Pull Request #1929 · drips-network/app

efstajas · 2026-06-28T08:32:34Z

Follow-up to today's Filecoin app incident.

The Filecoin app instance's Redis connection went half-open: Railway drops idle internal TCP connections, and a low-traffic deployment like Filecoin lets the socket sit idle long enough to get dropped. node-redis kept sending commands into the dead socket with no reply, so every cache read — explore page and project pages alike — hung for tens of seconds to minutes, blocking SSR and tripping the health check into 500s. Mainnet was unaffected because its constant traffic keeps the socket warm. Evicting the cache key didn't help (the value was never the problem); a restart fixed it by re-establishing the connection.

Two changes so a bad connection can't take a deployment down again:

redis.ts: add pingInterval: 10000 so the client PINGs on idle and detects/reconnects a dead socket instead of queueing commands into the void. This is the actual root-cause fix.
cached.ts: bound the cache read with a 1s timeout and fall through to the fetcher on timeout/error, so a degraded cache makes pages a bit slower rather than hanging them. Also stops silently swallowing write failures.

Net effect: a stalled cache now means slightly slower uncached pages, not a downed app.

One thing left deliberately out of scope: a few endpoints (api/tlv, api/projects, fiat price, embed) still do direct redis.get reads outside cached(). pingInterval protects them from the indefinite-wedge failure mode too, but wrapping them in the same timeout helper would be a reasonable follow-up.

Copilot

Pull request overview

This PR hardens the server-side Redis cache against half-open/stalled connections that can otherwise hang SSR and trigger health-check failures (as seen in the Filecoin deployment), ensuring cache degradation falls back to fresh fetches instead of wedging requests.

Changes:

Configure the Redis client to proactively PING on an idle interval (pingInterval: 10000) to detect and reconnect dead sockets.
Bound Redis cache reads with a short timeout (1s) and fall back to the fetcher on timeout/error.
Stop silently swallowing Redis write failures by logging async set() errors.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
src/routes/api/redis.ts	Adds Redis `pingInterval` configuration to detect/recover from idle-dropped TCP connections.
src/lib/utils/cache/remote/cached.ts	Adds a read timeout + error fallback for cache reads and logs cache write failures to avoid SSR hangs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

harden redis cache against stalled connections

4264f6b

efstajas requested a review from Copilot June 28, 2026 08:34

Copilot started reviewing on behalf of efstajas June 28, 2026 08:34 View session

Copilot AI reviewed Jun 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

harden redis cache against stalled connections#1929

harden redis cache against stalled connections#1929
efstajas wants to merge 1 commit into
mainfrom
harden-redis-cache-against-stalled-connections

efstajas commented Jun 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

efstajas commented Jun 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants