Skip to content

Replace wal_sender_timeout-based liveness with TCP keepalive.#373

Open
ibrarahmad wants to merge 2 commits intomainfrom
FEED_BACK
Open

Replace wal_sender_timeout-based liveness with TCP keepalive.#373
ibrarahmad wants to merge 2 commits intomainfrom
FEED_BACK

Conversation

@ibrarahmad
Copy link
Copy Markdown
Contributor

@ibrarahmad ibrarahmad commented Mar 4, 2026

The apply worker previously relied on wal_sender_timeout as both a server-side disconnect trigger and an indirect keepalive pressure on the subscriber. This caused spurious disconnects in two scenarios: a flood of 'w' messages keeping the subscriber too busy to send 'r' feedback in time, and large transactions whose apply time exceeded wal_sender_timeout.

The workaround was maybe_send_feedback(), which force-sent 'r' after every 10 'w' messages or wal_sender_timeout/2, whichever came first. This was a fragile band-aid that coupled subscriber behavior to a server GUC it cannot control.

Replace the entire mechanism with a clean two-layer model:

  • TCP keepalive (keepalives_idle=10, keepalives_interval=5, keepalives_count=3) is the primary liveness detector on both sides. A dead network or crashed host is detected in ~25 seconds.

  • wal_sender_timeout=0 is set on replication connections so the walsender never disconnects due to missing 'r' feedback. Liveness on the server side is now handled entirely by TCP keepalive.

  • spock.apply_idle_timeout (default 300s) is a subscriber-side safety net for a hung-but-connected walsender whose TCP keepalive probes are answered by the kernel but sends no data. Set to 0 to disable.

Fix a bug in last_receive_timestamp handling: it was updated unconditionally after every PQgetCopyData call, including when r==0 (no data available). Each 1-second WL_TIMEOUT spin silently reset the timer, making apply_idle_timeout never fire. Move the update to after the r==0 guard so it reflects actual data receipt only.

Remove maybe_send_feedback() as it is no longer needed.

SPOC-419

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 97c781cc-c795-448f-bde8-acdc5a68ac81

📥 Commits

Reviewing files that changed from the base of the PR and between e4c970d and ed05b18.

📒 Files selected for processing (4)
  • docs/configuring.md
  • include/spock.h
  • src/spock.c
  • src/spock_apply.c
💤 Files with no reviewable changes (1)
  • docs/configuring.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • include/spock.h

📝 Walkthrough

Walkthrough

Adds a new apply-worker idle-timeout GUC, exposes it in the header, adjusts replication connection parameters and TCP keepalive defaults, disables server-side wal_sender_timeout for replication connections, and refactors apply-worker timeout and feedback handling (removes internal feedback helper and updates idle-timer behavior).

Changes

Cohort / File(s) Summary
Header Export
include/spock.h
Removed extern int spock_feedback_frequency; and added extern int spock_apply_idle_timeout;.
GUC Configuration & Connection Setup
src/spock.c
Added int spock_apply_idle_timeout = 300 and registered spock.apply_idle_timeout GUC (seconds). Increased CONN_PARAM_ARRAY_SIZE from 9→10, adjusted TCP keepalive defaults (keepalives_idle 20→10, keepalives_interval 20→5, keepalives_count 5→3), and appended -c wal_sender_timeout=0 to replication connection options. Removed registration of spock.feedback_frequency.
Apply Worker Timeout & Feedback
src/spock_apply.c
Removed static maybe_send_feedback(...) and its scheduling/trigger logic. apply_work() now uses spock_apply_idle_timeout (seconds→ms) for WL_TIMEOUT checks only when >0, updates last_receive_timestamp only after actual COPY data (r > 0), and no longer forces feedback from the 'w' message path.
Documentation
docs/configuring.md
Removed the spock.feedback_frequency configuration option documentation and its description.

Poem

🐇 I watched the WALs as twilight crept,
A tiny timeout softly kept,
No more forced pings to break the hush,
I count the seconds, quiet as a brush,
Reconnect, resume, then nuzzle and leap!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: replacing wal_sender_timeout-based liveness detection with TCP keepalive as the primary mechanism.
Description check ✅ Passed The description clearly relates to the changeset, providing detailed context about why the change was needed and explaining the new two-layer liveness detection model.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch FEED_BACK

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ibrarahmad ibrarahmad requested a review from mason-sharp March 4, 2026 06:55
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/spock.c (1)

345-356: Consider making TCP keepalive parameters configurable.

The hardcoded values (idle=10s, interval=5s, count=3) result in ~25s dead connection detection. While reasonable for most deployments, high-latency or unreliable network environments may experience false-positive disconnects.

Consider exposing these as GUCs (e.g., spock.keepalives_idle, spock.keepalives_interval, spock.keepalives_count) to allow tuning without code changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock.c` around lines 345 - 356, Replace the hardcoded TCP keepalive
literals in the keys/vals block with configurable GUC-backed values: define GUCs
(e.g., spock.keepalives_idle, spock.keepalives_interval, spock.keepalives_count)
as int variables (suggest names spock_keepalives_idle,
spock_keepalives_interval, spock_keepalives_count) during module initialization
(e.g., in _PG_init or the existing GUC registration area), register them with
DefineCustomIntVariable, and then use those variables' stringified values when
populating vals[] for the keys "keepalives_idle", "keepalives_interval", and
"keepalives_count" in the code that sets keys[i]/vals[i]; keep the default
values 10/5/3 and ensure bounds checking on the GUCs (positive integers) when
registering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/spock.c`:
- Around line 345-356: Replace the hardcoded TCP keepalive literals in the
keys/vals block with configurable GUC-backed values: define GUCs (e.g.,
spock.keepalives_idle, spock.keepalives_interval, spock.keepalives_count) as int
variables (suggest names spock_keepalives_idle, spock_keepalives_interval,
spock_keepalives_count) during module initialization (e.g., in _PG_init or the
existing GUC registration area), register them with DefineCustomIntVariable, and
then use those variables' stringified values when populating vals[] for the keys
"keepalives_idle", "keepalives_interval", and "keepalives_count" in the code
that sets keys[i]/vals[i]; keep the default values 10/5/3 and ensure bounds
checking on the GUCs (positive integers) when registering.

ℹ️ Review info
Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 69aec04d-1d81-4de7-913f-14b9f06f8761

📥 Commits

Reviewing files that changed from the base of the PR and between 50764f4 and 1b0262a.

📒 Files selected for processing (3)
  • include/spock.h
  • src/spock.c
  • src/spock_apply.c

* kernel ACKs them, but no data is being sent.
*/
if (rc & WL_TIMEOUT)
if (rc & WL_TIMEOUT && spock_apply_idle_timeout > 0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like if walsender just doesn't have data to send for a long time, subscriber will restart. Am I wrong?

It would be better to modify walsender little: skip keepalive messages being busy and rely on TCP status. But send keepalive messages if no data arrives from the WAL. In this case we don't need any subscriber-side GUC at all.

@mason-sharp
Copy link
Copy Markdown
Member

@ibrarahmad Needs rebase

Copy link
Copy Markdown
Member

@mason-sharp mason-sharp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment about having a non-zero wal_sender_timeout.

Also, needs a rebase.

Also, could use that test file.

if (replication)
{
keys[i] = "options";
vals[i] = "-c wal_sender_timeout=0";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be non-zero, like 300000. Less?

What if there is a proxy that blocks the sender when it is trying to send WAL? The keepalives sent from the idle receiver won't make a difference. Meanwhile, WAL is accumulating.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/spock_apply.c (1)

2852-2863: ⚠️ Potential issue | 🟠 Major

This can reconnect an idle-but-healthy upstream.

last_receive_timestamp only moves when CopyData arrives. Because the replication connection now runs with wal_sender_timeout = 0, PostgreSQL's walsender returns early from WalSndKeepaliveIfNecessary() and won't send protocol keepalives, so a publisher that is simply caught up and idle can still hit this reconnect path after spock.apply_idle_timeout. As written, this safety net can't distinguish “hung” from “idle”; it needs a sender heartbeat that survives wal_sender_timeout = 0, or this GUC should default to 0. (doxygen.postgresql.org)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_apply.c` around lines 2852 - 2863, The idle-timeout check using
last_receive_timestamp can disconnect a healthy but caught-up publisher because
CopyData isn't updated by walsender keepalives when wal_sender_timeout = 0;
update the logic so spock.apply_idle_timeout does not trigger in that case:
either change the GUC spock_apply_idle_timeout default to 0, or add a guard in
the timeout branch (the block that sets MySpockWorker->worker_status =
SPOCK_WORKER_STATUS_STOPPED and elog(ERROR, ...)) to skip reconnect when the
upstream is only sending WAL keepalives (i.e., detect/update a separate
last_keepalive_timestamp on WalSndKeepaliveIfNecessary/keepalive handling and
use that for idle detection), or check wal_sender_timeout and treat
spock_apply_idle_timeout as disabled when wal_sender_timeout == 0; update
references to last_receive_timestamp, CopyData handling, and the reconnect path
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/spock_apply.c`:
- Around line 2852-2863: The idle-timeout check using last_receive_timestamp can
disconnect a healthy but caught-up publisher because CopyData isn't updated by
walsender keepalives when wal_sender_timeout = 0; update the logic so
spock.apply_idle_timeout does not trigger in that case: either change the GUC
spock_apply_idle_timeout default to 0, or add a guard in the timeout branch (the
block that sets MySpockWorker->worker_status = SPOCK_WORKER_STATUS_STOPPED and
elog(ERROR, ...)) to skip reconnect when the upstream is only sending WAL
keepalives (i.e., detect/update a separate last_keepalive_timestamp on
WalSndKeepaliveIfNecessary/keepalive handling and use that for idle detection),
or check wal_sender_timeout and treat spock_apply_idle_timeout as disabled when
wal_sender_timeout == 0; update references to last_receive_timestamp, CopyData
handling, and the reconnect path accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5f499ec7-1af5-4d57-bcd9-56189b254fe8

📥 Commits

Reviewing files that changed from the base of the PR and between 1b0262a and e4c970d.

📒 Files selected for processing (3)
  • include/spock.h
  • src/spock.c
  • src/spock_apply.c
✅ Files skipped from review due to trivial changes (1)
  • include/spock.h
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/spock.c

Ibrar Ahmed and others added 2 commits March 31, 2026 11:55
The apply worker previously relied on wal_sender_timeout as both a
server-side disconnect trigger and an indirect keepalive pressure on
the subscriber.  This caused spurious disconnects in two scenarios:
a flood of 'w' messages keeping the subscriber too busy to send 'r'
feedback in time, and large transactions whose apply time exceeded
wal_sender_timeout.

The workaround was maybe_send_feedback(), which force-sent 'r' after
every 10 'w' messages or wal_sender_timeout/2, whichever came first.
This was a fragile band-aid that coupled subscriber behavior to a
server GUC it cannot control.

Replace the entire mechanism with a clean two-layer model:

- TCP keepalive (keepalives_idle=10, keepalives_interval=5,
  keepalives_count=3) is the primary liveness detector on both sides.
  A dead network or crashed host is detected in ~25 seconds.

- wal_sender_timeout=0 is set on replication connections so the
  walsender never disconnects due to missing 'r' feedback.  Liveness
  on the server side is now handled entirely by TCP keepalive.

- spock.apply_idle_timeout (default 300s) is a subscriber-side safety
  net for a hung-but-connected walsender whose TCP keepalive probes
  are answered by the kernel but sends no data.  Set to 0 to disable.

Fix a bug in last_receive_timestamp handling: it was updated
unconditionally after every PQgetCopyData call, including when r==0
(no data available).  Each 1-second WL_TIMEOUT spin silently reset
the timer, making apply_idle_timeout never fire.  Move the update to
after the r==0 guard so it reflects actual data receipt only.

Remove maybe_send_feedback() as it is no longer needed.
@mason-sharp
Copy link
Copy Markdown
Member

Did a rebase + one additional commit

@codacy-production
Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics -1 duplication

Metric Results
Duplication -1

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants