Skip to content

fix(api-client): force WebSocket reconnect after sleep when socket stays OPEN [WPB-21916]#21416

Open
thisisamir98 wants to merge 2 commits into
devfrom
WPB-21916-back-from-sleep
Open

fix(api-client): force WebSocket reconnect after sleep when socket stays OPEN [WPB-21916]#21416
thisisamir98 wants to merge 2 commits into
devfrom
WPB-21916-back-from-sleep

Conversation

@thisisamir98
Copy link
Copy Markdown
Collaborator

@thisisamir98 thisisamir98 commented Jun 2, 2026

BugWPB-21916 [Web/Desktop] : Missing messages in Web - Prod

After laptop sleep, the TCP connection can die while the browser still reports readyState as OPEN. The sleep handler detected wake correctly, but reconnect was gated on isDisconnected (socket === CLOSED), so the common zombie-OPEN case was skipped and the client could stay silent with no messages and no offline banner.

Always call reconnect on wake, regardless of socket state. Add Datadog-friendly logs for wake detection, reconnect initiation, and successful reopen (including idle time and unansweredPing context). Add regression tests to prevent reintroducing the isDisconnected gate.

…ays OPEN

After laptop sleep, the TCP connection can die while the browser still
reports readyState as OPEN. The sleep handler detected wake correctly,
but reconnect was gated on isDisconnected (socket === CLOSED), so the
common zombie-OPEN case was skipped and the client could stay silent
with no messages and no offline banner.

Always call reconnect on wake, regardless of socket state. Add
Datadog-friendly logs for wake detection, reconnect initiation, and
successful reopen (including idle time and unansweredPing context).
Add regression tests to prevent reintroducing the isDisconnected gate.
@thisisamir98 thisisamir98 changed the title fix(api-client): force WebSocket reconnect after sleep when socket stays OPEN fix(api-client): force WebSocket reconnect after sleep when socket stays OPEN [WPB-21916] Jun 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

🔗 Download Full Report Artifact

🧪 Playwright Test Summary

  • Passed: 15
  • Failed: 0
  • Skipped: 0
  • 🔁 Flaky: 0
  • 📊 Total: 15
  • Total Runtime: 133.6s (~ 2 min 14 sec)

reconnect,
} as never;

expect(sleepWakeCallback).toBeDefined();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we have to assert if the sleepWakeCallback is defined.

const mockedOnBackFromSleep = jest.mocked(onBackFromSleep);

describe('ReconnectingWebsocket back from sleep', () => {
let sleepWakeCallback: (() => void) | undefined;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of undefined if we pass an empty function we don't have to use Non-Null Assertion !

}

const state = this.getState();
const timeSinceLastMessage = this.lastMessageTimestamp > 0 ? Date.now() - this.lastMessageTimestamp : undefined;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why undefined instead of 0?

The ping loop previously stopped on close, ping timeout, and each
connect(), leaving gaps where a zombie OPEN socket could stay undetected
until the next manual recovery path.

Start pinging once on first connect and keep the interval running until
explicit disconnect. Reschedule the timer after wake-from-sleep so ticks
resume promptly when OS timers were suspended. Ping ticks now force
reconnect on CLOSED sockets and unanswered pong timeouts without stopping
the interval, and reset ping state on open/close.

Add tests for persistent pinging across close and ping timeout, CLOSED
reconnect on ping tick, and sleep wake rescheduling.
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 2, 2026


// Force reconnect even if the browser keeps the socket in OPEN state after sleep.
this.sleepReconnectPending = true;
this.reschedulePinging();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System sleeps → Browser keeps socket "OPEN" → System wakes →
handleBackFromSleep() calls reschedulePinging() but isPingingEnabled is false →
Socket stays "zombie OPEN" → Connection appears healthy but is dead

Fix: Add explicit logging and ensure pinging is restored:

// Ensure pinging is active before rescheduling
  if (!this.isPingingEnabled) {
    this.logger.warn('Back from sleep detected but pinging is disabled; connection may become unresponsive');
  }
  
  const state = this.getState();
  // ... rest of logic
  
  this.sleepReconnectPending = true;
  this.reschedulePinging();
  this.socket.reconnect();

if (state === WEBSOCKET_STATE.CLOSED) {
this.logger.info('Ping tick — socket CLOSED, forcing reconnect');
this.hasUnansweredPing = false;
this.socket.reconnect();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to let the reconnecting-websocket library handle reconnection timing? No guarantee that reconnect() actually establishes a connection before the next ping tick (20ms later).Could trigger rapid re-reconnect loops.

* Starts the ping interval if pinging is enabled and not already running.
* The interval is kept alive across reconnects and is only cleared on disconnect/cleanup.
*/
private ensurePinging(): void {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensurePinging and reschedulePinging can create a possible race condition. No locking mechanism prevents both from creating intervals simultaneously

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants