Skip to content

Conversation

@EvenLjj
Copy link
Collaborator

@EvenLjj EvenLjj commented Nov 12, 2025

fix shared channel remove when rpc sendMag issue

Summary by CodeRabbit

  • Bug Fixes

    • Improved channel lifecycle: stronger reference-counting and safer shutdown with timeout and interruption handling.
  • Refactor

    • Adjusted synchronization and logging behavior around channel management to streamline connect/disconnect flows.
  • Tests

    • Added test coverage verifying shared channel usage and lifecycle across multiple transport instances.

@sofastack-cla sofastack-cla bot added bug Something isn't working question Further information is requested cla:yes CLA is ok size/L labels Nov 12, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 12, 2025

Walkthrough

Reference-counted gRPC channel handling was added and logging introduced. ReferenceCountManagedChannel now holds a final grpcChannel, decrements the reference count on shutdown and fully shuts the underlying channel when count ≤ 0. TripleClientTransport updates increment/decrement usage and disconnect/cleanup behavior. A test verifies shared-channel lifecycle.

Changes

Cohort / File(s) Summary
Reference Counted Channel
remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/ReferenceCountManagedChannel.java
Made grpcChannel final, added LOGGER, relocated referenceCount, added logging, changed shutdown() to decrement reference count, shut underlying channel and await termination up to 5s when count ≤ 0, and handle InterruptedException.
Transport integration
remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/TripleClientTransport.java
Made lock static, removed local logging usage, simplified disconnect to call shutdown() (removed await/InterruptedException handling), use remove(key, value) for channelMap cleanup, set channel to null after removal, and call incrementAndGetCount() when reusing or after creating a ReferenceCountManagedChannel.
Tests
remoting/remoting-triple/src/test/java/com/alipay/sofa/rpc/transport/triple/TripleClientTransportTest.java
Added test() validating shared-channel reuse and lifecycle (destroy/connect) and providerToClientConfig() helper; added required imports.

Sequence Diagram(s)

sequenceDiagram
    participant ClientTransport as TripleClientTransport
    participant RefChannel as ReferenceCountManagedChannel
    participant GRPC as ManagedChannel

    Note right of ClientTransport: obtain shared channel
    ClientTransport->>RefChannel: getSharedChannel()
    alt existing channel
        ClientTransport->>RefChannel: incrementAndGetCount()
        RefChannel->>RefChannel: referenceCount++
    else create new
        ClientTransport->>RefChannel: new ReferenceCountManagedChannel(GRPC)
        ClientTransport->>RefChannel: incrementAndGetCount()
        RefChannel->>RefChannel: referenceCount = 1
    end

    Note right of ClientTransport: shutdown flow
    ClientTransport->>RefChannel: shutdown()
    RefChannel->>RefChannel: referenceCount--
    alt referenceCount ≤ 0
        RefChannel->>GRPC: shutdown()
        RefChannel->>GRPC: awaitTermination(5s)
        RefChannel->>RefChannel: log termination/result
    else referenceCount > 0
        RefChannel-->>ClientTransport: return channel (no shutdown)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~35 minutes

  • Pay attention to:
    • changing lock to static (cross-instance synchronization effects)
    • correct placement/timing of referenceCount increments/decrements (race conditions)
    • shutdown ordering and InterruptedException handling in ReferenceCountManagedChannel
    • map removal using remove(key, value) and subsequent null assignment

Possibly related PRs

Suggested reviewers

  • chuailiwu
  • sunhailin-Leo

Poem

🐇
I count the hops, the channels bright,
Each friend that joins my shared delight,
When tails depart and numbers fall,
I close the gate and hush the hall,
A quiet hop — the tunnel's light. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: fixing the removal of shared channels when RPC sendMsg issues occur. It clearly relates to the core modifications across the three files which implement reference counting and proper channel lifecycle management.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ba7df3 and 0f66808.

📒 Files selected for processing (1)
  • remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/TripleClientTransport.java (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/TripleClientTransport.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: sca

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@EvenLjj EvenLjj changed the title fix shared channel remove when rpc sendMag issue fix triple shared channel remove when rpc sendMag issue Nov 12, 2025
@EvenLjj EvenLjj changed the title fix triple shared channel remove when rpc sendMag issue fix triple shared channel remove when rpc sendMsg issue Nov 12, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/ReferenceCountManagedChannel.java (1)

61-64: Preserve the interrupt flag after catching InterruptedException

We’re swallowing the interrupt when awaitTermination is interrupted, so upstream code will keep running as if nothing happened. Please restore the thread’s interrupt status after logging so callers can react properly.

-        } catch (InterruptedException e) {
-            LOGGER.warn("Triple channel shut down interrupted.");
+        } catch (InterruptedException e) {
+            LOGGER.warn("Triple channel shut down interrupted.");
+            Thread.currentThread().interrupt();
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a073684 and 39f1dde.

📒 Files selected for processing (3)
  • remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/ReferenceCountManagedChannel.java (3 hunks)
  • remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/TripleClientTransport.java (3 hunks)
  • remoting/remoting-triple/src/test/java/com/alipay/sofa/rpc/transport/triple/TripleClientTransportTest.java (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
remoting/remoting-triple/src/test/java/com/alipay/sofa/rpc/transport/triple/TripleClientTransportTest.java (4)
core/api/src/main/java/com/alipay/sofa/rpc/client/ProviderInfo.java (1)
  • ProviderInfo (35-528)
core/api/src/main/java/com/alipay/sofa/rpc/config/ConsumerConfig.java (1)
  • ConsumerConfig (71-1044)
core/api/src/main/java/com/alipay/sofa/rpc/transport/ClientTransportConfig.java (1)
  • ClientTransportConfig (41-299)
core/api/src/main/java/com/alipay/sofa/rpc/transport/ClientTransportFactory.java (1)
  • ClientTransportFactory (36-189)
remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/ReferenceCountManagedChannel.java (1)
core/api/src/main/java/com/alipay/sofa/rpc/log/LoggerFactory.java (1)
  • LoggerFactory (29-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: sca

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/TripleClientTransport.java (1)

252-252: Consider deferring shutdownNow() or using plain shutdown().

Calling shutdownNow() on an unavailable channel outside the synchronized block could interfere with other threads that are concurrently checking or using the same channel instance. While the subsequent double-check inside the synchronized block mitigates some risk, this aggressive shutdown might cause unexpected behavior.

Consider either:

  • Moving the shutdown logic inside the synchronized block after confirming the channel is still unavailable, or
  • Using shutdown() instead of shutdownNow() to allow graceful termination
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 39f1dde and 0ba7df3.

📒 Files selected for processing (1)
  • remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/TripleClientTransport.java (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build
  • GitHub Check: sca
🔇 Additional comments (1)
remoting/remoting-triple/src/main/java/com/alipay/sofa/rpc/transport/triple/TripleClientTransport.java (1)

77-77: The static lock design is sound and not a practical bottleneck.

While the static lock does create a global synchronization point across all TripleClientTransport instances, this is not problematic in practice. The lock is only held briefly during map operations (get, put, incrementAndGetCount()—all microsecond-scale operations). Critically, slow I/O operations like awaitTermination() occur outside the synchronized block. Additionally, reference counting uses AtomicInteger for lock-free increments/decrements, and instances are typically limited to one per distinct remote service—usually a handful to perhaps 50-100 even in large microservice applications. The implementation properly decrementing via ReferenceCountManagedChannel.shutdown() confirms the design is correct.

@EvenLjj EvenLjj merged commit 7e6ca00 into sofastack:master Nov 13, 2025
5 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cla:yes CLA is ok question Further information is requested size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants