Skip to content

8317801: java/net/Socket/asyncClose/Race.java fails intermittently (aix)#4294

Open
shruacha1234 wants to merge 6 commits intoopenjdk:masterfrom
shruacha1234:backport_8f121a17
Open

8317801: java/net/Socket/asyncClose/Race.java fails intermittently (aix)#4294
shruacha1234 wants to merge 6 commits intoopenjdk:masterfrom
shruacha1234:backport_8f121a17

Conversation

@shruacha1234
Copy link
Copy Markdown
Contributor

@shruacha1234 shruacha1234 commented Mar 15, 2026

This pull request contains a backport of commit 8f121a17 from the openjdk/jdk repository.
OpenJDK bug : https://bugs.openjdk.org/browse/JDK-8317801

This fix resolves a race condition in socket close handling that led to intermittent failures in Race.java on AIX in JDK17u-dev

The original patch didn’t apply cleanly to the JDK17u-dev branch due to differences in the NIO dispatcher implementation:

  • UnixDispatcher introduced in later JDK versions does not exist in JDK17.
  • Virtual thread related logic in the original patch is not present in JDK17.
  • The dispatcher hierarchy differs slightly.

To adapt the change for JDK17:

  • NativeDispatcher.preClose was updated to accept the reader and writer thread IDs.
  • Thread signalling logic was moved into FileDispatcherImpl.
  • Channel implementations (SocketChannelImpl, DatagramChannelImpl, ServerSocketChannelImpl, etc.) were updated to delegate signalling through the dispatcher rather than performing it locally.

These changes preserve the functional intent of the upstream patch, ensuring that threads blocked in I/O operations are correctly signalled when a file descriptor is pre-closed.

Testing : local AIX build, java_io, java_nio, jdk_net
Additionally, executed java/net/Socket/asyncClose/Race.java 500 times to check for intermittent failures. No failures were observed.


Progress

  • Change must not contain extraneous whitespace
  • JDK-8317801 needs maintainer approval
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issue

  • JDK-8317801: java/net/Socket/asyncClose/Race.java fails intermittently (aix) (Bug - P4 - Rejected)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk17u-dev.git pull/4294/head:pull/4294
$ git checkout pull/4294

Update a local copy of the PR:
$ git checkout pull/4294
$ git pull https://git.openjdk.org/jdk17u-dev.git pull/4294/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4294

View PR using the GUI difftool:
$ git pr show -t 4294

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk17u-dev/pull/4294.diff

Using Webrev

Link to Webrev Comment

Signed-off-by: Shruthi <Shruthi.Shruthi1@ibm.com>
@bridgekeeper
Copy link
Copy Markdown

bridgekeeper bot commented Mar 15, 2026

👋 Welcome back sacharya! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link
Copy Markdown

openjdk bot commented Mar 15, 2026

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot changed the title Backport 8f121a173ca2534c706682f6c68fbbb0b94ec057 8317801: java/net/Socket/asyncClose/Race.java fails intermittently (aix) Mar 15, 2026
@openjdk
Copy link
Copy Markdown

openjdk bot commented Mar 15, 2026

This backport pull request has now been updated with issue from the original commit.

@openjdk openjdk bot added backport Port of a pull request already in a different code base rfr Pull request is ready for review labels Mar 15, 2026
@mlbridge
Copy link
Copy Markdown

mlbridge bot commented Mar 15, 2026

Webrevs

@jerboaa
Copy link
Copy Markdown
Contributor

jerboaa commented Mar 16, 2026

This backport is being done very differently in JDK 17u because for JDK 21u the following dependencies where included first in the JDK 21 release before doing the actual backport for this bug:

It would have helped if this information was provided to the reader. If we end up going this route without the enhancement backports (which should be fine IMO), then we should more accurately model it to pre-existing JDK 17u code. A review follows shortly.

@jerboaa
Copy link
Copy Markdown
Contributor

jerboaa commented Mar 16, 2026

/reviewers 2 Reviewer

@openjdk
Copy link
Copy Markdown

openjdk bot commented Mar 16, 2026

@jerboaa
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).

Signed-off-by: Shruthi <Shruthi.Shruthi1@ibm.com>
Signed-off-by: Shruthi <Shruthi.Shruthi1@ibm.com>
Copy link
Copy Markdown
Contributor

@jerboaa jerboaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build failure should be resolved in a different way (rename the shared static method).

Signed-off-by: Shruthi <Shruthi.Shruthi1@ibm.com>
Signed-off-by: Shruthi <Shruthi.Shruthi1@ibm.com>
Copy link
Copy Markdown
Contributor

@jerboaa jerboaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks OK to me. Nit: Please make static native void preClose0 in FileDispatcherImpl.java private as it's only used in that class now.

Signed-off-by: Shruthi <Shruthi.Shruthi1@ibm.com>
@jerboaa
Copy link
Copy Markdown
Contributor

jerboaa commented Mar 24, 2026

@tstuefe Could you please help with a second review for this unclean backport? Thanks!

@tstuefe
Copy link
Copy Markdown
Member

tstuefe commented Mar 24, 2026

I am not an expert in that area, and this touches shared coding, so I am a bit apprehensive about this one. Tests should definitely be done on all platforms, not only AIX. Please ask SAP, specifically @JoKern65 , for a review.

@shruacha1234
Copy link
Copy Markdown
Contributor Author

@JoKern65 Can you please review this PR

Copy link
Copy Markdown

@JoKern65 JoKern65 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot test it on AIX at SAP, because we do not support AIX on jdk17. But the changes look good and should not introduce a different behavior on other platforms than before. And for AIX the change results in the same already established logic of the higher releases.
So for me it looks good.

@shruacha1234
Copy link
Copy Markdown
Contributor Author

/approval JDK-8317801 request backport fix, resolves a race condition in socket close handling that led to intermittent failures in java/net/Socket/asyncClose/Race.java on AIX in 17u-dev. The upstream fix from 21u could not be applied directly due to differences in the NIO dispatcher design (absence of UnixDispatcher, related refactorings, and virtual thread support), so it was adapted by updating NativeDispatcher.preClose to take reader/writer thread IDs, moving signalling into FileDispatcherImpl, and modifying channel implementations to delegate signalling through the dispatcher. This preserves the original fix’s intent of correctly signalling threads blocked in I/O during pre-close. Tested with AIX build, java_io, java_nio, jdk_net, and 500 iterations of Race.java with no failures.

@openjdk
Copy link
Copy Markdown

openjdk bot commented Mar 27, 2026

@shruacha1234
JDK-8317801: The approval request has been created successfully.

@openjdk openjdk bot added the approval Requires approval; will be removed when approval is received label Mar 27, 2026
@jerboaa
Copy link
Copy Markdown
Contributor

jerboaa commented Mar 27, 2026

This will still need a second review from an updates reviewer.

Copy link
Copy Markdown
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, patch looks functionally equivalent to upstream patch. Thanks @JoKern65 for looking at this.

However, please run the full test suite (at least jdk tier1 and tier2) on all platforms, not only AIX, before pushing, and make sure this does not cause regressions.

@jerboaa I think we should make @JoKern65 updates reviewer. Does he have to be reviewer in mainline for that to happen?

@jerboaa
Copy link
Copy Markdown
Contributor

jerboaa commented Mar 27, 2026

@jerboaa I think we should make @JoKern65 updates reviewer. Does he have to be reviewer in mainline for that to happen?

Not necessarily, but it does help.

@openjdk openjdk bot removed the approval Requires approval; will be removed when approval is received label Mar 30, 2026
@shruacha1234
Copy link
Copy Markdown
Contributor Author

I ran jdk:tier1 and jdk:tier2 tests on AIX, Linux, and macOS, and observed no regressions. The reported failures are the same both with and without my fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport Port of a pull request already in a different code base rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

4 participants