Skip to content

Bug fix in Narwhal site config: revert back from OFI to UCX#1916

Draft
climbfuji wants to merge 1 commit intoJCSDA:developfrom
climbfuji:bugfix/narwhal_ofi_ucx
Draft

Bug fix in Narwhal site config: revert back from OFI to UCX#1916
climbfuji wants to merge 1 commit intoJCSDA:developfrom
climbfuji:bugfix/narwhal_ofi_ucx

Conversation

@climbfuji
Copy link
Collaborator

Description

All in the title.

While the full tests I ran on Narwhal indicated that OFI was fine, repeated test runs of our operational configuration showed a significant slowdown, and random errors reading and writing HDF5 files.

Dependencies

None

Issues addressed

None created

Applications affected

None

Systems affected

Narwhal

Testing

  • CI: Note whether the automatic tests (GitHub actions tests that run automatically for every commit) pass or not
    • GitHub actions CI tests pass
    • GitHub actions CI tests do not pass (provide explanation)
    • GitHub actions CI tests skipped (provide explanation if necessary)
  • New tests added: List and describe any new tests added to GitHub actions
    • ...
  • Additional testing: Add information on any additional tests conducted
    • IN PROGRESS TESTING ON NARWHAL

Checklist

  • This PR addresses one issue/problem/enhancement or has a very good reason for not doing so.
  • These changes have been tested on the affected systems and applications.
  • All dependency PRs/issues have been resolved and this PR can be merged.
  • All necessary updates to the documentation (spack-stack wiki) will be made when this PR is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant