Open
Conversation
69bca87 to
0e3e0eb
Compare
4910570 to
32b8a26
Compare
1208344 to
189b025
Compare
cijohnson
requested changes
Dec 11, 2025
Contributor
cijohnson
left a comment
There was a problem hiding this comment.
I see
sys.path.insert( 0, './lib' ) use in mulitple files, are they required with new cvs pkg?
- Added detect_distro() function to identify Linux distribution - Added package name translation for RHEL/SUSE equivalents - Added multi-distro package management functions (install_package, update_package_cache, map_packages) - Added Docker installation support for RHEL/CentOS and SUSE - Updated test files with proper cvs.lib imports for multi-distro functions Supports: Debian/Ubuntu (apt-get), RHEL/CentOS/Rocky/Alma (dnf), SUSE (zypper)
… commands - Replace hardcoded 'apt update' and 'apt-get install' with detect_distro() and install_package() - Add proper package name translation using map_packages() - Update all affected test files: - tests/health/install/install_babelstream.py - tests/health/install/install_rocblas.py - tests/health/install/install_rvs.py - tests/health/rocblas_cvs.py - tests/ibperf/install_ibperf_tools.py
90e3ff4 to
8c4c817
Compare
atnair-amd
added a commit
that referenced
this pull request
Apr 17, 2026
Consolidated fix for four small, narrow bugs in cvs/lib/linux_utils.py.
Each fix has its own regression test in cvs/lib/unittests/test_linux_utils.py
following the existing MagicMock pattern; all four new tests fail against
the pre-fix code and pass here.
1. get_rdma_nic_dict missing match-guard (real impact on banff today):
match.group(1) was called without checking that the strict inner
pattern matched. DOWN rdma links omit the `netdev <name>` clause, so
the first DOWN device raised AttributeError and aborted the parse for
all nodes. The sibling get_active_rdma_nic_dict already guards with
`if match:`; add the same guard here. Caller check_cluster_health.py:73
wraps the call in try/except, so the health-report HTML was silently
missing RDMA NIC data on banff.
Live observation on banff-cyxtera-s70-2:
sudo rdma link
link mlx5_0/1 ... state DOWN physical_state DISABLED
...
link mlx5_8/1 state ACTIVE physical_state LINK_UP netdev ens14np0
### get_rdma_nic_dict on banff ###
AttributeError: 'NoneType' object has no attribute 'group'
2. get_ip_addr_dict int_nam leaks across nodes (latent, multi-node):
int_nam was initialized once outside the per-node loop, so after
parsing node A it retained A's last interface name. If node B's first
matching line was a property line (mtu/state/mac/inet/inet6) rather
than an interface header, the code did
`ip_dict['nodeB']['<nodeA-iface>']['mtu'] = ...` and raised KeyError.
Move `int_nam = None` inside the per-node loop and add an
`if int_nam is None: continue` guard after the header block so
property handlers no-op until the first header is seen.
Header lines carry `mtu N`/`state UP` inline, so they still fall
through because int_nam is just set by the header block above.
3. get_linux_perf_tuning_dict never returns (no callers today, primes
the function for future wiring-in):
The function built out_dict but ended without a `return`, so every
caller received None. The docstring itself flagged it. Append the
missing return and update the docstring. Live-confirmed on banff
(TYPE: NoneType, VALUE: None).
4. get_dns_dict dead branches (no callers today, same reasoning):
Duplicated `elif re.search('Protocols', ...)` branch and every body
was `print('')`, so dns_dict[node] was always {}. Replace the dead
branches with proper regex captures for Protocols (collected as a
list since it appears once globally and once per Link),
Current DNS Server, DNS Servers (space-split list), and DNS Domain.
Live-confirmed on banff: resolvectl returned usable data but
dns_dict was {'banff-...': {}}.
Coordination: no overlap with PR #15 (multi-OS additions appended after
line 814) or PR #122 (logging-only conversions elsewhere in the file).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Adds support to be OS agnostic and work for all linux distros
Technical Details
After the changes tried running install rvs commands:
pytest -vvv --log-file=../logs/rvs_cvs_test_mi325__install_rvs.log -s tests/health/install/install_rvs.py --cluster_file ./input/cluster_file/manojsk_cluster.json --config_file ./input/config_file/health/mi300_health_config.jsonUsed the same in RHEL, it works: