fix: add theoretical TFlops for H200 GPU #1422

roclark · 2025-10-24T15:46:42Z

What does this PR do ?

Added the theoretical TFlops for H200 GPUs to measure the process efficiency.

Issues

N/A

Usage

N/A

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

Confirmed the format of the GPU output on a cluster with H200s:

$ python3 -c 'import torch; print(torch.cuda.get_device_name())'
NVIDIA H200

Confirmed theoretical TFlops in the H200 data sheet

Summary by CodeRabbit

New Features
- Extended GPU performance tracking support to include NVIDIA H200 with bfloat16 and float32 precision types.

Added the theoretical TFlops for H200 GPUs which is equivalent to H100 80GB HBM3 estimates. Signed-Off-By: Robert Clark <[email protected]>

coderabbitai · 2025-10-24T15:50:30Z

📝 Walkthrough

Walkthrough

The pull request adds theoretical TFLOPS benchmark entries for NVIDIA H200 GPUs in bfloat16 and float32 data types to the THEORETICAL_TFLOPS lookup table, extending device-dtype coverage without modifying control flow or behavioral logic.

Changes

Cohort / File(s)	Summary
H200 TFLOPS Benchmark Additions `nemo_rl/utils/flops_tracker.py`	Adds two THEORETICAL_TFLOPS entries for H200: bfloat16 (1979/2 TFLOPS) and float32 (989/2 TFLOPS with TF32 conditional, else 67.0), mirroring existing H100 structure.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

feat: Update Theoretical TFLOPS #1236: Adds B200/B300/GB200/GB300 TFLOPS entries to the same THEORETICAL_TFLOPS dictionary, extending benchmark coverage for additional GPU architectures.

Suggested reviewers

guyueh1
terrykong

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Test Results For Major Changes	✅ Passed	The PR adds two new entries to the THEORETICAL_TFLOPS dictionary for the NVIDIA H200 GPU with specific bfloat16 and float32 values, mirroring the H100 entries. This is a minor change—purely a data addition to a lookup table with no new logic, behavioral changes, or code modifications that could affect numerics, convergence, or performance. The existing test file `test_flops_counter.py` validates FLOPS calculations for various models and configurations, and while it doesn't specifically test H200 values, the PR description explicitly states that pre-checks were completed including running unit and functional tests locally with no issues reported. The H200 TFLOPS values match official NVIDIA specifications and align with the established H100 values in the same table, making them straightforward reference data additions.
Title check	✅ Passed	The title accurately describes the main change: adding theoretical TFLOPS values for H200 GPU to the benchmark table in the flops_tracker.py file.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

terrykong

@guyueh1 to review

terrykong · 2025-11-04T17:43:13Z

@guyueh1 bump

terrykong · 2025-11-20T00:25:29Z

closing in favor of #1543 which has some tests

Add theoretical TFlops for H200 GPU

5363162

Added the theoretical TFlops for H200 GPUs which is equivalent to H100 80GB HBM3 estimates. Signed-Off-By: Robert Clark <[email protected]>

roclark requested a review from a team as a code owner October 24, 2025 15:46

roclark changed the title ~~Add theoretical TFlops for H200 GPU~~ fix: Add theoretical TFlops for H200 GPU Oct 24, 2025

roclark changed the title ~~fix: Add theoretical TFlops for H200 GPU~~ fix: add theoretical TFlops for H200 GPU Oct 24, 2025

terrykong reviewed Oct 24, 2025

View reviewed changes

euronymous-aithal requested a review from guyueh1 October 24, 2025 17:24

terrykong added the CI:L0 Run doctests and unit tests label Nov 4, 2025

terrykong temporarily deployed to nemo-ci November 4, 2025 17:43 — with GitHub Actions Inactive

terrykong temporarily deployed to nemo-ci November 4, 2025 17:49 — with GitHub Actions Inactive

youngeunkwon0405 mentioned this pull request Nov 6, 2025

fix: Make the optimizer offloading optional #1404

Merged

4 tasks

terrykong closed this Nov 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add theoretical TFlops for H200 GPU #1422

fix: add theoretical TFlops for H200 GPU #1422

Uh oh!

roclark commented Oct 24, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 24, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

terrykong left a comment

Uh oh!

terrykong commented Nov 4, 2025

Uh oh!

terrykong commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: add theoretical TFlops for H200 GPU #1422

fix: add theoretical TFlops for H200 GPU #1422

Uh oh!

Conversation

roclark commented Oct 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

terrykong left a comment

Choose a reason for hiding this comment

Uh oh!

terrykong commented Nov 4, 2025

Uh oh!

terrykong commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

roclark commented Oct 24, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 24, 2025 •

edited

Loading