Use centroids for SAH partition #1096

StafaH · 2025-11-20T22:55:59Z

This PR improves the quality of SAH BVH builds by using centroids rather than the full bounds when deciding which axis to split along. Elongated bounds can bias the axis choice, using centroids avoid this.

Before your PR is "Ready for review"

All commits are signed-off to indicate that your contribution adheres to the Developer Certificate of Origin requirements
Necessary tests have been added
Documentation is up-to-date
Auto-generated files modified by compiling Warp and building the documentation have been updated (e.g. __init__.pyi, functions.rst)
Code passes formatting and linting checks with pre-commit run -a

Summary by CodeRabbit

Refactor
- BVH partitioning now selects split axis using object centroids instead of bounds, yielding improved build quality and traversal performance in some scenarios.
- Added a guard to robustly handle degenerate/zero-range partition cases and avoid invalid split points.
Documentation
- Changelog updated to note the behavioral change in BVH SAH partitioning (no public API signature change).

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-11-20T22:56:09Z

📝 Walkthrough

Walkthrough

Modifies the BVH SAH partition computation in bvh.cpp to use centroid-based bounds instead of full bounds for determining split axes and bucket ranges. Adds a guard condition to handle degenerate cases when range end ≤ range start.

Changes

Cohort / File(s)	Change Summary
BVH SAH Partition Logic `warp/native/bvh.cpp`	Replaces full bounds (calc_bounds) with centroid bounds (centroid_bounds) for SAH partition computation, affecting split axis selection and bucket range calculation. Adds guard: if range_end ≤ range_start, returns range_start. Preserves existing bucket aggregation and SAH cost calculation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10–15 minutes

SAH algorithm semantics: Verify that using centroid bounds instead of full bounds is algorithmically correct for this partitioning strategy
Guard condition logic: Confirm the degenerate case handling (range_end ≤ range_start) produces expected behavior and matches design intent
Performance impact: Consider whether this change affects BVH tree quality or construction time

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Use centroids for SAH partition' directly and accurately describes the main change: replacing bounds-based calculations with centroid-based bounds in the SAH partitioning algorithm.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4af3378 and 595473a.

📒 Files selected for processing (2)

CHANGELOG.md (1 hunks)
warp/native/bvh.cpp (1 hunks)

🧰 Additional context used

🧠 Learnings (3)

📓 Common learnings

Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:343-357
Timestamp: 2025-09-28T18:49:59.528Z
Learning: In warp/native/bvh.h, CPU-built BVHs now allocate keys (may be zero-initialized), so bvh_get_group_root doesn't need null pointer guards for bvh.keys. User will document proper usage guidelines.

Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:232-234
Timestamp: 2025-09-28T18:45:51.495Z
Learning: In warp/native/bvh.h, the __ldg float4 casting pattern in bvh_load_node is acceptable because it's guarded by #ifdef __CUDA_ARCH__ and only runs in CUDA device code where alignment/aliasing behavior is predictable.

Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:359-369
Timestamp: 2025-09-28T18:45:45.955Z
Learning: StafaH prefers to defer the BVH shared stack stride issue (hardcoded WP_BVH_BLOCK_DIM vs runtime blockDim.x) for a future update rather than implementing the dynamic shared memory solution now.

📚 Learning: 2025-09-28T18:49:59.528Z

Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:343-357
Timestamp: 2025-09-28T18:49:59.528Z
Learning: In warp/native/bvh.h, CPU-built BVHs now allocate keys (may be zero-initialized), so bvh_get_group_root doesn't need null pointer guards for bvh.keys. User will document proper usage guidelines.

Applied to files:

CHANGELOG.md
warp/native/bvh.cpp

📚 Learning: 2025-09-28T18:45:48.198Z

Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/types.py:4097-4100
Timestamp: 2025-09-28T18:45:48.198Z
Learning: The num_groups parameter has been removed from the Warp BVH constructor API, so validation checks related to num_groups consistency are no longer needed.

Applied to files:

CHANGELOG.md

🧬 Code graph analysis (1)

warp/native/bvh.cpp (1)

warp/native/vec.h (1)

longest_axis (2087-2101)

🔇 Additional comments (3)

CHANGELOG.md (1)

22-22: LGTM!

The changelog entry clearly documents the behavioral change to use centroids instead of bounds for SAH partition axis determination, with appropriate context about build quality improvements.

warp/native/bvh.cpp (2)

396-408: LGTM! Correct implementation of centroid-based partitioning.

The change to compute centroid_bounds from item centers and use it for axis selection and bucket range determination follows the standard SAH BVH construction approach (as described in the Physics Based Rendering book, per the PR discussion). This avoids bias introduced by elongated bounds and should improve build quality.

410-416: LGTM! Proper guard for degenerate centroid distribution.

The guard condition correctly handles the edge case where all item centroids lie at the same position along the split axis (range_end ≤ range_start), preventing division-by-zero in the bucket computation at line 422. Returning range_start causes the partition to produce an empty left side, which is handled by the existing fallback logic in build_recursive (lines 548-552).

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2025-11-20T22:57:16Z

Greptile Overview

Greptile Summary

This PR improves the quality of Surface Area Heuristic (SAH) BVH builds by using centroids instead of full bounds when determining the split axis.

Key Changes:

Modified partition_sah_indices in warp/native/bvh.cpp to compute a bounding box of item centroids rather than using the full primitive bounds for axis selection
The split axis is now determined by longest_axis(centroid_bounds.edges()) instead of longest_axis(range_bounds.edges())
This prevents elongated primitive bounds from biasing the axis choice, resulting in better spatial partitioning
Bucket assignment still uses centroids (unchanged), but now both axis selection and bucket range are consistently based on centroid distribution
Updated CHANGELOG.md with proper documentation

Technical Correctness:
The implementation correctly computes centroids, handles the degenerate case where range_end <= range_start, and maintains consistency with the existing SAH algorithm. The change aligns with best practices in BVH construction literature where centroid-based splitting typically produces higher quality trees for elongated primitives.

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The change is well-understood, algorithmically sound, and follows established BVH construction best practices. The modification is localized to a single function, properly handles edge cases (degenerate partitions), and improves upon the existing implementation without introducing new failure modes. The CHANGELOG is properly updated, and the author confirms tests pass.
No files require special attention

Important Files Changed

File Analysis

Filename	Score	Overview
warp/native/bvh.cpp	5/5	Modified SAH partition to use centroids for axis selection instead of full bounds, improving BVH quality
CHANGELOG.md	5/5	Added changelog entry documenting the SAH BVH centroid-based partitioning improvement

Sequence Diagram

sequenceDiagram
    participant Caller as build_recursive
    participant SAH as partition_sah_indices
    participant Bounds as bounds3
    
    Caller->>SAH: Call with lowers, uppers, indices, range_bounds
    
    Note over SAH: Old: Used range_bounds.edges() for axis selection<br/>New: Compute centroid_bounds from item centers
    
    loop For each item in [start, end)
        SAH->>SAH: Compute item_center = 0.5 * (lower + upper)
        SAH->>Bounds: centroid_bounds.add_point(item_center)
    end
    
    SAH->>SAH: edges = centroid_bounds.edges()
    SAH->>SAH: split_axis = longest_axis(edges)
    
    Note over SAH: Use centroid_bounds for range_start/range_end<br/>instead of full bounds
    
    SAH->>SAH: range_start = centroid_bounds.lower[split_axis]
    SAH->>SAH: range_end = centroid_bounds.upper[split_axis]
    
    alt range_end <= range_start (degenerate case)
        SAH-->>Caller: Return range_start (triggers midpoint fallback)
    else Normal case
        loop For each item in [start, end)
            SAH->>SAH: Assign item to bucket based on centroid position
            SAH->>SAH: Accumulate item bounds in bucket
        end
        
        SAH->>SAH: Compute SAH costs for each split point
        SAH->>SAH: Find minimum cost split
        SAH-->>Caller: Return split_point on split_axis
    end
    
    Caller->>Caller: Partition indices using split_point
    Caller->>Caller: Recursively build left and right subtrees

greptile-apps

_{1 file reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

warp/native/bvh.cpp (1)
330-336: Optional: Consider centroid-based axis selection for consistency.

For consistency with the SAH approach, partition_median could also compute centroid bounds and use longest_axis(centroid_bounds.edges()) instead of longest_axis(range_bounds.edges()). The partitioning predicate already uses centroids (lines 318-319), so this would align the axis-selection logic across both SAH and median methods.

Example refactor:
 int TopDownBVHBuilder::partition_median(const vec3* lowers, const vec3* uppers, int* indices, int start, int end, bounds3 range_bounds)
 {
     assert(end-start >= 2);
 
-    vec3 edges = range_bounds.edges();
+    bounds3 centroid_bounds;
+    for (int i = start; i < end; ++i)
+    {
+        vec3 item_center = 0.5f * (lowers[indices[i]] + uppers[indices[i]]);
+        centroid_bounds.add_point(item_center);
+    }
+    vec3 edges = centroid_bounds.edges();
 
     int axis = longest_axis(edges);

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5857bc and 4af3378.

📒 Files selected for processing (1)

warp/native/bvh.cpp (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:343-357
Timestamp: 2025-09-28T18:49:59.528Z
Learning: In warp/native/bvh.h, CPU-built BVHs now allocate keys (may be zero-initialized), so bvh_get_group_root doesn't need null pointer guards for bvh.keys. User will document proper usage guidelines.

Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:232-234
Timestamp: 2025-09-28T18:45:51.495Z
Learning: In warp/native/bvh.h, the __ldg float4 casting pattern in bvh_load_node is acceptable because it's guarded by #ifdef __CUDA_ARCH__ and only runs in CUDA device code where alignment/aliasing behavior is predictable.

Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:359-369
Timestamp: 2025-09-28T18:45:45.955Z
Learning: StafaH prefers to defer the BVH shared stack stride issue (hardcoded WP_BVH_BLOCK_DIM vs runtime blockDim.x) for a future update rather than implementing the dynamic shared memory solution now.

📚 Learning: 2025-09-28T18:49:59.528Z

Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:343-357
Timestamp: 2025-09-28T18:49:59.528Z
Learning: In warp/native/bvh.h, CPU-built BVHs now allocate keys (may be zero-initialized), so bvh_get_group_root doesn't need null pointer guards for bvh.keys. User will document proper usage guidelines.

Applied to files:

warp/native/bvh.cpp

🧬 Code graph analysis (1)

warp/native/bvh.cpp (1)

warp/native/vec.h (1)

longest_axis (2087-2101)

🔇 Additional comments (2)

warp/native/bvh.cpp (2)

396-404: LGTM! Centroid-based axis selection improves SAH quality.

The change correctly computes centroid bounds and uses them for split axis selection, avoiding bias from elongated item bounds. This is the standard approach for high-quality SAH BVH construction.

407-416: Guard condition correctly handles degenerate centroid case.

When all item centroids coincide (zero extent along split axis), this guard prevents division by zero and triggers the fallback to midpoint split at lines 548-552. The comment accurately describes the behavior.

shi-eric · 2025-11-24T23:12:11Z

@StafaH (not sure if I left a comment somewhere) Can you also add a changelog entry for this change?

shi-eric · 2025-11-24T23:14:46Z

Maybe seek @AnkaChan's guidance on whether this is considered as a bug or a change

AnkaChan · 2025-11-24T23:47:33Z

This is a mis-implementation that hinders the performance. Current implementation does not accord with the one described in Physics Based Rendering book. Mustafa changed it back to be aligned with the algorithm described in the book. I would say it's a change. Because the old implementation is totally a viable algorithm, it's just that its quality is a bit lower.

Signed-off-by: Mustafa Haiderbhai <[email protected]> Update changelog Signed-off-by: Mustafa Haiderbhai <[email protected]>

StafaH · 2025-11-25T15:32:23Z

@shi-eric Changelog is updated, and linked to a change rather than a bug.

greptile-apps

_{2 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

AnkaChan · 2025-11-25T18:42:01Z

I think this looks all good to me!

greptile-apps bot reviewed Nov 20, 2025

View reviewed changes

coderabbitai bot reviewed Nov 20, 2025

View reviewed changes

shi-eric requested a review from AnkaChan November 20, 2025 23:51

shi-eric assigned StafaH Nov 20, 2025

shi-eric added this to the 1.11.0 milestone Nov 20, 2025

Use centroids for SAH partition

595473a

Signed-off-by: Mustafa Haiderbhai <[email protected]> Update changelog Signed-off-by: Mustafa Haiderbhai <[email protected]>

StafaH force-pushed the stafah/centroid branch from 4af3378 to 595473a Compare November 25, 2025 15:31

greptile-apps bot reviewed Nov 25, 2025

View reviewed changes

AnkaChan approved these changes Nov 25, 2025

View reviewed changes

shi-eric merged commit 2ae4c93 into NVIDIA:main Nov 26, 2025
16 checks passed

shi-eric linked an issue Nov 26, 2025 that may be closed by this pull request

[REQ] Modify SAH Constructor to use centroids rather than bounds #1102

Closed

Use centroids for SAH partition #1096

Use centroids for SAH partition #1096

Uh oh!

Conversation

StafaH commented Nov 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

greptile-apps bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

shi-eric commented Nov 24, 2025

Uh oh!

shi-eric commented Nov 24, 2025

Uh oh!

AnkaChan commented Nov 24, 2025

Uh oh!

StafaH commented Nov 25, 2025

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

AnkaChan commented Nov 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

StafaH commented Nov 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 20, 2025 •

edited

Loading

greptile-apps bot commented Nov 20, 2025 •

edited

Loading