Skip to content

Conversation

@StafaH
Copy link
Contributor

@StafaH StafaH commented Nov 20, 2025

This PR improves the quality of SAH BVH builds by using centroids rather than the full bounds when deciding which axis to split along. Elongated bounds can bias the axis choice, using centroids avoid this.

Before your PR is "Ready for review"

  • All commits are signed-off to indicate that your contribution adheres to the Developer Certificate of Origin requirements
  • Necessary tests have been added
  • Documentation is up-to-date
  • Auto-generated files modified by compiling Warp and building the documentation have been updated (e.g. __init__.pyi, functions.rst)
  • Code passes formatting and linting checks with pre-commit run -a

Summary by CodeRabbit

  • Refactor

    • BVH partitioning now selects split axis using object centroids instead of bounds, yielding improved build quality and traversal performance in some scenarios.
    • Added a guard to robustly handle degenerate/zero-range partition cases and avoid invalid split points.
  • Documentation

    • Changelog updated to note the behavioral change in BVH SAH partitioning (no public API signature change).

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

📝 Walkthrough

Walkthrough

Modifies the BVH SAH partition computation in bvh.cpp to use centroid-based bounds instead of full bounds for determining split axes and bucket ranges. Adds a guard condition to handle degenerate cases when range end ≤ range start.

Changes

Cohort / File(s) Change Summary
BVH SAH Partition Logic
warp/native/bvh.cpp
Replaces full bounds (calc_bounds) with centroid bounds (centroid_bounds) for SAH partition computation, affecting split axis selection and bucket range calculation. Adds guard: if range_end ≤ range_start, returns range_start. Preserves existing bucket aggregation and SAH cost calculation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10–15 minutes

  • SAH algorithm semantics: Verify that using centroid bounds instead of full bounds is algorithmically correct for this partitioning strategy
  • Guard condition logic: Confirm the degenerate case handling (range_end ≤ range_start) produces expected behavior and matches design intent
  • Performance impact: Consider whether this change affects BVH tree quality or construction time

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Use centroids for SAH partition' directly and accurately describes the main change: replacing bounds-based calculations with centroid-based bounds in the SAH partitioning algorithm.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4af3378 and 595473a.

📒 Files selected for processing (2)
  • CHANGELOG.md (1 hunks)
  • warp/native/bvh.cpp (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:343-357
Timestamp: 2025-09-28T18:49:59.528Z
Learning: In warp/native/bvh.h, CPU-built BVHs now allocate keys (may be zero-initialized), so bvh_get_group_root doesn't need null pointer guards for bvh.keys. User will document proper usage guidelines.
Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:232-234
Timestamp: 2025-09-28T18:45:51.495Z
Learning: In warp/native/bvh.h, the __ldg float4 casting pattern in bvh_load_node is acceptable because it's guarded by #ifdef __CUDA_ARCH__ and only runs in CUDA device code where alignment/aliasing behavior is predictable.
Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:359-369
Timestamp: 2025-09-28T18:45:45.955Z
Learning: StafaH prefers to defer the BVH shared stack stride issue (hardcoded WP_BVH_BLOCK_DIM vs runtime blockDim.x) for a future update rather than implementing the dynamic shared memory solution now.
📚 Learning: 2025-09-28T18:49:59.528Z
Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:343-357
Timestamp: 2025-09-28T18:49:59.528Z
Learning: In warp/native/bvh.h, CPU-built BVHs now allocate keys (may be zero-initialized), so bvh_get_group_root doesn't need null pointer guards for bvh.keys. User will document proper usage guidelines.

Applied to files:

  • CHANGELOG.md
  • warp/native/bvh.cpp
📚 Learning: 2025-09-28T18:45:48.198Z
Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/types.py:4097-4100
Timestamp: 2025-09-28T18:45:48.198Z
Learning: The num_groups parameter has been removed from the Warp BVH constructor API, so validation checks related to num_groups consistency are no longer needed.

Applied to files:

  • CHANGELOG.md
🧬 Code graph analysis (1)
warp/native/bvh.cpp (1)
warp/native/vec.h (1)
  • longest_axis (2087-2101)
🔇 Additional comments (3)
CHANGELOG.md (1)

22-22: LGTM!

The changelog entry clearly documents the behavioral change to use centroids instead of bounds for SAH partition axis determination, with appropriate context about build quality improvements.

warp/native/bvh.cpp (2)

396-408: LGTM! Correct implementation of centroid-based partitioning.

The change to compute centroid_bounds from item centers and use it for axis selection and bucket range determination follows the standard SAH BVH construction approach (as described in the Physics Based Rendering book, per the PR discussion). This avoids bias introduced by elongated bounds and should improve build quality.


410-416: LGTM! Proper guard for degenerate centroid distribution.

The guard condition correctly handles the edge case where all item centroids lie at the same position along the split axis (range_end ≤ range_start), preventing division-by-zero in the bucket computation at line 422. Returning range_start causes the partition to produce an empty left side, which is handled by the existing fallback logic in build_recursive (lines 548-552).

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link

greptile-apps bot commented Nov 20, 2025

Greptile Overview

Greptile Summary

This PR improves the quality of Surface Area Heuristic (SAH) BVH builds by using centroids instead of full bounds when determining the split axis.

Key Changes:

  • Modified partition_sah_indices in warp/native/bvh.cpp to compute a bounding box of item centroids rather than using the full primitive bounds for axis selection
  • The split axis is now determined by longest_axis(centroid_bounds.edges()) instead of longest_axis(range_bounds.edges())
  • This prevents elongated primitive bounds from biasing the axis choice, resulting in better spatial partitioning
  • Bucket assignment still uses centroids (unchanged), but now both axis selection and bucket range are consistently based on centroid distribution
  • Updated CHANGELOG.md with proper documentation

Technical Correctness:
The implementation correctly computes centroids, handles the degenerate case where range_end <= range_start, and maintains consistency with the existing SAH algorithm. The change aligns with best practices in BVH construction literature where centroid-based splitting typically produces higher quality trees for elongated primitives.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The change is well-understood, algorithmically sound, and follows established BVH construction best practices. The modification is localized to a single function, properly handles edge cases (degenerate partitions), and improves upon the existing implementation without introducing new failure modes. The CHANGELOG is properly updated, and the author confirms tests pass.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
warp/native/bvh.cpp 5/5 Modified SAH partition to use centroids for axis selection instead of full bounds, improving BVH quality
CHANGELOG.md 5/5 Added changelog entry documenting the SAH BVH centroid-based partitioning improvement

Sequence Diagram

sequenceDiagram
    participant Caller as build_recursive
    participant SAH as partition_sah_indices
    participant Bounds as bounds3
    
    Caller->>SAH: Call with lowers, uppers, indices, range_bounds
    
    Note over SAH: Old: Used range_bounds.edges() for axis selection<br/>New: Compute centroid_bounds from item centers
    
    loop For each item in [start, end)
        SAH->>SAH: Compute item_center = 0.5 * (lower + upper)
        SAH->>Bounds: centroid_bounds.add_point(item_center)
    end
    
    SAH->>SAH: edges = centroid_bounds.edges()
    SAH->>SAH: split_axis = longest_axis(edges)
    
    Note over SAH: Use centroid_bounds for range_start/range_end<br/>instead of full bounds
    
    SAH->>SAH: range_start = centroid_bounds.lower[split_axis]
    SAH->>SAH: range_end = centroid_bounds.upper[split_axis]
    
    alt range_end <= range_start (degenerate case)
        SAH-->>Caller: Return range_start (triggers midpoint fallback)
    else Normal case
        loop For each item in [start, end)
            SAH->>SAH: Assign item to bucket based on centroid position
            SAH->>SAH: Accumulate item bounds in bucket
        end
        
        SAH->>SAH: Compute SAH costs for each split point
        SAH->>SAH: Find minimum cost split
        SAH-->>Caller: Return split_point on split_axis
    end
    
    Caller->>Caller: Partition indices using split_point
    Caller->>Caller: Recursively build left and right subtrees
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
warp/native/bvh.cpp (1)

330-336: Optional: Consider centroid-based axis selection for consistency.

For consistency with the SAH approach, partition_median could also compute centroid bounds and use longest_axis(centroid_bounds.edges()) instead of longest_axis(range_bounds.edges()). The partitioning predicate already uses centroids (lines 318-319), so this would align the axis-selection logic across both SAH and median methods.

Example refactor:

 int TopDownBVHBuilder::partition_median(const vec3* lowers, const vec3* uppers, int* indices, int start, int end, bounds3 range_bounds)
 {
     assert(end-start >= 2);
 
-    vec3 edges = range_bounds.edges();
+    bounds3 centroid_bounds;
+    for (int i = start; i < end; ++i)
+    {
+        vec3 item_center = 0.5f * (lowers[indices[i]] + uppers[indices[i]]);
+        centroid_bounds.add_point(item_center);
+    }
+    vec3 edges = centroid_bounds.edges();
 
     int axis = longest_axis(edges);
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f5857bc and 4af3378.

📒 Files selected for processing (1)
  • warp/native/bvh.cpp (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:343-357
Timestamp: 2025-09-28T18:49:59.528Z
Learning: In warp/native/bvh.h, CPU-built BVHs now allocate keys (may be zero-initialized), so bvh_get_group_root doesn't need null pointer guards for bvh.keys. User will document proper usage guidelines.
Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:232-234
Timestamp: 2025-09-28T18:45:51.495Z
Learning: In warp/native/bvh.h, the __ldg float4 casting pattern in bvh_load_node is acceptable because it's guarded by #ifdef __CUDA_ARCH__ and only runs in CUDA device code where alignment/aliasing behavior is predictable.
Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:359-369
Timestamp: 2025-09-28T18:45:45.955Z
Learning: StafaH prefers to defer the BVH shared stack stride issue (hardcoded WP_BVH_BLOCK_DIM vs runtime blockDim.x) for a future update rather than implementing the dynamic shared memory solution now.
📚 Learning: 2025-09-28T18:49:59.528Z
Learnt from: StafaH
Repo: NVIDIA/warp PR: 925
File: warp/native/bvh.h:343-357
Timestamp: 2025-09-28T18:49:59.528Z
Learning: In warp/native/bvh.h, CPU-built BVHs now allocate keys (may be zero-initialized), so bvh_get_group_root doesn't need null pointer guards for bvh.keys. User will document proper usage guidelines.

Applied to files:

  • warp/native/bvh.cpp
🧬 Code graph analysis (1)
warp/native/bvh.cpp (1)
warp/native/vec.h (1)
  • longest_axis (2087-2101)
🔇 Additional comments (2)
warp/native/bvh.cpp (2)

396-404: LGTM! Centroid-based axis selection improves SAH quality.

The change correctly computes centroid bounds and uses them for split axis selection, avoiding bias from elongated item bounds. This is the standard approach for high-quality SAH BVH construction.


407-416: Guard condition correctly handles degenerate centroid case.

When all item centroids coincide (zero extent along split axis), this guard prevents division by zero and triggers the fallback to midpoint split at lines 548-552. The comment accurately describes the behavior.

@shi-eric shi-eric requested a review from AnkaChan November 20, 2025 23:51
@shi-eric shi-eric added this to the 1.11.0 milestone Nov 20, 2025
@shi-eric
Copy link
Contributor

@StafaH (not sure if I left a comment somewhere) Can you also add a changelog entry for this change?

@shi-eric
Copy link
Contributor

Maybe seek @AnkaChan's guidance on whether this is considered as a bug or a change

@AnkaChan
Copy link
Contributor

This is a mis-implementation that hinders the performance. Current implementation does not accord with the one described in Physics Based Rendering book. Mustafa changed it back to be aligned with the algorithm described in the book. I would say it's a change. Because the old implementation is totally a viable algorithm, it's just that its quality is a bit lower.

Signed-off-by: Mustafa Haiderbhai <[email protected]>

Update changelog

Signed-off-by: Mustafa Haiderbhai <[email protected]>
@StafaH
Copy link
Contributor Author

StafaH commented Nov 25, 2025

@shi-eric Changelog is updated, and linked to a change rather than a bug.

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@AnkaChan
Copy link
Contributor

I think this looks all good to me!

@shi-eric shi-eric merged commit 2ae4c93 into NVIDIA:main Nov 26, 2025
16 checks passed
@shi-eric shi-eric linked an issue Nov 26, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[REQ] Modify SAH Constructor to use centroids rather than bounds

3 participants