-
Notifications
You must be signed in to change notification settings - Fork 394
Fix Explicit release of temporaries due to ref-cycle in fem.borrow_temporary #1075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix Explicit release of temporaries due to ref-cycle in fem.borrow_temporary #1075
Conversation
Adds explicit release/detach handling in core fem code; full audit and examples to follow in later commits. Signed-off-by: Cucchi01 <[email protected]>
📝 WalkthroughWalkthroughThe pull request implements comprehensive lifecycle management for temporary buffers and arrays across FEM modules. It adds destructors, ownership tracking mechanisms, and explicit release calls to ensure timely deallocation of temporary resources and prevent memory retention. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
warp/_src/fem/geometry/trimesh.py (1)
299-310: Code duplication: identical helpers across geometry modules.The
_replace_owned_arrayand_release_owned_temporariesmethods are duplicated across trimesh.py, tetmesh.py, hexmesh.py, and quadmesh.py. While the pattern is consistent, consider extracting these to a base class or mixin to reduce duplication.Minor style note: Line 300 uses
getattr(self, attr_name, None), which is slightly cleaner than theif hasattr+getattrpattern used in tetmesh.py (line 414), hexmesh.py (line 559), and quadmesh.py (line 295). Consider standardizing on the simpler approach.warp/_src/fem/geometry/tetmesh.py (1)
413-425: Code duplication and minor style inconsistency.These helper methods are duplicated across trimesh.py, tetmesh.py, hexmesh.py, and quadmesh.py. Consider extracting to a shared base class or mixin.
Lines 414-417: The
if hasattr+getattrpattern is slightly more verbose than trimesh.py'sgetattr(self, attr_name, None)approach (line 300). Both are correct, but the simpler version is more concise.warp/_src/fem/geometry/hexmesh.py (1)
558-570: Code duplication: consider refactoring helpers to base class.The
_replace_owned_arrayand_release_owned_temporariesmethods are duplicated across all geometry modules (trimesh.py, tetmesh.py, hexmesh.py, quadmesh.py). While this follows the existing pattern, extracting these to a shared base class would reduce duplication and ensure consistency.Minor: Lines 559-562 use
if hasattr+getattr, while trimesh.py (line 300) uses the simplergetattr(self, attr_name, None).warp/_src/fem/geometry/quadmesh.py (1)
294-306: Code duplication: identical helpers across geometry classes.These methods are duplicated in trimesh.py, tetmesh.py, hexmesh.py, and quadmesh.py. Consider refactoring to a shared base class or mixin to eliminate duplication and ensure uniform behavior.
Style note: Lines 295-298 use
if hasattr+getattr, while trimesh.py (line 300) uses the more concisegetattr(self, attr_name, None)pattern.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
warp/_src/fem/adaptivity.py(8 hunks)warp/_src/fem/domain.py(2 hunks)warp/_src/fem/geometry/adaptive_nanogrid.py(6 hunks)warp/_src/fem/geometry/hexmesh.py(4 hunks)warp/_src/fem/geometry/nanogrid.py(6 hunks)warp/_src/fem/geometry/partition.py(5 hunks)warp/_src/fem/geometry/quadmesh.py(4 hunks)warp/_src/fem/geometry/tetmesh.py(4 hunks)warp/_src/fem/geometry/trimesh.py(4 hunks)warp/_src/fem/integrate.py(6 hunks)warp/_src/fem/quadrature/pic_quadrature.py(4 hunks)warp/_src/fem/space/partition.py(5 hunks)warp/_src/fem/space/restriction.py(6 hunks)warp/_src/fem/utils.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (13)
warp/_src/fem/geometry/quadmesh.py (7)
warp/_src/types.py (2)
array(2366-3651)array(4585-4591)warp/_src/fem/geometry/hexmesh.py (2)
_release_owned_temporaries(565-570)_replace_owned_array(558-563)warp/_src/fem/geometry/partition.py (2)
_release_owned_temporaries(353-358)_replace_owned_array(346-351)warp/_src/fem/geometry/tetmesh.py (2)
_release_owned_temporaries(420-425)_replace_owned_array(413-418)warp/_src/fem/geometry/trimesh.py (2)
_release_owned_temporaries(305-310)_replace_owned_array(299-303)warp/_src/fem/quadrature/pic_quadrature.py (2)
_release_owned_temporaries(344-357)_replace_owned_array(337-342)warp/_src/fem/utils.py (1)
masked_indices(178-227)
warp/_src/fem/utils.py (1)
warp/_src/fem/cache.py (1)
borrow_temporary_like(627-645)
warp/_src/fem/geometry/trimesh.py (5)
warp/_src/fem/geometry/hexmesh.py (2)
_release_owned_temporaries(565-570)_replace_owned_array(558-563)warp/_src/fem/geometry/partition.py (2)
_release_owned_temporaries(353-358)_replace_owned_array(346-351)warp/_src/fem/geometry/quadmesh.py (2)
_release_owned_temporaries(301-306)_replace_owned_array(294-299)warp/_src/fem/geometry/tetmesh.py (2)
_release_owned_temporaries(420-425)_replace_owned_array(413-418)warp/_src/fem/utils.py (1)
masked_indices(178-227)
warp/_src/fem/geometry/tetmesh.py (5)
warp/_src/fem/geometry/hexmesh.py (2)
_release_owned_temporaries(565-570)_replace_owned_array(558-563)warp/_src/fem/geometry/partition.py (2)
_release_owned_temporaries(353-358)_replace_owned_array(346-351)warp/_src/fem/geometry/quadmesh.py (2)
_release_owned_temporaries(301-306)_replace_owned_array(294-299)warp/_src/fem/geometry/trimesh.py (2)
_release_owned_temporaries(305-310)_replace_owned_array(299-303)warp/_src/fem/utils.py (1)
masked_indices(178-227)
warp/_src/fem/geometry/partition.py (4)
warp/_src/fem/geometry/hexmesh.py (2)
_release_owned_temporaries(565-570)_replace_owned_array(558-563)warp/_src/fem/geometry/tetmesh.py (2)
_release_owned_temporaries(420-425)_replace_owned_array(413-418)warp/_src/fem/cache.py (1)
borrow_temporary(594-624)warp/_src/fem/utils.py (1)
masked_indices(178-227)
warp/_src/fem/quadrature/pic_quadrature.py (3)
warp/_src/fem/geometry/tetmesh.py (2)
_release_owned_temporaries(420-425)_replace_owned_array(413-418)warp/_src/fem/cache.py (4)
borrow_temporary(594-624)detach(447-450)detach(526-528)TemporaryStore(425-582)warp/_src/fem/utils.py (1)
compress_node_indices(81-167)
warp/_src/fem/geometry/nanogrid.py (3)
warp/_src/fem/geometry/hexmesh.py (1)
_release_owned_temporaries(565-570)warp/_src/fem/utils.py (1)
masked_indices(178-227)warp/_src/fem/cache.py (2)
detach(447-450)detach(526-528)
warp/_src/fem/space/restriction.py (6)
warp/_src/fem/geometry/partition.py (2)
rebuild(461-493)_release_owned_temporaries(353-358)warp/_src/fem/geometry/quadmesh.py (1)
_release_owned_temporaries(301-306)warp/_src/fem/geometry/tetmesh.py (1)
_release_owned_temporaries(420-425)warp/_src/fem/quadrature/pic_quadrature.py (1)
_release_owned_temporaries(344-357)warp/_src/fem/cache.py (1)
borrow_temporary_like(627-645)warp/_src/fem/space/topology.py (1)
element_node_indices(118-158)
warp/_src/fem/domain.py (2)
warp/_src/fem/utils.py (1)
masked_indices(178-227)warp/_src/fem/cache.py (2)
detach(447-450)detach(526-528)
warp/_src/fem/integrate.py (2)
warp/_src/types.py (4)
numpy(3356-3379)numpy(3816-3819)array(2366-3651)array(4585-4591)warp/_src/fem/cache.py (1)
_release_temporary(577-582)
warp/_src/fem/geometry/adaptive_nanogrid.py (3)
warp/_src/fem/utils.py (1)
masked_indices(178-227)warp/_src/fem/geometry/nanogrid.py (1)
_replace_boundary_face_indices(132-136)warp/_src/fem/cache.py (2)
detach(447-450)detach(526-528)
warp/_src/fem/space/partition.py (10)
warp/_src/fem/geometry/hexmesh.py (1)
_release_owned_temporaries(565-570)warp/_src/fem/geometry/nanogrid.py (1)
_release_owned_temporaries(138-141)warp/_src/fem/geometry/partition.py (1)
_release_owned_temporaries(353-358)warp/_src/fem/geometry/quadmesh.py (1)
_release_owned_temporaries(301-306)warp/_src/fem/geometry/tetmesh.py (1)
_release_owned_temporaries(420-425)warp/_src/fem/geometry/trimesh.py (1)
_release_owned_temporaries(305-310)warp/_src/fem/quadrature/pic_quadrature.py (1)
_release_owned_temporaries(344-357)warp/_src/fem/space/restriction.py (2)
_release_owned_temporaries(247-256)node_count(177-179)warp/_src/types.py (3)
array(2366-3651)array(4585-4591)dtype(4758-4764)warp/_src/fem/cache.py (1)
borrow_temporary(594-624)
warp/_src/fem/geometry/hexmesh.py (7)
warp/_src/types.py (2)
array(2366-3651)array(4585-4591)warp/_src/fem/geometry/nanogrid.py (1)
_release_owned_temporaries(138-141)warp/_src/fem/geometry/partition.py (2)
_release_owned_temporaries(353-358)_replace_owned_array(346-351)warp/_src/fem/geometry/quadmesh.py (2)
_release_owned_temporaries(301-306)_replace_owned_array(294-299)warp/_src/fem/geometry/tetmesh.py (2)
_release_owned_temporaries(420-425)_replace_owned_array(413-418)warp/_src/fem/geometry/trimesh.py (2)
_release_owned_temporaries(305-310)_replace_owned_array(299-303)warp/_src/fem/utils.py (1)
masked_indices(178-227)
🪛 Ruff (0.14.3)
warp/_src/fem/domain.py
465-465: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (20)
warp/_src/fem/geometry/partition.py (7)
180-185: LGTM: Proper destructor for resource cleanup.The addition of
__del__to ensure timely release of owned temporaries is appropriate and aligns with the PR's goal of fixing delayed buffer releases.
312-335: LGTM: Explicit temporary management pattern.The refactoring from direct assignment to using intermediate variables +
_replace_owned_array()+ explicitrelease()correctly ensures timely return of temporaries to the pool. This pattern successfully breaks the reference cycle described in the PR objectives.
341-344: LGTM: Clean encapsulation for atomic updates.The helper method provides a clear interface for replacing all three side index arrays atomically.
346-351: LGTM: Robust replacement with identity check.The identity check
old_value is not new_valuecorrectly prevents releasing an array whenmasked_indicesreuses the passed-in buffer, ensuring safe resource management.
353-358: LGTM: Standard cleanup pattern.The cleanup method follows the same pattern as similar code in
hexmesh.pyandtetmesh.py. Note that this will callrelease()multiple times if attributes share the same object (addressed in separate comment on lines 242-243).
456-459: LGTM: Proper cleanup in inheritance hierarchy.Releasing class-specific temporaries (
_cells,_partition_cells) before callingsuper().__del__()ensures the parent class can clean up inherited temporaries. The explicitsuper().__del__()call is appropriate here since the parent's destructor is central to this PR's resource management goals.
477-485: LGTM: Consistent explicit management pattern.The refactoring to use intermediate variables and
_replace_owned_array()aligns with the changes incompute_side_indices_from_cells, ensuring consistent and timely temporary management across rebuild operations.warp/_src/fem/utils.py (3)
127-129: Ownership tracking logic looks correct.The function correctly tracks whether it allocated
unique_node_indices(and thus owns it) or whether it was provided by the caller. This ensures proper lifecycle management where only owned temporaries are released.
131-133: Ownership tracking is consistent.The ownership logic for
unique_node_countmirrors the approach forunique_node_indices, maintaining consistency in resource management.
161-164: Conditional release logic is correct.The function properly releases owned temporaries only when they're not returned to the caller. The dual ownership checks ensure that:
- Provided arrays are never inappropriately released
- Allocated temporaries are cleaned up when not needed
- Returned temporaries transfer ownership to the caller
warp/_src/fem/domain.py (3)
466-468: Ownership semantics are correctly implemented.When
element_indicesis provided by the caller, the Subdomain correctly marks it as not owned. The comment clearly explains the rationale, ensuring that caller-provided temporaries remain the caller's responsibility.
490-495: Destructor implementation is defensively coded.The destructor correctly implements defensive checks:
- Uses
getattrwith defaults to handle partial initialization- Checks for
releasemethod before calling- Clears the reference after release to prevent double-release
The ownership-based conditional ensures that only Subdomain-allocated temporaries are released, while caller-provided arrays remain untouched.
458-463: No issues found—the detach()/release() pattern is correct and safe.The code properly manages temporary arrays:
detach()removes the array from the temporary pool's auto-recycling and updates its deleter to use the allocator's default cleanup- The detached array retains its
release()method throughout its lifetimeelement_global_to_local.release()immediately returns the unused temporary to the pool- The destructor's defensive check (
hasattr(element_indices, "release")) safely callsrelease()on the detached array for explicit cleanupThis pattern is used consistently throughout the codebase (as seen in trimesh.py, tetmesh.py, quadmesh.py, etc.) and is the intended way to convert temporary arrays into owned storage.
warp/_src/fem/space/restriction.py (1)
239-256: Solid helper for temporary ownership.Centralizing the release logic here keeps every caller honest and pairs nicely with the destructor hook—no issues spotted.
warp/_src/fem/integrate.py (1)
1327-1337: Great call on the explicit release path.Releasing the borrowed accumulator before returning keeps the temporary pool tight without changing control flow—nice polish.
warp/_src/fem/geometry/trimesh.py (2)
98-99: LGTM: Destructor ensures owned temporaries are released.The destructor correctly invokes
_release_owned_temporariesto clean up owned temporary arrays, preventing memory retention when the mesh object is destroyed.
291-295: LGTM: Proper lifecycle for boundary indices.The code correctly:
- Calls
masked_indicesto extract boundary edge data- Detaches and stores
boundary_edge_indicesvia_replace_owned_array- Releases the unused
boundary_edge_global_to_localmapping immediatelyThis ensures the owned temporary is managed and the temporary mapping is returned to the pool.
warp/_src/fem/geometry/tetmesh.py (1)
96-97: LGTM: Destructor ensures cleanup of owned temporaries.The destructor correctly delegates to
_release_owned_temporariesto free vertex and boundary topology arrays.warp/_src/fem/geometry/hexmesh.py (1)
187-188: LGTM: Destructor properly releases owned temporaries.The destructor ensures that vertex and boundary face topology arrays are cleaned up when the mesh is destroyed.
warp/_src/fem/geometry/quadmesh.py (1)
92-93: LGTM: Destructor ensures proper cleanup.The destructor correctly calls
_release_owned_temporariesto free vertex and boundary edge topology arrays.
| empty = cache.borrow_temporary(temporary_store, dtype=int, shape=(0,), device=device) | ||
| self._set_partition_side_arrays(empty, empty, empty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Multiple release calls on same temporary object.
Assigning the same empty temporary to all three side arrays will cause release() to be called multiple times on the same object when _release_owned_temporaries() or subsequent _replace_owned_array() calls execute. This creates a double/triple-free scenario that could corrupt the temporary pool or cause crashes.
Apply this diff to create separate empty temporaries:
- empty = cache.borrow_temporary(temporary_store, dtype=int, shape=(0,), device=device)
- self._set_partition_side_arrays(empty, empty, empty)
+ empty_partition = cache.borrow_temporary(temporary_store, dtype=int, shape=(0,), device=device)
+ empty_boundary = cache.borrow_temporary(temporary_store, dtype=int, shape=(0,), device=device)
+ empty_frontier = cache.borrow_temporary(temporary_store, dtype=int, shape=(0,), device=device)
+ self._set_partition_side_arrays(empty_partition, empty_boundary, empty_frontier)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| empty = cache.borrow_temporary(temporary_store, dtype=int, shape=(0,), device=device) | |
| self._set_partition_side_arrays(empty, empty, empty) | |
| empty_partition = cache.borrow_temporary(temporary_store, dtype=int, shape=(0,), device=device) | |
| empty_boundary = cache.borrow_temporary(temporary_store, dtype=int, shape=(0,), device=device) | |
| empty_frontier = cache.borrow_temporary(temporary_store, dtype=int, shape=(0,), device=device) | |
| self._set_partition_side_arrays(empty_partition, empty_boundary, empty_frontier) |
🤖 Prompt for AI Agents
In warp/_src/fem/geometry/partition.py around lines 242 to 243, the code assigns
the same temporary `empty` to all three side arrays which leads to multiple
release() calls on the same temporary; to fix this, call cache.borrow_temporary
separately three times to create three distinct empty temporaries (one per side
array) with the same dtype/shape/device and pass those three distinct
temporaries into self._set_partition_side_arrays so each owned temporary is
released exactly once.
|
Thanks @Cucchi01 for this thorough PR! I 'll have a closer look a the changes. In the meantime: here's a branch https://github.com/gdaviet/warp/tree/fix/temporary_self_reference that marks the Edit: I have switched to a less intrusive construct, using an array view. Downside is that we lose the deprecation warning, but at least it does not affect other array instances |
Yes, in this case releasing the old temporary first should be fine (there is not reason for the struct argument lifetime to exceed that of the class attribute), though in some situations I would still prefer to have the garbage collector be in charge of the release (this was the main reason for the refactor of the Temporary object, as the old way of having separate temporary and arrays was creating too much logic related external accesses). |
|
Perfect, then I am going to use your private branch for the moment @gdaviet. Thank you! |
|
When investigating this I actually found a couple more ref cycles. The fixes are now merged back to the main branch, and the memory usage is now much more stable! |
Context / Problem
fem.borrow_temporary()gives backwarp.cache.Temporaryobjects whosearrayattribute self-references the temporary. If we simply drop local references, the buffers keep a ref-cycle and only return to the pool when Python’s GC runs, contributing to situation like this issue in the related Newton repository newton-physics/newton#1042Possible alternatives in my opinion
Wrap
temporary.arrayin aweakref.proxy.temp.arrayor passes it to kernels now receives a proxy, sowp.launchcan’t infer types (weakref.ProxyTypeshows up as the dtype).Patch
wp.arrayitself (inside Warp) soarraybecomes a property and the temporary installs weakref-backedrelease/detach.wp.array: every array-like would be affected even if they never touchfem.borrow_temporary.Explicit release/detach in the whole codebase.
TemporaryAPI but audit the Warp library to release or detach buffers.Notes
self.attr = fem.borrow_temporary(...), in my changes the old buffer is released first.Even in situation in which the temporary could be returned externally, e.g.:
Please correct me if this is not the expected behavior.
Still pending
fem.integrate()call sites outside the core solver still need to be changed. I'm going to do a follow-up commit to cover those.Before your PR is "Ready for review"
__init__.pyi,functions.rst)pre-commit run -aSummary by CodeRabbit
Bug Fixes
Performance