Skip to content

Fix O(N²) entity addition in CallbackGroup (#2942)#3166

Open
PavelGuzenfeld wants to merge 3 commits into
ros2:rollingfrom
PavelGuzenfeld:fix/callback-group-quadratic-add-2942
Open

Fix O(N²) entity addition in CallbackGroup (#2942)#3166
PavelGuzenfeld wants to merge 3 commits into
ros2:rollingfrom
PavelGuzenfeld:fix/callback-group-quadratic-add-2942

Conversation

@PavelGuzenfeld

@PavelGuzenfeld PavelGuzenfeld commented Jun 6, 2026

Copy link
Copy Markdown

Fixes #2942

The add_* methods (add_timer, add_subscription, add_service, add_client, add_waitable) scanned the whole entity list to drop expired weak_ptrs on every call, so adding N entities was O(N²). In the #2942 reproducer (10,000 timers) that scan is ~96% of the wall time.

This moves the expired-entry cleanup out of the add_* methods — which become a plain push_back — and into collect_all_ptrs, which already walks every entry on each executor spin and so can compact in the same pass at no extra cost (erase/remove_if). collect_all_ptrs is now non-const; there's no public API or ABI change.

It's safe because collect_all_ptrs already skipped expired entries and now also removes them, the executor calls it every spin so cleanup stays prompt, and the find_*_ptrs_if helpers already tolerate expired weak_ptrs.

Reproducer from #2942 (10,000 timers): 429 ms → 6 ms.

Adds test_callback_group_entity_cleanup.cpp covering compaction during collect_all_ptrs, live-only collection, non-quadratic insertion, interleaved add/remove, and mixed entity types.


Some of this change was produced with Claude Opus 4.6 (Anthropic).

Move expired weak_ptr cleanup from add_* methods (O(N) per call, O(N²)
total when adding N entities) into collect_all_ptrs via a new
collect_and_compact helper. collect_all_ptrs already iterates all
entries on every executor spin, so cleanup happens at zero extra cost.

The add_* methods become O(1) (just push_back). This fixes the
quadratic slowdown reported in ros2#2942 (429ms → 6ms for 10,000 timers).

This change is safe because:
- collect_all_ptrs already skipped expired entries; now it also removes
  them in the same pass
- The executor calls collect_all_ptrs on every spin iteration, so
  expired entries are cleaned up promptly
- All find_*_ptrs_if methods already handle expired weak_ptrs gracefully

Fixes ros2#2942

Signed-off-by: Pavel Guzenfeld <pavelguzenfeld@gmail.com>
Generated-by: Claude Opus 4.6 (Anthropic)
Tests verify that:
- Expired weak_ptrs are compacted during collect_all_ptrs
- collect_all_ptrs only yields live entities
- Adding N timers is not quadratic (5000 timers in < 5s)
- Interleaved add/remove cycles produce correct entity counts
- Mixed entity types (timers + subscriptions) are cleaned independently

Signed-off-by: Pavel Guzenfeld <pavelguzenfeld@gmail.com>
Generated-by: Claude Opus 4.6 (Anthropic)
- Remove mutable from entity vectors, make collect_all_ptrs non-const
- Replace collect_and_compact helper with std::remove_if (erase-remove idiom)
- Remove leaked test_time_source_deadlock CMake entry from another branch
- Fix uncrustify formatting in test file

Signed-off-by: Pavel Guzenfeld <pavelguzenfeld@gmail.com>
Generated-by: Claude Opus 4.6 (Anthropic)
@skyegalaxy skyegalaxy requested a review from wjwwood June 11, 2026 16:48
@iv461

iv461 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

I would like to comment on this PR since I'm the author of the issue: This issue is both not really relevant for real use cases of rclcpp (or are you really adding more than a couple hundred entities in the same callback group ?) and also difficult to fix properly.
What you are doing is simply moving the linear search for expired references to be done on every wake of the executor, which will slow down the execution of nodes and increase CPU usage. Probably not significantly unless you have at least hundreds of entities, but still, it does not solve the fundamental algorithmic problem of having to scan for expired references.
So, I would suggest not merging this PR.

The reason I created the issue was a task throughput benchmark that I did, which is a standard benchmark for event loop implementations, but still only a benchmark.
I think the proper fix for this issue is to remove the need for weak references at all, since this eliminates the need for a search for expired ones. For example, the executor can keep strong references to the entities, the entities store their callback group ID, instead of the callback group object keeping references to the entities that belong to it. But this likely requires a large refactoring and API-breaking changes.

Edit: I overlooked a linear search is already happening in the function collect_all_ptrs, so adding another does not slow down asymptotically. So there is no concern of increased CPU usage. Therefore, I would actually suggest merging this PR, since it fixes the quadratic time algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Why does adding N entities to a CallbackGroup take N^2, i.e. quadratic time ?

2 participants