Skip to content

Conversation

@kyle-basis
Copy link
Contributor

When calling setTransform(), rotation data would flow as ros_msgs::Quaternion -> tf2::Quaternion -> tf2::Matrix3x3, then reconvert to tf2::Quaternion 12 times to check validity of the rotation, then do one final conversion to actually store it inside the buffer.

When calling lookupTransform(), we'd go from tf2::Quaternion->tf2::Matrix3x3->tf2::Quaternion (x4)->ros_msgs::Quaternion.

I've fixed this in two parts -

  1. Change setTransformImpl and lookupTransformImpl to take a vector+quat, eliminating both the roundtrip through tf2::Transfrom and the possible footgun of getRotation
  2. In places that still need a tf2::Transform, only call getRotation() once, and reusing the result afterwards
  3. I did not change the reference frame version of lookupTransform nor lookupVelocity, those are far less mechanical changes - would appreciate if someone fixed these up in a followup PR

Some notes:

  1. getRotation() is poorly named - would recommend deprecating and renaming to calculateRotation or similar. I shot myself in the foot implementing tf2_basis and not realizing that origin is by reference and rotation is by value.
  2. Part of the reason this happened is Draft: Break geometry_msgs dependency in tf2 #732 - there are conversion functions that actually do the conversion correctly (not converting once per component) but they are unable to be used due to circular reference
  3. It feels like tf2::Transform should really be a vector+quat, but it would be an API change

bool setTransformImpl(
const tf2::Transform & transform_in, const std::string frame_id,
const std::string child_frame_id, const TimePoint stamp,
const tf2::Vector3 & origin_in, const tf2::Quaternion & rotation_in, const std::string & frame_id,
Copy link
Contributor Author

@kyle-basis kyle-basis Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I changed the string values into references - this really should be a std::string_view - it shouldn't break any code, and ROS is c++17 now - but it's somewhat a moot point as SSO will kick in for most char* conversions.

@kyle-basis
Copy link
Contributor Author

@kscottz - tagging as requested

@kyle-basis
Copy link
Contributor Author

Note: I've broken all tests, I'll fix it - the base idea is sound, however

@kyle-basis
Copy link
Contributor Author

Never mind - appears to just be tests using hardcoded quaternion values, rather than actually checking equality - I'm happy to fix this either by manually fixing the test quaternions, or by doing some nicer equality check. This does mean that downstream users will sometimes get out different (but equiv) rotations from the library than they got before.

@kyle-basis
Copy link
Contributor Author

Pushed what should hopefully fix the failing tests. q == -q

@kyle-basis
Copy link
Contributor Author

Hit a failed test in CI, I don't think this is my fault - ci flake?

02:29:00 5:  - /usr/bin/python3 -m launch_testing.launch_test /tmp/ws/src/geometry2/test_tf2/test/buffer_client_tester.launch.py --junit-xml=/tmp/ws/test_results/test_tf2/test_buffer_client_tester.launch.py.xunit.xml --package-name=test_tf2
02:29:00 5: [INFO] [launch]: All log files can be found below /home/buildfarm/.ros/log/2024-12-04-02-29-00-675074-4d63d25031d7-4049
02:29:00 5: [INFO] [launch]: Default logging verbosity is set to INFO
02:29:00 5: test_termination (test_tf2.TestBufferClient.test_termination) ... [INFO] [static_transform_publisher-1]: process started with pid [4053]
02:29:00 5: [INFO] [test_buffer_server-2]: process started with pid [4054]
02:29:00 5: [INFO] [test_buffer_client-3]: process started with pid [4055]
02:29:00 5: [INFO] [python3-4]: process started with pid [4056]
02:29:00 5: [static_transform_publisher-1] [INFO] [1733308140.776567271] [static_transform_publisher_VNHQnJSRyXBSDNpK]: Spinning until stopped - publishing transform
02:29:00 5: [static_transform_publisher-1] translation: ('5.000000', '6.000000', '7.000000')
02:29:00 5: [static_transform_publisher-1] rotation: ('0.000000', '0.000000', '0.000000', '1.000000')
02:29:00 5: [static_transform_publisher-1] from 'a' to 'b'
02:29:00 5: [test_buffer_client-3] [==========] Running 2 tests from 1 test suite.
02:29:01 5: [test_buffer_client-3] [----------] Global test environment set-up.
02:29:01 5: [test_buffer_client-3] [----------] 2 tests from tf2_ros
02:29:01 5: [test_buffer_client-3] [ RUN      ] tf2_ros.buffer_client
02:29:01 5: [test_buffer_client-3] FOOO
02:29:01 5: [test_buffer_client-3] [INFO] [1733308142.718698560] [tf_action_node]: p1: (0.00, 0.00, 0.00), p2: (-5.00, -6.00, -7.00)
02:29:02 5: [test_buffer_client-3] [       OK ] tf2_ros.buffer_client (1014 ms)
02:29:02 5: [test_buffer_client-3] [ RUN      ] tf2_ros.buffer_client_different_types
02:29:02 5: [test_buffer_client-3] [ERROR] [1733308142.795791742] [tf_action_node]: Bullet: (-1.0000, 0.0000, 0.0000)
02:29:02 5: [test_buffer_client-3] [ERROR] [1733308142.795866165] [tf_action_node]: KDL: (0.0000, 0.0000, 0.0000)
02:29:02 5: [test_buffer_client-3] /tmp/ws/src/geometry2/test_tf2/test/test_buffer_client.cpp:127: Failure
02:29:02 5: [test_buffer_client-3] The difference between b1[0] and -5.0 is 4, which exceeds EPS, where
02:29:02 5: [test_buffer_client-3] b1[0] evaluates to -1,
02:29:02 5: [test_buffer_client-3] -5.0 evaluates to -5, and
02:29:02 5: [test_buffer_client-3] EPS evaluates to 0.001.
02:29:02 5: [test_buffer_client-3] 
02:29:02 5: [test_buffer_client-3] /tmp/ws/src/geometry2/test_tf2/test/test_buffer_client.cpp:128: Failure
02:29:02 5: [test_buffer_client-3] The difference between b1[1] and -6.0 is 6, which exceeds EPS, where
02:29:02 5: [test_buffer_client-3] b1[1] evaluates to 0,
02:29:02 5: [test_buffer_client-3] -6.0 evaluates to -6, and
02:29:02 5: [test_buffer_client-3] EPS evaluates to 0.001.
02:29:02 5: [test_buffer_client-3] 
02:29:02 5: [test_buffer_client-3] /tmp/ws/src/geometry2/test_tf2/test/test_buffer_client.cpp:129: Failure
02:29:02 5: [test_buffer_client-3] The difference between b1[2] and -7.0 is 7, which exceeds EPS, where
02:29:02 5: [test_buffer_client-3] b1[2] evaluates to 0,
02:29:02 5: [test_buffer_client-3] -7.0 evaluates to -7, and
02:29:02 5: [test_buffer_client-3] EPS evaluates to 0.001.
02:29:02 5: [test_buffer_client-3] 
02:29:02 5: [test_buffer_client-3] [  FAILED  ] tf2_ros.buffer_client_different_types (74 ms)
02:29:02 5: [test_buffer_client-3] [----------] 2 tests from tf2_ros (1089 ms total)
02:29:02 5: [test_buffer_client-3] 
02:29:02 5: [test_buffer_client-3] [----------] Global test environment tear-down
02:29:02 5: [test_buffer_client-3] [==========] 2 tests from 1 test suite ran. (1089 ms total)
02:29:02 5: [test_buffer_client-3] [  PASSED  ] 1 test.
02:29:02 5: [test_buffer_client-3] [  FAILED  ] 1 test, listed below:
02:29:02 5: [test_buffer_client-3] [  FAILED  ] tf2_ros.buffer_client_different_types
02:29:02 5: [test_buffer_client-3] 
02:29:02 5: [test_buffer_client-3]  1 FAILED TEST

Locally:

[test_buffer_client-3] [==========] Running 2 tests from 1 test suite.
[test_buffer_client-3] [----------] Global test environment set-up.
[test_buffer_client-3] [----------] 2 tests from tf2_ros
[test_buffer_client-3] [ RUN      ] tf2_ros.buffer_client
[test_buffer_client-3] FOOO
[test_buffer_client-3] [INFO] [1733348400.995224948] [tf_action_node]: p1: (0.00, 0.00, 0.00), p2: (-5.00, -6.00, -7.00)
[test_buffer_client-3] [       OK ] tf2_ros.buffer_client (403 ms)
[test_buffer_client-3] [ RUN      ] tf2_ros.buffer_client_different_types
[test_buffer_client-3] [ERROR] [1733348401.032240829] [tf_action_node]: Bullet: (-5.0000, -6.0000, -7.0000)
[test_buffer_client-3] [ERROR] [1733348401.032402204] [tf_action_node]: KDL: (0.0000, 0.0000, 0.0000)
[test_buffer_client-3] [       OK ] tf2_ros.buffer_client_different_types (30 ms)
[test_buffer_client-3] [----------] 2 tests from tf2_ros (434 ms total)
[test_buffer_client-3] 
[test_buffer_client-3] [----------] Global test environment tear-down
[test_buffer_client-3] [==========] 2 tests from 1 test suite ran. (434 ms total)
[test_buffer_client-3] [  PASSED  ] 2 tests.
[INFO] [test_buffer_client-3]: process has finished cleanly [pid 254386]
ok

@kscottz
Copy link
Contributor

kscottz commented Dec 4, 2024

Sorry about the delay, had a power outage this morning. I'll take a look but I am not super familiar with the internals of TF / Geometry. Thanks for sticking with this and trying to get it over the line. We really appreciate the help and could always use a few more contributors. ❤️

Note: I've broken all tests, I'll fix it - the base idea is sound, however

I figured you might hit a few flakey tests as anything that relies on float / double ends up being a bit brittle. I can't speak to any of these tests being historically flakey. I think setting the epsilons to five or six sig figs should be sufficient.

This could probably use some documentation

Absolutely. We're starting to have conversations about bringing in someone to address API doc deficiencies but that's still far off. I would need more familiarity to do it myself.

Never mind - appears to just be tests using hardcoded quaternion values, rather than actually checking equality - I'm happy to fix this either by manually fixing the test quaternions, or by doing some nicer equality check. This does mean that downstream users will sometimes get out different (but equiv) rotations from the library than they got before.

This one's on you and what you're up for. I would have to look at the individual tests to make a call. Please do keep in mind that more you change on the lower level code the more likely it is that it may break something up stream. Smaller deltas are always better and get finished faster. You can always circle back

isNan() or isValid() on Vector/Quaternion would be really nice for this, along with a normalization check on quaternion - this code doesn't belong here.

I'll need to look at these a bit more but probably warrants an issue with a good first issue tag.

@clalancette
Copy link
Contributor

I figured you might hit a few flakey tests as anything that relies on float / double ends up being a bit brittle. I can't speak to any of these tests being historically flakey. I think setting the epsilons to five or six sig figs should be sufficient.

For what it is worth, we have no reports of flakey tests from our nightly CI in this package that I can remember, and nothing is reported in https://github.com/ros2/geometry2/issues . So while it is always possible that there is a flakey test here, we do not currently know about any.

EXPECT_NEAR(q_simple.quaternion.z, -1 * M_SQRT1_2, EPS);
EXPECT_NEAR(q_simple.quaternion.w, 0, EPS);
{
const geometry_msgs::msg::QuaternionStamped q_simple = tf_buffer->transform(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might help the reviewers, and readability, if you articulate what's going on in this function call. I've been looking at this for ten minutes and it still isn't quite clear what's going on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an intermediate variable - I think the diff moved your comment though, which line specifically is confusing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, that doesn't really tell me what the test does, it just tells me what the result should be. What would really help readers is something like, // Frame a and frame b differ by [x,y,z] and [p,q,r] at time t, check the resulting quat is <foo>

That's why I dislike this big ball of wax test pattern, when you write one function per test you can name the tests something sane like, "check_basic_frame_transform" and it all makes sense. The original author really should have at least left us a few bread crumb comments. 😢

Anyway, not your fault, thanks for putting up with this stuff.

@kyle-basis
Copy link
Contributor Author

Sorry about that - done

Copy link
Contributor

@ahcorde ahcorde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kyle-basis
Copy link
Contributor Author

No problems found??

@kyle-basis
Copy link
Contributor Author

Fixed. Will work on perf testing today or tomorrow.

@kyle-basis
Copy link
Contributor Author

kyle-basis commented Dec 19, 2024

Will fix the BUFFER_CORE_INTERFACE_HEADER_DEPERCATION warnings, too.

...should I make a PR to fix the spelling of that warning?

Edit: these warnings don't appear to be my fault.

Edit2: the warnings were in fact my fault

@kyle-basis
Copy link
Contributor Author

kyle-basis commented Dec 27, 2024

@kscottz I wrote a benchmark:

#include <benchmark/benchmark.h>

#include "tf2_ros/buffer.h"

static void benchmark_set_transform(benchmark::State & state)
{
    rclcpp::Clock::SharedPtr clock = std::make_shared<rclcpp::Clock>(RCL_SYSTEM_TIME);
    tf2_ros::Buffer buffer(clock);
    buffer.setUsingDedicatedThread(false);

    geometry_msgs::msg::TransformStamped transform;
    transform.header.frame_id = "foo";
    transform.header.stamp.sec = 0;
    transform.header.stamp.nanosec = 0;
    transform.child_frame_id = "bar";
    transform.transform.translation.x = 42.0;
    transform.transform.translation.y = -3.14;
    transform.transform.translation.z = 0.0;
    transform.transform.rotation.x = 0.591922;
    transform.transform.rotation.y = 0.210684;
    transform.transform.rotation.z = -0.762476;
    transform.transform.rotation.w = 0.154502;

    for (auto _ : state) {
        transform.header.stamp.nanosec++;
        buffer.setTransform(transform, "benchmark");   
    }
}

BENCHMARK(benchmark_set_transform);

before:

basis@ubuntu:~/tf2_ws/src/geometry2$ ./build/tf2_ros/buffer_benchmark --benchmark_repetitions=10
2024-12-27T14:54:09-08:00
Running ./build/tf2_ros/buffer_benchmark
Run on (8 X 2201.6 MHz CPU s)
CPU Caches:
  L1 Data 64 KiB (x8)
  L1 Instruction 64 KiB (x8)
  L2 Unified 256 KiB (x8)
  L3 Unified 2048 KiB (x2)
Load Average: 0.57, 1.27, 1.16
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-------------------------------------------------------------------------
Benchmark                               Time             CPU   Iterations
-------------------------------------------------------------------------
benchmark_set_transform             80170 ns        80080 ns        23146
benchmark_set_transform             94480 ns        94391 ns        23146
benchmark_set_transform             64783 ns        64723 ns        23146
benchmark_set_transform             64649 ns        64590 ns        23146
benchmark_set_transform             85981 ns        85908 ns        23146
benchmark_set_transform             64721 ns        64663 ns        23146
benchmark_set_transform             93638 ns        93547 ns        23146
benchmark_set_transform             99861 ns        99763 ns        23146
benchmark_set_transform            103710 ns       103613 ns        23146
benchmark_set_transform             64942 ns        64883 ns        23146
benchmark_set_transform_mean        81694 ns        81616 ns           10
benchmark_set_transform_median      83076 ns        82994 ns           10
benchmark_set_transform_stddev      15940 ns        15924 ns           10
benchmark_set_transform_cv          19.51 %         19.51 %            10

after

basis@ubuntu:~/tf2_ws/src/geometry2$ ./build/tf2_ros/buffer_benchmark --benchmark_repetitions=10
2024-12-27T14:58:55-08:00
Running ./build/tf2_ros/buffer_benchmark
Run on (8 X 2201.6 MHz CPU s)
CPU Caches:
  L1 Data 64 KiB (x8)
  L1 Instruction 64 KiB (x8)
  L2 Unified 256 KiB (x8)
  L3 Unified 2048 KiB (x2)
Load Average: 2.31, 1.35, 1.16
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-------------------------------------------------------------------------
Benchmark                               Time             CPU   Iterations
-------------------------------------------------------------------------
benchmark_set_transform             79914 ns        79811 ns        23001
benchmark_set_transform             94020 ns        93897 ns        23001
benchmark_set_transform             64227 ns        64148 ns        23001
benchmark_set_transform             64270 ns        64194 ns        23001
benchmark_set_transform             88543 ns        88430 ns        23001
benchmark_set_transform             64285 ns        64207 ns        23001
benchmark_set_transform             84061 ns        83955 ns        23001
benchmark_set_transform            101408 ns       101276 ns        23001
benchmark_set_transform            106780 ns       106642 ns        23001
benchmark_set_transform             64476 ns        64398 ns        23001
benchmark_set_transform_mean        81198 ns        81096 ns           10
benchmark_set_transform_median      81988 ns        81883 ns           10
benchmark_set_transform_stddev      16436 ns        16412 ns           10
benchmark_set_transform_cv          20.24 %         20.24 %            10

This was run on a Jetson Orin kit, with cpu scaling governer set to performance. TBH the results aren't amazing - it looks to be around 2% faster (and sometimes slightly slower - struggling to actually disable CPU scaling on this machine). This could also be me failing to write a good benchmark. I'd suspected this to be the case - if it were an actual performance problem, someone would have raised a flag before me. I think in terms of code quality this is the correct solution, however.

@scott-robotics
Copy link

scott-robotics commented Jan 6, 2025

Does this not all go away if we use a quaternion for rotation in Transform instead of a matrix? The Transform interface can remain identical -- getBasis() converts a quaternion to matrix instead.

@scott-robotics
Copy link

I was curious about this myself. I replaced the Matrix in tf2::Transform with a Quaternion. 50% speed up in setTransform! (~120ns 😅 )

lookupTransform largely unaffected

see benchmarks in (jazzy / rolling )

@kyle-basis
Copy link
Contributor Author

kyle-basis commented Jan 14, 2025 via email

@clalancette
Copy link
Contributor

While I haven't looked at the details of this exact change, I did want to mention that API breaks are allowed in Rolling. However, if we do this, we'll need to keep the old API with a deprecation warning, and then add in a new API.

@kyle-basis
Copy link
Contributor Author

kyle-basis commented Jan 14, 2025 via email

@jmachowinski
Copy link
Contributor

@ahcorde Can you start a CI ?

@ahcorde
Copy link
Contributor

ahcorde commented Jan 14, 2025

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Linux-rhel Build Status
  • Windows Build Status

Copy link
Contributor

@ahcorde ahcorde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with green CI

Copy link
Contributor

@ahcorde ahcorde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to deprecated old headers

@jmachowinski
Copy link
Contributor

We still need to deprecated old headers

I'm confused, what headers are you referring to ?

@ahcorde
Copy link
Contributor

ahcorde commented Jan 14, 2025

While I haven't looked at the details of this exact change, I did want to mention that API breaks are allowed in Rolling. However, if we do this, we'll need to keep the old API with a deprecation warning, and then add in a new API.

I'm refering to this comment

@jmachowinski
Copy link
Contributor

This patch currently does not break API. Therefore there is no header to deprecate. The discussion was about a more substantial change that would yield a bigger performance gain.

@ahcorde
Copy link
Contributor

ahcorde commented Jan 14, 2025

ups, I was lost in all the conversation, my bad. Approved

@ahcorde ahcorde merged commit 609deb1 into ros2:rolling Jan 14, 2025
2 checks passed
@kscottz
Copy link
Contributor

kscottz commented Jan 14, 2025

🎉 Good work @kyle-basis! Thanks for your patience.

We really appreciate the improvement.

@kscottz
Copy link
Contributor

kscottz commented Jan 14, 2025

@scott-robotics can you post the results of your benchmark for Rolling or point me to the results?

@scott-robotics
Copy link

@kscottz I ran some benchmarks, building off what @kyle-basis has done. This PR has the same performance impact on setTransform as using Quaternion internally in tf2::Transform and returning by reference.

Rolling (with this PR merged):

-----------------------------------------------------------------------------
Benchmark                                   Time             CPU   Iterations
-----------------------------------------------------------------------------

benchmark_set_transform                   121 ns          121 ns      5214257
benchmark_set_transform_mean              122 ns          122 ns           10
benchmark_set_transform_median            121 ns          121 ns           10
benchmark_set_transform_stddev           5.78 ns         5.78 ns           10
benchmark_set_transform_cv               4.74 %          4.75 %            10

benchmark_get_transform                   838 ns          838 ns       838794
benchmark_get_transform_mean              849 ns          849 ns           10
benchmark_get_transform_median            851 ns          851 ns           10
benchmark_get_transform_stddev           14.4 ns         14.4 ns           10
benchmark_get_transform_cv               1.70 %          1.70 %            10

benchmark_resolve_transform              1098 ns         1097 ns       644474
benchmark_resolve_transform_mean         1100 ns         1100 ns           10
benchmark_resolve_transform_median       1098 ns         1098 ns           10
benchmark_resolve_transform_stddev       6.50 ns         6.81 ns           10
benchmark_resolve_transform_cv           0.59 %          0.62 %            10

Return Quaternion by reference ( scott-robotics:use-quaternion-as-basis-rolling, prior to this PR merge):

-----------------------------------------------------------------------------
Benchmark                                   Time             CPU   Iterations
-----------------------------------------------------------------------------

benchmark_set_transform                   121 ns          121 ns      5058187
benchmark_set_transform_mean              123 ns          123 ns           10
benchmark_set_transform_median            121 ns          121 ns           10
benchmark_set_transform_stddev           6.72 ns         6.72 ns           10
benchmark_set_transform_cv               5.48 %          5.48 %            10

benchmark_get_transform                   824 ns          824 ns       823082
benchmark_get_transform_mean              832 ns          832 ns           10
benchmark_get_transform_median            832 ns          832 ns           10
benchmark_get_transform_stddev           13.5 ns         13.5 ns           10
benchmark_get_transform_cv               1.62 %          1.63 %            10

benchmark_resolve_transform              1088 ns         1088 ns       613403
benchmark_resolve_transform_mean         1086 ns         1086 ns           10
benchmark_resolve_transform_median       1087 ns         1087 ns           10
benchmark_resolve_transform_stddev       10.3 ns         10.3 ns           10
benchmark_resolve_transform_cv           0.95 %          0.95 %            10
Run on (22 X 4700 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x11)
  L1 Instruction 64 KiB (x11)
  L2 Unified 2048 KiB (x11)
  L3 Unified 24576 KiB (x1)
Load Average: 3.15, 2.60, 1.81
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants