Skip to content

Conversation

@tenpercent
Copy link
Contributor

@tenpercent tenpercent commented Jan 16, 2026

Summary

  • Replace lambdas with named functors in container_concat
  • Add make_uniform_tuple helper for repeated value patterns
  • Add container_product helper with O(1) depth fold expression
  • Reduces container_concat instantiations from 186 to 93 (50% reduction)

Why It Works

Lambda expressions in container_concat created unique types at each call site. The make_tuple_functor named struct shares one type across all uses, halving instantiation count.

The make_uniform_tuple helper eliminates repeated lambda instantiations for creating tuples with the same value repeated N times.

Test Plan

  • Added 12 unit tests for container_concat and make_uniform_tuple helpers
  • Waiting for full CI

PR Stack

This PR is part of the build time optimization effort (issue #3575). All PRs now target develop independently:

# PR Description Status
1 #3585 sequence_gen with __make_integer_seq Independent
2 #3628 generate_identity_sequences + named functors New (replaces #3588, #3589)
3 #3590 container_concat optimization This PR
4 #3596 O(1) pack expansion rewrites Independent
5 #3600 TensorDescriptor/TensorAdaptor lambda elimination Independent

Tracking issue: #3575

@tenpercent tenpercent marked this pull request as draft January 16, 2026 15:48
@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 59f0c32 to 5190578 Compare January 16, 2026 17:34
@tenpercent tenpercent force-pushed the mpodkory/transform-tensor-descriptor-optimization branch from 885b80f to 0791bad Compare January 16, 2026 20:16
@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 5190578 to 887bdf2 Compare January 16, 2026 20:16
@tenpercent tenpercent force-pushed the mpodkory/transform-tensor-descriptor-optimization branch from 0791bad to b26ed88 Compare January 17, 2026 03:37
@tenpercent tenpercent marked this pull request as ready for review January 17, 2026 03:41
@tenpercent tenpercent force-pushed the mpodkory/transform-tensor-descriptor-optimization branch from b26ed88 to 00849ac Compare January 17, 2026 03:51
@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 887bdf2 to 02e42dc Compare January 17, 2026 03:51
@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 82b6016 to 602c127 Compare January 21, 2026 23:56
@tenpercent tenpercent requested review from a team and ddembeckAMD as code owners January 21, 2026 23:56
@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 602c127 to 1713ea7 Compare January 22, 2026 01:00
@tenpercent tenpercent changed the base branch from mpodkory/transform-tensor-descriptor-optimization to develop January 22, 2026 01:03
- Replace lambdas with named functors in container_concat
- Add make_uniform_tuple helper for repeated value patterns
- Add container_product helper with O(1) depth fold expression
- Add merge_sequences_functor and unpack_and_merge_sequences
- Add 16 unit tests for container helpers

Co-Authored-By: Claude <noreply@anthropic.com>
@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 47dfd5f to 99872ec Compare January 22, 2026 01:49
Detailed comments explain:
- Why named functors reduce instantiations vs lambdas in container_concat
- Impact: 50% reduction in container_concat (186 → 93 instantiations)
- make_uniform_tuple optimization using pack expansion instead of lambda
- generate_identity_sequences optimization for identity permutations
- When to apply these patterns elsewhere

This documentation helps maintainers understand the build-time optimization
strategies and prevents reverting to less efficient patterns.
@tenpercent tenpercent marked this pull request as draft January 22, 2026 18:50
@cgmillette cgmillette self-assigned this Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants