Add container and tuple optimization helpers #3590

tenpercent · 2026-01-16T07:33:23Z

Summary

Replace lambdas with named functors in container_concat
Add make_uniform_tuple helper for repeated value patterns
Add container_product helper with O(1) depth fold expression
Reduces container_concat instantiations from 186 to 93 (50% reduction)

Why It Works

Lambda expressions in container_concat created unique types at each call site. The make_tuple_functor named struct shares one type across all uses, halving instantiation count.

The make_uniform_tuple helper eliminates repeated lambda instantiations for creating tuples with the same value repeated N times.

Test Plan

Added 12 unit tests for container_concat and make_uniform_tuple helpers
Waiting for full CI

PR Stack

This PR is part of the build time optimization effort (issue #3575). All PRs now target develop independently:

#	PR	Description	Status
1	#3585	sequence_gen with `__make_integer_seq`	Independent
2	#3628	generate_identity_sequences + named functors	New (replaces #3588, #3589)
3	#3590	container_concat optimization	This PR
4	#3596	O(1) pack expansion rewrites	Independent
5	#3600	TensorDescriptor/TensorAdaptor lambda elimination	Independent

Tracking issue: #3575

- Replace lambdas with named functors in container_concat - Add make_uniform_tuple helper for repeated value patterns - Add container_product helper with O(1) depth fold expression - Add merge_sequences_functor and unpack_and_merge_sequences - Add 16 unit tests for container helpers Co-Authored-By: Claude <noreply@anthropic.com>

Detailed comments explain: - Why named functors reduce instantiations vs lambdas in container_concat - Impact: 50% reduction in container_concat (186 → 93 instantiations) - make_uniform_tuple optimization using pack expansion instead of lambda - generate_identity_sequences optimization for identity permutations - When to apply these patterns elsewhere This documentation helps maintainers understand the build-time optimization strategies and prevents reverting to less efficient patterns.

tenpercent requested review from Snektron, ThomasNing, afagaj, andriy-ca, aosewski, asleepzzz, bartekxk, carlushuang, cgmillette, coderfeli, geyyer, illsilin, poyenc, qianfengz, shumway, vidyasagar-amd and vpietila-amd as code owners January 16, 2026 07:33

tenpercent marked this pull request as draft January 16, 2026 15:48

tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 59f0c32 to 5190578 Compare January 16, 2026 17:34

tenpercent force-pushed the mpodkory/transform-tensor-descriptor-optimization branch from 885b80f to 0791bad Compare January 16, 2026 20:16

tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 5190578 to 887bdf2 Compare January 16, 2026 20:16

tenpercent mentioned this pull request Jan 16, 2026

Replace nested static_for lambdas with compile-time search helper #3600

Draft

2 tasks

tenpercent force-pushed the mpodkory/transform-tensor-descriptor-optimization branch from 0791bad to b26ed88 Compare January 17, 2026 03:37

tenpercent marked this pull request as ready for review January 17, 2026 03:41

tenpercent force-pushed the mpodkory/transform-tensor-descriptor-optimization branch from b26ed88 to 00849ac Compare January 17, 2026 03:51

tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 887bdf2 to 02e42dc Compare January 17, 2026 03:51

tenpercent mentioned this pull request Jan 19, 2026

Add unit tests for template optimization helpers #3610

Closed

tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 82b6016 to 602c127 Compare January 21, 2026 23:56

tenpercent requested review from a team and ddembeckAMD as code owners January 21, 2026 23:56

tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 602c127 to 1713ea7 Compare January 22, 2026 01:00

tenpercent changed the base branch from mpodkory/transform-tensor-descriptor-optimization to develop January 22, 2026 01:03

tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 47dfd5f to 99872ec Compare January 22, 2026 01:49

tenpercent added 2 commits January 22, 2026 02:52

Apply clang-format with -style=file

eed4270

tenpercent marked this pull request as draft January 22, 2026 18:50

cgmillette self-assigned this Jan 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add container and tuple optimization helpers #3590

Add container and tuple optimization helpers #3590

tenpercent commented Jan 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add container and tuple optimization helpers #3590

Are you sure you want to change the base?

Add container and tuple optimization helpers #3590

Conversation

tenpercent commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why It Works

Test Plan

PR Stack

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tenpercent commented Jan 16, 2026 •

edited

Loading