Skip to content

Conversation

@tenpercent
Copy link
Contributor

@tenpercent tenpercent commented Jan 16, 2026

Summary

Replace the O(N) recursive sequence_map_inverse implementation with O(1) template depth using pack expansion.

Approach

  • Use constexpr loop in find_source_index to locate permutation inverse indices
  • Expand via pack expansion for O(1) template instantiation depth

Why It Works

Template recursion requires N template instantiations for N iterations, each with its own overhead. Constexpr loops execute within a single template instantiation, avoiding per-instantiation overhead.

Build Performance Impact

Template Instantiation Reduction (measured on device_grouped_conv3d_fwd_bias_bnorm_clamp_instance target, 248 files):

This confirms the optimization successfully reduces template instantiation overhead by eliminating recursive template patterns in favor of pack expansion.

Test Plan

  • Existing SequenceMapInverse.InverseMap and SequenceMapInverse.InverseIdentityMap tests validate correctness
  • CI

Notes

@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 59f0c32 to 5190578 Compare January 16, 2026 17:34
@tenpercent tenpercent force-pushed the mpodkory/recursive-to-pack-expansion branch from 6d792da to f5ada17 Compare January 16, 2026 20:16
@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 5190578 to 887bdf2 Compare January 16, 2026 20:16
@tenpercent tenpercent marked this pull request as ready for review January 17, 2026 03:41
@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 887bdf2 to 02e42dc Compare January 17, 2026 03:51
@tenpercent tenpercent force-pushed the mpodkory/recursive-to-pack-expansion branch from f5ada17 to 9942fd6 Compare January 17, 2026 03:51
@tenpercent tenpercent force-pushed the mpodkory/recursive-to-pack-expansion branch from 9d67d0d to c4d95f7 Compare January 21, 2026 23:43
@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 82b6016 to 602c127 Compare January 21, 2026 23:56
@tenpercent tenpercent force-pushed the mpodkory/recursive-to-pack-expansion branch from c4d95f7 to 631df4f Compare January 21, 2026 23:57
@tenpercent tenpercent force-pushed the mpodkory/generate-tuple-optimizations branch from 602c127 to 1713ea7 Compare January 22, 2026 01:00
@tenpercent tenpercent changed the base branch from mpodkory/generate-tuple-optimizations to develop January 22, 2026 01:04
@tenpercent tenpercent force-pushed the mpodkory/recursive-to-pack-expansion branch 2 times, most recently from cbaf07b to 3b8b37d Compare January 22, 2026 19:52
@tenpercent tenpercent changed the title Rewrite O(N) recursive templates with O(1) pack expansion Replace O(N) recursive sequence_map_inverse with O(1) pack expansion Jan 22, 2026
@tenpercent tenpercent force-pushed the mpodkory/recursive-to-pack-expansion branch from 3b8b37d to 7c9cdf0 Compare January 22, 2026 20:24
@tenpercent tenpercent marked this pull request as draft January 22, 2026 20:31
@tenpercent tenpercent force-pushed the mpodkory/recursive-to-pack-expansion branch 4 times, most recently from d162e26 to f8d808e Compare January 22, 2026 21:11
@tenpercent tenpercent marked this pull request as ready for review January 22, 2026 22:29
Use constexpr loop in find_source_index to locate permutation inverse
indices, then expand via pack expansion for O(1) template instantiation
depth instead of O(N) recursive template instantiation.
@tenpercent tenpercent force-pushed the mpodkory/recursive-to-pack-expansion branch from e921e01 to bd98bd1 Compare January 23, 2026 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants