Add TurboT2AV SageAttention, SageSLA and FastNorm submodule by liuyuxiang1021 · Pull Request #132 · thu-ml/TurboDiffusion

liuyuxiang1021 · 2026-06-10T10:37:46Z

Summary

This PR adds TurboT2AV as an optional TurboDiffusion submodule for accelerated text-to-audio-video inference.

The submodule main branch is rebuilt from the original inference branch and only adds inference-time acceleration:

SageAttention for selected unmasked LTX attention modules
TurboDiffusion FastNorm for RMSNorm/LayerNorm paths, including the functional RMSNorm helper used by LTX
TurboDiffusion SLA/SageSLA self-attention adapters with --sla_topk, --sla_block_q, and --sla_block_k controls
inference CLI flags for acceleration and timing
README timing notes for the 40-step teacher, accelerated 4-step student, and SageSLA speed/quality tradeoffs

Not included in this integration:

training scripts or training workflow changes
W8A8 quantization
trained SLA adapter checkpoints

Submodule

Repository: liuyuxiang1021/turbo-t2av
Branch used by the submodule: main
Current commit: c0e4506a40e9a4eb3ed5f34b0fe7dfcd3f26f084
Clean base branch: origin_inference at c99f03b7b615661f63513b3816ea6c62b754c5ce

The parent repository .gitmodules tracks TurboT2AV on main, and the submodule gitlink is pinned to c0e4506.

Validation

Submodule checks:

git diff --check
pixi run python -m compileall packages/ltx-distillation/src/ltx_distillation/acceleration.py packages/ltx-distillation/src/ltx_distillation/tools/run_av_inference_eval.py
CLI help includes --attention_type {default,sageattn,sla,sagesla}, --sla_topk, --sla_block_q, and --sla_block_k
Secret/path scan found no real tokens, proxy settings, or local dataset paths in the submitted files

H20 generator-only measurements documented in the README:

512x768, 121 frames, 4 prompts: default 4-step student 2.53s/video; SageAttention self + FastNorm 2.17s/video (1.16x)
704x1280, 121 frames, first 4 prompts: SageSLA top-k sweep showed the quality/speed tradeoff; quality-first topk=1.0 reached about 1.10x with lower visual error than sparse settings
1024x1792, 121 frames, first 4 prompts: SageSLA topk=0.4 stress test improved median generator time from 15.97s/video to 10.58s/video (1.51x), with visible quality changes

The README also explains why the attention speedup is lower than TurboDiffusion Wan2.1 720p SageSLA results: the default LTX path already uses an efficient attention backend, TurboT2AV has a shorter latent sequence at normal resolution, and non-attention work remains unchanged.

liuyuxiang1021 force-pushed the accelerate2 branch 3 times, most recently from 85baa11 to 19dff3a Compare June 10, 2026 12:49

Add TurboT2AV inference acceleration submodule

8972961

liuyuxiang1021 force-pushed the accelerate2 branch from 19dff3a to 8972961 Compare June 10, 2026 13:20

Update TurboT2AV acceleration submodule

f128604

liuyuxiang1021 changed the title ~~Add TurboT2AV SageAttention and FastNorm submodule~~ Add TurboT2AV SageAttention, SageSLA and FastNorm submodule Jun 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TurboT2AV SageAttention, SageSLA and FastNorm submodule#132

Add TurboT2AV SageAttention, SageSLA and FastNorm submodule#132
liuyuxiang1021 wants to merge 2 commits into
thu-ml:mainfrom
liuyuxiang1021:accelerate2

liuyuxiang1021 commented Jun 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

liuyuxiang1021 commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Submodule

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

liuyuxiang1021 commented Jun 10, 2026 •

edited

Loading