Skip to content

Add TurboT2AV SageAttention, SageSLA and FastNorm submodule#132

Open
liuyuxiang1021 wants to merge 2 commits into
thu-ml:mainfrom
liuyuxiang1021:accelerate2
Open

Add TurboT2AV SageAttention, SageSLA and FastNorm submodule#132
liuyuxiang1021 wants to merge 2 commits into
thu-ml:mainfrom
liuyuxiang1021:accelerate2

Conversation

@liuyuxiang1021

@liuyuxiang1021 liuyuxiang1021 commented Jun 10, 2026

Copy link
Copy Markdown

Summary

This PR adds TurboT2AV as an optional TurboDiffusion submodule for accelerated text-to-audio-video inference.

The submodule main branch is rebuilt from the original inference branch and only adds inference-time acceleration:

  • SageAttention for selected unmasked LTX attention modules
  • TurboDiffusion FastNorm for RMSNorm/LayerNorm paths, including the functional RMSNorm helper used by LTX
  • TurboDiffusion SLA/SageSLA self-attention adapters with --sla_topk, --sla_block_q, and --sla_block_k controls
  • inference CLI flags for acceleration and timing
  • README timing notes for the 40-step teacher, accelerated 4-step student, and SageSLA speed/quality tradeoffs

Not included in this integration:

  • training scripts or training workflow changes
  • W8A8 quantization
  • trained SLA adapter checkpoints

Submodule

  • Repository: liuyuxiang1021/turbo-t2av
  • Branch used by the submodule: main
  • Current commit: c0e4506a40e9a4eb3ed5f34b0fe7dfcd3f26f084
  • Clean base branch: origin_inference at c99f03b7b615661f63513b3816ea6c62b754c5ce

The parent repository .gitmodules tracks TurboT2AV on main, and the submodule gitlink is pinned to c0e4506.

Validation

Submodule checks:

  • git diff --check
  • pixi run python -m compileall packages/ltx-distillation/src/ltx_distillation/acceleration.py packages/ltx-distillation/src/ltx_distillation/tools/run_av_inference_eval.py
  • CLI help includes --attention_type {default,sageattn,sla,sagesla}, --sla_topk, --sla_block_q, and --sla_block_k
  • Secret/path scan found no real tokens, proxy settings, or local dataset paths in the submitted files

H20 generator-only measurements documented in the README:

  • 512x768, 121 frames, 4 prompts: default 4-step student 2.53s/video; SageAttention self + FastNorm 2.17s/video (1.16x)
  • 704x1280, 121 frames, first 4 prompts: SageSLA top-k sweep showed the quality/speed tradeoff; quality-first topk=1.0 reached about 1.10x with lower visual error than sparse settings
  • 1024x1792, 121 frames, first 4 prompts: SageSLA topk=0.4 stress test improved median generator time from 15.97s/video to 10.58s/video (1.51x), with visible quality changes

The README also explains why the attention speedup is lower than TurboDiffusion Wan2.1 720p SageSLA results: the default LTX path already uses an efficient attention backend, TurboT2AV has a shorter latent sequence at normal resolution, and non-attention work remains unchanged.

@liuyuxiang1021 liuyuxiang1021 force-pushed the accelerate2 branch 3 times, most recently from 85baa11 to 19dff3a Compare June 10, 2026 12:49
@liuyuxiang1021 liuyuxiang1021 changed the title Add TurboT2AV SageAttention and FastNorm submodule Add TurboT2AV SageAttention, SageSLA and FastNorm submodule Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant