Both Sora and Stable Diffusion 3 adopt diffusion transformers, but do we really need a super large DiT for all sampling steps for generation?
![Face with monocle :face_with_monocle: 🧐](https://cdn.jsdelivr.net/joypixels/assets/6.6/png/unicode/64/1f9d0.png)
No
![Man gesturing NO :man_gesturing_no: 🙅♂️](https://cdn.jsdelivr.net/joypixels/assets/6.6/png/unicode/64/1f645-2642.png)
Introduce Trajectory Stitching (T-Stitch), a training-free method that complements existing efficient sampling methods by dynamically allocating computation to different denoising steps.
Paper: Code:
GitHub - NVlabs/T-Stitch: Official PyTorch implmentation of paper "T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stitching"
Official PyTorch implmentation of paper "T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stitching" - NVlabs/T-Stitch
T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stitching
T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stitching