Training Diffusion Transformers with Muon

How fast can we train DiTs with the Muon optimizer?

May 31, 2026 · 6 min · Sven Lüpke