Simplifying, stabilizing, and scaling continuous-time consistency models

6 points | by tosh 2 days ago

3 comments

devindotcom 2 days ago
I understand diffusion models somewhat, but I simply don't understand how the actual denoising steps can be skipped in this method. Like isn't the whole point of diffusion, which this uses as a basis, that it must go stepwise from noise to image? Even distilling a diffusion model and accelerating it, it feels crazy that it could go from taking 50 steps to taking 1 or 2.
I've looked at a few papers but none seems to explain in simple terms what is going on here. It's like saying, we made a new car where you turn it on and you are at your destination. If anyone has come across an accessible explanation of this approach I'd love to look at it.
[-]
- htrp a day ago
  to use an imperfect analogy, they're learning how to get from point a to point b in a straight line versus having to drive around a meandering path through the neighborhood taking every back road
surprisetalk 2 days ago
Read the paper: https://arxiv.org/abs/2410.11081