PaperSummary02 : Denoising diffusion probabilistic models
1 min readJan 2, 2025
The paper refines the theoretical foundation from thermodynamics framework to focus on generating high-quality images. The core concept is a diffusion process progressively adds noise to data, destroying the signal, while the reverse process iteratively removes this noise to reconstruct high-quality images.
The key points are:
- Forward process: It is a Markov chain that adds Gaussian noise, parameterized by a variance schedule.
- Reverse process: It predicts and removes noise step by step using a neural network leveraging denoising score matching and Langevin dynamics.
- As part of training objective a variational lower bound is optimized, simplified into a weighted loss resembling denoising score matching over multiple noise levels for efficient training.
- The model’s sampling process can be viewed as progressive lossy decomposition similar to autoregressive decoding but generalized for iterative refinement.
- The reverse process uses a U-Net-based architecture with self-attention and group normalization. The temporal position of noise is encoded using sinusoidal embeddings.
The framework generates high quality, diverse and detailed images comparable to GANs and other advanced generative models. It is flexible, scalable, and exhibits excellent inductive bias for images synthesis.
References: