PaperSummary04 : Fine-tuning diffusion models with limited data

Poonam Saini
1 min readJan 4, 2025

--

The paper proposes Adapter-Augmented Attention Fine-tuning (A3FT) method for efficient fine-tuning of diffusion models on small datasets with minimal overfitting and faster convergence. When fine-tuning pre-trained diffusion models with limited data, overfitting is a critical issue and the A3FT solution involves fine-tuning only attention blocks, not all parameters. It introduces a time-aware adapter module to the pre-trained model for improved learning.

The key points are:

  1. Time-Fusion Module: Time aware adapter combines feature vectors with time embeddings.
  2. Cross-Attention Module: It applies attention with time-aware features.
  3. Time-Scaling Module: It regulates adapter influence based on time steps.
  4. It adds minimal computational overhead of only ~1.8% of model parameters.
  5. The proposed method outperforms naive fine-tuning and attention block fine-tuning in terms of cFID and KID scores.

A3FT method provides a faster, and efficient method for generating high-quality images while addressing overfitting and computational challenges.

References:

--

--

Poonam Saini
Poonam Saini

Written by Poonam Saini

PhD Student, Research Associate @ Ulm University

No responses yet