A Survey on Generative Modeling with Limited Data, Few Shots, and Zero Shot

* Equal Contribution    † Corresponding Author


Singapore University of Technology and Design

Summary

In machine learning, generative modeling aims to learn to generate new data statistically similar to the training data distribution. In this paper, we survey learning generative models under limited data, few shots, and zero shot, referred to as Generative Modeling under Data Constraint (GM-DC). This is an important topic when data acquisition is challenging, e.g. healthcare. We discuss background, challenges, and propose two taxonomies: one on GM-DC tasks and another on GM-DC approaches. Importantly, we study interactions between different GM-DC tasks and approaches. Furthermore, we highlight research gaps, research trends, and potential avenues for future exploration.

On this webpage, we provide a high-level view of different aspects of our survey. For a more detailed and technical discussion, please see our paper.

GM-DC Reseach Landscape Overview

The following Sankey diagram provides an illustration of the research landscape of the GMDC covering the works proposed in GM-DC and the interaction between the GM-DC Tasks and approaches. More details on the definition of the tasks and approaches can be found in the respective sections of the proposed taxonomy.

An interactive version of this Sankey diagram can be found here.

Sankey

Trends, Technical Evolution, and Statistics

Our comprehensive analysis of the GM-DC research landscape coupled with a meticulous analysis of each individual work, enables us to extract trends, technical evolution, and detailed statistics in the literature.

The overall publication statistics in GM-DC illustrated below show an increasing interest in the research community in this topic. For example, the number of published works until late July of 2023 is more than the works published in the whole of 2022.

Fig1

Overall publications statistics in GM-DC. GM-DC Publications (Left): GM-DC publication trends indicate rising interest in this area. We remark that the previous survey only covers ~27% of publications discussed in our survey. Publication Venues (Right): The distribution of publications in major machine learning and computer vision venues, other venues, and arXiv.


The detailed statistics of the publications in GM-DC are illustrated below. The definition of the tasks and approaches can be found in the respective sections of the proposed taxonomy.

Fig2

Analysis of publications in GM-DC. Data Constraints: Different types of data constraints studied in GM-DC. Models: Different types of models are studied including Generative Adversarial Network (GAN), Diffusion Model (DM), and Variational Auto-Encoder (VAE). Tasks: Different GM-DC tasks that are studied; See tasks taxonomy table for details on task definitions in our proposed task taxonomy. Approaches: Different approaches that are applied for addressing GM-DC; More details on our proposed taxonomy of approaches can be found in approaches taxonomy table.


Generative Models. Our study shows that three types of generative models are used in GM-DC:

  • Generative Adversarial Network (GAN)
  • Diffusion Model (DM)
  • Variational Auto-Encoder (VAE), and very recently its variant Vector-Quantized VAE (VQ-VAE)


Data Constraints. In the majority of the GM-DC works, three setups for data constraints can be observed:

  • Limited Data (LD): 50 to 5000 training samples (images) are given
  • Few-Shot (FS): 1 to 50 training samples are given
  • Zero-Shot (ZS): no training sample is given and usually a text prompt is available to define the target domain
Note that these ranges are the typically used values as there are no fixed definitions in the literature.


Timeline. The figure below illustrates the timeline when the first work for a GM-DC task or approach is introduced based on our proposed taxonomies.

TimeLine

Proposed Task Taxonomy

We propose a taxonomy of 8 different tasks based on the attributes of different GM-DC tasks addressed in the literature. The task definition, an example for each task, and a task illustration are given in the table below. Please check our paper for more detail.

Our proposed taxonomy for tasks in GM-DC. For each task, we extract their key characteristics. [Attributes] C: Conditional generation, P: Pre-trained generator given, I: Images (as input), TP: Text-Prompt (as input), X: X(Cross)-domain adaptation; [Data Constraint] LD: Limited-Data, FS: Few-Shot, ZS: Zero-Shot. / denotes the absence/presence, respectively.
Task Attributes Data Constraint
C P I TP X LD FS ZS

uGM-1
Description:   Given $K$ samples from a domain D, learn to generate diverse and high-quality samples from D.
Example:   ADA learns a StyleGAN2 using 1k images from AFHQ-Dog.

uGM-2
Description:   Given a pre-trained generator on a source domain D$_s$ and $K$ samples from a target domain D$_t$, learn to generate diverse and high-quality samples from D$_t$.
Example:   CDC adapts a pre-trained GAN on FFHQ to Sketches using 10 samples.

uGM-3
Description:   Given a pre-trained generator on a source domain D$_s$ and a text prompt describing a target domain D$_t$, learn to generate diverse and high-quality samples from D$_t$
Example:   NADA adapts pre-trained GAN on FFHQ to the painting domain using `Fernando Botero Painting' as input.

cGM-1
Description:   Given $K$ samples with class labels from a domain D, learn to generate diverse and high-quality samples conditioning on the class labels from D.
Example:   CbC trains conditional generator on 20 classes of ImageNet Carnivores using 100 images per class.

cGM-2
Description:   Given a pre-trained generator on the seen classes $C_{seen}$ of a domain D and $K$ samples with class labels from unseen classes $C_{unseen}$ of D, learn to generate diverse and high-quality samples conditioning on the class labels for $C_{unseen}$ from D.
Example:   LofGAN learns from 85 classes of Flowers to generate images for an unseen class with only 3 samples.

cGM-3
Description:   Given a pre-trained generator on a source domain D$_s$ and $K$ samples with class labels from a target domain D$_t$ , learn to generate diverse and high-quality samples conditioning on the class labels from D$_t$.
Example:   VPT adapts a pre-trained conditional generator on ImageNet to Places365 with 500 images per class.

IGM
Description:   Given $K$ samples (usually $K$=1) and assuming rich internal distribution for patches within these samples, learn to generate diverse and high-quality samples with the same internal patch distribution.
Example:   SinDDM trains a generator using a single image of Marina Bay Sands, and generates variants of it.

SGM
Description:   Given a pre-trained generator, $K$ samples of a particular subject, and a text prompt, learn to generate diverse and high-quality samples containing the same subject.
Example:   DreamBooth trains a generator using 4 images of a particular backpack and adapts it with a text-prompt to be in the `grand canyon'.

Proposed Approaches Taxonomy

We propose 7 categories of approaches for GM-DC including Transfer Learning, Data Augmentaion, Network Architectures, Multi-Task Objectives, Exploiting Frequency Components, Meta-Learning, and Modeling Internal Patch Distribution. For more details on each category, and the comprehensive review of these works, please check our paper. The list of papers including the link and code for each paper, can also be found in our GitHub Repository for GM-DC .

Our proposed taxonomy for approaches in GM-DC. For each approach, the addressed GM-DC tasks (see Tab. 2 for task definitions) and the data constraints are indicated. A detailed list of methods under each sub-category is also tabulated (some methods are under multiple categories). / denotes the absence/presence of the tasks commonly addressed by each approach, and the data constraints usually considered: LD: Limited-Data, FS: Few-Shot and ZS: Zero-Shot.
Transfer Learning (Sec. 4.1)
Description: Improve GM-DC on target domain by knowledge of a generator pre-trained on source domain (with numerous and diverse samples).
Task: uGM-1   uGM-2   uGM-3   cGM-1   cGM-2   cGM-3   IGM   SGM Data constraint: LD   FS   ZS
1) Regularizer-based Fine-Tuning: Explore regularizers to preserve source generators' knowledge.
Methods:   TGAN, BSA, FreezeD, EWC, CDC, cGANTransfer, W3, C3, DCL, RSSA, fairTL, GenOSDA , SVD, D3-TGAN, JoJoGAN, KDFSIG, CtlGAN, ICGAN, MaskD, F3, ICGAN, DDPM-PA, DWSC, CSR, ProSC
2) Latent Space: Explore latent space of source generator to identify suitable knowledge for adaptation.
Methods:   MineGAN, MineGAN++, LCL, WeditGAN, GenDA, SiSTA, MultiDiffusion
3) Modulation: Leverage trainable modulation weights on top of frozen weights of the source generator.
Methods:   AdaFMGAN, GAN-Memory, CAM-GAN, AdAM, DynaGAN, HyperDomainNet
4) Natural Language-guided: Use the feedback of vision-language models to adapt the source generator with text prompts.
Methods:   StyleGAN-NADA, MTG, HyperDomainNet, DiFa, OneCLIP, IPL, SINE, DreamBooth, MCC, Textual-Inversion, SpecialistDiffusion, BLIP-Diffusion
5) Adaptation-Aware: Preserve the source generator's knowledge that is important to the adaptation task.
Methods:   AdAM, RICK
6) Prompt Tuning: Freeze the source generator and add/ generate visual prompts to guide generation for the target domain.
Methods:   VPT
Data Augmentation (Sec. 4.2)
Description: Improve GM-DC by increasing coverage of the data distribution by applying various transformations on the given samples.
Task: uGM-1   uGM-2   uGM-3   cGM-1   cGM-2   cGM-3   IGM   SGM Data constraint: LD   FS   ZS
1) Image-Level Augmentation: Apply data transformations on image space.
Methods:   ADA, DiffAugment, IAG, DiffusionGAN, bCR, CR-GAN, APA, PatchDiffusion
2) Feature-Level Augmentation: Apply data transformations on the feature space.
Methods:   AdvAug, AFI
3) Transformation-Driven Design: Leverage the information of individual transformations to design an efficient learning mechanism.
Methods:   DAG, SSGAN-LA
Network Architectures (Sec. 4.3)
Description: > Design specific architecture for the generator to improve its learning under data constraints.
Task: uGM-1   uGM-2   uGM-3   cGM-1   cGM-2   cGM-3   IGM   SGM Data constraint: LD   FS   ZS
1) Feature Enhancement: Design additional modules/ layers to enhance/retain the feature maps of the generator for better generative modeling.
Methods:   FastGAN, MoCA, DFSGAN, SCHA-VAE
2) Ensemble Large Pre-trained Vision Models: Improve architecture by integrating pre-trained vision models to enable more accurate GM-DC.
Methods:   Vision-aided GAN, ProjectedGAN
3) Dynamic Network Architecture: Improve generative learning with limited data by evolving the generator architecture during training.
Methods:   CbC, DynamicD, AdvAug, Re-GAN, AutoInfoGAN
Multi-Task Objectives (Sec. 4.4)
Description: Introduce additional task(s) to extract generalizable representations that are useful for all tasks, to reduce overfitting under data constraints.
Task: uGM-1   uGM-2   uGM-3   cGM-1   cGM-2   cGM-3   IGM   SGM Data constraint: LD   FS   ZS
1) Regularizer: Add an additional task objective as a regularizer to prevent an undesirable behaviour during training generative model.
Methods:   LeCam, DigGAN, MDL, RegLA
2) Contrastive Learning: Introduce a pretext task to enhance the learning process of the generative model.
Methods:   InsGen, FakeCLR, DCL, C3, ctlGAN, IAG, CML-GAN
3) Masking: Mask a part of the image/ information to increase the task hardness and prevent learning the trivial solutions.
Methods:   MaskedGAN, MaskD, DMD
4) Knowledge Distillation: Add a task objective that enforces the generator to follow a strong teache.
Methods:   KD-DLGAN, KDFSIG
5) Prototype Learning: Emphasize learning prototypes for samples/ concepts within the distribution as an additional task objective.
Methods:   ProtoGAN, MoCA
6) Other Multi-Task Objectives: Apply other types of multi-task objectives including co-training, patch-level learning, and diffusion.
Methods:   GenCo, PatchDiffusion, AnyRes-GAN , DiffusionGAN, D2C, AdaptiveIMLE, FSDM
Exploiting Frequency Components (Sec. 4.5)
Description: Exploit frequency components to improve learning the generative model by reducing frequency bias.
Task: uGM-1   uGM-2   uGM-3   cGM-1   cGM-2   cGM-3   IGM   SGM Data constraint: LD   FS   ZS
Methods:   FreGAN, WaveGAN, MaskedGAN, Gen-co
Meta-Learning (Sec. 4.6)
Description: Learn meta-knowledge from seen classes to improve generator learning for unseen classes.
Task: uGM-1   uGM-2   uGM-3   cGM-1   cGM-2   cGM-3   IGM   SGM Data constraint: LD   FS   ZS
1) Optimization: Learn initialization weights from the seen classes as meta-knowledge to enable quick adaptation to unseen classes.
Methods:   GMN, FIGR, Dawson, FAML, CML-GAN
2) Transformation: Learn sample transformations from the seen classes as meta-knowledge and use them for sample generation for unseen classes.
Methods:   DAGAN, DeltaGAN, Disco, AGE, SAGE, HAE, LSO
3) Fusion: Learn to fuse the samples of the seen classes as meta-knowledge, and apply learned meta-knowledge to generation for unseen classes.
Methods:   MatchingGAN, F2GAN, LofGAN, WaveGAN, AMMGAN
Modeling Internal Patch Distribution (Sec. 4.7)
Description: Learn the internal patch distribution within one image to generate diverse samples with the same visual content (patch distribution).
Task: uGM-1   uGM-2   uGM-3   cGM-1   cGM-2   cGM-3   IGM   SGM Data constraint: LD   FS   ZS
1) Progressive Training: Train a generative model progressively to learn the patch distribution at different scales/ noise levels.
Methods:   SinDiffusion, SinDDM, Deff-GAN, BlendGAN, SinGAN, ConSinGAN
2) Non-progressive Training: Train a generative model on the same scale/ noise but with changes to the model’s architecture.
Methods:   SinFusion, One-Shot GAN

BibTeX

@article{abdollahzadeh2023survey,
  author    = {Milad Abdollahzadeh
              and Touba Malekzadeh
              and Christopher T. H. Teo
              and Keshigeyan Chandrasegaran
              and Guimeng Liu
              and Ngai-Man Cheung
                },
  title     = {A Survey on Generative Modeling with Limited Data, Few Shots, and Zero Shot},
  year={2023},
  eprint={2307.14397},
  archivePrefix={arXiv},
  primaryClass={cs.CV}

}
Flag Counter