Introduction to Generative Adversarial Networks

Marco Bertini, Lorenzo Seidenari, June-Jully 2019; 12 hours, 3 CFU

Recent advances in computer vision and machine learning are mostly directed towards supervised methods, where the coupling of images and their semantic information is exploited to train deep convolutional neural networks. This approach is also regarded as discriminative and formally models the conditional probability P(y | I) where y is the semantic label (i.e. cat, dog, person) and I is the input image.

Generative models are instead approaches that aim at modelling P(I), or simply put the probability of observing a certain image I. An interesting property of generative models is the capability to sample from the learned distribution, which in our case means create unseen images.

This short course will first cover the basics of deep architectures and the main concepts of generative models: explicit and implicit. The course will then cover the recent approach of Generative Adversarial Networks (GAN), their relation to Nash equilibriums and how to train such models in a stable manner. Finally we will learn how to condition image generation leading to conditional GAN models. cGANs will help us solve many interesting applications such as: compression artifact removal, super-resolution, style and texture transfer.

We will provide examples in Python that can be used to reproduce results.


Lecture 1)

Deep Learning and Convolutional Network Refresher. In order to make the course self contained a brief intro to modern Deep Learning architectures and algorithms with a focus on CNN for image classification. Adversarial examples: how to attack a neural network or making it more general. Generative Models vs Discriminative Models. Explicit vs Implicit models. Fitting explicit models, e.g. mixture of Gaussians.

Lecture 2)

Generative Adversarial Models. Combine Generative models with the Adversarial Learning framework. Optimization problem and relation with Nash Equilibrium. Failure modes and countermeasures to stabilize training. Quantitative evaluation of generated images quality. Inception score and Frech├ęt Inception Distance (FID).

Lecture 3)

Advanced GAN formulations. Avoiding Mode Collapse: Wasserstein GANs. Autoencoder GANs.

Training multiple independent GANS (SGANS). Reducing Vanishing Gradient: Least Square GANs.

Lecture 4)

Conditional image generation. Conditioning the generator using discrete variables (text or category). Conditioning using images in paired and unpaired setting. Applications: super resolution, artifact removal, style transfer.

** Evaluation

Students can either decide to present a paper from recent NIPS, CVPR, ICCV or ECCV conferences on Generative Models or perform experiments on a dataset of choice for image generation.