Large Scale GAN Training for High Fidelity Natural Image Synthesis (BigGAN)
At this reading group we discussed a paper the acheived state of the art results for image generation of the ImageNet dataset.
Abstract
Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenable to a simple “truncation trick”, allowing fine control over the trade-off between sample fidelity and variety by truncating the latent space. Our modifications lead to models which set the new state of the art in class-conditional image synthesis. When trained on ImageNet at 128x128 resolution, our models (BigGANs) achieve an Inception Score (IS) of 166.3 and Frechet Inception Distance (FID) of 9.6, improving over the previous best IS of 52.52 and FID of 18.65.
Here is the link to the paper. The authors also included a google-colab implementation of the pretrained generator which highlights some the results they discussed. There is also an independent blog post that discusses the paper.
This paper mentioned various other methods/techniques used for training GANs, for those interested here are some resources that explain some of them: