| | --- |
| | datasets: |
| | - cifar10 |
| | - https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/ |
| | --- |
| | |
| | GAN model trained on [CIFAR10 (Airplane)](https://www.tensorflow.org/datasets/catalog/cifar10) and [FGVC Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/) images. The model leverages [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) with [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf). |
| |
|
| | Try out this model [here](https://huggingface.co/spaces/PrakhAI/AIPlane). |
| |
|
| | | Generated Images | Real Images (for comparison) | |
| | | -------- | --------- | |
| | |  |  | |
| |
|
| | # Training Progression |
| | <video width="50%" controls> |
| | <source src="https://cdn-uploads.huggingface.co/production/uploads/649f9483d76ca0fe679011c2/qFlnTITZwS3DSTxLp0Oa8.mp4" type="video/mp4"> |
| | </video> |
| |
|
| | # Details |
| | [Colab Notebook](https://colab.research.google.com/drive/1b4KFZOnLERwQW_3jQ8FMABepKEAcDIK7?usp=sharing) |
| |
|
| | The model generates 32 x 32 images of Airplanes. It is trained on an NVIDIA T4 Colab Runtime. |
| |
|
| | The Critic consists of Convolutional Layers (3x3 kernel) with strides for downsampling, and Leaky ReLU activation. The critic uses [Spectral Normalization](https://arxiv.org/pdf/1802.05957.pdf), with more details [here](#spectral-normalization). |
| |
|
| | The Generator uses Transposed Convolutions (2x2 kernel) with strides for upsampling, and ReLU activation. The generator uses the variant of pixel-level Local Response Normalization proposed in the [Progressive Growing](https://arxiv.org/pdf/1710.10196.pdf) paper. |
| |
|
| | # Spectral Normalization |
| |
|
| | Spectral Normalization is a technique suggested for training GANs in [this paper](https://arxiv.org/pdf/1802.05957.pdf). |
| |
|
| | It aims to make the Critic's (Discriminator's) outputs mathematically continuous w.r.t. the space of input images, avoiding exploding gradients. |
| |
|
| | Spectral Normalization works very well in practice to stabilize the training of the GAN, as demonstrated by the example below (comparison at equivalent points during training): |
| |
|
| | | Batch Normalization | Spectral Normalization | |
| | | ----------- | ------------ | |
| | |  |  | |
| |
|
| | # Progressive Growing |
| |
|
| | Progressive Growing of GAN resolutions is suggested to improve the Quality and Stability of GAN training, especially for higher resolution models (1024x1024). |
| |
|
| | For 32x32 images of Airplanes, even a short initial round of Progressive Growing provides significant improvement (comparison at equivalent points during training): |
| |
|
| | | Flat Growing | Progressive Growing | |
| | | ----------- | ------------ | |
| | |  |  | |
| |
|
| | The generator for this model generates 4x4, 8x8, 16x16 and 32x32 images, which form the inputs for the critic. Each resolution is associated with a 'weight' (α<sub>4</sub>, α<sub>8</sub>, α<sub>16</sub>, α<sub>32</sub>), which indicate the focus on the corresponding image resolution at any given time during the training. |
| |
|
| | At the beginning of the training, α<sub>4</sub>=1, α<sub>8</sub>=0, α<sub>16</sub>=0, α<sub>32</sub>=0, with the values being α<sub>4</sub>=0, α<sub>8</sub>=0, α<sub>16</sub>=0, α<sub>32</sub>=1 towards the end. |