Deep learning models require substantial amounts of data to achieve high accuracy and robustness. However, the scarcity of authentic data presents a significant challenge. This project aims to address this issue by generating synthetic images to supplement the limited available data, exploring the impact on the robustness and accuracy of classification deep learning models.
The goal of this project is to enhance the performance of a Convolutional Neural Network (CNN) classifier for emotion recognition by addressing the challenge of limited data availability. The objectives are as follows:
- Synthesize Data: Employ a Generative Adversarial Network (GAN) to generate synthetic images that mimic the dataset.
- Iterative Refinement: GAN-generated images are evaluated by a CNN classifier and only those images that the classifier deems high-quality will be retained and used to further train the GAN. This feedback loop is designed to progressively improve the quality of the synthetic images.
- Model Enhancement: Leverage the refined synthetic data to bolster the training set for the CNN classifier. By incorporating a larger and more diverse set of training examples, the goal is to improve the classifier's ability to accurately recognize the five target emotions.
- Performance Evaluation: Conduct a thorough comparison of the CNN classifier's performance when trained on the original dataset versus the enhanced dataset containing high-confidence GAN-generated images.
This project aims to improve the accuracy and robustness of the CNN emotion classifier utilizing the GAN-generated data to overcome challenges associated with data scarcity.
The dataset used for this project, FER2013, is available on Kaggle at the following link: FER2013 Kaggle Dataset.
A Generative Adversarial Network (GAN) is trained using WGAN-GP (Wasserstein GAN with Gradient Penalty) framework to produce synthetic images that resemble real human emotions. The CNN classifier is initially trained on the authentic dataset to establish a baseline performance. It is then used to assess the quality of GAN-generated images, and only those that meet a defined confidence threshold are added to the training set. This iterative process is repeated to refine the quality of generated images and improve the classifier's performance.
- Initial warming up of the generator is employed to kickstart the GAN's learning process.
- By kickstarting the generator, we aim to produce initial synthetic images that already bear some resemblance to the target dataset, thus improving the efficiency of the subsequent epochs
Perceptual Loss: This loss function evaluates the difference in feature representations between the real and generated images. By minimizing this loss, the generator is trained to create images that not only fool the discriminator but also closely resemble the feature distribution of real images, enhancing the visual quality of the generated images.
Generator Loss: The generator loss motivates the generator to create images that the discriminator will classify as real. It is a measure of the generator's success in deceiving the discriminator, and minimizing this loss improves the generator's ability to produce realistic images.
Discriminator Loss:
- The discriminator loss incorporates a gradient penalty to regulate the discriminator's gradients, ensuring they neither grow too large nor too small. This regularization promotes stable training and improves the quality of images generated by maintaining a balanced adversarial relationship with the generator.
- GAN training yields quality-generated images, though not all are perfect. By inspection, we can identify promising images that can be used.
- In Stage 2 we will employ a classifier to select the best ones to reinforce the GAN and enhance the classifier's accuracy.
- The GAN-generated images are evaluated by the CNN classifier to ensure they are of high enough quality to be used for training. The classifier's confidence levels are used to filter out less convincing images.
- This process is iterated, with each cycle improving the classifier's performance and the GAN's synthetic image quality.
- The iterative refinement process leads to a measurable improvement in the CNN classifier's performance.
- The project demonstrates that GAN-generated images can effectively supplement a limited dataset, leading to better generalization and robustness of the classifier.
- Investigating more sophisticated GAN architectures and training strategies to further enhance image quality.
- Expanding the approach to other domains where data scarcity is a problem.
- Exploring the use of synthetic data in other machine learning tasks beyond classification.
This ongoing research effort was initiated during the summer of 2023 and recently revisited. The progression and outcomes of this project pave the way for future research in multiple areas:
- Further optimization of GAN architectures to improve the quality and diversity of synthetic data.
- Application of synthetic data generation to a wider range of deep learning tasks and challenges.
- Exploration of the impact of synthetic data on model robustness in various real-world scenarios.
The learnings and methodologies from this project could significantly contribute to fields where data is scarce or privacy concerns limit the availability of real datasets. By continuing to refine and adapt these techniques, we can push the boundaries of what's possible with synthetic data in machine learning.