CPSC542 Assignment 2

runme

id	version
01HSEEWGP8NWV3ZH52D49ZNGHM	v3

CPSC542 Assignment 2

Student Information

Student Name: Devyn Miller
Student ID: 2409539

Collaboration

In-class collaborators: Hayden Fargo (not working on the same code, but just bouncing ideas off of one another)

Resources

Resources used:
- The Oxford-IIIT Pet Dataset
- oxford_iiit_pet
- Keras's pretrained MobileNetV2 for the base model
- Copilot
- Perplexity

Data Source

Data source: oxford_iiit_pet dataset, info = tfds.load('oxford_iiit_pet:3.*.*', with_info=True)

Code Repository

GitHub Repository: Assignment 2

Project Organization and Pipeline Overview

This project develops an image segmentation pipeline that leverages deep learning and computer vision techniques using images from the Oxford-IIIT Pet Dataset, focusing on accurately distinguishing pets from their backgrounds across various settings and poses. I used TensorFlow and Keras libraries. Through careful organization and documentation, I aimed to create a transparent and reproducible workflow that addresses the challenges of pet image segmentation.

Project Structure

README.md: Project overview, setup instructions, and additional notes.
src: Source code for the project.
- preprocessing.py: Contains functions for data loading and preprocessing.
- augmentation.py: Implements data augmentation techniques.
- model.py: Defines the U-Net model architecture.
- training.py: Manages the model training process.
- metrics.py: Evaluation metrics and performance analysis.
- grad_cam.py: Grad-CAM visualizations for model interpretation.
figures: Directory for storing static figures and plots.
- Various PNG images from initial exploratory data analysis.
- model_structures: PNG Images for various model architectures I experimented with.
main.ipynb: Main notebook with project walkthrough, including EDA and results.
eda.ipynb: Additional EDA not displayed in main.ipynb
model_history.json: Stores the training history of the model for analysis.
models: Due to size constraints, the trained model is hosted externally.

Data Preprocessing

I started by preprocessing the data to make it suitable for training a deep learning model. This involved loading the dataset using TensorFlow Datasets (TFDS), normalizing the pixel values of the images to the range [0, 1], and resizing both the images and their corresponding segmentation masks to a uniform dimension of 128x128 pixels. The preprocessing steps are encapsulated in the preprocess function within src/preprocessing.py.

Data Augmentation

To improve the model's generalization capability, I implemented data augmentation techniques, including random horizontal flipping of the images and masks. This augmentation is performed on-the-fly during training to introduce variability in the training data without increasing its size. The augmentation logic is defined in the augment function in src/augmentation.py.

Model Architecture

For the segmentation task, I utilized a U-Net architecture, known for its effectiveness in image segmentation tasks. The U-Net model comprises a pretrained MobileNetV2 as the encoder and a series of upsample blocks as the decoder, facilitating precise localization. The model architecture is defined in src/model.py.

Training

The model is compiled and trained using the Adam optimizer and Sparse Categorical Crossentropy loss function, suitable for multi-class segmentation tasks. Training involves feeding the preprocessed and augmented images to the model, with early stopping implemented to prevent overfitting. The training process is managed by the train_model function in src/training.py.

Evaluation and Visualization

Post-training, the model's performance is evaluated using metrics such as accuracy, precision, recall, F1 score, and Intersection over Union (IoU). Additionally, I implemented Grad-CAM visualizations to interpret the model's predictions, highlighting the regions of the image that contributed most to the segmentation decision. The evaluation and visualization steps are detailed in src/metrics.py and src/grad_cam.py.

Conclusion

This project demonstrates a structured approach to solving an image segmentation problem using deep learning. By carefully preprocessing the data, augmenting it to enhance model robustness, and employing a powerful U-Net architecture, I was able to achieve precise segmentation of pets from their backgrounds. The project's modular design ensures each component is easily understandable and modifiable for future enhancements or adaptations to similar tasks.

Additional notes below — please read.

The u-net model was too large to push to github and can be found here.
All of the plots and figures (EDA, grad-CAM, etc) for this project can be found in the Jupyter notebook entitled main.ipynb. The report only contains the metrics table and graphs to show model performance.
As described in my writeup, I modularized things from an initial Jupyter notebook that contained everything. The png images saved in the /figures folder are from that inital ipynb, while those in main.ipynb were generated when running the modularized project.
I forgot to include part of the EDA in my pipeline, but it is rather simple and can be found in the Jupyter notebook entitled eda.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CPSC542 Assignment 2

Student Information

Collaboration

Resources

Data Source

Code Repository

Project Organization and Pipeline Overview

Project Structure

Data Preprocessing

Data Augmentation

Model Architecture

Training

Evaluation and Visualization

Conclusion

Additional notes below — please read.

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
figures		figures
models		models
src		src
README.md		README.md
eda.ipynb		eda.ipynb
main.ipynb		main.ipynb
model_history.json		model_history.json

devyn-miller/the-final-assignment2-cpsc542

Folders and files

Latest commit

History

Repository files navigation

CPSC542 Assignment 2

Student Information

Collaboration

Resources

Data Source

Code Repository

Project Organization and Pipeline Overview

Project Structure

Data Preprocessing

Data Augmentation

Model Architecture

Training

Evaluation and Visualization

Conclusion

Additional notes below — please read.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages