Completed deep learning and NLP labs: #38
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)
a) Preprocessed image and text data, including augmentation for images and tokenization for text.
b) Designed and implemented neural network models (CNN for image classification, RNN/LSTM for text classification).
c) Applied embeddings (word2vec/Glove) and padding for text processing.
d) Trained models using appropriate optimizers and loss functions, tuning hyperparameters (learning rate, batch size, epochs).
e) Evaluated models using relevant metrics (accuracy, F1-score, confusion matrix).
f) Saved trained models and implemented inference pipelines for image and text tasks.
g) Documented lab results and insights from model training and evaluation.
What did you learn from the changes you have made?
From these labs, I learned how to effectively preprocess image and text data for deep learning tasks. For image data, I worked with resizing, normalization, and augmentation techniques to improve model generalization. I also explored CNNs for image classification. For NLP tasks, I applied tokenization, padding, and word embeddings (such as Word2Vec) to process text data. I became familiar with RNNs and LSTMs for text classification tasks and fine-tuned various hyperparameters such as learning rate, batch size, and epochs to optimize model performance. I also learned how to evaluate model performance using accuracy, F1-score, and confusion matrices.
Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?
Initially, I considered using traditional machine learning methods (e.g., Random Forest or SVM) for text classification, especially when working with smaller datasets. For image classification, I thought about using transfer learning with pre-trained models like ResNet or VGG to speed up convergence and improve results. However, I ultimately chose to build custom CNN and RNN/LSTM models from scratch to better understand the fundamentals of deep learning. Additionally, for text tasks, I considered experimenting with transformer models (like BERT) but decided to focus on simpler approaches first.
Were there any challenges? If so, what issue(s) did you face? How did you overcome it?
Yes, there were several challenges. For the image classification tasks, one challenge was dealing with overfitting due to limited data. I overcame this by using data augmentation techniques (e.g., rotations, flips) to artificially expand the dataset and by adding dropout layers to my model. In the NLP tasks, the biggest challenge was handling long sequences of text and ensuring they were padded correctly. I managed this by applying proper padding strategies and using RNNs/LSTMs with attention mechanisms to handle variable-length sequences. Another issue was tuning the hyperparameters, where I used grid search and manual tuning to find optimal values for batch size, learning rate, and epochs.
How were these changes tested?
The changes were tested by splitting the data into training and testing sets and evaluating the model's performance on the test set. I used common metrics like accuracy, precision, recall, F1-score, and confusion matrix for both image and text classification tasks. For the image models, I also visualized the training/validation loss curves to detect signs of overfitting. I performed cross-validation in NLP tasks to ensure robustness. The models were tested using real-world examples and compared against baseline models to assess improvements.
A reference to a related issue in your repository (if applicable)
I used the slides and labs on the 01_materials and the websites mentioned during the sessions.
Checklist