Final project for UCB MIDS W266 NLP class
The full write up for our experiments and learning is published here RoBERTa and Transfer Learning to Predict Review Helpfulness
The data sets used in this project can be found in the /data folder. We use review data sourced from Amazon and from Yelp. /data /amazon /yelp
Initial EDA and creation of the Yelp data sets can be found in Yelp_Data_EDA.ipynb
Processing of the Amazon data can be found in Amazon-data-processing-LARGE.ipynb and Amazon-data-processing-SMALL.ipynb
Recreation of the baseline model form Bilal et. al. is in Bilal_et_al_Baseline.ipynb
In Bilal_et_al_baseline_on_Yelp_data.ipynb we fine-tune the baseline model using the Yelp data set, to create a new baseline.
Fine-tuning of the RoBERTa model can be found in RoBERTa.ipynb
Model training using transfer learning techniques can be found in transfer_learning.py, train_bilal_baseline.py and train_amazon.py, train_amazon_large.py
The evalution of transfer learning can be found in Evaluating Transfer Learning Models.ipynb
Saved models can be found in /results, not that not all models are stored here due to size contraints.