Performed Exploratory Data Analysis, Data Cleaning, Data Visualization and Text Featurization(BOW, tfidf, Word2Vec). Build several ML models like KNN, Naive Bayes, Logistic Regression, SVM
Given a text review, determine the sentiment of the review whether its positive or negative. Data Source: https://www.kaggle.com/snap/amazon-fine-food-reviews
The Amazon Fine Food Reviews dataset consists of reviews of fine foods from Amazon.
- Number of reviews: 568,454
- Number of users: 256,059
- Number of products: 74,258
- Timespan: Oct 1999 - Oct 2012
- Number of Attributes/Columns in data: 10
- Id
- ProductId - unique identifier for the product
- UserId - unqiue identifier for the user
- ProfileName
- HelpfulnessNumerator - number of users who found the review helpful
- HelpfulnessDenominator - number of users who indicated whether they found the review helpful or not
- Score - rating between 1 and 5
- Time - timestamp for the review
- Summary - brief summary of the review
- Text - text of the review
Learnt performing
- Exploratory data analysis
- T-sne
- Sentiment analysis
- Featurizations such as Bag of Words, Tfidf and Wor2Vec and text processing.
- KNN
- Naive Bayes
- Logistic Regression
- Support Vector Machine
- Decision Tree
- Random Forest and XGBoost
An Enlightening Introduction to Machine Learning providing skills to work on future projects