Skip to content

Latest commit

 

History

History
executable file
·
11 lines (7 loc) · 627 Bytes

File metadata and controls

executable file
·
11 lines (7 loc) · 627 Bytes

Web Scraping & Logistic Regression

Project Summary

In this project I performed a sentiment analysis using logistic regression and NLTK to identify the top 10 sushi restaurants based on the sushi restaurants reviews on Yelp.

The resulting logistic regression model can perform a binary classification of restaurant quality (Thumbs-Up or Thumbs-Down" based on its top reviews on Yelp. The AUC is 0.75.

Python packages used: Scikit-Learn, BeautifulSoup, Pandas, NLTK, MatplotLib

Dataset

Top 10 reviews per restaurant for the first 60 most popular sushi restaurants in Bellevue, WA according to Yelp.com.