Tweet_Sentiment_Analysis

An end to end data pipeline project analyzing the sentiments of tweets about Thanksgiving.

In this project, tweets were scraped to an S3 bucket in AWS using Kinesis Firehose

I utilized PySpark using the Databricks environment to clean and transform the raw tweet data. I then used vader to create sentiments for a portion of the tweets, and saved these back into an S3 bucket.
I used the cleaned tweets sample to create a logistic regression model to predict the sentiment of the remaining tweets.
Using Amazon Athena, I used SQL to create tables that were used to create a dashboard using QuickSight.