Skip to content

A project to analyze public sentiment about COVID-19 using Indonesian tweets. It processes data, applies Logistic Regression, and classifies sentiments as positive, negative, or neutral while handling data imbalance effectively.

Notifications You must be signed in to change notification settings

muhakbarhamid21/sentiment-analysis-covid-19-twitter

Repository files navigation

Sentiment Analysis COVID 19 Twitter

System Flowchart

sys_flow

Dataset

https://www.kaggle.com/datasets/dionisiusdh/covid19-indonesian-twitter-sentiment

Method

Prerocessing

  • Case Folding
  • Tokenizing
  • Filtering
  • Word Handling
  • Stemming

Feature Selection

  • TF-IDF

Classification

  • Logistic Regression

Handling Imbalance

  • Undersampling
  • Oversampling
  • SMOTE
  • Cost-Sensitive Learning
  • Bagging
  • Tomek Links

Data Exploration

Class

positif = 23521

negatif = 20055

netral = 9383

total jumlah sentimen

Top Features

  • Positive Class top features - positive class
  • Negative Class top features - negative class
  • Neutral Class top features - neutral class

Word Cloud

  • Positive Class word cloud - positive class
  • Negative Class word cloud - negative class
  • Neutral Class word cloud - neutral class

Evaluation

Accuracy

model accuracy comparation model accuracy comparation (2)

Classification Report

Untitled Diagram drawio

About

A project to analyze public sentiment about COVID-19 using Indonesian tweets. It processes data, applies Logistic Regression, and classifies sentiments as positive, negative, or neutral while handling data imbalance effectively.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published