Sentiment Analysis of Persian Comments

Implementation of NLP models for classification of the Persian comments' sentiment in Python.

Requirements

see requirements.txt and use pip install -r requirements.txt for installation.

numpy for some calculations
pandas for data read/write
scikit-learn for classfiers
gensim for Word2Vec model

About Sentiment Analysis

Sentiment analysis (also known as opinion mining or emotion AI) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine [Wiki]. Following figure shows the workflow of sentiment analysis [monkeylearn].

Project structure

📂 directory [sentiment_data]

This directory contains data used in the project. train.csv has labeled comments for training (we split it into train-validation to find the best model). test.csv has unlabeled comments that we should predict the label of each row using our propused model and save the result in test-labeled.csv.
📂 directory [nlp_files]

This directory contains some files for being used in NLP models (e.g. stop-words).
📄 [phase1.ipynb]

Contains steps for finding the best method for being applied to data. In this file, we only use train.csv and split it into train-validation sets. Here is the final result of all 18 models employed in this phase:

Classifiers	TF-IDF	Word2Vec
KNN(n=4)	0.4619	0.6131
KNN(n=8)	0.6940	0.6298
KNN(n=16)	0.7524	0.6238
SVM(linear)	0.7905	0.4774
SVM(poly)	0.6417	0.6452
SVM(rbf)	0.7905	0.6810
XGB(n=50)	0.7214	0.6571
XGB(n=100)	0.7298	0.6786
XGB(n=150)	0.7381	0.6714

📄 [phase2.ipynb]
This file is the implementation of the proposed method found in phase1. The result of applying the proposed model on test.csv can be seen in test-labeled.csv.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.imgs		README.imgs
nlp_files		nlp_files
sentiment_data		sentiment_data
LICENSE		LICENSE
README.md		README.md
phase1.ipynb		phase1.ipynb
phase2.ipynb		phase2.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis of Persian Comments

Requirements

About Sentiment Analysis

Project structure

📂 directory [sentiment_data]

📂 directory [nlp_files]

📄 [phase1.ipynb]

📄 [phase2.ipynb]

About

Releases

Packages

Languages

License

mohammadAbbasniya/NLP_persian_sentiment_analysis

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis of Persian Comments

Requirements

About Sentiment Analysis

Project structure

📂 directory [sentiment_data]

📂 directory [nlp_files]

📄 [phase1.ipynb]

📄 [phase2.ipynb]

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages