"Movie Review Sentiment Classification" is a Python-based machine learning project that aims to classify movie reviews into positive or negative categories. Utilizing the IMDB 50K review dataset, the project employs several natural language processing techniques and machine learning models to analyze and determine the sentiment of movie reviews. This tool is invaluable for understanding public opinion trends in film and media.
- Data Preprocessing: Cleans and preprocesses movie reviews, including removal of HTML tags, special characters, and stopwords, and implementation of stemming.
- Exploratory Data Analysis: Provides visual representations of data distributions, including sentiment counts and word frequencies.
- Machine Learning Models: Incorporates various models like Gaussian Naive Bayes, Multinomial Naive Bayes, Bernoulli Naive Bayes, MLP Classifier, and KNN Classifier.
- Model Evaluation: Evaluates models using accuracy scores, confusion matrices, and classification reports.
- Prediction Capability: Allows users to input new movie reviews for sentiment prediction.
- Python
- Libraries: NumPy, Pandas, NLTK, Scikit-learn, Matplotlib, Seaborn, WordCloud
The dataset used is the IMDB 50k movie review dataset, which can be found at: IMDB Dataset Link.
Contributions to enhance and expand the functionality of this project are welcome. Please follow the standard procedures for contributing to a Python project.