This repository contains a collection of Natural Language Processing (NLP) projects implemented in Python. These projects are designed for beginners to learn and practice various NLP techniques.
Word Frequency Counter
- Counts the frequency of words in a given text.
Stop Word Removal
- Removes common stop words from a text to focus on more meaningful content.
Combined Word Frequency and Stop Word Removal
- A script that combines both word frequency counting and stop word removal.
Text Normalization
- Converts text to a standard format for consistency in analysis.
The following projects are planned to be added to this repository:
Sentiment Analysis
- Determines the sentiment (positive, negative, neutral) of a given text.
N-gram Generation
- Creates n-grams (contiguous sequences of n items) from a given text.
Spam Email Classification
- Classifies emails as spam or not spam based on their content.
Language Detection
- Identifies the language of a given text.
Text Tokenization
- Breaks down text into individual words or subwords.
To use these scripts, make sure you have Python installed on your system. Clone this repository and navigate to the project directory.
git clone
cd beginner-nlp-projects
Each project is contained in its own Python script. To run a script, use the following command:
with the name of the script you want to run.
Contributions to this project are welcome! Please feel free to submit a Pull Request.
This project is open source and available under the BSD 3-Clause license.