A machine learning project that uses Natural Language Processing (NLP) to identify and classify real disaster-related tweets from non-disaster tweets.
TweetAlert is an intelligent system that helps emergency responders and disaster management teams quickly identify genuine disaster-related social media content. Using advanced machine learning techniques, it differentiates between tweets about actual emergencies (e.g., "Forest fire spreading near downtown!") and non-emergency tweets using similar language (e.g., "This new album is fire!").
.
├── .github/ # GitHub Actions workflows
├── ML/ # Core ML implementation
│ ├── data/ # Training and test datasets
│ ├── dataset/ # Data loading and processing
│ ├── helper_functions/# Utility functions
│ ├── modelling/ # Model implementations
│ └── predictions/ # Model outputs
├── tests/ # Test suite
└── wandb/ # Weights & Biases logging
- Binary classification of tweets (disaster vs non-disaster)
- PyTorch-based implementation
- Multiple model architectures
- Weights & Biases integration for experiment tracking
- Comprehensive test coverage
- GPU acceleration support
- Python 3.7+
- PyTorch
- torchvision
- torchtext
- pandas
- numpy
- scikit-learn
- wandb
- matplotlib
- tqdm
- Clone the repository
git clone https://github.com/Programmer-RD-AI/NLP-Disaster-Tweets.git
- Install dependencies
pip install -r requirements.txt
To train the model:
python run.py
Monitor training progress in the Weights & Biases dashboard.
The project uses two main datasets:
- train.csv: Labeled tweets for training
- test.csv: Unlabeled tweets for prediction
Labels:
- 1: Real disaster
- 0: Not a real disaster
- Fork the repository
- Create a feature branch
- Commit changes
- Push to the branch
- Open a pull request
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.