Skip to content

Latest commit

 

History

History
31 lines (25 loc) · 1.89 KB

README.md

File metadata and controls

31 lines (25 loc) · 1.89 KB

Barclays-Hack-O-Hire

Automated Email Classification Machine Learning Model

This project presents a machine learning model designed to classify emails automatically. The model utilizes several libraries and techniques for effective classification and performance evaluation.

Libraries Used:

  • NLTK (Natural Language Toolkit): Utilized for data preprocessing tasks such as tokenization, stemming, and stop words removal.
  • Tf-idf (Term Frequency-Inverse Document Frequency): Employed to convert textual data into numerical form, capturing the importance of terms in documents.
  • Support Vector Machine (SVM) and Naive Bayes: Implemented to compare and evaluate the performance of two different classification algorithms.

Performance Evaluation:

The model's performance is evaluated using two key metrics:

  • Accuracy: Measures the overall correctness of the classification.
  • F1-Score: Provides a balance between precision and recall, particularly useful for imbalanced class distributions.

Visualization and Evaluation:

  • The test data is visualized to gain insights into the distribution of classes and potential patterns.
  • The accuracy and F1-score are computed for both SVM and Naive Bayes models.
  • Performance comparison is conducted to determine the superior performing model.

GitHub README File:

This repository includes:

  • Detailed documentation on the project setup, data preprocessing, model implementation, and evaluation.
  • Instructions for reproducing the results and running the classification model.
  • Visualization of test data distribution and model performance metrics.
  • Discussion on the implications and potential enhancements. Acknowledgments: We acknowledge the contributions of the open-source community and the developers of NLTK, scikit-learn, and other libraries used in this project.

References:

https://www.youtube.com/watch?feature=shared&v=O2L2Uv9pdDA