A machine learning model that detects whether a Reddit comment from a specified subreddit is considered cyberbullying or not. The model uses a Twitter dataset with tweets labelled as offensive or non-offensive. The data is stored in a pickled panda dataframe with the Pickles and Pandas library. The strings of data are then cleaned to the stem of each word. 3 extraction methods are used including: Bag of Words (BoW), Term Frequency - Inverse Document Frequency (TF-IDF), and a custom approach using Lexicons. The first 2 models are trained using Naive Bayes classifier, while the custom model uses Support Vector Machines. The model are evualated using 3 metrics: recall, precision and f1 score.
-
Notifications
You must be signed in to change notification settings - Fork 0
ljbudz/reddit-bot
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
A machine learning model for detecting cyberbullying in Reddit comments.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published