Link: https://github.com/compsocial/CREDBANK-data
Paper link: CREDBANK: A Large-scale Social Media Corpus With Associated Credibility Annotations
Task:
- multi-class classification - social media credibility classification of tweets (5 classes)
- other research-based tasks (credibility of social media, differences in sharing credible and non-credible events, ...)
The CREDBANK corpus was collected between mid October 2014 and end of February 2015. It is a collection of streaming tweets tracked over this period, topics in this tweet stream, topics classified as events or non events, events annotated with credibility ratings. The data is spread across four files. The description of each file along with their location is listed in source link.
Note: Due to more complicated files structure and the size of all files, analysis is not performed for this dataset. However, at least meta information in datasets README file is provided.
Note2: Labels are provided for not tweets, but just identified events. Then, tweets are related to events.