HuffPost is an american website about opinions and news like politcs, culture, wellness, etc. It was founded in 2005 by Andrew Breitbart, Arianna Huffington, Kenneth Lerer, and Jonah Peretti. The dataset provides data from 2012 to 2018.
In this work is made a exploratory data analysis (EDA) and category classification of the news posted in HuffPost using the Headlines and Short Descriptions of these news. For the EDA, Headlines, Short Descriptions and Categories are analysed through the years, and for the classification task, it's used a lighter version of the popular Natural Language Processing (NLP) framework BERT developed by Google, called DistilBERT, in this developed by huggingface.