Skip to content

BrenoAV/Youtube-Trending-Video-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PySpark — YouTube Trending Video Analysis

Interesting Points

  1. The best time to release a video is 12:00 PM (Lunch time).
  2. The “worst” day to release a video is on Saturday.
  3. Tags apparently are not required to video go to trending state.
  4. For Brazil: The football (soccer) is incredible consumed until nowadays and minecraft contents.
  5. There are some correlations with the number of views with the likes, dislikes, and number of comments. The most correlated are the likes.
  6. Put a lot of exclamation mark in the title.

Summary

Analysis of the YouTube trending video using the dataset provided by Rishav Sharma - Kaggle. For that propose was used PySpark to process the data. The Jupyter Notebok contains my analysis of the data and can be done more analysis on top of.

Note: If you want another country, change the config.yml file.

Image of a laptop and a tablet with the YouTube logo

"Youtube" by clasesdeperiodismo is licensed under CC BY-SA 2.0 .

Kaggle

You can use the Kaggle Public API or just download manually the dataset from the website.

Python Dependencies

$ pip install --user pyspark pandas matplotlib seaborn PyYaml kaggle

Alternative (container): https://github.com/jupyter/docker-stacks/tree/main/images/pyspark-notebook


MIT License
This project is licensed under the MIT License - see the LICENSE file for details.

© 2023 BrenoAV

About

Youtube Trending analysis from 2020 until 2023

Topics

Resources

License

Stars

Watchers

Forks