Welcome to this repository and dashboard about using NLP (natural language processing) to summarize YouTube comments from one of our favorite shows! The Bigger Pockets is a producer of Youtube, Podcast, and other products focusing on real estate investments and financing; it is a roadmap for financial freedom through real estate investment. This is one of our favorite shows! The YouTube channel of Bigger Pockets (https://www.youtube.com/c/biggerpockets) has been producing informational videos since 2016, with 1.16M subscribers and 3K videos released so far! (Great contents BTW) As a big fan of the channel, we utilized NLP and sentiment analyses to draw insights into the audience's responses to each episode of the Bigger Pockets release.
This is a collaborative project featuring two data scientists: Dr. David Henderson (see profile here - https://github.com/HD013) & Dr. Yingtong "Amanda" Wu (see profile here - https://github.com/YingtongAamandaWu)
01_Codes: a folder containing jupyter notebooks of python codes showing the intermediate steps and exploratory data analyses
This is an interactive figure from "Video_polarity_mean_max_min_plotly.html", where you can hover over the html file and see the mean, max, and min comment polarity on every video of Bigger Pocket youtube. Note: we excluded youtube videos with less than 30 mins and with less than 10 comments for this analysis.
Highlight 2 - How long of a video should BiggerPockets make to achieve the highest cost-effectiveness (view counts = profits)?
We found that a video length of 260 seconds (4 min 20 secs) has the most view count per second -- This seems to be a "sweet spot" of attracting view counts with the minimum efforts spent on making video productions, without compromising the view counts and contents :)
As a case study, we produced a wordcloud image based on 124 Youtube comments from this recent video "New Rental Property Mortgages with 3% Interest Rates, 5% Down" (Link https://www.youtube.com/watch?v=IVK5vQg1UvY). Most of the comments are positive, as you can in the figure below: most comments show polarity values over zero. From the wordcloud image above, "Thank" and "Great" are two main keywords that popped up consistently in the comments -- meaning that the audience is very thankful for the information shared about low-interest rate mortgages in 2023!