Netflix Engagement Trends in 2023 #829
pwu97
announced in
2024 Plotnine Contest
Replies: 3 comments
-
Ahoy @pwu97 , |
Beta Was this translation helpful? Give feedback.
0 replies
-
@pwu97, The winning submission has been announced here. Thank you for taking part in the contest. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Authors
Peter Wu
Links
LinkedIn, GitHub
Full description
Introduction and Business Context
In its 2024 Q1 quarterly earnings report, Netflix made the surprising announcement that it would discontinue reporting its quarterly subscriber figures (but still report significant milestones along the way for this metric). Its argument was that they had developed new revenue streams like advertising and new extra member feature (notably by cracking down on password sharing), and so the subscriber count was not their true north star metric anymore, but rather engagement (i.e. time spent). Thus, in 2023, Netflix started releasing engagement reports for the time spent consuming each film in periods of measurement of 6 months.
In my initial analysis, we discover that when analyzing engagement data from January 2023 to June 2023, roughly 2,500 of the 18,000 total shows account for 80% of the total hours viewed during that period of measurement. Additionally, roughly 5,000 of the 18,000 total shows account for 90% of the total hours viewed during that period of measurement. The distribution of the total hours viewed per show exhibits a high degree of skewness. Let's try to concretely visualize that better. In our final plot, we want to get a sense for the typical engagement for a show on Netflix and see which ones gained or lost the most engagement in 2023.
Plot Submission
The plot below is my submission for the 2024 Plotnine Contest:
Some simplifying assumptions made:
Compelling aspects of the plot
scale_y_log10()
for the y-scale is particularly appealing because the distribution of the average hours per show per day is heavily right skewed. Furthermore, usinggeom_label()
complements this well because it will signal to the viewer of the plot by looking at some of the numbers (like 8.12M and 845K) that the scale is not a linear scale, but on the log scale.Comments on the plot and other things I tried
Code repository
GitHub Repo
Jupyter Notebook
Beta Was this translation helpful? Give feedback.
All reactions