Reddit r/science post topic analysis

Report

Data source

In December 2022, I scraped the top 100 posts (month, year, and all-time) from reddit.com/r/science

I also scraped post titles from the homepages of Frontiers in Science and Nature.

Data scraping with PRAW

PRAW (Reddit's API) can only be accessed with Python.

To get a read-only list of posts, I used scraper.py. To run this code yourself, you'll need to follow the instructions this guide and use your own credentials.

Python packages

praw
pandas

Post data analysis

I investigated which topics are most prevalent on top of r/science

Title text analysis

I looked at the frequency of words in top post titles. I also made a wordcloud.

R packages

tidyverse
lubridate
textdata (Sentiment analysis)
tidytext
wordcloud
readr
gghighlight
skimr

Notes

Because I scraped using python and analyzed using R I had to create a virtual environment in my project folder in which I worked in Python.

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
reddit_final_report_files/figure-html		reddit_final_report_files/figure-html
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
UntitledR.Renviron		UntitledR.Renviron
data_analysis.R		data_analysis.R
data_cleaning.R		data_cleaning.R
reddit science analysis.Rmd		reddit science analysis.Rmd
reddit text analysis.Rmd		reddit text analysis.Rmd
reddit-science-analysis.Rproj		reddit-science-analysis.Rproj
reddit_all_code.R		reddit_all_code.R
reddit_final_report.Rmd		reddit_final_report.Rmd
reddit_final_report.html		reddit_final_report.html
reddit_final_report.md		reddit_final_report.md
reddit_logo.png		reddit_logo.png
references		references
scraper.py		scraper.py
text_analysis.R		text_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddit r/science post topic analysis

Report

Data source

Data scraping with PRAW

Python packages

Post data analysis

Title text analysis

R packages

Notes

About

Releases

Packages

Languages

lfontanills/reddit-science-analysis

Folders and files

Latest commit

History

Repository files navigation

Reddit r/science post topic analysis

Report

Data source

Data scraping with PRAW

Python packages

Post data analysis

Title text analysis

R packages

Notes

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages