This is the spiritual successor to the original rising threads bot, developed for reddit.com
The basic concept of this bot is to look at a new post and determine whether or not it is likely to become popular. The process for doing this is as follows:
- Catalog parameters of new posts as they're submitted. Asign new posts a default position of 1000.
- Scan the top 1000 posts on reddit for submissions that were cataloged this way and update their score position.
- Use cataloged posts that are at least 10 hours old (to ensure they either did or did not make it to the front page at that time) as the training data for a linear regression model. The input being post parameters, the output being the post's top position.
- Use the linear regression model to generate predictive scores for new posts.
- Log the posts with their predicted score in a data-structure that stores 10000 elements, which are curated as first-in-first-out, while sorted by their predicted score.
- Examine the ratio of posts that achieved a position higher than 100 to all posts cataloged.
- If a post inserted in the data-structure ends up in a position better than this ratio, flag it as a probable popular post.
- Post the flagged submission to reddit.
From the main directory:
python3 rising_threads/main.py
- Python 3.5
- Flask
- Pymongo
- Sklearn
- PRAW
- A MongoDB instance
License: MIT
Author: The1RGood / Randy Goodman
Contact: [email protected]