This Python bot automatically posts updates from an RSS feed to your Bluesky feed. It checks the feed regularly, retrieves new articles, and creates posts with clickable links and optional header images. The bot uses a database to avoid duplicate posts by tracking previously posted articles, even across sessions.
- Automatically fetches and posts new articles from the RSS feed
- Creates formatted posts with clickable links to articles
- Smart image handling:
- Configurable image posting (can be enabled/disabled)
- Configurable image source priorities (og:image, twitter:image, etc.)
- Automatically retrieves header images from article pages
- Compresses images to meet Bluesky's size requirements
- Falls back to text-only posts if image processing fails
- Robust duplicate prevention:
- Configurable duplicate detection strategies
- Optional database tracking of posted articles
- Optional backup check against recent Bluesky posts
- Configurable automatic database synchronization
- Reliable operation:
- Implements exponential backoff for login attempts
- Handles rate limits automatically
- Includes comprehensive error handling and logging
- Highly configurable through YAML:
- Customizable post format
- Customizable check intervals
- Adjustable retry settings
- Configurable logging levels
- Minimum post date filtering
- Python 3.8+
- Libraries:
requests
feedparser
beautifulsoup4
Pillow
atproto
python-dotenv
pyyaml
Clone this repository to your local environment or preferred hosting environment:
git clone https://github.com/MisterClean/ChicagoYIMBYblueskybot.git
cd bluesky-rss-posting-bot
Install the necessary Python libraries using pip
:
pip install -r requirements.txt
Create a .env
file in the root directory with your credentials:
BLUESKY_USERNAME=[email protected]
BLUESKY_PASSWORD=yourpassword
RSS_FEED_URL=https://example.com/feed
The bot uses a config.yaml
file for extensive customization. Here's a comprehensive example with all available options:
# RSS Feed Configuration
rss:
min_post_date: "2024-11-13" # Format: YYYY-MM-DD
image_sources: # Order determines priority
use_og_image: true # Use OpenGraph image meta tag
use_twitter_image: true # Use Twitter card image meta tag
use_wp_post_image: true # Use WordPress featured image
use_first_image: true # Fallback to first img tag found
# Bot Settings
bot:
check_interval: 600 # seconds
max_retries: 5
initial_delay: 10
posts_to_check: 50
include_images: true # Master switch for image posting
post_format: "{title}\n\nRead more: {link}" # Supports {title} and {link} placeholders
duplicate_detection:
check_database: true # Use SQLite database for duplicate checking
check_bluesky_backup: true # Backup check against recent posts
auto_sync_to_database: true # Auto-sync posts found on Bluesky to database
# Logging
logging:
level: INFO
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
min_post_date
: Minimum date for posts to be consideredimage_sources
: Configure which image sources to try and in what orderuse_og_image
: Try OpenGraph meta tagsuse_twitter_image
: Try Twitter card meta tagsuse_wp_post_image
: Try WordPress featured imageuse_first_image
: Fall back to first image found in page
check_interval
: Time between RSS feed checks in secondsmax_retries
: Maximum login retry attemptsinitial_delay
: Initial delay between retriesposts_to_check
: Number of recent posts to check for duplicatesinclude_images
: Master switch to enable/disable image postingpost_format
: Template for post text (supports {title} and {link} placeholders)duplicate_detection
: Configure duplicate post detection strategycheck_database
: Enable/disable database checkingcheck_bluesky_backup
: Enable/disable checking recent Bluesky postsauto_sync_to_database
: Enable/disable automatic database synchronization
level
: Logging level (INFO, DEBUG, WARNING, ERROR)format
: Log message format
Start the bot using:
python src/main.py
- RSS Fetching: Regularly checks the RSS feed for new articles
- Duplicate Detection (configurable):
- Checks SQLite database if enabled
- Verifies against recent Bluesky posts if enabled
- Optionally syncs posts found on Bluesky to database
- Image Processing (if enabled):
- Attempts to extract header image using configured sources
- Compresses image to meet Bluesky's size limits
- Gracefully falls back to text-only if image processing fails
- Post Formatting:
- Creates formatted post using configured template
- Adds clickable link to article
- Attaches processed image if available and enabled
- Error Handling:
- Implements exponential backoff for login attempts
- Handles rate limits automatically
- Logs all operations for debugging
Article Title
Read more: https://example.com/article-link
[Attached Image]
- Exponential backoff for login attempts
- Automatic retry on rate limits
- Graceful fallbacks for image processing
- Comprehensive logging for debugging
- Configurable primary database check
- Optional backup check against recent posts
- Optional automatic database synchronization
- Configurable number of posts to check
- Built-in delays between posts
- Automatic handling of API rate limits
- Configurable retry settings
- Login Failures: The bot will automatically retry with exponential backoff
- Image Processing: Falls back to text-only posts if image processing fails
- Database Sync: Optionally repairs database if posts are found on Bluesky but missing from local storage
- Rate Limits: Handled automatically with configurable delays
- Configurable logging levels in
config.yaml
- Detailed error messages for debugging
- Operation tracking for monitoring
This project is open-source and available under the MIT License.