A powerful tool for scraping and managing movie and TV show lists from popular websites like Trakt, Letterboxd, and MDBList.
Parsely helps you collect and organize movie and TV show titles from various online sources, automatically match them with TMDB (The Movie Database) information, and maintain clean, deduplicated lists for your media collection.
- Multi-Site Scraping: Extract titles from Trakt.tv, Letterboxd, and MDBList
- TMDB Integration: Automatically match scraped titles with TMDB IDs
- Smart Caching: Reuse previous TMDB lookups to reduce API calls
- Duplicate Detection: Find and remove duplicates while preserving the best data
- Error Fixing: Automatically fix entries that failed to match properly
- Batch Processing: Handle multiple URLs or files at once
- Parallel Processing: Multi-threaded design for faster operation
- Python 3.7+
- TMDB API key (required)
- MDBList API key (optional)
-
Clone the repository:
git clone https://github.com/amcgready/parsely.git cd parsely
-
Install the required dependencies:
pip install -r requirements.txt
-
Create a
.env
file with your API keys (use.env.template
as a reference):TMDB_API_KEY=your_tmdb_key_here ENABLE_TMDB_MATCHING=true MDBLIST_API_KEY=your_mdblist_key_here INCLUDE_YEAR=true
-
Test your configuration:
python envtest.py
Run the main script to launch the interactive menu:
python parsely.py
Scrape a single URL from Trakt, Letterboxd, or MDBList and save to a file.
Process multiple URLs at once and combine the results into a single list.
Monitor URLs for changes and update your lists automatically.
Fix entries that couldn't be matched with TMDB automatically.
Find and remove duplicate entries across your lists.
Comprehensive tool to fix both duplicates and errors across multiple files in one operation.
Configure Parsely's behavior including TMDB matching and output format.
Parsely includes Docker support for easy deployment:
# Build the Docker image
docker build -t parsely .
# Run with Docker Compose
docker-compose up
Lists are stored in text files with the following format:
Title (Year) [TMDB_ID] # For TV shows
Title (Year) [movie:TMDB_ID] # For movies
Contributions are welcome! Feel free to open issues or submit pull requests.
This project is licensed under the MIT License - see the LICENSE file for details.
- TMDB API for metadata
- Trakt.tv for watchlist and collection data
- Letterboxd for curated film lists
- MDBList for list integration