Parsely

A powerful tool for scraping and managing movie and TV show lists from popular websites like Trakt, Letterboxd, and MDBList.

📋 Overview

Parsely helps you collect and organize movie and TV show titles from various online sources, automatically match them with TMDB (The Movie Database) information, and maintain clean, deduplicated lists for your media collection.

✨ Features

Multi-Site Scraping: Extract titles from Trakt.tv, Letterboxd, and MDBList
TMDB Integration: Automatically match scraped titles with TMDB IDs
Smart Caching: Reuse previous TMDB lookups to reduce API calls
Duplicate Detection: Find and remove duplicates while preserving the best data
Error Fixing: Automatically fix entries that failed to match properly
Batch Processing: Handle multiple URLs or files at once
Parallel Processing: Multi-threaded design for faster operation

🚀 Getting Started

Prerequisites

Python 3.7+
TMDB API key (required)
MDBList API key (optional)

Installation

Clone the repository:

git clone https://github.com/amcgready/parsely.git
cd parsely

Install the required dependencies:
```
pip install -r requirements.txt
```

Create a .env file with your API keys (use .env.template as a reference):

TMDB_API_KEY=your_tmdb_key_here
ENABLE_TMDB_MATCHING=true
MDBLIST_API_KEY=your_mdblist_key_here
INCLUDE_YEAR=true

Test your configuration:
```
python envtest.py
```

Usage

Run the main script to launch the interactive menu:

python parsely.py

📚 Main Features

1. Single URL Scraper

Scrape a single URL from Trakt, Letterboxd, or MDBList and save to a file.

2. Batch Scraper

Process multiple URLs at once and combine the results into a single list.

3. Monitor Scraper (Coming Soon)

Monitor URLs for changes and update your lists automatically.

4. Fix Errors

Fix entries that couldn't be matched with TMDB automatically.

5. Manage Duplicates

Find and remove duplicate entries across your lists.

6. Auto Fix Tool

Comprehensive tool to fix both duplicates and errors across multiple files in one operation.

7. Settings

Configure Parsely's behavior including TMDB matching and output format.

🛠️ Docker Support

Parsely includes Docker support for easy deployment:

# Build the Docker image
docker build -t parsely .

# Run with Docker Compose
docker-compose up

📝 List Format

Lists are stored in text files with the following format:

Title (Year) [TMDB_ID]       # For TV shows
Title (Year) [movie:TMDB_ID] # For movies

🤝 Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgements

TMDB API for metadata
Trakt.tv for watchlist and collection data
Letterboxd for curated film lists
MDBList for list integration

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.env.template		.env.template
Dockerfile		Dockerfile
FUNDING.yml		FUNDING.yml
README.md		README.md
docker-compose.yml		docker-compose.yml
parsely.py		parsely.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Parsely

📋 Overview

✨ Features

🚀 Getting Started

Prerequisites

Installation

Usage

📚 Main Features

1. Single URL Scraper

2. Batch Scraper

3. Monitor Scraper (Coming Soon)

4. Fix Errors

5. Manage Duplicates

6. Auto Fix Tool

7. Settings

🛠️ Docker Support

📝 List Format

🤝 Contributing

📜 License

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

amcgready/Parsely

Folders and files

Latest commit

History

Repository files navigation

Parsely

📋 Overview

✨ Features

🚀 Getting Started

Prerequisites

Installation

Usage

📚 Main Features

1. Single URL Scraper

2. Batch Scraper

3. Monitor Scraper (Coming Soon)

4. Fix Errors

5. Manage Duplicates

6. Auto Fix Tool

7. Settings

🛠️ Docker Support

📝 List Format

🤝 Contributing

📜 License

🙏 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages