Skip to content

A web crawler designed to scrape pokemon card prices from TCGPlayer.com

License

BSD-3-Clause, BSD-3-Clause licenses found

Licenses found

BSD-3-Clause
LICENSE.txt
BSD-3-Clause
LICENSE_SCRAPY_PLAYWRIGHT
Notifications You must be signed in to change notification settings

Richard-Stump/pokespider

PokeSpider

A web crawler designed to scrape pokemon card prices from TCGPlayer.com and export them to .csv files.

Installation

  1. Install Python 3, if you do not have it already.

  2. Create a new virtual environment:

    python -m venv venv
  3. Enter the virtual environment:

    Powershell:

    . .venv\Scripts\Activate.ps1

    cmd.exe:

    . .venv\Scripts\activate.bat

    Linux:

    source .venv/bin/activate
  4. Install dependencies:

    pip install -r requirements
    playwright install

Running

  1. Enter the virtual environment, if you are not in it already. (See step 3 of the installation instructions)
  2. Run the crawler with the following command:
    scrapy crawl 'main`
  3. A window will pop up with a list of sets that can be scrapped. Check the ones that you want and then close the window.
  4. Wait and eventually it should complete.

Other Notes:

Important Files for Making edits

File Purpose
settings.py Settings for Scrapy and the spider
pipelines.py Pipeline that takes items and outputs them to CSV files.
items.py The data structure for the scraped data
spiders/main_spider.py The spider code that handles requesting and parsing data.

Dependencies:

Dependency Min Version Reason Used Notes
scrapy 2.11.0 Framework that orchestrates the scraping process and provides a CLI tool for running the scaper.
playwright 1.15 Runs a headless browser that downloads dynamic content.
scrapy-playwright Special Implements a Scrapy download handler that lets scrapy download pages using playwright. This project uses a fork of scrapy-playwright that lets it run on Windows, rather than just Linux. This is included in source form in this project rather than as a submodule
wxPython 4.2.1 Used to implement the set selector window

About

A web crawler designed to scrape pokemon card prices from TCGPlayer.com

Resources

License

BSD-3-Clause, BSD-3-Clause licenses found

Licenses found

BSD-3-Clause
LICENSE.txt
BSD-3-Clause
LICENSE_SCRAPY_PLAYWRIGHT

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages