scrapy-site-downloader

Overview

Template project for downloading a site with Scrapy. Crawls, scrapes, and saves HTML files from a given website, domain, and URL filters.

Clone this repository and cd into it
Install the dependencies using the following command:
```
pip install -r requirements.txt
```
Configure the crawler/spiders/site.py file for the site you want to crawl
Start the downloader using the following command (be sure to run this from the repository root!):
```
scrapy crawl site
```
Refer to the Scrapy documentation for best practices and other configuration options
When the crawler finishes, the HTML files will be located in the /html directory

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
crawler		crawler
html		html
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
scrapy.cfg		scrapy.cfg