Skip to content
Mingtao edited this page Oct 8, 2018 · 3 revisions

Install scrapy

pip install Scrapy

Create a new project

scrapy startproject tutorial

Edit example.py tutorial/spiders

import scrapy

class ExampleSpider(scrapy.Spider):
    name = 'example'
    allowed_domains = ['projectcomputing.com']
    start_urls = ['http://overproof.projectcomputing.com/showRequest/1538722038978085']

    def parse(self, response):
        after_correction_list = response.css('td[name="rn"] ::text').extract()
        after_correction = ' '.join(after_correction_list)
        after_correction = ' '.join(after_correction.split())
        print(after_correction)

Run the project

scrapy crawl example

XPath cheat sheet

https://devhints.io/xpath

Clone this wiki locally