Skip to content

Latest commit

 

History

History

B-Scraping

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Scraping

Contents

Learning Goals

  • how does the general process of scraping work, including structure of a website
  • how to identify specific data
  • how to reorganize it into a reusable format
  • learn about the challenges posed by different publishing formats
  • learn how to use quickscrape with own urls

Activities and Methods

  • compare website source with scraper definition and subsequent scraping result
  • write scraper

Duration

  • 45min
    • 10min presentation
    • 5min demo of quickscrape
    • 20min hands on example, comparing source-html with scraper-definition and output
    • 10min reserve for questions

Prerequisites

  • command line

Resources

  • readymade executable examples
  • two or three tested links with example output
  • slides [LINK]

Watch for