Skip to content

Releases: cldellow/datasette-scraper

0.5.3

04 Mar 22:11
Compare
Choose a tag to compare

Make the UI less daunting by hiding all tables except dss_crawl, and improving links from dss_crawl to dss_job, #51

0.5.2

01 Mar 03:33
Compare
Choose a tag to compare

Fix crawl status page when dss database is not the primary database (#49)

0.5.1

28 Feb 04:07
Compare
Choose a tag to compare

Support installing datasette-scraper into a database with pre-existing tables (#48)

0.5

22 Jan 18:35
Compare
Choose a tag to compare
0.5
  • feature: generic support for extracting json+ld data
  • feature: specific support for extracting json+ld Product data
  • feature: add discover-allow to specify an allowlist of patterns to crawl
  • enhancement: seed-sitemaps only activates for seeds that are at the top-level of the domain
  • enhancement: extract_from_response can delete existing entries
  • enhancement: extract_from_response can add indexed entries with @ sigil
  • enhancement: extract_from_response skips doing writes that wouldn't change the database
  • enhancement: prune pages that exceed max depth/max page limit earlier

0.4

01 Jan 22:13
Compare
Choose a tag to compare
0.4
0.4

0.3

01 Jan 22:06
Compare
Choose a tag to compare
0.3
I don't know what I'm doing...

perhaps we need to explicitly say which packages we publish? Perhaps
spending a few hours trying random things will save me a few minutes
learning a new package management system.

0.2

01 Jan 21:55
Compare
Choose a tag to compare
0.2
0.2

0.1

01 Jan 21:45
Compare
Choose a tag to compare
0.1
--verbose