Skip to content

0.5

Compare
Choose a tag to compare
@cldellow cldellow released this 22 Jan 18:35
· 8 commits to master since this release
  • feature: generic support for extracting json+ld data
  • feature: specific support for extracting json+ld Product data
  • feature: add discover-allow to specify an allowlist of patterns to crawl
  • enhancement: seed-sitemaps only activates for seeds that are at the top-level of the domain
  • enhancement: extract_from_response can delete existing entries
  • enhancement: extract_from_response can add indexed entries with @ sigil
  • enhancement: extract_from_response skips doing writes that wouldn't change the database
  • enhancement: prune pages that exceed max depth/max page limit earlier