Skip to content

v0.9.0

Compare
Choose a tag to compare
@jonatansteller jonatansteller released this 04 Nov 13:18
· 4 commits to main since this release
  • Full rewrite with a modular architecture
  • Any combination of Feed and FeedElement
  • Support for RDF (schema.org), XML (CMIF, LIDO), Beacon, ZIP ingest
  • Log but accept missing feed elements
  • Less memory hoarding with large datasets
  • Look-up routine for authority files
  • Single template to generate nfdicore/cto triples
  • Template adapted to current nfdicore/cto version
  • Automatically create ARK IDs for nfdicore/cto
  • Prep work for further serialisations such as DCAT
  • New command-line interface and argument parsing
  • A -quiet option prevents reporting intermiedate progress
  • Provide optional OCI (Podman/Docker) container set-up
  • Observe rules layed out in robots.txt files
  • Recognise http and https namespaces in schema.org sources
  • Provide log files for scraping runs
  • Switch to httpx