Skip to content

Accord-Project/zotero-bibliography

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

Zotero Bibliography

Table of Contents

Introduction

The ACCORD project uses a shared Zotero bibliography Semantic BIM.

Gdoc Zotero Productivity # Tag Normalization describes ideas how to use Zotero and additional goodies, and should become an agreed "local guide" on how we use it.

This project includes Zotero export (in various formats) and other files for processing Zotero data. In particular, we have these ideas:

  • Normalize Tags
  • Map tags to Wikidata and OpenAlex topics. However, WD is missing even our root topic "automated compliance checking", so we'll need to build up
  • Build a Mind Map of the domain, perhaps using Zotero export to "The Brain"
  • Build a comprehensive Citation Network
  • Build a semantic Knowledge Graph

Zotero Export Formats

Intro

  • Install zotero-better-bibtex, which doubles the number of output formats
  • Rclick on the collection "Semantic BIM"
  • Export
  • Try out each one of the formats
    • Select UTF-8
    • Select maximum options (notes etc)
  • Have Tools> Developer> Error Console open so you can record errors
    • "Clear" this often if the list gets too large

For RDF formats, convert them to Turtle for easier reading and validation, eg Zotero RDF was invalid 9 years ago (see aurimasv/zotero-import-export-formats#4):

riot --formatted=turtle "Zotero.rdf"  1>Zotero RDF.ttl
11:08:17 ERROR riot            :: [line: 1380, col: 59] {E205} rdf:resource is not allowed as an element tag here.

TODO:

BibLatex

biblatex.bib

  • human-readable
  • exports abstract, extra (note), tags (keywords), notes (annotation)
  • generates good citation keys
  • handles special chars in tags
  • needlessly quotes uppercase words {...} (but tools take care of this)
	keywords = {@read, {GraphQL}, {GraphSPARQL}, knowledge graphs, linked data, {SPARQL}},
	annotation = {How does this compare to Ontotext Semantic Objects {GraphQL}?

Missing:

  • collection
  • year, month (but "date" is there)
  • date added, modified
  • TODO

Better BibLatex

better-biblatex.bib

The ordering is different, so it's not so easy to compare them

  • Better and more flexible citekey generation
  • Excessive quoting of capitalized words, eg {{BIM}}
  • TODO

BibTex

TODO

Should be pretty similar to BibLatex, but maybe doesn't support UTF-8 as well?

Better BibTex

TODO

Better BibTex JSON

better-bibtex-json.json

Good JSON that mimics the Bibtex fields

Zotero RDF

zotero.rdf, zotero.ttl

  • Uses relative URLs but doesn't define @base
  • Tags are in dc:subject;
    • manually added as string: "hybrid" , "RDF"
    • ingested with metadata as node: [rdf:type z:AutomaticTag; rdf:value "Data interoperability"]
  • dc:date is not normalized, comes as entered (although Zotero recognizes the d-m-y parts)
  • 56% of items also have dcterms:dateSubmitted (eg "2021-05-09 19:57:03", not quite XSD format). This is date creatd, but maybe only for those that were modified after adding. Also, inconsistent time offsets are applied between what I see in Zotero and what is recorded.

Bibliontology RDF

bibliontology.ttl, bibliontology.rdf

These are partial files due to this error:

[JavaScript Error: "item.creators is undefined" {file: "chrome://zotero/content/xpcom/translation/translate_firefox.js line 425 > eval" line: 1097}]

The only item without Creator is a Standlone note (which is just text and has no metadata).

COiNS

coins.html

Metadata Embedded in HTML spans. COinS is an abbreviation for "ContextObjects in Spans". It's basically an html tag with some metadata inside it. One of the use cases is to embed bibliographic citation to web pages.

Collected Notes

collected-notes.html, collected-notes.md

  • Dump of library content:
    • Each collection is a seciton
    • Each item is a line, followed by
    • All notes about the item, including images
  • IF an item is in 2 collections, it is printed twice, with all its notes

Citation Graph

citation-graph.dot

Graphviz dot source to generate a citation graph.

  • But because we don't have any "related" links, there's no graph

Better CSL YAML

better-csl-yaml.yaml

Citation Key

better-citation-key.txt

A long \cite{} with the keys of all selected items

Export formats comparison

The result are based on comparing 4 item types:

  • Journal article
  • Video recording
  • Conference paper
  • Blog post

Result table

Export format Missing values Is a candidate Additional notes
CSV Looks fine, the only concern is a csv format itself
bibliontology.rdf Item type, Publication year, Date modified, Access date Looks like all "important" data are here
better-bibtex-json.json Publication year, Short title There might be more tags than in csv file for some reasons. Few values are missing, but nothing critical
zotero.rdf Key, Publication Year, Date Modified Looks good overall. That's and RDF format
MODS (XML data format) Key, Publication year, Date modified, Running time, Access date Few values are missing in comparison to csv
endnote.xml Key, Date added, Date modified, Access date Looks fine: XML format fits us and only few dates missing
better-csl.json Key, Publication year, Access date, Manual tags, Automatic tags Tags are missing, but this information looks important
ref-works-tagged.txt Key, Date modified, Library catalog, Running time, Short title Unsatisfactory format
RIS Key, Date, Date modified, Running time, Short title Unsatisfactory format
wikipeadia-сitation-еemplates.txt Key, Item type, Abstract, Date added, Date modified, Library catalog, Manual tags, Automatic tags, Language Unsatisfactory format. A lot of missing data
wikidata-quick-statements.txt Key, Item type, Publication year, Abstract, Date added, Date modified, Access date, Library catalog, Automatic tags, Running time, ISBN, Publisher, Conference name, Unsatisfactory format. A lot of missing data
unqualified-dublin-core.rdf Key, Publication year, Publication title, Abstract note, Date added, Date modified, Access date, Pages, Issue, Volume, Journal abbreviation, Language, Library catalog, Extra, Manual tags, Automatic tags A lot of missing data
TEI (XML data format) Abstract note, Date, Date added, Date modified, Access date, Pages, Language, Library catalog, Extra, Manual tags, ISBN, DOI, Automatic tags, Conference name A lot of missing data
simple-evernote-export.enex Key, Item type, Publication year, Author, Publication title, ISSN, ISBN, DOI, Url, Abstract note, Date, Date added, Date modified, Access date, Pages, Issue, Volume, Journal abbreviation, Language, Library catalog, Extra Unsatisfactory format. A lot of missing data
refer:biblx.txt Key, Publication year, ISSN, DOI, Date added, Date modified, Access date, Library catalog, Extra, Running time, Conference name, Short title A lot of missing data
csl.json Key, Publication year, Date added, Date modified, Short title, Language, Manual tags, Automatic tags A lot of missing data
collected-notes.html No data exporting video recording, conference paper, blog post. Exporting journal article only title, authors and date presented A lot of missing data
citation-graph.dot No data
bookmarks.html Key, Item type, Publication year, Author, Publication title, ISSN, DOI, Abstract note, Date, Date added, Date modified, Access date, Pages, Issue, Volume, Journal abbreviation, Language, Library catalog, Extra, Running time, Short title A lot of missing data
bibtex.bib Key, Item type, Date, Date added, Date modified, Access date, Library catalog, Running time, conference name, Short title A lot of missing data
biblatex.bib Key, Item type, Publication year, Date added, Date modified, Access date, Library catalog, Running time, Short title A lot of missing data
better-csl-yaml.yaml Key, Publication year, Date, Date added, Date modified, Short title, Language, Manual tags, Automatic tags Might be a candidate if we don't care about the missing fields. From my point of view, tags might be important
better-bibtex-citation-key-quick-copy.txt No data
better-bibtex.bib Key, Publication year, Title, Url, Date added, Date modified, Access date, Library catalog, Running , Conference name, Short title A lot of missing data
better-biblatex.bib Key, Item type, Publication year, Date, Date modified, Library catalog, Running time, Short title Might be a candidate, but there is other formats to choose with less missing data
coins.html Key, Item type, Publication year, Url, Abstract, Date added, Date modified, Access date, Library catalog, Extra, Manual tags, Automatic tags, Running time Unsatisfactory format. A lot of missing data

Normalize Tags

go to tag_normalization/README.md

About

Shared Zotero bibliography

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages