-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathCITATION.cff
53 lines (43 loc) · 1.68 KB
/
CITATION.cff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: page lengths of novels in the DNB catalogue
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Robert
family-names: Jäschke
affiliation: Humboldt-Universität zu Berlin
email: [email protected]
orcid: 'https://orcid.org/0000-0003-3271-9653'
- given-names: Frank
family-names: Fischer
affiliation: Freie Universität Berlin
orcid: 'https://orcid.org/0000-0003-2419-6629'
email: [email protected]
repository-code: 'https://github.com/weltliteratur/dnb'
abstract: >-
We analyse the number of pages of novels (i.e., fictional
literary works) in the German National Library (DNB).
It is not trivial to extract all novels from a big
catalogue like that of the German National library.
“Librarians estimate that genre information is present in
the expected MARC field for less than a quarter of the
volumes in HathiTrust Digital Library,” (Underwood et al.
2013) and we encounter the same problem, which calls for
an innovative solution.
Our approach is to
1. extract a list of writers from Wikidata together with
their GND id
2. download linked data about the DNB books
3. join the writer list with the list of books using the
GND id
This repository documents the evolution of this process,
which turned out to be not as straightforward as it seems.
One reason is the size of the data and the complexity of
queries.
license: GPL-3.0
commit: e0a25ce6c35108dcb788c6eed1297e762d11317a
date-released: '2021-09-22'