Skip to content

Commit

Permalink
EASY-2209 Upgrade to v4.0 of license for GDPR
Browse files Browse the repository at this point in the history
* new version of the depositor agreement
* uses only few emd-fields
* no longer needs a human readable version for the audience
* no longer needs fsrdb
* the font has changed, otherwise printing the document failed
* added easy-licenses resources for appendix3
  • Loading branch information
jo-pol authored and janvanmansum committed Nov 29, 2019
1 parent 039e3f8 commit 24c706f
Show file tree
Hide file tree
Showing 37 changed files with 879 additions and 1,300 deletions.
70 changes: 18 additions & 52 deletions Functional requirements.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ and an `OutputStream` to which the output is written as its arguments and return

## New design
The new design of `easy-deposit-agreement-creator` consists of (1) an API which can be called from within other
modules that are dependents of this module and (2) a command line tool. In case the latter is used,
modules which depend on this module and (2) a command line tool. In case the latter is used,
we assume that the dataset as well as the depositor data are already present in EASY. Again notice
that this command line tool is not intended to ingest the newly generated deposit agreement into EASY!

Expand All @@ -24,7 +24,7 @@ The input and output of both parts of `easy-deposit-agreement-creator` are as fo
* *input (via API call):* either one of
* dataset identifier, `OutputStream` - *used in modification tools*
* EMD object, depositor object, `OutputStream` - *used in the business-layer*
* EMD object, depositor identifier, list of file metadata objects, `OutputStream` - *used in Easy-Stage-Dataset*
* EMD object, depositor identifier, an obsolete list of file metadata objects, `OutputStream` - *used in Easy-Stage-Dataset*
* dataset object, `OutputStream`
* *output (via command line):* pdf document with the deposit agreement
* *output (via API call):* `Unit`
Expand All @@ -40,29 +40,23 @@ The input and output of both parts of `easy-deposit-agreement-creator` are as fo
* `easy-deposit-agreement-creator` wil *not* send emails to depositors whose datasets are modified or newly ingested.
* No data is written to the databases; this module only reads data!

### Additions relative to the former version
* In the list of files in the dataset the access category for each file needs to be included.
* An explaination of the distinct access categories from the previous item.

## Resources
The deposit agreement is generated from a series of template files with placeholders. Using the resources listed
below this module can resolve these placeholders and transform the whole text into a pdf.

### Template files
* `Appendix.html` - an appendix with more information about the CC0 access category
* `styles.css` - styling for the various elements
* `Agreement.html` - the main template with the content for the footer and parsing the other templates
* `Body.html` - the main content of the deposit agreement text
* `Appendix1.html` - an appendix with the chosen access rights, license and an optional embargo statement
* `Appendix2.html` - an appendix with the dans license that may be applicable or not as explained in the body.html
* `dans_logo.png` - the Dans logo to be displayed in the header of each page
* `Embargo.html` - an optional text in case the dataset is under embargo
* `FileTable.html` - an overview of all the files in the dataset, showing the file path, checksum and access category
* `Agreement.html` - the main file in which all the other html files are merged together
* `agreement_version.txt` - the version of the agreement to be displayed in the footer of each page of the document
* `DriveByData.png` - background in the footer
* `MetadataTerms.properties` - a mapping between terms from the metadata and the text to be displayed in the agreement
* `Table.html` - an overview of the metadata of the dataset

### Data resources
* *Fedora* - metadata of the dataset is stored in Fedora. The EMD datastream dissemination contains the metadata of the dataset itself, the AMD datastream dissemination contains the depositor identifier, the EASY_FILE datastream and EASY_FILE_METADATA datastream dissemination contain the data of the files in the dataset.
* *Fedora* - metadata of the dataset is stored in Fedora. The EMD datastream dissemination contains the metadata of the dataset itself, the AMD datastream dissemination contains the depositor identifier.
* *LDAP* - the depositor data required for the agreement is stored in LDAP.
* *RiSearch* - this is part of Fedora and provides the relation between the dataset and the files.

### Required data in the template
Besides the dataset's metadata and the list of files contained in the dataset, several other values
Expand All @@ -71,14 +65,13 @@ are required in the creation of the deposit agreement.
| Data | Used in | Stored in |
|------|---------|-----------|
| Dataset - identifier | all occasions where a query for (a part of) the dataset in Fedora is required | application parameter |
| Dataset - DANS managed DOI | template `Body.html` | `emd:identifier // dc:identifier` |
| Dataset - encoded DANS managed DOI | template `Body.html`, see the link on the managed DOI above | `let id = emd:identifier // dc:identifier in (id@eas:identification-system ++ "/" ++ id.value)` |
| Dataset - date submitted | template `Body.html` | `emd:date // eas:dateSubmitted` |
| Dataset - preferred title | template `Body.html` | `emd:title // dc:title` |
| Dataset - access category | template `Agreement.html` | `emd:rights // dct:accessRights` or `dc:rights` (*these are always the same, only in different schemas. Therefore we can always use the value from EMD to get the least amount of Fedora queries*)|
| Dataset - is under embargo | code `LicenseComposer.java:193` | to be calculated based on the current date and `Dataset - date available` below |
| Dataset - date available | template `Embargo.html` | `emd:date // eas:available` |
| Current time | template `Tail.html`, this is the timestamp of creating the deposit agreement: `new org.joda.time.DateTime().toString("YYYY-MM-dd HH:mm:ss"))` | calculated at runtime |
| Dataset - DANS managed DOI | template `Header.tml` | `emd:identifier // dc:identifier` |
| Dataset - encoded DANS managed DOI | template `Header.html`, see the link on the managed DOI above | `let id = emd:identifier // dc:identifier in (id@eas:identification-system ++ "/" ++ id.value)` |
| Dataset - date submitted | template `Header.tml` | `emd:date // eas:dateSubmitted` |
| Dataset - preferred title | template `Header.html` | `emd:title // dc:title` |
| Dataset - open access | template `Apppendix1.html` | `emd:rights // dct:accessRights` or `dc:rights` <br> (*these are always the same, only in different schemas. Therefore we can always use the value from EMD to get the least amount of Fedora queries*)|
| Dataset - is under embargo | template `Apppendix1.html` | to be calculated based on the current date and `Dataset - date available` below |
| Dataset - date available | template `Apppendix1.html` | `emd:date // eas:available` |
| EasyUser - displayName | template `Body.html` | LDAP user database - `(givenName <> initials)? + dansPrefixes? + sn?` |
| EasyUser - organization | template `Body.html` | LDAP user database - `o` |
| EasyUser - address | template `Body.html` | LDAP user database - `postalAddress` |
Expand All @@ -88,32 +81,6 @@ are required in the creation of the deposit agreement.
| EasyUser - telephone | template `Body.html` | LDAP user database - `telephoneNumber` |
| EasyUser - email | template `Body.html` | LDAP user database - `mail` |

### Displaying the dataset metadata
* For each term in the metadata the *qualified name* is calculated ([namespace].[name]) and mapped to the corresponding value in `MetadataTerms.properties`.
* If the term equals **AUDIENCE**, all associated *discipline identifiers* are queried from Fedora and displayed as a comma-separated `String`.
* If the term equals **ACCESSRIGHTS**, it is mapped to the corresponding string representation (see below).
* For all other terms the values are displayed as a comma-separated `String`.
* Every term corresponds to one row in the table.

### Displaying the files in the dataset
* All files contained in the dataset are retrieved from Fedora using `RiSearch`.
* For each file the SHA1-hash is queried.
* Each file (path and hash) corresponds to one row in the table.
* In case the dataset does not contain any files, the text "*No uploaded files*" is added instead of the table.
* In case the SHA1-hash of a file is not calculated, the alternative text "*------------- not-calculated -------------*" is used

### Mapping of access categories
| Access Category | Agreement snippet | String representation |
|-----------------|-----------------|-----------------------|
| ANONYMOUS_ACCESS | OpenAccess.html | `"Anonymous"` |
| OPEN_ACCESS | OpenAccess.html | `"Open Access"` |
| OPEN_ACCESS_FOR_REGISTERED_USERS | OpenAccessForRegisteredUsers.html | `"Open access for registered users"` |
| GROUP_ACCESS | RestrictGroup.html | `"Restricted -'archaeology' group"` |
| REQUEST_PERMISSION | RestrictRequest.html | `"Restricted -request permission"` |
| ACCESS_ELSEWHERE | OtherAccess.html | `"Elsewhere"` |
| NO_ACCESS | OtherAccess.html | `"Other"` |
| FREELY_AVAILABLE | OpenAccess.html | `"Open Access"` |

## Page layout
* The document has an A4 page size and the following margins (top-right-bottom-left): 2.5cm 1.5cm 2cm 1.5cm
* Every page has a header with the DANS logo
Expand All @@ -134,12 +101,11 @@ resolved. This only happens when the property `runtime.references.strict = true`
Velocity properties file. Besides that Velocity requires the path to the resources to be set using
the property `file.resource.loader.path`. As an extra parameter we added `template.file.name`,
holding the name of the file to be resolved by Velocity. This file is supposed to be present inside
the `file.resource.loader.path` folder. All these parameters can are set in the
`velocity-engine.properties` file in `src/main/assembly/dist/res/`.
the `file.resource.loader.path` folder. All these parameters are [hard coded](https://github.com/DANS-KNAW/easy-deposit-agreement-creator/blob/230a1e1ffaf24213f71277c1de1dbb9cd08daf96/src/main/scala/nl/knaw/dans/easy/agreement/internal/package.scala#L43-L51).

### WeasyPrint
The transformation from html to pdf is done using the WeasyPrint command line tool. This tool is
installed on the servers (*deasy*, *teasy* and *easy01*). For running it locally (during
development) we recommend using the `localrun.sh` script. Indicate this in the
installed on the servers (*deasy*, *teasy* and *easy11*). For running it locally (during
development) we recommend using the `src/main/assembly/dist/res/pdfgen.sh` script. Indicate this in the
`application.properties` located in `src/main/assembly/dist/cfg/` and fill in the `...`
placeholders in the script.
1 change: 1 addition & 0 deletions debug-init-env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,4 @@ DATADIR=data

touch $DATADIR/easy-deposit-agreement-creator.log
cp src/test/resources/debug-config/pdfgen.sh $HOMEDIR/res/
cp -r target/easy-licenses/licenses $HOMEDIR/res/licenses
Empty file added docs/01_manual.md
Empty file.
4 changes: 4 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,10 @@ yum install redhat-rpm-config python-devel python-pip python-lxml cairo pango gd

After this, `weasyprint --help` is supposed to show the appropriate help page.

Python however may complain about `unknown locale`, add to your profile:

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8

BUILDING FROM SOURCE
--------------------
Expand Down
46 changes: 46 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
<properties>
<main-class>nl.knaw.dans.easy.agreement.Command</main-class>
<easy.emd.version>3.8.0</easy.emd.version>
<easy-licenses.version>1.0.3</easy-licenses.version>
</properties>

<scm>
Expand Down Expand Up @@ -175,6 +176,33 @@
</repository>
</repositories>

<build>
<plugins>
<plugin>
<artifactId>maven-dependency-plugin</artifactId>
<executions>
<execution>
<id>resources</id>
<phase>generate-resources</phase>
<goals>
<goal>unpack</goal>
</goals>
<configuration>
<artifactItems>
<artifactItem>
<groupId>nl.knaw.dans.easy</groupId>
<artifactId>easy-licenses</artifactId>
<version>${easy-licenses.version}</version>
<outputDirectory>${project.build.directory}/easy-licenses</outputDirectory>
</artifactItem>
</artifactItems>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>

<profiles>
<profile>
<id>rpm</id>
Expand Down Expand Up @@ -219,6 +247,24 @@
</source>
</sources>
</mapping>
<mapping>
<directory>/var/opt/${dans-provider-name}/resource/${project.artifactId}/template/licenses</directory>
<configuration>${rpm-replace-configuration}</configuration>
<sources>
<source>
<location>target/easy-licenses/licenses</location>
</source>
</sources>
</mapping>
<mapping>
<directory>/opt/${dans-provider-name}/${project.artifactId}/res/template/licenses</directory>
<configuration>${rpm-replace-configuration}</configuration>
<sources>
<source>
<location>target/easy-licenses/licenses</location>
</source>
</sources>
</mapping>
</mappings>
</configuration>
</plugin>
Expand Down
4 changes: 0 additions & 4 deletions src/main/assembly/dist/cfg/application.properties
Original file line number Diff line number Diff line change
@@ -1,12 +1,8 @@
fcrepo.url=http://localhost:8080/fedora
fcrepo.user=fedoraAdmin
fcrepo.password=changeme
fsrdb.db-connection-url=jdbc:postgresql://localhost:5432/easy_db
fsrdb.db-connection-username=easy_webui
fsrdb.db-connection-password=changeme
auth.ldap.url=ldap://localhost
auth.ldap.user=cn=ldapadmin,dc=dans,dc=knaw,dc=nl
auth.ldap.password=changeme
agreement.resources=/var/opt/dans.knaw.nl/resource/easy-deposit-agreement-creator
agreement.fileLimit=500
daemon.http.port=20130
Binary file added src/main/assembly/dist/res/DrivenByData.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 0 additions & 1 deletion src/main/assembly/dist/res/agreement_version.txt

This file was deleted.

Binary file modified src/main/assembly/dist/res/dans_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
119 changes: 30 additions & 89 deletions src/main/assembly/dist/res/template/Agreement.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,57 +2,7 @@
<html lang="en">
<head>
<style>
@page {
margin: 2.5cm 1.5cm 2cm 1.5cm;
size: A4;
@top-left {
content: "";
width: 100%;
height: 100%;
background: url(data:image/jpg;base64,$DansLogo) no-repeat 0 0;
background-position: left center;
background-size: 100px 30px;
}
@bottom-center {
content: "$FooterText - page " counter(page) "/" counter(pages);
font: 14px "Bitstream Charter";
}
}
body {
font: 14px "Bitstream Charter";
text-align: justify;
}
ol li ul {
list-style-type: disc;
}
ol {
list-style-type: lower-alpha;
}
table {
border-collapse: collapse;
border-spacing: 0;
display: table;
border: 1px solid black;
margin-left: auto;
margin-right: auto;
}
tr {
border-bottom: 1px solid black;
}
table td, table th {
padding: 6px 8px;
display: table-cell;
vertical-align: middle;
border-right: 1px solid black;
}
.error {
color: red;
font-size: 150%;
font-weight: bold;
}
.inline-header {
font-weight: bold;
}
#parse("style.css")
</style>
</head>
<body>
Expand All @@ -64,48 +14,39 @@
#end

#parse("Body.html")

#if ($OpenAccess)
<p class="inline-header">[Open Access: unlimited access without registration of user registration]</p>
<p>The Depositor agrees to the dataset being made available in accordance with the conditions of the Creative Commons Zero Waiver, the CC0 1.0 Universal Public Domain Dedication (Appendix 1). In doing so, the Depositor renounces all possible rights relating to the dataset.</p>
#elseif ($OpenAccessForRegisteredUsers)
<p class="inline-header">[Open Access for Registered Users: unlimited access for registered users]</p>
<p>The Repository is permitted to make the dataset available to all persons, legal entities and organisations registered with the Repository.</p>
#elseif ($OtherAccess)
<p class="inline-header">[Other Access: the data are not available via EASY]</p>
<p>The dataset will be made available by means of another method to be agreed with the Repository.</p>
#elseif ($RestrictGroup)
<p class="inline-header">[Restricted Access: access restricted to registered persons or group members, N.B. only for archeology]</p>
<p>The Depositor may grant access permission in advance for persons, legal entities and organisations that belong to one of the user groups specified by DANS and/or the Depositor.</p>
#elseif ($RestrictRequest)
<p class="inline-header">[Restricted Access: access with the permission of the Repository]</p>
<p>The Repository is permitted to make the dataset available to persons, legal entities and organisations registered with the Depositor only after receiving express permission from the Depositor.<p>
#else
<!-- default access right
never expect to be in this case, but I can't find an error function in Velocity
so I use a error message in the pdf itself -->
<p class="error">NO VALID VALUE FOR THE ACCESS CATEGORY FOUND!!!</p>
#end

#if ($UnderEmbargo)
<p class="inline-header">You have additionally chosen:</p>
<p class="inline-header">[Temporary restriction: Embargo]; only possible if Open Access, Open Access for Registered Users or Restricted Access has been chosen</p>
<p>The dataset will be temporarily unavailable until $DateAvailable, commencing on the date of publication. The embargo period cannot be longer than two years and cannot be extended. When this period elapses, one of the special provisions set out above shall automatically apply. An extension of this period is only possible in consultation with the Depositor.</p>
#if (! $IsSample)
<p>This agreement has been accepted by both parties on $DateSubmitted upon completion of the deposit process via easy.dans.knaw.nl.</p>
#end

<p class="inline-header">The Depositor hereby agrees to the above provisions and the general code(s) of conduct referred to in this document.</p>

#parse("Table.html")

#if ($HasFiles)
<br/>
#parse("FileTable.html")
#else
<p>No uploaded files.</p>
#parse("Appendix1.html")
#parse("Appendix2.html")
#if ($OpenAccess)
<h1 style="page-break-before: always">Appendix 3 Legal text of chosen public-domain statement or Open Access Licence</h1>
<pre>
#parse($Appendix3)
</pre>
#end

<br/>
#parse("Appendix.html")
<footer>
<p>
Deposit agreement
<br>
Version 26-09-2019
</p>
<p>
DANS promotes sustainable access to digital research data.
See <a href="www.dans.knaw.nl">www.dans.knaw.nl</a> for more information.
</p>
<p>
<strong>Data Archiving and Networked Services (DANS)</strong>
<br>
Anna van Saksenlaan 51 | 2593 HW Den Haag
<br>
070 349 44 50 | [email protected] | www.dans.knaw.nl
<br>
CoC 54667089 | DANS is an institute of KNAW and NWO
</p>
</footer>

</body>
</html>
Loading

0 comments on commit 24c706f

Please sign in to comment.