mod-erm-usage-harvester

This software is distributed under the terms of the Apache License, Version 2.0. See the file "LICENSE" for more information.

Introduction

Module for harvesting counter reports.

Requirements

The module needs to know about the Okapi URL (see here).
For scheduled harvesting you need to provide user credentials (see here).
Environment variables for database connectivity need to be provided (see here).

Installation

$ git clone ...
$ cd mod-erm-usage-harvester
$ mvn clean install

Run plain jar

$ env OKAPI_URL=http://127.0.0.1:9130 java -jar \
  mod-erm-usage-harvester-bundle/target/mod-erm-usage-harvester-bundle-fat.jar

Run via Docker

Build docker image

$ docker build -t mod-erm-usage-harvester .

Run docker image

$ docker run -e OKAPI_URL=http://127.0.0.1:9130 -p 8081:8081 mod-erm-usage-harvester

Configuration

Listening port

The default listening port is 8081 and can be set by using -Dhttp.port parameter when running the jar file or using the -p flag when using docker run.

Setting the Okapi URL

Use the environment variable named OKAPI_URL to provide the URL to Okapi.

Proxy configuration

Proxy settings are configured via JVM system properties if you are running the plain jar.

http.proxyHost, http.proxyPort, https.proxyHost, https.proxyPort, http.nonProxyHosts

And via environment variables if you are running the Docker container.

HTTP_PROXY, HTTPS_PROXY, NO_PROXY
These get translated into JVM system properties by the base image.

Quartz scheduler

Quartz configuration is located in quartz.properties . If you wish to use another file, you must define the system property org.quartz.properties to point to the file you want. You can also set individual quartz properties using system properties ( e.g. -Dorg.quartz.threadPool.threadCount=8). The org.quartz.threadPool.threadCount property controls how many providers are harvested concurrently.

Hazelcast

The default Quartz configuration uses the HazelcastJobStore for clustering which relies on Hazelcast. By default the standard configuration shipped with hazelcast is used. You can supply your own XML or YAML configuration through the hazelcast.config system property or just put it into the working directory. If you're using clustering, make sure that member discovery is working by inspecting the logs. You might want to tailor the Hazelcast configuration to suit your particular deployment environment. You can read about Hazelcast discovery mechanisms here.

Periodic harvesting

Periodic harvesting requires the module to log in using user credentials. These credentials are defined separately for each tenant via the environment variables {TENANT}_USER_NAME and {TENANT}_USER_PASS, where {TENANT} serves as a placeholder for the tenant ID and must be in uppercase. The user also needs the ermusageharvester.start-all.get permission.

Example for tenant 'diku':

DIKU_USER_NAME=mod-erm-usage-harvester
DIKU_USER_PASS=password123

Periodic harvesting is set up through the erm-usage-harvester/periodic API. Configuration is done for each tenant separately by using the X-Okapi-Tenant header. See PeriodicConfig and periodic.raml.

Example:

curl --request POST \
  --url http://localhost:9130/erm-usage-harvester/periodic \
  --header 'content-type: application/json' \
  --header 'x-okapi-tenant: diku' \
  --data '{
  "startAt": "2019-01-01T08:00:00.000+0000",
  "periodicInterval": "daily"
}'

This request will create a schedule which triggers harvesting for tenant diku each day at 8am UTC starting on 2019-01-01.

Note: Using "periodicInterval: "monthly" and startAt with days > 28 will result in a 'last day of month' schedule.

Example 2:

{
  "startAt": "2019-01-29T08:00:00.000+0000",
  "periodicInterval": "monthly"
}

This configuration will trigger harvesting every last day of month at 8am UTC starting on 2019-01-31 followed by 2019-02-28, 2019-03-31, 2019-04-30, ... .

ServiceEndpoint implementations

The ServiceEndpoint implementation defines how reports are fetched for a provider. To provide additional implementations you will need to implement the ServiceEndpointProvider interface and make it available on the classpath.

So far 3 implementations are provided:

mod-erm-usage-harvester-cs41 – Counter Sushi 4.1
mod-erm-usage-harvester-cs50 – Counter Sushi 5.0 API
mod-erm-usage-harvester-nss – Germanys National Statistics Server

Implementations available at runtime can be listed at /erm-usage-harvester/impl.

{
  "implementations": [
    {
      "name": "Counter-Sushi 4.1",
      "description": "SOAP-based implementation for CounterSushi 4.1",
      "type": "cs41",
      "isAggregator": false
    },
    {
      "name": "Counter-Sushi 5.0",
      "description": "Implementation for Counter/Sushi 5",
      "type": "cs50",
      "isAggregator": false
    },
    {
      "name": "Nationaler Statistikserver",
      "description": "Implementation for Germanys National Statistics Server (https://sushi.redi-bw.de).",
      "type": "NSS",
      "isAggregator": true,
      "configurationParameters": [
        "apiKey",
        "requestorId",
        "customerId",
        "reportRelease"
      ]
    }
  ]
}

mod-erm-usage-harvester-cs50

Request parameters

To enable the creation of standard views, master reports are retrieved with the following additional parameters:

Report	Attributes_To_Show	Include_Parent_Details
DR	Data_Type\|Access_Method
IR	Authors\|Publication_Date\|Article_Version\|Data_Type\|YOP\|Access_Type\|Access_Method	True
PR	Data_Type\|Access_Method
TR	Data_Type\|Section_Type\|YOP\|Access_Type\|Access_Method

Example:
/reports/dr?requestor_id=xxx&customer_id=xxx&begin_date=2021-01&end_date=2021-12&attributes_to_show=Data_Type|Access_Method

Additional processing

Due to providers responding in various ways the provider response is intercepted and adjusted before processing.
This is nescessary as some providers use 2xx status codes to send sushi errors, but the generated client expects 2xx codes to return counter reports and different codes to return sushi errors.
So if reponses with status code 2xx are received, it is checked whether the response data structure matches one of the 4 counter master reports (TR, PR, DR and IR). If it does match, no changes are made to the response. If it does not match, the response gets transformed into a 400 - Bad Request response, preserving the original response body in cases listed below.

Some observations and how they are handled so far:

Providers use 2xx status codes to return sushi errors, not reports (gets routed and handled as 400 with original response body)
Providers return sushi errors as array instead of object (array makes it into the response body)
Providers return "null" instead of sushi error (returns a InvalidReportException: null)
Providers return reports with a Report_Header that contains a Exception object instead of a Exceptions array (not handled, will be interpreted as report without Exceptions)

Additional information

Issue tracker

See project MODEUSHARV at the FOLIO issue tracker.

Name		Name	Last commit message	Last commit date
Latest commit History 796 Commits
.github/workflows		.github/workflows
assets		assets
descriptors		descriptors
mod-erm-usage-harvester-bundle		mod-erm-usage-harvester-bundle
mod-erm-usage-harvester-core		mod-erm-usage-harvester-core
mod-erm-usage-harvester-cs41		mod-erm-usage-harvester-cs41
mod-erm-usage-harvester-cs50		mod-erm-usage-harvester-cs50
mod-erm-usage-harvester-cs51		mod-erm-usage-harvester-cs51
mod-erm-usage-harvester-nss		mod-erm-usage-harvester-nss
mod-erm-usage-harvester-spi		mod-erm-usage-harvester-spi
ramls		ramls
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
NEWS.md		NEWS.md
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mod-erm-usage-harvester

Introduction

Requirements

Installation

Run plain jar

Run via Docker

Build docker image

Run docker image

Configuration

Listening port

Setting the Okapi URL

Proxy configuration

Quartz scheduler

Hazelcast

Periodic harvesting

ServiceEndpoint implementations

mod-erm-usage-harvester-cs50

Request parameters

Additional processing

Additional information

Issue tracker

Other documentation

About

Releases 38

Packages

Contributors 8

Languages

License

folio-org/mod-erm-usage-harvester

Folders and files

Latest commit

History

Repository files navigation

mod-erm-usage-harvester

Introduction

Requirements

Installation

Run plain jar

Run via Docker

Build docker image

Run docker image

Configuration

Listening port

Setting the Okapi URL

Proxy configuration

Quartz scheduler

Hazelcast

Periodic harvesting

ServiceEndpoint implementations

mod-erm-usage-harvester-cs50

Request parameters

Additional processing

Additional information

Issue tracker

Other documentation

About

Resources

License

Stars

Watchers

Forks

Releases 38

Packages 0

Contributors 8

Languages

Packages