Skip to content

Commit

Permalink
Merge pull request IQSS#10479 from IQSS/mdc-process-files-via-script
Browse files Browse the repository at this point in the history
new script for make data count log processing as well as updated docu…
  • Loading branch information
landreev authored Aug 6, 2024
2 parents c5a6a8f + a18c6e4 commit 85e0fc7
Show file tree
Hide file tree
Showing 7 changed files with 23 additions and 20 deletions.
3 changes: 3 additions & 0 deletions doc/release-notes/make-data-count-.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
### Counter Processor 1.05 Support

This release includes support for counter-processor-1.05 for processing Make Data Count metrics. If you are running Make Data Counts support, you should reinstall/reconfigure counter-processor as described in the latest Guides. (For existing installations, note that counter-processor-1.05 requires a Python3, so you will need to follow the full counter-processor install. Also note that if you configure the new version the same way, it will reprocess the days in the current month when it is first run. This is normal and will not affect the metrics in Dataverse.)
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/_static/util/counter_daily.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#! /bin/bash

COUNTER_PROCESSOR_DIRECTORY="/usr/local/counter-processor-0.1.04"
COUNTER_PROCESSOR_DIRECTORY="/usr/local/counter-processor-1.05"
MDC_LOG_DIRECTORY="/usr/local/payara6/glassfish/domains/domain1/logs/mdc"

# counter_daily.sh
Expand Down
8 changes: 4 additions & 4 deletions doc/sphinx-guides/source/admin/make-data-count.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Architecture

Dataverse installations who would like support for Make Data Count must install `Counter Processor`_, a Python project created by California Digital Library (CDL) which is part of the Make Data Count project and which runs the software in production as part of their `DASH`_ data sharing platform.

.. _Counter Processor: https://github.com/CDLUC3/counter-processor
.. _Counter Processor: https://github.com/gdcc/counter-processor
.. _DASH: https://cdluc3.github.io/dash/

The diagram below shows how Counter Processor interacts with your Dataverse installation and the DataCite hub, once configured. Dataverse installations using Handles rather than DOIs should note the limitations in the next section of this page.
Expand Down Expand Up @@ -84,9 +84,9 @@ Configure Counter Processor

* Change to the directory where you installed Counter Processor.

* ``cd /usr/local/counter-processor-0.1.04``
* ``cd /usr/local/counter-processor-1.05``

* Download :download:`counter-processor-config.yaml <../_static/admin/counter-processor-config.yaml>` to ``/usr/local/counter-processor-0.1.04``.
* Download :download:`counter-processor-config.yaml <../_static/admin/counter-processor-config.yaml>` to ``/usr/local/counter-processor-1.05``.

* Edit the config file and pay particular attention to the FIXME lines.

Expand All @@ -99,7 +99,7 @@ Soon we will be setting up a cron job to run nightly but we start with a single

* Change to the directory where you installed Counter Processor.

* ``cd /usr/local/counter-processor-0.1.04``
* ``cd /usr/local/counter-processor-1.05``

* If you are running Counter Processor for the first time in the middle of a month, you will need create blank log files for the previous days. e.g.:

Expand Down
6 changes: 3 additions & 3 deletions doc/sphinx-guides/source/developers/make-data-count.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Make Data Count
===============

Support for Make Data Count is a feature of the Dataverse Software that is described in the :doc:`/admin/make-data-count` section of the Admin Guide. In order for developers to work on the feature, they must install Counter Processor, a Python 3 application, as described below. Counter Processor can be found at https://github.com/CDLUC3/counter-processor
Support for Make Data Count is a feature of the Dataverse Software that is described in the :doc:`/admin/make-data-count` section of the Admin Guide. In order for developers to work on the feature, they must install Counter Processor, a Python 3 application, as described below. Counter Processor can be found at https://github.com/gdcc/counter-processor

.. contents:: |toctitle|
:local:
Expand Down Expand Up @@ -49,7 +49,7 @@ Once you are done with your configuration, you can run Counter Processor like th

``su - counter``

``cd /usr/local/counter-processor-0.1.04``
``cd /usr/local/counter-processor-1.05``

``CONFIG_FILE=counter-processor-config.yaml python39 main.py``

Expand Down Expand Up @@ -82,7 +82,7 @@ Second, if you are also sending your SUSHI report to Make Data Count, you will n

``curl -H "Authorization: Bearer $JSON_WEB_TOKEN" -X DELETE https://$MDC_SERVER/reports/$REPORT_ID``

To get the ``REPORT_ID``, look at the logs generated in ``/usr/local/counter-processor-0.1.04/tmp/datacite_response_body.txt``
To get the ``REPORT_ID``, look at the logs generated in ``/usr/local/counter-processor-1.05/tmp/datacite_response_body.txt``

To read more about the Make Data Count api, see https://github.com/datacite/sashimi

Expand Down
16 changes: 8 additions & 8 deletions doc/sphinx-guides/source/installation/prerequisites.rst
Original file line number Diff line number Diff line change
Expand Up @@ -428,7 +428,7 @@ firewalled from your Dataverse installation host).
Counter Processor
-----------------

Counter Processor is required to enable Make Data Count metrics in a Dataverse installation. See the :doc:`/admin/make-data-count` section of the Admin Guide for a description of this feature. Counter Processor is open source and we will be downloading it from https://github.com/CDLUC3/counter-processor
Counter Processor is required to enable Make Data Count metrics in a Dataverse installation. See the :doc:`/admin/make-data-count` section of the Admin Guide for a description of this feature. Counter Processor is open source and we will be downloading it from https://github.com/gdcc/counter-processor

Installing Counter Processor
============================
Expand All @@ -438,9 +438,9 @@ A scripted installation using Ansible is mentioned in the :doc:`/developers/make
As root, download and install Counter Processor::

cd /usr/local
wget https://github.com/CDLUC3/counter-processor/archive/v0.1.04.tar.gz
tar xvfz v0.1.04.tar.gz
cd /usr/local/counter-processor-0.1.04
wget https://github.com/gdcc/counter-processor/archive/refs/tags/v1.05.tar.gz
tar xvfz v1.05.tar.gz
cd /usr/local/counter-processor-1.05

Installing GeoLite Country Database
===================================
Expand All @@ -451,7 +451,7 @@ The process required to sign up, download the database, and to configure automat

As root, change to the Counter Processor directory you just created, download the GeoLite2-Country tarball from MaxMind, untar it, and copy the geoip database into place::

<download or move the GeoLite2-Country.tar.gz to the /usr/local/counter-processor-0.1.04 directory>
<download or move the GeoLite2-Country.tar.gz to the /usr/local/counter-processor-1.05 directory>
tar xvfz GeoLite2-Country.tar.gz
cp GeoLite2-Country_*/GeoLite2-Country.mmdb maxmind_geoip

Expand All @@ -461,12 +461,12 @@ Creating a counter User
As root, create a "counter" user and change ownership of Counter Processor directory to this new user::

useradd counter
chown -R counter:counter /usr/local/counter-processor-0.1.04
chown -R counter:counter /usr/local/counter-processor-1.05

Installing Counter Processor Python Requirements
================================================

Counter Processor version 0.1.04 requires Python 3.7 or higher. This version of Python is available in many operating systems, and is purportedly available for RHEL7 or CentOS 7 via Red Hat Software Collections. Alternately, one may compile it from source.
Counter Processor version 1.05 requires Python 3.7 or higher. This version of Python is available in many operating systems, and is purportedly available for RHEL7 or CentOS 7 via Red Hat Software Collections. Alternately, one may compile it from source.

The following commands are intended to be run as root but we are aware that Pythonistas might prefer fancy virtualenv or similar setups. Pull requests are welcome to improve these steps!

Expand All @@ -477,7 +477,7 @@ Install Python 3.9::
Install Counter Processor Python requirements::

python3.9 -m ensurepip
cd /usr/local/counter-processor-0.1.04
cd /usr/local/counter-processor-1.05
pip3 install -r requirements.txt

See the :doc:`/admin/make-data-count` section of the Admin Guide for how to configure and run Counter Processor.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -117,10 +117,10 @@ public class DatasetMetrics implements Serializable {
* For an example of sending various metric types (total-dataset-requests,
* unique-dataset-investigations, etc) for a given month (2018-04) per
* country (DK, US, etc.) see
* https://github.com/CDLUC3/counter-processor/blob/5ce045a09931fb680a32edcc561f88a407cccc8d/good_test.json#L893
* https://github.com/gdcc/counter-processor/blob/5ce045a09931fb680a32edcc561f88a407cccc8d/good_test.json#L893
*
* counter-processor uses GeoLite2 for IP lookups according to their
* https://github.com/CDLUC3/counter-processor#download-the-free-ip-to-geolocation-database
* https://github.com/gdcc/counter-processor#download-the-free-ip-to-geolocation-database
*/
@Column(nullable = true)
private String countryCode;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,15 +27,15 @@
* How to Make Your Data Count July 10th, 2018).
*
* The recommended starting point to implement Make Data Count is
* https://github.com/CDLUC3/Make-Data-Count/blob/master/getting-started.md
* https://github.com/gdcc/Make-Data-Count/blob/master/getting-started.md
* which specifically recommends reading the "COUNTER Code of Practice for
* Research Data" mentioned in the user facing docs.
*
* Make Data Count was first implemented in DASH. Here's an example dataset:
* https://dash.ucmerced.edu/stash/dataset/doi:10.6071/M3RP49
*
* For processing logs we could try DASH's
* https://github.com/CDLUC3/counter-processor
* https://github.com/gdcc/counter-processor
*
* Next, DataOne implemented it, and you can see an example dataset here:
* https://search.dataone.org/view/doi:10.5063/F1Z899CZ
Expand Down

0 comments on commit 85e0fc7

Please sign in to comment.