-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* DEV-1258 Update README.md Both @moseshll and @Ronster2018 updated and expanded on several sections of the Readme. --------- Co-authored-by: K'Ron Simmons <[email protected]>
- Loading branch information
1 parent
38dd950
commit d75af3a
Showing
1 changed file
with
114 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,52 +1,146 @@ | ||
# CRMS: Copyright Review Management System | ||
<br/> | ||
<p align="center"> | ||
CRMS: Copyright Review Management System | ||
|
||
![Run Tests](https://github.com/hathitrust/crms/workflows/Run%20Tests/badge.svg) [![Coverage Status](https://coveralls.io/repos/github/hathitrust/crms/badge.svg?branch=main)](https://coveralls.io/github/hathitrust/crms?branch=main) | ||
|
||
A web app and suite of tools for performing copyright review projects. | ||
</p> | ||
<br/> | ||
<br/> | ||
|
||
![Run Tests](https://github.com/hathitrust/crms/workflows/Run%20Tests/badge.svg) [![Coverage Status](https://coveralls.io/repos/github/hathitrust/crms/badge.svg?branch=main)](https://coveralls.io/github/hathitrust/crms?branch=main) | ||
|
||
A web app and suite of tools for performing copyright review projects. | ||
|
||
[Copyright Review Program at HathiTrust](https://www.hathitrust.org/copyright-review "HathiTrust CRMS home") | ||
## Table Of Contents | ||
|
||
[Internal University of Michigan Confluence Page](https://tools.lib.umich.edu/confluence/display/HAT/CRMS+System "Internal University of Michigan Confluence Page") | ||
* [About the Project](#about-the-project) | ||
* [Built With](#built-with) | ||
* [Prerequisites](#prerequisites) | ||
* [Installation](#installation) | ||
* [Project Structure](#project-structure) | ||
* [Functionality](#functionality) | ||
* [Usage](#usage) | ||
* [Tests](#tests) | ||
* [Hosting](#hosting) | ||
* [Resources](#resources) | ||
|
||
## About The Project | ||
CRMS is a web app and a suite of tools for performing copyright review projects. | ||
The primary purpose of the user interface portion is to allow trained reviewers to | ||
navigate, research, and enter data relevant to the ultimate disposition of HathiTrust | ||
materials whose copyright status can be investigated. CRMS also provides a convenient | ||
place to record licensing agreements (e.g., Creative Commons) with rights holders. | ||
|
||
Copyright and licensing determinations ultimately get exported as text files for | ||
insertion into the Rights Database by parts of the HathiTrust infrastructure external | ||
to CRMS. CRMS does not have write access to the Rights Database. | ||
|
||
The "suite of tools" refers to scripts that typically run as cron jobs to manage workflows. | ||
A copyright review proceeds via several stages (typically more than one person submits | ||
data independently on a volume) so there is a "nightly processing" stage which moves | ||
things along. | ||
|
||
There are also scripts which only run (manually or otherwise) yearly, at around the time | ||
of the January 1 "public domain day" rollover at which time a swathe of works falls into | ||
the public domain or some other rights category. | ||
|
||
## Built With | ||
- Perl | ||
- MariaDB | ||
- [Template Toolkit](https://template-toolkit.org/) | ||
|
||
## Prerequisites | ||
* Docker | ||
* Git ssh access to the repository | ||
|
||
## Installation | ||
1. Clone the repo & cd into `crms/` | ||
```sh | ||
git clone [email protected]:hathitrust/crms.git | ||
cd crms/ | ||
``` | ||
|
||
2. Pull the Post Zephir Processing Repo as a sub-module | ||
|
||
```sh | ||
git submodule init | ||
git submodule update | ||
``` | ||
|
||
3. Stand up the docker environment and run the tests. | ||
- This will run two database services and make sure that the local MariaDB connections are healthy. | ||
``` | ||
docker compose build | ||
docker compose up -d mariadb mariadb-ht | ||
docker compose run --rm test | ||
``` | ||
|
||
By default the `test` service produces a `Devel::Cover` HTML report using | ||
`scripts/test.sh`. The other script, `scripts/test_and_cover.sh`, is for upload to | ||
Coveralls and is used in the GitHub action. | ||
|
||
## What is Where | ||
|
||
### Project Structure | ||
- `bib_rights` two miscellaneous scripts related to bibliographic rights. | ||
- `bin` For the most part these are actions and reports run as cron jobs | ||
- `cgi` Main entry point `cgi/crms` as well as Perl modules and view templates | ||
- This is the directory most in need of reorganization. In future much of its | ||
content will be migrated to `lib` and `views`. | ||
- `docker` Database seeds | ||
- `lib` Perl modules (new development and refactored modules from `cgi`) | ||
- `prep` Destination for some log files and reports | ||
- `scripts` Testing wrappers run as part of development or by GitHub | ||
- `t` Tests | ||
- `web` Static assets including images, JS, CSS | ||
|
||
`cgi` is the directory most in need of reorganization. In future much of its | ||
content will be migrated to `lib` and `views`. | ||
|
||
## Hello World | ||
## Functionality | ||
- CRMS has its own database, called `crms`, to which it has write access. | ||
- CRMS also uses the `ht` database/view to which it only has read access. | ||
- In some cases it connects to the `ht_repository` database -- this is a legacy | ||
usage and we should standardize on using only `ht` wherever possible. | ||
- CRMS does not follow an MVC architecture. Most functionality is exposed via a | ||
top-level `CRMS` object (`cgi/CRMS.pm`). | ||
- Lacking an object-oriented data model, most operations are or can be accomplished | ||
with the MySQL wrappers exposed by the `CRMS` object: `SelectAll` (aref), | ||
`SimpleSqlGet` (single value), and `PrepareSubmitSql` (for `INSERT`s). | ||
- Routines that retrieve database data tend to return data structures rather than | ||
objects. For example, `$crms->Menus()` | ||
- `cgi/crms` is the main entry point for the CRMS web app. | ||
- There are a few other, more API-like, scripts in the same directory. They can | ||
be identified by a lack of file extension. Most return JSON. | ||
|
||
Most functionality is exposed via the top-level `CRMS` object, including | ||
the all-important `SelectAll` (aref), `SimpleSqlGet` (single value), and | ||
`PrepareSubmitSql` wrappers around `DBI` functions. | ||
|
||
## Usage | ||
Create a CRMS object. | ||
```perl | ||
# SDRROOT is the critical proprioceptive environment variable | ||
use lib $ENV{'SDRROOT'} . '/crms/cgi'; | ||
use CRMS; | ||
my $crms = CRMS->new; | ||
# List all the HTIDs and their priority in project 1 (Core) | ||
``` | ||
|
||
Basic database operation: list all the HTIDs and their priority in project 1 (Core) queue | ||
```perl | ||
my $sql = "SELECT id,priority FROM queue WHERE project=?"; | ||
my $aref = $crms->SelectAll($sql, 1); | ||
print "$_->[0], $_->[1]\n" for @$aref; | ||
say "$_->[0], $_->[1]" for @$aref; | ||
``` | ||
|
||
Get the current rights for a volume. | ||
```perl | ||
say $crms->CurrentRightsString("mdp.35112101180794"); | ||
# "pd/add" | ||
``` | ||
|
||
## Tests | ||
- Tests are run by the previously-mentioned `docker compose run --rm test` command. | ||
- By default the `test` service produces a `Devel::Cover` HTML report using | ||
`scripts/test.sh`. The other script, `scripts/test_and_cover.sh`, is for upload to | ||
Coveralls and is used in the GitHub action. | ||
|
||
## Hosting | ||
- This is not hosted in the kubernetes cluster. | ||
- Hosted in the Hathitrust web servers like the other Babel Applications. | ||
|
||
## Resources | ||
- Two of the cron jobs (`bin/pdd_collection*`) talk to a Collection Builder script which is part of | ||
the [monolithic babel app](https://github.com/hathitrust/babel/blob/7865e2516727ee7c6351c1bfe192ce29b7b442f7/mb/scripts/batch-collection.pl). | ||
- Bibliographic queries are run against the HathiTrust Bib API in `cgi/Metadata.pm`. | ||
- This also relies on [Post Zephir Processing](https://github.com/hathitrust/post_zephir_processing/) | ||
- [Copyright Review Program at HathiTrust](https://www.hathitrust.org/copyright-review "HathiTrust CRMS home") | ||
- [Internal University of Michigan Confluence Page](https://tools.lib.umich.edu/confluence/display/HAT/CRMS+System "Internal University of Michigan Confluence Page") |