Releases · ENCODE-DCC/caper

31 Mar 17:42

leepc12

v0.8.0

533b060

v0.8.0

Parameters

Deprecated parameters:

--use-netrc: Autouri defaults to use ~/.netrc.
--http-user and --http-password: Use ~/.netrc to access private URLs

Change of parameters:

--use-gsutil-over-aws-s3 -> --use-gsutil-for-s3: Autouri uses gsutil CLI only for direct transfer between S3 and GCS buckets. Otherwise, it always use Python libraries like google-cloud-storage and boto3.

Added parameters:

--debug and --verbose: For better logging.

New features

Localization and preventing repetitive file transfer

When a new localization module makes a copy of source on destination cache directory, it compares md5 hash of source and destination if a file already exists on destination. All bucket URIs (s3://, gs://) and most of URLs provide md5 hash information in their headers. If md5 hash of those match, Caper skips unnecessary file transfer. For local paths, Caper calculate md5 hash of them and store md5 hash string in .md5 file since md5 hash calculation is expensive. This happens only when Caper writes on a local storage (i.e. when localizing files on local cache). .md5 file is not valid if its modification time (mtime) is older than the file itself.
If md5sum comparison fails, Caper compares file sizes and mtimes instead. If file sizes match and mtime is newer for destination then Caper skips file transfer.

File locking

Caper uses a stable file locking tested up to multiple threads (50 for local, 10 for cloud URIs) competing to write on the same file.

Automatic subworkflow zipping

Fixed bugs in old auto-zipping module.
Caper can automatically zip subworkflow WDLs imported in the main WDL. It can also be manullay defined by users in command line arguments --imports. Caper will skip auto-zipping if --imports is defined.
Enabled for caper submit only. i.e. caper run does not use any automatic subworkflow zipping since it is assumed that all sub-WDLs are already localized for caper run.

Womtool validation

If --imports is defined or there is an auto-zipped subworkflow WDLs, then Caper creates a temporary directory and put the main WDL and unpack the zip file there. And then Caper runs Womtool to validate those WDLs.
You can still skip Womtool validation with --ignore-womtool.

Assets 2

10 Mar 00:39

leepc12

v0.7.0

f4a56d0

v0.7.0

New features

caper init downloads and Cromwell/Womtool JARs and adds them to Caper's default conf file ~/.caper/default.conf (or whatever defined with caper -c) so that Caper can work completely offline once those JARs are installed.
Caper made a copy of outputs on every re-ran workflows (tasks) on GCP. Added --gcp-call-caching-dup-strat to control this behavior. It defaults back to reference instead of copy. Define --gcp-call-caching-dup-strat copy to keep making copies on re-ran (call-cached) tasks.
Caper can soft-link globbed outputs instead of hard-linking them. This is useful on file systems where hard-linking is not allowed (e.g. beeGFS). Added a flag --soft-glob-output for local backends (local, slurm, sge and pbs). This flag cannot work with docker (with --docker) or docker-based backends (gcp and aws).

Documentation

Heartbeat file and how to run multiple caper server on a single machine.
How to configure Caper for a custom backend.
Important notes for storage choices on Sherlock cluster.

Bug fixes

metadata.json in output directory/bucket is updated correctly while running and after being done.
caper list sent too many requests to get label of all workflows. Now it sends a single query to retrieve all information of workflows.

Assets 2

12 Feb 00:40

leepc12

v0.6.4

9fe97bd

v0.6.4

Improved job submission for SLURM backend (Sherlock, SCG, ...)

Fix for the following submission error when server is busy. Caper can try sbatching up to 3 times.
```
sbatch: error: Batch job submission failed: Socket timed out on send/recv operation
```

Added warning for Stanford Sherlock platform (SLURM backend)

Do not install Caper, Conda and any executable on $OAK or $SCRATCH. Install them on $HOME or $PI_HOME.

Bug fixes

Fix for w['submission'] error.

Assets 2

21 Dec 16:50

leepc12

v0.6.3

a909e32

v0.6.3

added warning for parameter tmp-dir

change in default parameters

increase default java-heap-run 2G->3G

bug fixes

check presence of metadata.json file for troubleshooting
submission = w['submission'] error for caper list

Assets 2

09 Dec 19:56

leepc12

v0.6.2

468a7e8

v0.6.2

Bug fixes

Remove leading/trailing quotes " and ' from values when reading from the conf file (e.g. ~/.caper/default.conf`). Users can use quoted strings in a conf file.

Assets 2

15 Nov 23:04

leepc12

v0.6.1

2f359ea

v0.6.1

Minor update for Croo's new feature (task graph)

Bug fixes

Permission denied issue for MySQL shell script for docker.

Updated documentation

MySQL docker

Assets 2

07 Nov 03:15

leepc12

v0.6.0

f01fc90

v0.6.0

IMPORATNT: Caper defaults back to NOT use a file-based metadata DB, which means no call-caching (re-using outputs from previous workflows) by default.

IMPORATNT: Even if you still want to use a file-based DB (--db file and --file-db [DB_PATH]), metadata DB generated from Caper<0.6 (with Cromwell-42) is not compatible with metadata DB generated from Caper>=0.6 (with Cromwell-47). Refer to this doc for such migration.

See this for details about metadata DB. Define a DB type with db= in your conf ~/.caper/default.conf to use a metadata DB.

Engine update

Upgraded default crowmell JAR version: 42 -> 47
- Some feature of Caper will only work with 47 (e.g. PostgreSQL support, some bug fixes).

Then how to choose a DB?

You can choose a DB type with --db (or db= in conf file ~/.caper/default.conf). Then define chosen DB's required parameters (nothing required for in-memory DB).
- Choices: file (unstable), mysql (recommended), postgresql and in-memory (new default but no call-caching).
mysql is recommended. We provide shell scripts (run_mysql_server_docker.sh and run_mysql_server_singularity.sh) to run a MySQL server with docker/singularity (without root).
See details at "Metadata database" section on README.

New features

Support for PostgreSQL DB for call-caching (Cromwell >= 43)

Change of parameters

New:
- --db: in-memory (default), file (unstable), mysql (recommended) or postgresql (experimental).
- --mysql-db-name: (optional) cromwell by default
- --postgresql-db-ip: localhost by default
- --postgresql-db-port: 5432 by default
- --postgresql-db-user: (optional) cromwell by default
- --postgresql-db-password: (optional) cromwell by default
- --postgresql-db-name: (optional) cromwell by default
Deprecated:
- --no-file-db: File DB is disabled by default. Many users reported that a file DB is unstable.

Bug fixes

PAPI v10 error (preemption error) on Google Cloud.
Caper didn't run WDL without --docker on cloud backends (aws and gcp).
- Some WDL has docker image definition in each task (runtime { docker : }). Users had to specify a dummy docker image --docker ubuntu:latest to bypass this error.

Assets 2

06 Nov 01:18

leepc12

v0.5.6

48f2799

v0.5.6

Bug fix

cloud backends not working due to failed localization of an input JSON file

Assets 2

05 Nov 23:15

leepc12

v0.5.5

aa5f2aa

v0.5.5

Bug fixes

womtool validation fails on remote cloud backends (gcp and aws).

Assets 2

02 Nov 21:42

leepc12

v0.5.4

641db11

v0.5.4

Validation for WDL/input JSON

added womtool validation for WDL and input JSON
useful to find missing/wrong parameters in input JSON

New parameters

--womtool: womtool JAR location (URL or path). 42 by default.
--ignore-womtool: Flag. Ignore womtool validation.

Added dict_tool.py (for new tool qc2tsv)

useful dict functions
- merge_dict(a, b): Merge dict b into dict a.
- split_dict(): Split dict into multiple dicts according to given "split_rule" (REGEX)
- flatten_dict()/unflatten_dict(): Flatten dict with a 1-level tuple key. This tuple key keeps hierachy of original dict object in it.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parameters

New features

New features

Documentation

Bug fixes

Releases: ENCODE-DCC/caper

v0.8.0

Parameters

New features

v0.7.0

New features

Documentation

Bug fixes

v0.6.4

v0.6.3

v0.6.2

v0.6.1

v0.6.0

v0.5.6

v0.5.5

v0.5.4