Skip to content

CHESSComputing/DataBookkeeping

Repository files navigation

DataBookkeeping Service

build status go report card godoc

Data Bookkeeping service

APIs

public APIs

  • /datasets get all datasets
  • /files get files for a given did
  • /dataset/*name get dataset with given name
  • /file/*name get file with given name
  • /provenance get provenance information about given did

Example

Here are examples of GET HTTP requests

# look-up all datasets
curl -v http://localhost:8310/datasets

# look-up concrete dataset=/x/y/z
dataset=/x/y/z
curl -v http://localhost:8310/dataset$dataset

# look-up files from a dataset
curl -v "http://localhost:8310/file?dataset=$dataset"

protected APIs

  • HTTP POST requests
    • /dataset create new dataset data
    • /file create new file data
  • HTTP PUT requests
    • /dataset update dataset data
    • /file update file data
  • HTTP DELETE requests
    • /dataset/*name delete dataset
    • /file/*name delete file

Example

Here is an example of HTTP POST request

# record.json
{
    "parent_did": "/beamline=aaa/btr=bbb/cycle=ccc/sample_name=sss",
    "did": "/beamline=aaa/btr=bbb/cycle=ccc/sample_name=sss/test=child",
    "processing": "processing string, e.g. glibc-123-python-123",
    "osinfo": {"name": "linux-cc7", "kernel": "1-2-3", "version": "cc7-123"},
    "environments": [
      {"name": "galaxy", "version": "version", "details": "details",
          "parent_environment": "conda-123", "os_name": "linux-cc7"},
      {"name": "conda-123", "version": "version", "details": "details",
          "parent_environment": null, "os_name": "linux-cc7",
          "packages": [
              {"name": "numpy", "version": "123"},
              {"name": "matplotlib", "version": "987"}
          ]
      }
    ],
    "scripts": [
      {"name": "reader", "options": "-reader_options", "parent_script": null, "order_idx": 1},
      {"name": "chap", "options": "-chap_options", "parent_script": "myscript", "order_idx": 2}
    ],
    "input_files": [
      {"name": "/tmp/file1.png"},
      {"name": "/tmp/file2.png"}
    ],
    "output_files": [
      {"name": "/tmp/file1.png"}
    ],
    "site": "Cornell",
    "buckets": ["bucketABC"]
}

# inject new record
curl -v -X POST -H "Authorization: Bearer $token" \
    -H "Content-type: application/json" \
    -d@./record.json \
    http://localhost:8310/dataset

For more (and up-to-date examples) please see data integration area of this repository and look-up JSON input in int_provenance.json file.