Skip to content
This repository has been archived by the owner on May 7, 2019. It is now read-only.

Commit

Permalink
Merge pull request #1 from david4096/master
Browse files Browse the repository at this point in the history
Update readme
  • Loading branch information
david4096 authored Feb 7, 2018
2 parents 2ead228 + a047270 commit e2fe283
Show file tree
Hide file tree
Showing 3 changed files with 468 additions and 29 deletions.
115 changes: 95 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,92 @@
# dos-indexd-lambda
# dos-dss-lambda

Presents indexd data over GA4GH compliant methods.
This presents an [Amazon Lambda](https://aws.amazon.com/lambda/) microservice
following the [Data Object Service](https://github.com/ga4gh/data-object-service-schemas).
It allows data in the [Human Cell Atlas Data Store](https://github.com/HumanCellAtlas/data-store)
to be accessed using Data Object Service APIs.

## Using the service

A development version of this service is available at https://spbnq0bc10.execute-api.us-west-2.amazonaws.com/api/ .
To make proper use of the service, one can either use cURL or an HTTP client to write API requests
following the [OpenAPI description](https://spbnq0bc10.execute-api.us-west-2.amazonaws.com/api/swagger.json).

```
+------------------+ +--------------+ +--------+
| ga4gh-dos-client |------|dos-dss-lambda|--------|DSS API |
+--------|---------+ +--------------+ +--------+
| |
| |
|------------------swagger.json
```
# Will request the first page of Data Bundles from the service.
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{}' 'https://spbnq0bc10.execute-api.us-west-2.amazonaws.com/api/ga4gh/dos/v1/databundles/list'
```

We have created a lambda that creates a lightweight layer that can be used
to access data in the HCA DSS using GA4GH libraries.
There is also a Python client available, that makes it easier to use the service from code.

The lambda accepts DOS requests and converts them into requests against
DSS endpoints. The results are then translated into DOS style messages before
being returned to the client.
```
from ga4gh.dos.client import Client
client = Client("https://spbnq0bc10.execute-api.us-west-2.amazonaws.com/api")
local_client = client.client
models = client.models
local_client.ListDataBundles(body={}).result()
```

To make it easy for developers to create clients against this API, the Open API
description is made available.
For more information refer to the [Data Object Service](https://github.com/ga4gh/data-object-service-schemas).

## Development

### Status

## Try it out!
This software is being actively developed to provide the greatest level of feature parity
between DOS and DSS. It also presents an area to explore features that might extend the DOS
API. Current development items can be seen in [the Issues](https://github.com/DataBiosphere/dos-dss-lambda/issues).

Install chalice: `pip install chalice` and try it out yourself!
### Feature development

The Data Object Service can present many of the features of the DSS API naturally. This
lambda should present a useful client for the latest releases of the DSS API.

In addition, the DOS schemas may be extended to present available from the DSS, but
not from DOS.

#### DSS Features

* Subscriptions
* Authentication
* Querying
* Storage management

#### DOS Features

* File listing
* The DSS API presents bundle oriented indices and so listing all the details of files
can be a challenge.
* Filter by URL
* Retrieve bundle entries by their URL to satisfy the DOS List request.

### Installing and Deploying

The gateway portion of the AWS Lambda microservice is provided by chalice. So to manage
deployment and to develop you'll need to install chalice.

Once you have installed chalice, you can download and deploy your own version of the
service.

```
git clone https://github.com/david4096/dos-dss-lambda.git
pip install chalice
git clone https://github.com/DataBiosphere/dos-dss-lambda.git
cd dos-dss-lambda
chalice deploy
```

Chalice will return a HTTP location that you can issue DOS requests to. You can then use
HTTP requests in the style of the [Data Object Service](https://ga4gh.github.io/data-object-service-schemas).

### Accessing data using DOS client

This will return you a URL you can make GA4GH DOS requests against!
A Python client for the Data Object Service is made available [here](https://github.com/ga4gh/data-object-service-schemas/blob/master/python/ga4gh/dos/client.py).
Install this client and then view the example in [Example Usage](https://github.com/DataBiosphere/dos-dss-lambda/example-usage.ipynb).
This notebook will guide you through basic read access to data in the DSS via DOS.

### Issues

If you have a problem accessing the service or deploying it for yourself, please head
over to [the Issues](https://github.com/DataBiosphere/dos-dss-lambda/issues) to let us know!


## TODO
Expand All @@ -41,3 +95,24 @@ This will return you a URL you can make GA4GH DOS requests against!
* Error handling
* Aliases
* Filter by URL

```
+------------------+ +--------------+ +--------+
| ga4gh-dos-client |------|dos-dss-lambda|--------|DSS API |
+--------|---------+ +--------------+ +--------+
| |
| |
|------------------swagger.json
```

We have created a lambda that creates a lightweight layer that can be used
to access data in the HCA DSS using GA4GH libraries.

The lambda accepts DOS requests and converts them into requests against
DSS endpoints. The results are then translated into DOS style messages before
being returned to the client.

To make it easy for developers to create clients against this API, the Open API
description is made available.


32 changes: 23 additions & 9 deletions app.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ def make_urls(object_id, path):
"""
replicas = ['aws', 'azure']
urls = map(
lambda replica: {'url' : '{}/{}/{}?replica={}'.format(DSS_URL, path, object_id, replica)},
lambda replica: {'url' : '{}/{}/{}?replica={}'.format(
DSS_URL, path, object_id, replica)},
replicas)
return urls

Expand Down Expand Up @@ -115,21 +116,32 @@ def list_data_bundles():
if req_body and (req_body.get('page_token', None)):
page_token = req_body.get('page_token')
if page_token:
res = requests.post("{}/search?replica=aws&per_page={}&_scroll_id={}".format(DSS_URL, per_page, page_token), json={'es_query': {}})
res = requests.post(
"{}/search?replica=aws&per_page={}&_scroll_id={}".format(
DSS_URL, per_page, page_token), json={'es_query': {}})
else:
res = requests.post("{}/search?replica=aws&per_page={}".format(DSS_URL, per_page), json={'es_query': {}})
res = requests.post(
"{}/search?replica=aws&per_page={}".format(
DSS_URL, per_page), json={'es_query': {}})
# We need to page using the github style
if res.links.get('next', None):
try:
# first _scroll_id item of the query string in the link
# header of the response
next_page_token = urlparse.parse_qs(
urlparse.urlparse(res.links['next']['url']).query)['_scroll_id']
urlparse.urlparse(
res.links['next']['url']).query)['_scroll_id'][0]
except Exception as e:
print(e)
# And convert the fqid message into a DOS id and version
response = {}
response['next_page_token'] = next_page_token
response['data_bundles'] = map(dss_list_bundle_to_dos, res.json()['results'])
return response
try:
response['data_bundles'] = map(dss_list_bundle_to_dos, res.json()['results'])
except Exception as e:
response = e
finally:
return response

@app.route('/ga4gh/dos/v1/databundles/{data_bundle_id}', methods=['GET'], cors=True)
def get_data_bundle(data_bundle_id):
Expand All @@ -143,10 +155,12 @@ def get_data_bundle(data_bundle_id):
if app.current_request.query_params:
version = app.current_request.query_params.get('version', None)
if version:
res = requests.get("{}/bundles/{}?replica=aws&version={}".format(DSS_URL, data_bundle_id, version)).json()
res = requests.get("{}/bundles/{}?replica=aws&version={}".format(
DSS_URL, data_bundle_id, version)).json()
else:
res = requests.get("{}/bundles/{}?replica=aws".format(DSS_URL, data_bundle_id)).json()
return dss_bundle_to_dos(res['bundle'])
res = requests.get(
"{}/bundles/{}?replica=aws".format(DSS_URL, data_bundle_id)).json()
return {'data_bundle': dss_bundle_to_dos(res['bundle'])}



Expand Down
Loading

0 comments on commit e2fe283

Please sign in to comment.