-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release/drs 1.5.0 #408
base: master
Are you sure you want to change the base?
Release/drs 1.5.0 #408
Changes from all commits
55ae2be
785cb9c
170a4ba
aa52de7
c533609
54284fc
c9976c9
becd4e1
42a4ba8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,11 +24,23 @@ properties: | |
An arbitrary string to be passed to the `/access` method to get an `AccessURL`. | ||
This string must be unique within the scope of a single object. | ||
Note that at least one of `access_url` and `access_id` must be provided. | ||
cloud: | ||
type: string | ||
description: >- | ||
Name of the cloud service provider that the object belongs to. | ||
If the cloud service is Amazon Web Services, Google Cloud Platform or Azure the values should be `aws`, `gcp`, or `azure` respectively. | ||
example: aws, gcp, or azure | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure how pedantic you want to be but this is more a repeat of the description and not a valid example. Maybe |
||
region: | ||
type: string | ||
description: >- | ||
Name of the region in the cloud service provider that the object belongs to. | ||
example: us-east-1 | ||
available: | ||
type: boolean | ||
description: >- | ||
Availablity of file in the cloud. | ||
This label defines if this file is immediately accessible via DRS. Any delay or requirement of thawing mechanism if the file is in offline/archival storage is classified as 0 or unavailable. | ||
example: true | ||
authorizations: | ||
allOf: | ||
- $ref: './Authorizations.yaml' | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,12 @@ properties: | |
maxBulkRequestLength: | ||
type: integer | ||
description: The max length the bullk request endpoints can handle (>= 1) before generating a 413 error e.g. how long can the arrays bulk_object_ids and bulk_object_access_ids be for this server. | ||
objectCount: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in the documentation, these stats show as top-level in the "response sample", vs in a "stats" object in the example |
||
type: integer | ||
description: The total number of objects in this DRS service. | ||
totalObjectSize: | ||
type: integer | ||
description: The total size of all objects in this DRS service in bytes. As a general best practice, file bytes are counted for each unique file and not cloud mirrors or other redundant copies. | ||
type: | ||
type: object | ||
required: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
get: | ||
summary: Retrieve information about this service | ||
description: |- | ||
Returns information about the DRS service | ||
Returns information about the DRS service along with stats pertaning to total object count and cumulative size in bytes. | ||
|
||
Extends the | ||
[v1.0.0 GA4GH Service Info specification](https://github.com/ga4gh-discovery/ga4gh-service-info) | ||
|
@@ -22,9 +22,14 @@ get: | |
... | ||
"type": { | ||
"group": "org.ga4gh", | ||
"artifact": "drs" | ||
"artifact": "drs", | ||
"version": "1.5" | ||
} | ||
... | ||
"stats": { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this does not match what the response sample indicates additionally, I would prefer some consistency with projects like refget and htsget, where the service-specific additions to the service-info response are nested in an object named after the service artifact There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey David, could you point me to the specifics for refget and htsget. I would love to understand how they are written so we could incorporate those into DRS There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for pointing this out @davidlougheed , I think it's a good idea to be consistent here so Michael or I will take a look and try to address your feedback ASAP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For refget, see http://samtools.github.io/hts-specs/refget.html - specifically There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although I don't feel as strongly about how the stats are nested in the response as much as I feel that they should be nested, in one form or another. |
||
"objectCount": 774560, | ||
"totalObjectSize": 4018437188907752 | ||
} | ||
} | ||
``` | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
## DRS and Data Connect | ||
|
||
With DRS objects it may be necessary to attach additional metadata to your objects. We believe that a change to the API of DRS to include metadata is not in the spirit of the DRS spec and in general DRS should have no knowledge of the metadata associated with the objects. We have found that a good GA4GH standard to support this is Data Connect (https://github.com/ga4gh-discovery/data-connect). The general approach would be to have a Data Connect service on your platform and to include "tables" with the ID matching your DRS ID for the same object. This means that if you have metadata associated with an object id `abcd` (ex. additional information about Compound Objects) all you need to do is request the information from the Data Connect client at `/tables/abcd/info`. There are optional functionalities of Data Connect, such as querying of tables, but we do not explore them or give any recommendations here. | ||
|
||
Here is an example of using Data Connect with DRS in the fasp-scripts repository (https://github.com/ga4gh/fasp-scripts/blob/master/notebooks/drs/DRS%20File%20Data.ipynb). In this notebook we can see that data connect is used to get DRS IDs from a platform. Those DRS IDs are then used to gather aditional information about the file that might be necessary for analysis. This is just one example of how DRS and Data Connect can interact with each other to gather information about data on a platform. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use "Data Connect" here "... can see that data connect is used to ...". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor question: why was this description deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a duplicate "description" entry here which caused a silent build failure for our documentation (took a long time for me to find that!!)