Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/issue 334 drs bulk requests #365

Merged
merged 43 commits into from
Sep 25, 2023
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
0e4abfa
first attempt to add bulk operations
Aug 16, 2021
a35aedd
bulk operations prototyping
Aug 16, 2021
e7fcf31
bulk operations prototyping
Aug 16, 2021
ac3e6ea
bulk operations prototyping
Aug 16, 2021
7df1396
bulk operations prototyping
Aug 16, 2021
f1734ab
bulk operations prototyping
Aug 16, 2021
1898a0f
bulk operations prototyping
Aug 16, 2021
982ea08
bulk operations prototyping
Aug 16, 2021
b5806c6
bulk operations prototyping
Aug 16, 2021
e9285af
bulk operations prototyping
Aug 16, 2021
c1d72ce
bulk operations prototyping
Aug 16, 2021
a6c268c
bulk operations prototyping
Aug 16, 2021
ba1ba6b
responding to comments on PR#365
briandoconnor Jan 10, 2022
a0f9f33
Create summary.yaml
briandoconnor Feb 4, 2022
264bf07
Create unresolved.yaml
briandoconnor Feb 4, 2022
39a4fb4
Update 200OkDrsObjects.yaml
briandoconnor Feb 4, 2022
d7ccae7
working on bulk schema changes
Feb 7, 2022
0fad71e
updates
Feb 7, 2022
0a84ce2
updates
Feb 7, 2022
ddcf7c6
fixed 200 return object for objects endpoint
briandoconnor Mar 14, 2022
3dbcc96
fixed 200 return object for objects endpoint
briandoconnor Mar 14, 2022
a9de009
Merge branch 'develop' into feature/issue-334-drs-bulk-requests
briandoconnor Sep 22, 2022
efeca29
trying to include passports as a parameter in the bulk body request
briandoconnor Sep 22, 2022
5938922
updating the build badge, trying to get passport to show up on bulk
briandoconnor Sep 22, 2022
0414c4e
working to get passports
briandoconnor Sep 22, 2022
6baeb80
trying to get passport field to show up
briandoconnor Sep 22, 2022
d9e643e
trying to figure out adding passports
briandoconnor Sep 22, 2022
10d627e
trying to get this to build the docs
briandoconnor Sep 22, 2022
815f7df
adding passports
briandoconnor Mar 27, 2023
d3b09b4
working on bulk method for access URLs
briandoconnor Mar 27, 2023
f604464
resolved conflicts in merge of v2 of this feature branch
briandoconnor Mar 27, 2023
4a71e69
added 413 error
briandoconnor Mar 27, 2023
9bf5892
working on adding bulk options endpoint
briandoconnor May 22, 2023
bd79805
adding error 413 for the bulk options request
briandoconnor May 22, 2023
e1ea19c
updated logo and version string
briandoconnor Jul 11, 2023
22c5a82
removing passport from bulk opts endpoint since it does not make sens…
briandoconnor Jul 11, 2023
b511345
adding new yaml
briandoconnor Jul 11, 2023
5ccf068
updating travis URL
briandoconnor Jul 11, 2023
783ef68
updated the security to PassportAuth
briandoconnor Jul 11, 2023
affde66
merged in develop into my feature branch
briandoconnor Jul 11, 2023
19a815b
adding a maxBulkRequestLength variable to service info
briandoconnor Sep 19, 2023
7e607d3
adding better description of variables
briandoconnor Sep 19, 2023
6f29332
adding better description of variables
briandoconnor Sep 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions openapi/components/parameters/BulkObjectId.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
name: bulk_object_id
required: true
description: '`DrsObject` identifier'
schema:
type: string
10 changes: 10 additions & 0 deletions openapi/components/responses/200OkAccesses.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
description: The `AccessURL` was found successfully
content:
application/json:
schema:
type: object
properties:
access_urls:
type: array
items:
$ref: '../schemas/BulkAccessURL.yaml'
10 changes: 10 additions & 0 deletions openapi/components/responses/200OkDrsObjects.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
description: The `DrsObjects` were found successfully
content:
application/json:
schema:
type: object
properties:
drsobjects:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts on the response payload of a bulk DRS request:

  1. Currently drsobjects is a flat list, would it be advantageous to make it an object with the supplied IDs as keys, and the corresponding DRS Object as the value?
  2. There could be a mix of valid and invalid IDs in the overall selection (e.g. no resource by an ID, client not authorized to access that DRS Object). Should we make a slightly more complex object that clearly outlines which IDs could not be returned as Objects and why?

Taking the above points together, here is a proposed example response payload:

{
  "summary": {
    "requested": 5,
    "loaded": 2,
    "unloaded": 3
  },
  "loadedDrsObjects": {
    "123": {
      "id": "123",
      "name": "DRS Object 123",
      ...
    },
    "456": {
      "id": "456",
      "name": "DRS Object 456",
      ...
    }
  },
  "unloadedDrsObjects": {
    "777": 404,
    "778": 404,
    "779": 401 
  }
}

In the above example, the response payload summarizes how many IDs were passed, how many DRS Objects were successfully retrieved, and how many were unsuccessful. Each retrieved DRS Object can be referenced by its ID. Each non-retrieved DRS Object comes with an explanation (in the form of HTTP code) why it wasn't retrieved.

It's a more complex payload overall, but provides more information to the client about how the request was processed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss on today's call

Copy link

@patmagee patmagee Jan 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments on the proposed format:

Summary

Depending on the number of items that are requested and where we end up falling on pagination in a bulk request, the summary information may not be known ahead of time when the initial response is sent. Its possible this number may be per page, instead of in total.

Results Format

It may be nice to return the objects as an array with the ordering mirroring the order of the requested Id's This leads to less surprises and allows the client to iterate over the objects in the order they expect. Additionally, since this is a bulk operation we could interleave the errors that are encountered directly in the array at the appropriate index. (this is using the error message format from beacon here

for example, if I request both [456,457] the following may be a valid response

{
  "drsObjects" [
    {
      "id": 456,
      "name":  "..."
    },
    {
      "id": 457,
      "error":  {
        "errorCode": 404,
        "errorMessage": "DRS object does not exist"
      }
    }
  ]
}

type: array
items:
$ref: '../schemas/DrsObject.yaml'
17 changes: 17 additions & 0 deletions openapi/components/schemas/BulkAccessURL.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
type: object
required:
- url
properties:
object_id:
$ref: '../components/parameters/BulkObjectId.yaml'
url:
type: string
description: A fully resolvable URL that can be used to fetch the actual object bytes.
headers:
type: array
items:
type: string
description: >-
An optional list of headers to include in the HTTP request to `url`.
These headers can be used to provide auth tokens required to fetch the object bytes.
example: 'Authorization: Basic Z2E0Z2g6ZHJz'
6 changes: 5 additions & 1 deletion openapi/data_repository_service.openapi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ servers:
serverURL:
default: drs.example.org
description: >
DRS server endpoints MUST be prefixed by the '/ga4gh/drs/v1' endpoint
DRS server endpoints MUST be prefixed by the '/ga4gh/drs/v1' endpoint
path
security:
- {}
Expand Down Expand Up @@ -103,8 +103,12 @@ x-tagGroups:
paths:
/objects/{object_id}:
$ref: ./paths/objects@{object_id}.yaml
/bulk/objects:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making this endpoint at just POST /objects rather than POST /bulk/objects ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll make that change.

We do need to figure out how to enable multi-part POST here so it can be used with Passports. this is a TODO

$ref: ./paths/bulkobjects@{object_id}.yaml
/objects/{object_id}/access/{access_id}:
$ref: ./paths/objects@{object_id}@access@{access_id}.yaml
/bulk/objects/access/{access_id}:
$ref: ./paths/bulkobjects@access@{access_id}.yaml
components:
securitySchemes:
BasicAuth:
Expand Down
40 changes: 40 additions & 0 deletions openapi/paths/bulkobjects@access@{access_id}.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
post:
summary: Get URLs for fetching bytes from multiple objects
description: >-
Returns an array of URL objects that can be used to fetch the bytes of multiple `DrsObject`s.

This method only needs to be called when using an `AccessMethod` that contains an `access_id`
(e.g., for servers that use signed URLs for fetching object bytes).
operationId: GetBulkAccessURL
parameters:
- $ref: '../components/parameters/AccessId.yaml'
responses:
200:
$ref: '../components/responses/200OkAccesses.yaml'
202:
$ref: '../components/responses/202Accepted.yaml'
400:
$ref: '../components/responses/400BadRequest.yaml'
401:
$ref: '../components/responses/401Unauthorized.yaml'
403:
$ref: '../components/responses/403Forbidden.yaml'
404:
$ref: '../components/responses/404NotFoundAccess.yaml'
500:
$ref: '../components/responses/500InternalServerError.yaml'
tags:
- Objects
x-swagger-router-controller: ga4gh.drs.server
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
object_ids:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about renaming this property to selection to harmonize with the DRS + Passport downscoping thread.

@kwrodarmer @mbarkley WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So "object_ids" --> "selection"? I'm fine with that...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It started as selection and, with some discussion, we consciously went to object_ids as there may be other use cases than "selection" for how you get to a list of ids that you want to request. object_ids is neutral as to use case, rather than projecting a particular use case on to DRS which has some generality.

type: array
items:
$ref: '../components/parameters/BulkObjectId.yaml'
description: An array of ObjectIDs
37 changes: 37 additions & 0 deletions openapi/paths/bulkobjects@{object_id}.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
post:
summary: Get info about multiple DrsObjects.
description: >-
Returns an array of object metadata, and a list of access methods that can be used to fetch objects' bytes.
operationId: GetBulkObjects
parameters:
- $ref: '../components/parameters/Expand.yaml'
responses:
200:
$ref: '../components/responses/200OkDrsObjects.yaml'
202:
$ref: '../components/responses/202Accepted.yaml'
400:
$ref: '../components/responses/400BadRequest.yaml'
401:
$ref: '../components/responses/401Unauthorized.yaml'
403:
$ref: '../components/responses/403Forbidden.yaml'
404:
$ref: '../components/responses/404NotFoundDrsObject.yaml'
500:
$ref: '../components/responses/500InternalServerError.yaml'
tags:
- Objects
x-swagger-router-controller: ga4gh.drs.server
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
object_ids:
type: array
items:
$ref: '../components/parameters/BulkObjectId.yaml'
description: An array of ObjectIds