Skip to content

Commit

Permalink
move 'More Background on Compact Identifiers" subdoc to its own folde…
Browse files Browse the repository at this point in the history
…r, build via same process as main spec document
  • Loading branch information
Jeremy Adams committed Jul 5, 2021
1 parent ea838cd commit 034afaf
Show file tree
Hide file tree
Showing 23 changed files with 207 additions and 434 deletions.
17 changes: 15 additions & 2 deletions .spec-docs.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,20 @@
{
"apiSpecPath": "openapi/data_repository_service.openapi.yaml",
"docsRoot": "docs",
"defaultBranch": "master",
"branchPathBase": "preview",
"redocTheme": "ga4gh"
"redocTheme": "ga4gh",
"buildPages": [
{
"apiSpecPath": "openapi/data_repository_service.openapi.yaml",
"htmlOutfile": "index.html",
"yamlOutfile": "openapi.yaml",
"jsonOutfile": "openapi.json"
},
{
"apiSpecPath": "pages/more-background-on-compact-identifiers/openapi.yaml",
"htmlOutfile": "more-background-on-compact-identifiers.html",
"yamlOutfile": "more-background-on-compact-identifiers.yaml",
"jsonOutfile": "more-background-on-compact-identifiers.json"
}
]
}
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
- "12"
before_script:
- npm install -g @redocly/openapi-cli && npm install -g redoc-cli
- npm install -g gh-openapi-docs
- npm install -g @ga4gh/gh-openapi-docs@0.2.2-rc3
script:
- gh-openapi-docs
deploy:
Expand Down
18 changes: 18 additions & 0 deletions openapi/tags/Auth.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,21 @@ The DRS API allows implementers to support a variety of different content access
* caller fetches the object bytes from the `url` (passing auth info from the specified headers, if any)

DRS implementers should ensure their solutions restrict access to targets as much as possible, detect attempts to exploit through log monitoring, and they are prepared to take action if an exploit in their DRS implementation is detected.

## Authentication

### BasicAuth

A valid authorization token must be passed in the 'Authorization' header, e.g. "Basic ${token_string}"

| Security Scheme Type | HTTP |
|----------------------|------|
| **HTTP Authorization Scheme** | basic |

### BearerAuth

A valid authorization token must be passed in the 'Authorization' header, e.g. "Bearer ${token_string}"

| Security Scheme Type | HTTP |
|----------------------|------|
| **HTTP Authorization Scheme** | bearer |
4 changes: 2 additions & 2 deletions openapi/tags/CompactIdentifierBasedURIs.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ The examples below show the current API interactions with [n2t.net](https://n2t.

## Registering a DRS Server on a Meta-Resolver

See the documentation on the [n2t.net](https://n2t.net/e/compact_ids.html) and [identifiers.org](https://docs.identifiers.org/) meta-resolvers for adding your own compact identifier type and registering your DRS server as a resolver. You can register new prefixes (or mirrors by adding resource provider codes) for free using a simple online form. For more information see [More Background on Compact Identifiers](/data-repository-service-schemas/sources/md/more_background_on_compact_identifiers).
See the documentation on the [n2t.net](https://n2t.net/e/compact_ids.html) and [identifiers.org](https://docs.identifiers.org/) meta-resolvers for adding your own compact identifier type and registering your DRS server as a resolver. You can register new prefixes (or mirrors by adding resource provider codes) for free using a simple online form. For more information see [More Background on Compact Identifiers](./more-background-on-compact-identifiers.html).

## Calling Meta-Resolver APIs for Compact Identifier-Based DRS URIs

Expand Down Expand Up @@ -70,4 +70,4 @@ The compact identifier format used by identifiers.org/n2t.net does not percent-e

## Additional Examples

For additional examples, see the document [More Background on Compact Identifiers](/data-repository-service-schemas/sources/md/more_background_on_compact_identifiers).
For additional examples, see the document [More Background on Compact Identifiers](./more-background-on-compact-identifiers.html).
10 changes: 5 additions & 5 deletions openapi/tags/DrsApiPrinciples.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ For convenience, including when passing content references to a [WES server](htt
There are two styles of DRS URIs, Hostname-based and Compact Identifier-based, both using the `drs://` URI scheme. DRS servers may choose either style when exposing references to their content;. DRS clients MUST support resolving both styles.

Tip:
> See [Appendix: Background Notes on DRS URIs](#tag/Appendix:-Background-Notes-on-DRS-URIs) for more information on our design motivations for DRS URIs.
> See [Appendix: Background Notes on DRS URIs](#tag/Background-Notes-on-DRS-URIs) for more information on our design motivations for DRS URIs.
### Hostname-based DRS URIs

Expand All @@ -42,13 +42,13 @@ GET https://drs.example.org/ga4gh/drs/v1/objects/314159
The protocol is always https and the port is always the standard 443 SSL port. It is invalid to include a different port in a DRS hostname-based URI.

Tip:
> See the [Appendix: Hostname-Based URIs](#tag/Appendix:-Hostname-Based-URIs) for information on how hostname-based DRS URI resolution to URLs is likely to change in the future, when the DRS v2 major release happens.
> See the [Appendix: Hostname-Based URIs](#tag/Hostname-Based-URIs) for information on how hostname-based DRS URI resolution to URLs is likely to change in the future, when the DRS v2 major release happens.
### Compact Identifier-based DRS URIs

Compact Identifier-based DRS URIs use resolver registry services (specifically, [identifiers.org](https://identifiers.org/) and [n2t.net (Name-To-Thing)](https://n2t.net/)) to provide a layer of indirection between the DRS URI and the DRS server name — the actual DNS name of the DRS server isn’t present in the URI. This approach is based on the Joint Declaration of Data Citation Principles as detailed by [Wimalaratne et al (2018)](https://www.nature.com/articles/sdata201829).

For more information, see the document [More Background on Compact Identifiers](/data-repository-service-schemas/sources/md/more_background_on_compact_identifiers).
For more information, see the document [More Background on Compact Identifiers](./more-background-on-compact-identifiers.html).

Compact Identifiers take the form:

Expand All @@ -59,7 +59,7 @@ drs://[provider_code/]namespace:accession
Together, provider code and the namespace are referred to as the `prefix`. The provider code is optional and is used by identifiers.org/n2t.net for compact identifier resolver mirrors. Both the `provider_code` and `namespace` disallow spaces or punctuation, only lowercase alphanumerical characters, underscores and dots are allowed (e.g. [A-Za-z0-9._]).

Tip:
> See the [Appendix: Compact Identifier-Based URIs](#tag/Appendix:-Compact-Identifier-Based-URIs) for more background on Compact Identifiers and resolver registry services like identifiers.org/n2t.net (aka meta-resolvers), how to register prefixes, possible caching strategies, and security considerations.
> See the [Appendix: Compact Identifier-Based URIs](#tag/Compact-Identifier-Based-URIs) for more background on Compact Identifiers and resolver registry services like identifiers.org/n2t.net (aka meta-resolvers), how to register prefixes, possible caching strategies, and security considerations.
#### For DRS Servers

Expand Down Expand Up @@ -101,7 +101,7 @@ DRS servers can choose to issue either hostname-based or compact identifier-base
|-------------------|----------------|--------------------------|
| URI Durability | URIs are valid for as long as the server operator maintains ownership of the published DNS address. (They can of course point that address at different physical serving infrastructure as often as they’d like.) | URIs are valid for as long as the server operator maintains ownership of the published compact identifier resolver namespace. (They also depend on the meta-resolvers like identifiers.org/n2t.net remaining operational, which is intended to be essentially forever.) |
| Client Efficiency | URIs require minimal client logic, and no network requests, to resolve. | URIs require small client logic, and 1-2 cacheable network requests, to resolve. |
| Security | Servers have full control over their own security practices. | Server operators, in addition to maintaining their own security practices, should confirm they are comfortable with the resolver registry security practices, including protection against denial of service and namespace-hijacking attacks. (See the [Appendix: Compact Identifier-Based URIs](#tag/Appendix:-Compact-Identifier-Based-URIs) for more information on resolver registry security.) |
| Security | Servers have full control over their own security practices. | Server operators, in addition to maintaining their own security practices, should confirm they are comfortable with the resolver registry security practices, including protection against denial of service and namespace-hijacking attacks. (See the [Appendix: Compact Identifier-Based URIs](#tag/Compact-Identifier-Based-URIs) for more information on resolver registry security.) |

## DRS Datatypes

Expand Down
6 changes: 3 additions & 3 deletions openapi/tags/Motivation.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Data sharing requires portable data, consistent with the FAIR data principles (findable, accessible, interoperable, reusable). Today’s researchers and clinicians are surrounded by potentially useful data, but often need bespoke tools and processes to work with each dataset. Today’s data publishers don’t have a reliable way to make their data useful to all (and only) the people they choose. And today’s data controllers are tasked with implementing standard controls of non-standard mechanisms for data access.
</td>
<td style="width:60%">
<img src="/data-repository-service-schemas/sources/img/figure1.png">
<img src="/data-repository-service-schemas/public/img/figure1.png">
<em>
Figure 1: there’s an ocean of data, with many different tools to drink from it, but no guarantee that any tool will work with any subset of the data
</em>
Expand All @@ -18,7 +18,7 @@
We need a standard way for data producers to make their data available to data consumers, that supports the control needs of the former and the access needs of the latter. And we need it to be interoperable, so anyone who builds access tools and systems can be confident they’ll work with all the data out there, and anyone who publishes data can be confident it will work with all the tools out there.
</td>
<td style="width:60%">
<img src="/data-repository-service-schemas/sources/img/figure2.png">
<img src="/data-repository-service-schemas/public/img/figure2.png">
<em>
Figure 2: by defining a standard Data Repository API, and adapting tools to use it, every data publisher can now make their data useful to every data consumer
</em>
Expand Down Expand Up @@ -49,7 +49,7 @@
</ul>
</td>
<td style="width:25%">
<img src="/data-repository-service-schemas/sources/img/figure3.png">
<img src="/data-repository-service-schemas/public/img/figure3.png">
<em>
Figure 3: a standard Data Repository API enables an ecosystem of data producers and consumers
</em>
Expand Down
29 changes: 29 additions & 0 deletions pages/more-background-on-compact-identifiers/openapi.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
openapi: 3.0.3
info:
title: More Background on Compact Identifiers
version: 1.1.0
x-logo:
url: 'https://www.ga4gh.org/wp-content/themes/ga4gh-theme/gfx/GA-logo-horizontal-tag-RGB.svg'
termsOfService: 'https://www.ga4gh.org/terms-and-conditions/'
contact:
name: GA4GH Cloud Work Stream
email: [email protected]
license:
name: Apache 2.0
url: 'https://raw.githubusercontent.com/ga4gh/data-repository-service-schemas/master/LICENSE'
tags:
- name: About
description:
$ref: ./tags/About.md
- name: Background on Compact Identifier-Based URIs
description:
$ref: ./tags/BackgroundOnCompactIdentiferBasedURIs.md
- name: Registering a DRS Server on a Meta-Resolver
description:
$ref: ./tags/RegisteringOnMetaResolver.md
- name: Example DRS Client Compact Identifier-Based URI Resolution Process - Existing Compact Identifier Provider
description:
$ref: ./tags/ExampleExistingProvider.md
- name: Example DRS Client Compact Identifier-Based URI Resolution Process - Registering a new Compact Identifier for Your DRS Server
description:
$ref: ./tags/ExampleRegisterIdentifier.md
1 change: 1 addition & 0 deletions pages/more-background-on-compact-identifiers/tags/About.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
This document contains more examples of resolving compact identifier-based DRS URIs than we could fit in the DRS specification or appendix. It’s provided here for your reference as a supplement to the specification.
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
Compact identifiers refer to locally-unique persistent identifiers that have been namespaced to provide global uniqueness. See ["Uniform resolution of compact identifiers for biomedical data"](https://www.biorxiv.org/content/10.1101/101279v3) for an excellent introduction to this topic. By using compact identifiers in DRS URIs, along with a resolver registry (identifiers.org/n2t.net), systems can identify the current resolver when they need to translate a DRS URI into a fetchable URL. This allows a project to issue compact identifiers in DRS URIs and not be concerned if the project name or DRS hostname changes in the future, the current resolver can always be found through the identifiers.org/n2t.net registries. Together the identifiers.org/n2t.net systems support the resolver lookup for over 700 compact identifiers formats used in the research community, making it possible for a DRS server to use any of these as DRS IDs (or to register a new compact identifier type and resolver service of their own).

We use a DRS URI scheme rather than [Compact URIs (CURIEs)](https://en.wikipedia.org/wiki/CURIE) directly since we feel that systems consuming DRS objects will be able to better differentiate a DRS URI. CURIEs are widely used in the research community and we feel the fact that they can point to a wide variety of entities (HTML documents, PDFs, identities in data models, etc) makes it more difficult for systems to unambiguously identify entities as DRS objects.

Still, to make compact identifiers work in DRS URIs we leverage the CURIE format used by identifiers.org/n2t.net. Compact identifiers have the form:

```
prefix:accession
```

The prefix can be divided into a `provider_code` (optional) and `namespace`. The `accession` here is an Ark, DOI, Data GUID, or another issuers’s local ID for the object being pointed to:

```
[provider_code/]namespace:accession
```

Both the `provider_code` and `namespace` disallow spaces or punctuation, only lowercase alphanumerical characters, underscores and dots are allowed.

[Examples](https://n2t.net/e/compact_ids.html) include (from n2t.net):

```
PDB:2gc4
Taxon:9606
DOI:10.5281/ZENODO.1289856
ark:/47881/m6g15z54
IGSN:SSH000SUA
```

Tip:
> DRS URIs using compact identifiers with resolvers registered in identifiers.org/n2t.net can be distinguished from the hostname-based DRS URIs below based on the required ":" which is not allowed in hostname-based URI.
See the documentation on [n2t.net](https://n2t.net/e/compact_ids.html) and [identifiers.org](https://docs.identifiers.org/) for much more information on the compact identifiers used there and details about the resolution process.
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
A DRS client identifies the a DRS URI compact identifier components using the first occurance of "/" (optional) and ":" characters. These are not allowed inside the provider_code (optional) or the namespace. The ":" character is not allowed in a Hostname-based DRS URI, providing a convenient mechanism to differentiate them. Once the provider_code (optional) and namespace are extracted from a DRS compact identifier-based URI, a client can use services on identifiers.org to identify available resolvers.

*Let’s look at a specific example DRS compact identifier-based URI that uses DOIs, a popular compact identifier, and walk through the process that a client would use to resolve it. Keep in mind, the resolution process is the same from the client perspective if a given DRS server is using an existing compact identifier type (DOIs, ARKs, Data GUIDs) or creating their own compact identifier type for their DRS server and registering it on identifiers.org/n2t.net.*

Starting with the DRS URI:

```
drs://doi:10.5072/FK2805660V
```

with a namespace of "doi", the following GET request will return information about the namespace:

```
GET https://registry.api.identifiers.org/restApi/namespaces/search/findByPrefix?prefix=doi
```

This information then points to resolvers for the "doi" namespace. This "doi" namespace was assigned a namespace ID of 75 by identifiers.org. This "id" has nothing to do with compact identifier accessions (which are used in the URL pattern as `{$id}` below) or DRS IDs. This namespace ID (75 below) is purely an identifiers.org internal ID for use with their APIs:

```
GET https://registry.api.identifiers.org/restApi/resources/search/findAllByNamespaceId?id=75
```

This returns enough information to, ultimately, identify one or more resolvers and each have a URL pattern that, for DRS-supporting systems, provides a URL template for making a successful DRS GET request. For example, the DOI urlPattern is:

```
urlPattern: "https://doi.org/{$id}"
```

And the `{$id}` here refers to the accession from the compact identifier (in this example the accession is `10.5072/FK2805660V`). If applicable, a provide code can be supplied in the above requests to specify a particular mirror if there are multiple resolvers for this namespace. In the case of DOIs, you only get a single resolver.

Given this information you now know you can make a GET on the URL:

```
GET https://doi.org/10.5072/FK2805660V
```

*The URL above is valid for a DOI object but it is not actually a DRS server! Instead, it redirects to a DRS server through a series of HTTPS redirects. This is likely to be common when working with existing compact identifiers like DOIs or ARKs. Regardless, the redirect should eventually lead to a DRS URL that percent-encodes the accession as a DRS ID in a DRS object API call. For a **hypothetical** example, here’s what a redirect to a DRS API URL might ultimately look. A client doesn’t have to do anything other than follow the HTTPS redirects. The link between the DOI resolver on doi.org and the DRS server URL below is the result of the DRS server registering their data objects with a DOI issuer.*

```
GET https://drs.example.org/ga4gh/drs/v1/objects/10.5072%2FFK2805660V
```

IDs in DRS hostname-based URIs/URLs are always percent-encoded to eliminate ambiguity even though the DRS compact identifier-based URIs and the identifier.orgs API do not percent-encode accessions. This was done in order to 1) follow the CURIE conventions of identifiers.org/n2t.net for compact identifier-based DRS URIs and 2) to aid in readability for users who understand they are working with compact identifiers. **The general rule of thumb, when using a compact identifier accession as a DRS ID in a DRS API call, make sure to percent-encode it. An easy way for a DRS client to handle this is to get the initial DRS object JSON response from whatever redirects the compact identifier resolves to, then look for the** `self_uri` **in the JSON, which will give you the correctly percent-encoded DRS ID for subsequent DRS API calls such as the** `access` **method.**
Loading

0 comments on commit 034afaf

Please sign in to comment.