Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial input for the creation of formal specification documents #72

Closed
wants to merge 10 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.idea/
8 changes: 2 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,9 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).


## 2022-09

### Changed
- fixed missing mandatory elements in mutlipart message


- fixed missing mandatory elements in multipart message
- Fixed wrong references to IDS-G-pre and RAM in CONTRIBUTING.md

## [Q2/2022]

Expand Down
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Contributing to IDS-G-pre
# Contributing to IDS-G

IDS-G is the official repository of [IDSA](https://www.internationaldataspaces.org) to publish the [IDS-RAM]() and the subsequent specifications.
IDS-G is the official repository of [IDSA](https://www.internationaldataspaces.org) to publish the [IDS-RAM](https://github.com/International-Data-Spaces-Association/IDS-RAM_4_0) and the subsequent specifications.

All content published here is approved by the IDSA Technical Steering Committee and the IDSA Working Groups. Detailed information on the contribution process can be found in the [IDS-G Handbook](Handbook/README.md). Nevertheless, you are very welcome to contribute
to this project when you find a bug, want to suggest an improvement, or have an idea for a useful
Expand Down Expand Up @@ -30,7 +30,7 @@ should at least include the following information:

## Labels

The [labels](https://github.com/International-Data-Spaces-Association/ids-g/labels) are listed at the
The [labels](https://github.com/International-Data-Spaces-Association/IDS-G/labels) are listed at the
[issues](https://github.com/International-Data-Spaces-Association/IDS-G/issues).
There are two types of labels: one describes the content of the issue and should be used by the
developer that creates the issue. The other one, starting with `status`, will be added from the
Expand Down Expand Up @@ -75,5 +75,5 @@ An example of a very good commit might look like this: `feat![login]: add awesom


## Versioning
IDS-G-pre uses the [SemVer](https://semver.org/) for versioning. The release versions
IDS-G uses the [SemVer](https://semver.org/) for versioning. The release versions
are tagged with their respective version.
110 changes: 110 additions & 0 deletions Communication/CommunicationGuide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# IDS Communication Guide #

## Introduction ##

Interoperability is a major goal of the IDS. Therefore, the interoperability between IDS Connectors and other components is of high importance. The IDS Communication Guide shall provide the required data structure and the interaction sequences to be realized for interoperability and to be used for interoperability testing.

The Communication Guide is organized into a modular and composable structure.

## Terms and Definitions ##

### Control Plane vs. Data Plane and in-band vs. out of band ###

Joint understanding of the terms `in-band` and `out-of-band`, as well as the `control plane` and `data plane`:

**Commonalities:** Both term pairs…

- represent the split of a previously joint, combined flow of information into two separate parts
- have a background in technology
- have some overlap, but put different emphasis

#### in-band/out-of-band ####

- **origins:** selection of radio frequencies (“bands”) for primary/secondary communication
- the split is motivated mainly by isolation & break-out reasons
- `in-band`: the same `frequency`, `connection` or `means of communication` is used for all transfers
- `out-of-band`: for a selected subset of communication, a different, dedicated band is selected
- **example:** main process is using HTTP, user identity verification subprocess uses SMTP (email)

#### control plane/data plane ####

- **origins:** in a networking device…
- the `control plane` is optimized for customizability and security. it controls the data plane.
- the `data plane` is optimized for speed, throughput and bandwidth. it handles the data payloads.
- the split is motivated mainly by “separation of concerns”
- `control plane`: controls what happens on the data plane
- `data plane`: agnostic of control logic, only used for payload transfers

## Foundation ##

The foundation package contains elements that commonly used. This includes standards that are used as foundation for the Communication Guide.

### Foundational standards ###

[The Foundational Standards list.](./FoundationalStandards/README.md)

### Information Model ###

The common information model that is used in every other package. This shall include a base model containing the entities of a data space and their relation.

**Insert entity model after update.**

The realization bases on DCAT for the Data Products and ODRL for Contract Policies.

[The IDS-Information Model is here.](./Infomodel/README.md)

### Identities ###

messages and data types:
protocols: state machines for message flows and interaction patterns:
API binding:

### Trust Frameworks ###

messages and data types:
protocols: state machines for message flows and interaction patterns:
API binding:

### Policies (authorization and Policy Description) ###

messages and data types:
protocols: state machines for message flows and interaction patterns:
API binding:

## Data Sharing (Conector) ##

### Contract Negotiation ###

part of the control plane

messages and data types:
protocols: state machines for message flows and interaction patterns:
API binding:

### Data Transfer ###

part of the data plane. How data is exchanged with focus on communication and not on how the data plane is built.

messages and data types:
protocols: state machines for message flows and interaction patterns:
API binding:

## Catalog (Publish and query meta-data) ##

messages and data types:
protocols: state machines for message flows and interaction patterns:
API binding:

## Registration ##

messages and data types:
protocols: state machines for message flows and interaction patterns:
API binding:

## Audit logging ##

currently out of scope

## Vocabularies ##

currently out of scope
39 changes: 39 additions & 0 deletions Communication/FoundationalStandards/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Foundational Standards #

These Foundational Standards are used in the IDS Communication Guide:

## Attribute Based Access Control (ABAC) ##

Related to Access Control that is not part of the [IDS-RAM](https://github.com/International-Data-Spaces-Association/IDS-G/blob/master/Glossary/README.md#ids-ram-international-data-spaces-reference-architecture-model).
[NIST, Guide to Attribute Based Access Control (ABAC) Definition and Considerations](https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-162.pdf)

## Linked Data Platform (LDP) ##

[W3C, "Linked Data Platform"](https://dvcs.w3.org/hg/ldpwg/raw-file/default/ldp.html)

[W3C, "Linked Data Platform 1.0 Primer"](https://www.w3.org/TR/ldp-primer/)

## Open Digital Rights Language (ODRL) ##

[W3C, ODRL](https://www.w3.org/TR/odrl-model/) as basis
for IDS usage control.

## Resource Description Framework (RDF) ##

[Wikipedia, „Resource Description Framework“](https://en.wikipedia.org/wiki/Resource_Description_Framework).

## Time Ontology in OWL ##

[W3C, "Time Ontology"](https://www.w3.org/TR/owl-time/)

## The Organization Ontology ##

[W3C, "The Organization Ontology"](https://www.w3.org/TR/vocab-org/)

## WebAccessControl (WAC) ##

[W3C, "Web Acces Control"](https://www.w3.org/wiki/WebAccessControl)

## eXtensible Access Control Markup Language (XAML) ##

[Wikipedia, "XAML"](https://en.wikipedia.org/wiki/XACML)
File renamed without changes.
File renamed without changes.
13 changes: 13 additions & 0 deletions Specifications/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Getting Started

The [Information Model document](./model/information.model.md) defines the core concepts, entities, and relationships that underpin a `Dataspace`.

The [Catalog Protocol document](./catalog/catalog.protocol.md) defines a how a `Catalog` is requested from a catalog service by a consumer using an abstract message exchange format.

The [Catalog Binding document](./catalog/catalog.binding.https.md) defines a RESTful API over HTTPS for the `Catalog Protocol`.

The [Bibliography](./notes/bibliography.md) contains links to relevant standards referenced by the above documents.

Note that PlantUML diagrams will be replaced by Draw-IO alternatives.


134 changes: 134 additions & 0 deletions Specifications/catalog/catalog.binding.https.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Catalog HTTPS Binding

## 1 Introduction

This specification defines a RESTful API over HTTPS for the [Catalog Protocol].

The OpenAPI definitions for this specification can be accessed [here](TBD).

## 2 Path Bindings

### 2.1 Prerequisites

1. The `<base>` notation indicates the base URL for a catalog service endpoint. For example, if the base catalog URL is `api.example.com`, the URL `https://<base>/catalog/request`
will map to `https//api.example.com/catalog/request`.

2. All request and response messages must use the `application/json` media type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This opens the discussion of JSON vs. JSON-LD. IDS, DCAT, ODRL, Gaia-X, and many others relevant for us go with JSON-LD/RDF rather than plain (and therefor a lot simpler!) JSON.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for simple JSON!

IMO the only thing we should require of a runtime implementation is the ability to parse and write JSON (and, by extension, JSON-LD). There is no need to require any semantic processing in the specification. A simple runtime can be built which is capable of conveying semantic metadata without having to understand it. For example, A DCAT catalog serialized as JSON-LD could be deserialized and JSON-LD expansion applied to nodes so that namespaces are resolved and persisted along with asset attributes in a persistent store. A "semantic query" tool could be used to manipulate the persisted data if someone wanted to do that.


### 2.2 CatalogErrorMessage

In the event of a request error, the catalog service must return an appropriate HTTP code and a [CatalogErrorMessage](./catalog.protocol.md#) in the response body.

| Field | Type | Description |
|---------|---------------|-------------------------------------------------------------|
| code | string | An optional implementation-specific error code. |
| reasons | Array[object] | An optional array of implementation-specific error objects. |

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a correlationId attribute to trace an error message through different dataspace components?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tracing and observability are implementation details and would not be propagated across organizational boundaries.

### 2.3 The `catalog/request` endpoint

#### 2.3.1 POST

The [CatalogRequestMessage](catalog.protocol.md#1-catalogrequestmessage) corresponds to `POST https://<base>/catalog/request`:

```
POST https://provider.com/catalog/request
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not GET?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the need for filters and query expressions. These become very difficult (and potentially unsafe) to put in URL params.


Authorization: ...

{
"@context": {
"ids": "https://idsa.org/"
},
"@type": "ids:CatalogRequest"
"ids:filter": {}
}
```

The `Authorization` header is optional if the catalog service does not require authorization. If present, the contents of the `Authorization` header are detailed in the
[Authorization section](#authorization).

The `filter` property is optional. If present, the `filter` property can contain an implementation-specific filter expression or query to be executed as part of the catalog
request.

#### 2.3.2 OK (200) Response

If the request is successful, the catalog service must return a response body containing a [CatalogMessage](./message/catalog.message.json) which is a profiled DCAT Catalog type
described by the [Catalog Protocol Specification](catalog.protocol.md).

## 3 Technical Considerations

### 3.1 Authorization

A catalog service may require authorization. If the catalog service requires authorization, requests must include an HTTP `Authorization` header with a token. The contents of
the token are undefined by may be an OAUTH2, Web DID, or other access token type.

### 3.2 Versioning

- Versioning will be done via URLs. TBD.

### 3.3 Pagination
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure we can discuss it. OData is an older, complex technology. We may be able to get by with something much simpler and more RESTful.


A catalog service may paginate the results of a `CatalogRequestMessage`. Pagination data is specified using [Web Linking](https://datatracker.ietf.org/doc/html/rfc5988)
and the HTTP `Link` header. The `Link` header will contain URLs for navigating to previous and subsequent results. The following request sequence demonstrates pagination:

```
Link: <https://provider.com/catalog?page=2&per_page=100>; rel="next"
{
"@context": {
"dcat": "http://www.w3.org/ns/dcat/"
},
"@type": "dcat:Catalog"
...
}

```

Second page response:

```
Link: <https://provider.com/catalog?page=1&per_page=100>; rel="previous"
Link: <https://provider.com/catalog?page=3&per_page=100>; rel="next"

{
"@type": "dcat:Catalog"
...
}
```

Last page response:

```
Link: <https://provider.com/catalog?page=2&per_page=100>; rel="previous"

{
"@type": "dcat:Catalog"
...
}
```

### 3.4 Compression

Catalog services MAY compress responses to a catalog request by setting the `Content-Encoding` header to `gzip` as described in
the [HTTP 1.1 Specification](https://www.rfc-editor.org/rfc/rfc9110.html#name-gzip-coding).

## 4 Notes

### 4.1 Asynchronous Interactions

We may want to specify optional support for asynchronous callbacks for the catalog response. This would require addling a `callbackAddress` property and an `@id` to the request:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @id attribute is always there even if it doesn't look like it. If omitted (making it a BlankNode), a random one is used implicitly. Better let's make @id mandatory.


```
POST https://provider.com/catalog/request

Authorization: ...

{
"@context:{},
"@type": "ids:CatalogRequest"
"@id: "..."
"ids:callbackAddress": "https://example.com/endpoint"
}
```

The `CatalogResponseMessage` would be POSTed back to the endpoint. the response message could be posted mutiple times for paginated results and would need to include the
original `@id` value as a `correlationId` and a property indicating if the contents are complete (or additional responses will be sent).
Loading