Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gateways: document Content-Location #471

Merged
merged 6 commits into from
Apr 18, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 39 additions & 13 deletions src/http-gateways/path-gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: >
The comprehensive low-level HTTP Gateway enables the integration of IPFS
resources into the HTTP stack through /ipfs and /ipns namespaces, supporting
both deserialized and verifiable response types.
date: 2023-03-30
date: 2024-04-17
maturity: reliable
editors:
- name: Marcin Rataj
Expand Down Expand Up @@ -242,6 +242,16 @@ These are the equivalents:
When both `Accept` HTTP header and `format` query parameter are present,
`Accept` SHOULD take precedence.

A Client SHOULD include the `format` query parameter in the request URL, in
addition to the `Accept` header. This provides the best interoperability and
ensures consistent HTTP cache behavior across various gateway implementations.

A Gateway SHOULD include the
[`Content-Location`](#content-location-response-header) header in the response when:
- the request contains an `Accept` header specifying a well-known response
format, but the URL does not include the `format` query parameter
- the `format` parameter is present, but does not match the format from `Accept`
Comment on lines +245 to +253
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hacdias I've added these clarifications as SHOULDs, the goal here is to push everyone towards including explicit ?format in requests – haves a lot of headache in debugging interop/caching problems.

We've always been doing this internally (including both format and Accept), this writes it down as suggested practice.

Copy link
Contributor

@aschmahmann aschmahmann Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it problematic then that some information in accept headers cannot be expressed in the format parameters (e.g. dfs, duplicates and version for CARs)

Copy link
Member

@lidel lidel Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, could cause issues at some point in the future when a single gateway supports something more than "the Kubo default" and clients actually request non-default CAR variant, including custom implementations with edge cases like #431.

That is why, to avoid cache override problems in the future, we revised position on these, add car-* URL params in #472. This allows us to explicitly list them in Content-Location when non-default CAR was requested (code for rainbow in ipfs/boxo#603)


### `dag-scope` (request query parameter)

Only used on CAR requests, same as :ref[dag-scope] from :cite[trustless-gateway].
Expand Down Expand Up @@ -486,12 +496,35 @@ To illustrate, `?filename=testтест.pdf` should produce:
not attempt to render raw bytes. CID and `.bin` file extension should be used
if a custom `filename` was not provided with the request.

### `Content-Location` (response header)

Returned when a non-default content format has been negotiated with the
[`Accept` header](#accept-request-header) but `format` was missing from the URL.

The value of this field SHOULD include
the URL of the resource with the `format` query parameter included, so that
generic HTTP caches can store deserialized, CAR, and block responses separately.

:::note

For example, a request to `/ipfs/{cid}` with `Accept: application/vnd.ipld.raw`
SHOULD return a `Content-Location: /ipfs/{cid}?format=raw` header in order for
block response to be cached separately from deserialized one.

:::

### `Content-Length` (response header)

Represents the length of returned HTTP payload.

:::warning

<!-- TODO https://github.com/ipfs/specs/issues/461 -->

NOTE: the value may differ from the real size of requested data if compression or chunked `Transfer-Encoding` are used.
<!-- TODO (https://github.com/ipfs/in-web-browsers/issues/194) IPFS clients looking for UnixFS file size should use value from `X-Ipfs-DataSize` instead. -->
See [ipfs/specs#461](https://github.com/ipfs/specs/issues/461).

:::

### `Content-Range` (response header)

Expand All @@ -513,8 +546,6 @@ deterministic.
Returned only when response status code is [`301` Moved Permanently](#301-moved-permanently).
The value informs the HTTP client about new URL for requested resource.

This header is more widely used in [SUBDOMAIN_GATEWAY.md](./SUBDOMAIN_GATEWAY.md#location-response-header).

#### Use in directory URL normalization

Gateway MUST return a redirect when a valid UnixFS directory was requested
Expand All @@ -530,6 +561,10 @@ It also ensures the same behavior on path gateways (`https://example.com/ipfs/ci
and origin-isolated HTTP contexts `https://cid.ipfs.dweb.link`
or non-HTTP URLs like `ipfs://cid`, where empty path component is implicit `/`.

#### Use in interop with Subdomain Gateway

See [`Location` section](https://specs.ipfs.tech/http-gateways/subdomain-gateway/#location-response-header) of :cite[subdomain-gateway].

### `X-Ipfs-Path` (response header)

Used for HTTP caching and indicating the IPFS address of the data.
Expand Down Expand Up @@ -567,15 +602,6 @@ NOTE: while the first CID will change every time any article is changed,
the last root (responsible for specific article or a subdirectory) may not
change at all, allowing for smarter caching beyond what standard Etag offers.

<!-- TODO: https://github.com/ipfs/in-web-browsers/issues/194
- `X-Ipfs-DagSize`
- Indicates the total size of the DAG (raw data + IPLD metadata) representing the requested resource.
- For UnixFS this is equivalent to `CumulativeSize` from `ipfs files stat`
- `X-Ipfs-DataSize`
- Indicates the original byte size of the raw data (not impacted by HTTP transfer encoding or compression), without IPFS/IPLD metadata.
- For UnixFS this is equivalent to `Size` from `ipfs files stat` or `ipfs dag stat`
-->

### `X-Content-Type-Options` (response header)

Optional, present in certain response types:
Expand Down
25 changes: 21 additions & 4 deletions src/http-gateways/trustless-gateway.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: >
The minimal subset of HTTP Gateway response types facilitates data retrieval
via CID and ensures integrity verification, all while eliminating the need to
trust the gateway itself.
date: 2023-06-20
date: 2024-04-17
maturity: reliable
editors:
- name: Marcin Rataj
Expand Down Expand Up @@ -90,7 +90,18 @@ mode (no deserialized responses) and `Accept` header is missing.

## Request Query Parameters

### :dfn[dag-scope] (request query parameter)
### :dfn[`format`] (request query parameter)

Same as [`format`](https://specs.ipfs.tech/http-gateways/path-gateway/#format-request-query-parameter) in :cite[path-gateway], but with limited number of supported response types:
- `format=raw` → `application/vnd.ipld.raw`
- `format=car` → `application/vnd.ipld.car`
- `format=ipns-record` → `application/vnd.ipfs.ipns-record`

A Client SHOULD include the `format` query parameter in the request URL, in
addition to the `Accept` header. This provides the best interoperability and
ensures consistent HTTP cache behavior across various gateway implementations.

### :dfn[`dag-scope`] (request query parameter)

Optional, `dag-scope=(block|entity|all)` with default value `all`, only available for CAR requests.

Expand All @@ -111,7 +122,7 @@ path segments.

When present, returned `Etag` must include unique prefix based on the passed scope type.

### :dfn[entity-bytes] (request query parameter)
### :dfn[`entity-bytes`] (request query parameter)

The optional `entity-bytes=from:to` parameter is available only for CAR
requests.
Expand Down Expand Up @@ -203,6 +214,12 @@ If a CAR stream was requested:

MUST be returned and set to `attachment` to ensure requested bytes are not rendered by a web browser.

### `Content-Location` (response header)

Same as in :cite[path-gateway], SHOULD be returned when Trustless Gateway
supports more than a single response format and the `format` query parameter is
missing or does not match well-known format from `Accept` header.

# Block Responses (application/vnd.ipld.raw)

An opaque bytes matching the requested block CID
Expand All @@ -217,7 +234,7 @@ A CAR stream for the requested
content type (with optional `order` and `dups` params), path and optional
`dag-scope` and `entity-bytes` URL parameters.

## CAR version
## CAR version (content type parameter)

Value returned in
[`CarV1Header.version`](https://ipld.io/specs/transport/car/carv1/#header)
Expand Down
Loading