Skip to content

Commit

Permalink
initial work on an extension to facilitate faceted search
Browse files Browse the repository at this point in the history
triggered by #52
  • Loading branch information
Paul committed Apr 10, 2021
1 parent 0a5bc7e commit 86b6526
Show file tree
Hide file tree
Showing 6 changed files with 189 additions and 0 deletions.
135 changes: 135 additions & 0 deletions proposals/aggregations/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# OGC API - Records - Term Aggregations

This folder contains the content for the standard extension OGC API - Records - Term Aggregations.

# Overview

This extensions enables the capability to include Term Aggregations in the items (records) response. These aggregations can be used by clients to enable [faceted search](https://en.wikipedia.org/wiki/Faceted_search).

Various backends support faceted search. Examples are [Elastic](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html), [SOLR](https://solr.apache.org/guide/8_8/json-facet-api.html) and limited support in [PostGres](https://akorotkov.github.io/blog/2016/06/17/faceted-search/), [Oracle](https://blogs.oracle.com/apex/apex-192-faceted-search).

Facet statistics are also interesting to give an overview of the spatio temporal distribution of items in a (part of a) collection.

## Extending a json search result with Term Aggregations

The server includes the Term Aggregations for any search result within a collection. For example, for a collection `cat` the response to

```
GET /collections/cat/items
```

may include the following property `aggregations`. The content represents an array of aggregated terms.
A aggregation is identified by a collection property name and contains the top number of buckets with
their count. A bucket is a number of results in the resultset matching the key. A parameter `next`
indicates if there are potentially more buckets to be retrieved.

```
{
"type": "FeatureCollection",
"aggregations": [
{
"property": "keyword",
"buckets": [
{
"key": "forestry",
"count": "202"
},
{
"key": "marine",
"count": "150"
}
],
"next": "0"
}
],
features": [],
"numberMatched": 375,
"numberReturned": 0,
"links": []
}
```

## Numerical, spatial and temporal buckets

For dynamic values a range of buckets can be returned defined by a min, max or bbox value.

```
{
"aggregations": [
"key": "scale",
"buckets": [
{
"key": "0.1 - 0.001",
"count": "175"
},
{
"key": "0.001 - 0.00001",
"count": "120"
}
],
"next": "2"
},
{
"key": "date",
"buckets": [
{
"key": "1990/01/01 - 1995/01/01",
"count": "12"
},
{
"key": "1995/01/01 - 2000/01/01",
"count": "340"
},
{
"key": "2000/01/01 - 2005/01/01",
"count": "1200"
}
],
"next": "5"
}
]
}
```

In the case of geometries this is a boundingbox in WGS84 defined by two points.

```
{
"aggregations": [
{
"key": "bbox",
"buckets": [
{
"key": "0,0 5,5",
"count": 175
},
{
"key": "0,5 5,10",
"count": 120
},
{
"key": "10,5 15,10",
"count": 77
}
],
"next": 7
}
]
}
```

## Interacting with aggregations

In some situations clients are interested only in the aggregations. In other situations the aggregations are not required. These aspects can be controlled via additional query parameters.

| Parameter | Explanation |
| -- | -- |
| aggregationsOnly | default `false`, Returns the aggregations without search results. Similar to `&limit=0`. |
| includeAggregations | default `true` (if available), can be set to `false` |

# Folder structure

This folder is organized as follows:

* openapi - normative OpenAPI components specified by the standard

10 changes: 10 additions & 0 deletions proposals/aggregations/openapi/parameters/aggregationsOnly.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
name: aggregationsOnly
description: parameter can be used to request aggregations only, similar to &limit=0
in: query
required: false
schema:
type: string
format: uri
style: form
explode: false
default: false
10 changes: 10 additions & 0 deletions proposals/aggregations/openapi/parameters/includeAggregations.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
name: includeAggregations
description: parameter can be set to omit aggregations in search result
in: query
required: false
schema:
type: string
format: uri
style: form
explode: false
default: true
18 changes: 18 additions & 0 deletions proposals/aggregations/openapi/schemas/aggregation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Aggregation:
type: object
description: An aggregation is linked to a dataset property and contains the top set of occurences (buckets) of values of the property
required:
- key
properties:
key:
type: string
description: Name of the property
next:
type: number
description: Indication how many additional buckets exist for this property
buckets:
type: array
description: buckets for this property
items:
$ref: "#/definitions/Bucket"

4 changes: 4 additions & 0 deletions proposals/aggregations/openapi/schemas/aggregations.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
type: array
description: List of aggregations
items:
$ref: "./aggregation.yaml"
12 changes: 12 additions & 0 deletions proposals/aggregations/openapi/schemas/bucket.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
type: object
description: A bucket is the number of items sharing a certain value for a property
required:
- key
- count
properties:
key:
type: string
description: Bucket value
count:
type: number
description: Number of hits for this value

0 comments on commit 86b6526

Please sign in to comment.