Skip to content

Latest commit

 

History

History
284 lines (220 loc) · 21 KB

db-schema-data.md

File metadata and controls

284 lines (220 loc) · 21 KB

EDR data Schema Documentation

Overview

The data schema holds entity ('real-world' things), property (of entities) and relationship (between entities) tables. This schema is an Entity Attribute Value (EAV) database (see edr-dbms-design-principles) for details. EAV databases are somewhat opaque to anyone browsing them, essentially being a rows of UUIDs for foreign keys to cm tables. To offset this, each table has a corresponding view that presents it in a human friendly way by getting labels from the appropriate definition table. The views have the same name as the table they present but suffixed by '__view', e.g data.entity__view for data.entity.

Figure data schema tables.

schema - data

Tables and views

data.binary_object [table]

A binary object store for photos and scanned documents, and raw output from sensors (e.g. spectrometers). Binary objects are linked to a entity/dataset via a key in the xxx_attribute.value JSON object (see the binary-object data type in cm.system__type__data_type). Only one copy of a binary object is to be stored, it may be shared (e.g. a photo of multiple sites) by several entities. Objects should be stored using media types and content encodings support by HTTP web APIs. File names are managed on each attribute that references the binary object as different names may be appropriate in different circumstances.

Column Data type Key FK References Null Default Definition
id text PK NO Binary Object UUID - generated from the md5 hash of the binary object. Allows clients to create objects and refer to them when offline or before the object has been created server-side. Also ensures an object is only stored once.
blobject bytea NO The binary object.
media_type text NO The media type (AKA MIME Type) of the binary object. Values are as defined by the Internet Assigned Numbers Authority (IANA): https://www.iana.org/assignments/media-types/media-types.xhtml.
file_extension text YES The well-known file extension appropriate to the object if it is downloaded as a file.
content_encoding text YES The file compression algorithm used, if applicable. Values are as defined by the Internet Assigned Numbers Authority (IANA) for content encodings returned in the HTTP Accept-Encoding header: https://www.iana.org/assignments/http-parameters/http-parameters.xml#content-coding.
Index Type Columns
pk_binary_object unique; btree id

data.dataset [table]

Datasets (collections) organise entities into logical collections with a common purpose, e.g. a soil survey, and owner. Containers 'own' the entities and, as such, are the primary location in which an entity is managed. Deleting a dataset deletes all member entities.

Column Data type Key FK References Null Default Definition
id uuid PK NO gen_random_uuid() Dataset UUID.
register_id uuid NO The register the governs dataset licensing, and access constraints.
class_model_id uuid NO The class model that governs the types of entities, properties and associations in the dataset.
class_id text NO The type of dataset as defined in cm.class.
default_label text YES The default label for a dataset when displayed. Derived by concatenating one or more attribute values in data.dataset__attribute and populated by a trigger function on that table.
dataset_id uuid YES Parent dataset UUID. Organisation of datasets into hierarchies should be discouraged in favour of organisation into registers (reg.register).
Index Type Columns
fx_dataset__class btree class_id
fx_dataset__dataset btree dataset_id
fx_dataset__register btree register_id
ix_dataset__default_label gin default_label gin_trgm_ops
pk_dataset unique; btree id

data.dataset__attribute [table]

Dataset properties stored as key (attribute_id) value pairs.

Column Data type Key FK References Null Default Definition
id uuid PK NO gen_random_uuid() Dataset attribute UUID
dataset_id uuid NO Dataset UUID
attribute_id text NO The attribute type - the key in a key-value-pair - from cm.attribute.
value jsonb YES A JSON object holding the value of attribute. The data type for a given dataset__attribute.attribute_id is defined in cm.system__type__data_type.schema_json.
Index Type Columns
fx_dataset__attribute__attribute btree attribute_id
fx_dataset__attribute__dataset btree dataset_id
ix_dataset__attribute__value gin value
pk_dataset__attribute unique; btree id

data.dataset__default_extent [table]

The default extent for an dataset when displayed and searched. Stored as a WGS84 (EPSG:4326) lat/long geometry. Derived from a polygon (a concave hull) encompassing the member entity's default location (data.entity__default_location) - populated by a trigger function on data.entity__default_location. Extends data.dataset as not all datasets have a spatial extent.

Column Data type Key FK References Null Default Definition
dataset_id uuid PK NO Dataset UUID.
extent USER-DEFINED NO The default extent of the dataset.
Index Type Columns
pk_dataset__default_extent unique; btree dataset_id
sx_dataset__default_extent__extent gist extent

data.dataset__label_template [table]

Templates for the default_label property for a given class as defined for specific datasets. Used to override the class's default template.

Column Data type Key FK References Null Default Definition
dataset_id uuid PK NO Dataset UUID
class_id text PK NO Class UUID.
system__label_template text YES A template specifying how class attributes may be used to create a label for presentation. Used to override the class's default system__label_template template. Format is text with column values inserted as variables using the attribute identifier to identify the property. E.g.: '${depth} [${designation}]'
Index Type Columns
fx_dataset__label_template__class btree class_id
fx_dataset__label_template__dataset btree dataset_id
pk_dataset__label_template unique; btree dataset_id, class_id

data.entity [table]

Entities are instances of classes that describe environmental or sampling entities, and their parts. Entities must be organised and managed in a dataset.

Column Data type Key FK References Null Default Definition
id uuid PK NO gen_random_uuid() Entity UUID.
dataset_id uuid NO Links an entity to its dataset. Used by Row Level Security policies (where present).
class_id text NO The type of entity as defined in cm.class.
default_label text YES The default label for an entity when displayed. Derived by concatenating one or more attribute values in `data.entity__assertion
Index Type Columns
fx_entity__class btree class_id
fx_entity__dataset btree dataset_id
ix_entity__default_label gin default_label gin_trgm_ops
pk_entity unique; btree id

data.entity__aggregation [table]

A specialised association that combines entities into into aggregates - those that are made up of other entities. If an aggregate entity is deleted, its parts are also deleted (by trigger function data.tf_entity__aggregation__delete()).

Column Data type Key FK References Null Default Definition
id uuid PK NO gen_random_uuid() Entity aggregation UUID.
association_id text NO The association type - the key in a key-value-pair - from cm.association.
entity_id uuid NO The aggregate entity UUID.
associated_entity_id uuid NO The associated entity UUID.
system__order integer YES The order the entity's parts are to be returned in an array or user interface.
Constraint Type Columns Description
ck_entity__aggregation__recursion check entity_id, associated_entity_id Ensures that the associated entity is not already present in the aggregate 'tree'.
uq_entity__aggregation__associated_entity unique associated_entity_id Ensures that the associated_entity_id is unique as an entity can only be a part of one aggregate.
Index Type Columns
fx_entity__aggregation__association btree association_id
fx_entity__aggregation__entity btree entity_id
pk_entity__aggregation unique; btree id
uq_entity__aggregation__associated_entity unique; btree associated_entity_id

data.entity__assertion [table]

Asserted entity properties stored as key (attribute_id) value pairs.

Column Data type Key FK References Null Default Definition
id uuid PK NO gen_random_uuid() Entity attribute UUID
entity_id uuid NO Entity UUID
attribute_id text NO The attribute type - the key in a key-value-pair - from cm.attribute.
value jsonb NO A JSON object holding the value of attribute. The data type for a given entity__assertion.attribute_id is defined in cm.system__type__data_type.schema_json.
Index Type Columns
fx_entity__assertion__attribute btree attribute_id
fx_entity__assertion__entity btree entity_id
ix_entity__assertion__value gin value
pk_entity__assertion unique; btree id

data.entity__association [view]

Associations that link entities with other entities. UNION of data.entity__relationship and data.entity__aggregation. INSERTS and UPDATES can be made against this view - data are directed to the appropriate union table using the instead of trigger tr_entity__association__upsert.

Column Data type Definition
id uuid Entity association UUID.
association_id text The association type - the key in a key-value-pair - from cm.association.
entity_id uuid The entity UUID.
associated_entity_id uuid The associated entity UUID.
system__order integer The order the entity's associated entities are to be returned in an array or user interface.

data.entity__attribute [view]

Entity properties. UNION of data.entity__assertion, data.entity__observation and data.entity__measurement. INSERTS and UPDATES can be made against this view - data are directed to the appropriate union table using the instead of trigger tr_entity__attribute__upsert.

Column Data type Definition
id uuid Entity attribute UUID
entity_id uuid Entity UUID
attribute_id text The attribute type - the key in a key-value-pair - from cm.attribute.
value jsonb A JSON object holding the value of attribute. The data type for a given entity__attribute.type_id is defined in cm.system__type__data_type.schema_json.
procedure__entity_id uuid A workflow, protocol, plan, algorithm, or computational method specifying how to make a observation or measurement.
agent_entity_id uuid The agent used to make the estimate.
time timestamp with time zone The time the value was generated.
quality jsonb A JSON object holding a statistical quantification of uncertainty (e.g. a prediction interval). How uncertainty is expressed will vary as appropriate to the attribute, procedure and/or agent.

data.entity__default_location [table]

The default location (the point, or a centroid for lines or polygons) for an entity when displayed and searched (including spatial joins). Stored as a WGS84 (EPSG:4326) lat/long geometry. Derived from the most accurate location in data.entity__assertion|observation|measurement - populated by a trigger function. Extends data.entity as not all entities have a spatial extent.

Column Data type Key FK References Null Default Definition
entity_id uuid PK NO gen_random_uuid() Entity UUID.
location USER-DEFINED NO The default point location at the surface. This is the centroid of an entity with an areal extent.
Index Type Columns
pk_entity__default_location unique; btree entity_id
sx_entity__default_location__location gist location

data.entity__measurement [table]

Measured entity properties stored as key (attribute_id) value pairs. Measurements are created when detailed metadata describing the procedure and equipment used to make the estimate, when it was made, and its uncertainty is required. Measurements are typically made off-site, using some sort of sensor or software agent.

Column Data type Key FK References Null Default Definition
id uuid PK NO gen_random_uuid() Entity Measurement UUID
entity_id uuid NO Entity UUID
attribute_id text NO The attribute type - the key in a key-value-pair - from cm.attribute.
value jsonb NO A JSON object holding the value of attribute. The data type for a given entity__measurement.type_id is defined in cm.system__type__data_type.schema_json.
procedure__entity_id uuid NO A workflow, protocol, plan, algorithm, or computational method specifying how to make a Measurement.
agent_entity_id uuid YES The agent used to make the estimate.
time timestamp with time zone YES The time the value was generated.
quality jsonb YES A JSON object holding a statistical quantification of uncertainty (e.g. a prediction interval). How uncertainty is expressed will vary as appropriate to the attribute, procedure and/or agent.
Index Type Columns
fx_entity__measurement__attribute btree attribute_id
fx_entity__measurement__entity btree entity_id
ix_entity__measurement__value gin value
pk_entity__measurement unique; btree id

data.entity__observation [table]

Observed entity properties stored as key (attribute_id) value pairs. Observations capture characteristics of an entity that are estimated using some sort of procedure, with a corresponding quality. As a general rule observations are made on-site, i.e. where the entity occurs naturally, by humans while sampling.

Column Data type Key FK References Null Default Definition
id uuid PK NO gen_random_uuid() Entity attribute UUID
entity_id uuid NO Entity UUID
attribute_id text NO The attribute type - the key in a key-value-pair - from cm.attribute.
value jsonb NO A JSON object holding the value of attribute. The data type for a given entity__observation.attribute_id is defined in cm.system__type__data_type.schema_json.
procedure__entity_id uuid YES A workflow, protocol, plan, algorithm, or computational method specifying how to make an Observation.
quality jsonb YES A JSON object holding a statistical quantification of uncertainty (e.g. a prediction interval). How uncertainty is expressed will vary as appropriate to the attribute, procedure and/or sensor.
Index Type Columns
fx_entity__observation__attribute btree attribute_id
fx_entity__observation__entity btree entity_id
ix_entity__observation__value gin value
pk_entity__observation unique; btree id

data.entity__relationship [table]

Relationships link independent entities to each other..

Column Data type Key FK References Null Default Definition
id uuid PK NO gen_random_uuid() Entity relationship UUID.
association_id text NO The association type - the key in a key-value-pair - from cm.association.
entity_id uuid NO The entity UUID.
associated_entity_id uuid NO The associated entity UUID.
system__order integer YES The order the associated entities are to be returned in an array or user interface.
Constraint Type Columns Description
uq_entity__relationship__entities unique association_id, entity_id, associated_entity_id
Index Type Columns
fx_entity__relationship__associated_entity btree associated_entity_id
fx_entity__relationship__association btree association_id
fx_entity__relationship__entity btree entity_id
pk_entity__relationship unique; btree id
uq_entity__relationship__entities unique; btree association_id, entity_id, associated_entity_id

Functions

Function Type Arguments Returns Description
class__label function _attribute_table_prefix; _class_id text Returns a class's (_class_id) formatted label based on the template defined in cm.class
class__label_variable_value function _attribute_table_prefix; _class_id; _variable_identifier text Returns the value for an attribute of a dataset or entity (class) based on a variable in the label template. If multiple values are found, the result is concatenated into delimited string, unless an index is provided with the variable (e.g. ${variable}[1]) in which case the value at the position (or closest to it) is provided.
dataset__extent function _dataset_id geometry Returns a dataset's (_dataset_id) extent as a buffered convex hull (with lightly rounded vertices for aesthetic reasons) enclosing the cmault locations of member entities.
dataset__label__attribute function _dataset_id; _attribute_id boolean Function that tests whether the attribute (_attribute_id) is part of the dataset's (_dataset_id) label template.
entity__aggregates function _entity_id uuid[] Returns an array of uuids of the entities of which an entity (_entity_id) is a part.
entity__label__attribute function _entity_id; _attribute_id boolean Function that tests whether the attribute (_attribute_id) is part of the entity's (_entity_id) label template.
entity__location function _entity_id geometry
entity__location__attribute function _entity_id; _attribute_id boolean Function that tests whether the attribute (_attribute_id) is the entity's (_entity_id) location attribute.
entity__parts function _entity_id uuid[] Returns an array of uuids of the parts (and their parts) of an aggregate entity (_entity_id).
tf_entity__aggregation__delete function trigger A trigger function that ensures the aggregated entities in an aggregation are deleted when the aggregate entity is deleted.
tf_entity__association__delete function trigger An INSTEAD OF trigger function the allows DELETEs on the view data.entity__association to be sent to the correct underlying association table.
tf_entity__association__upsert function trigger An INSTEAD OF trigger function the allows INSERTs and UPDATEs on the view data.entity__association to be sent to the correct underlying association table.
tf_entity__attribute__delete function trigger An INSTEAD OF trigger function the allows DELETEs on the view data.entity__attribute to be sent to the correct underlying attribute table.
tf_entity__attribute__upsert function trigger An INSTEAD OF trigger function the allows INSERTs and UPDATEs on the view data.entity__attribute to be sent to the correct underlying attribute table.