Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apache IoTDB: Integrate IoTDB Data Quality Library and enable visualisation in Grafana #52

Merged
merged 5 commits into from
Aug 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion docker/docker-compose-cdsp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,13 @@ services:

# Apache IoTDB acting as VSS Data Store
iotdb-service:
image: apache/iotdb:1.2.2-standalone
hostname: iotdb-service
container_name: iotdb-service
build:
context: iotdb # CDSP IoTDB image
args:
iotdb_version: 1.2.2
service_hostname: iotdb-service
restart: on-failure:3
ports:
- "6667:6667"
Expand Down
42 changes: 42 additions & 0 deletions docker/iotdb/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# SPDX-FileCopyrightText: Copyright (c) 2024 Renesas Electronics
# SPDX-License-Identifier: MPL-2.0
#
# Dockerfile for the CDSP build of Apache IoTDB.
# ---------------------------------------------------------------------------

ARG IOTDB_VERSION=1.2.2
ARG IOTDB_IMAGE=apache/iotdb:${IOTDB_VERSION}-standalone

FROM ${IOTDB_IMAGE} AS cdsp-extras

ARG IOTDB_VERSION
ARG IOTDB_IMAGE

# UDF Data Quality library
ARG UDF_ARCHIVE_NAME=apache-iotdb-${IOTDB_VERSION}-library-udf-bin.zip
ARG UDF_ARCHIVE_URL=https://archive.apache.org/dist/iotdb/${IOTDB_VERSION}/${UDF_ARCHIVE_NAME}
ARG UDF_LIB_PATH=apache-iotdb-${IOTDB_VERSION}-library-udf-bin/ext/udf/library-udf.jar
ARG UDF_REG_SCRIPT_NAME=register-UDF.sh
ARG UDF_REG_SCRIPT_PATH=apache-iotdb-${IOTDB_VERSION}-library-udf-bin/tools/${UDF_REG_SCRIPT_NAME}
ARG SERVICE_HOSTNAME=iotdb-service

# General configuration
RUN apt update \
&& apt install unzip -y

# Add the optional IoTDB UDF Data Quality Library to the image.
# Note: to use the library functions the registration script must first be run in the running container.
WORKDIR ${IOTDB_HOME}/ext/udf
ADD ${UDF_ARCHIVE_URL} .
RUN unzip -j -o ${UDF_ARCHIVE_NAME} ${UDF_LIB_PATH} ${UDF_REG_SCRIPT_PATH} \
&& sed -i "s/^host=127.0.0.1/host=${SERVICE_HOSTNAME}/g" ${UDF_REG_SCRIPT_NAME} \
&& mv ${UDF_REG_SCRIPT_NAME} ${IOTDB_HOME}/sbin/ \
&& rm ${UDF_ARCHIVE_NAME}

WORKDIR ${IOTDB_HOME}/conf
# Enable IoTDB REST API as it is used by the IoTDB Grafana Connector for queries
RUN sed -i 's/^# enable_rest_service=false/enable_rest_service=true/g' iotdb-common.properties

# Set the entry point path to the IoTDB sbin directory where the IoTDB admin scripts are.
# This mimics the upstream image.
WORKDIR ${IOTDB_HOME}/sbin
2 changes: 1 addition & 1 deletion docs/docs-gen/content/examples/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Note: There is not a separate section for the VSS data model because the vast ma
## Data Layer, Processing and Analysis
Data Reduction \| Data Quality \| Events \| Data Streams etc.

Tip: well you wait for some examples consider how you could use the IoTDB data processing functions in the [UDF library](https://iotdb.apache.org/UserGuide/latest/Reference/UDF-Libraries.html).
Tip: well you wait for some examples consider how you could use the [IoTDB data processing functions]({{< ref "apache-iotdb#data-processing-functions" >}} "IoTDB data processing").

## Knowledge Layer, Reasoning and Data Models
Data Layer Connector \| A \| B \| C etc.
Expand Down
63 changes: 46 additions & 17 deletions docs/docs-gen/content/manuals/apache-iotdb.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,23 +111,52 @@ At the time of writing the playground docker deployment deploys the single node

Of course if your cloud development requires higher performance then you can integrate the cluster version.

## UDF and UDF library for data processing
Whilst IoTDB has a series of built-in timeseries processing functions you can add your own as User Defined Functions (UDF).

The [UDF section](https://iotdb.apache.org/UserGuide/latest/User-Manual/Database-Programming.html#user-defined-function-udf) of the IoTDB documentation explains how to develop and register your own.

The IoTDB project also maintains UDF Library an extensive collection of data processing functions covering:
- Data Quality
- Data Profiling
- Anomaly Detection
- Frequency Domain Analysis
- Data Repair
- Series Discovery
- Machine Learning

The UDF Library is an optional install. How to install the library is documented [here](https://iotdb.apache.org/UserGuide/latest/User-Manual/Database-Programming.html#data-quality-function-library). Documentation for the functions can be found [here](https://iotdb.apache.org/UserGuide/latest/Reference/UDF-Libraries.html).

The combination of built-in and UDF library functions, built on the low latency queries enabled by IoTDB and its TsFile format gives you a lot to explore.
## Data processing functions
### Built-in functions
IoTDB has an extensive collection of built-in data processing functions covering areas including:

- Aggregate Functions, such as `SUM`.
- Arithmetic Functions, such as `SIN`.
- Comparison Functions, such as `ON_OFF`.
- String Processing Functions, such as `STRING_CONTAINS`.
- Data Type Conversion Function, such as `CAST`.
- Constant Timeseries Generating Functions, such as `CONST`.
- Selector Functions, such as `TOP_K`.
- Continuous Interval Functions, such as `ZERO_DURATION`.
- Variation Trend Calculation Functions, such as `TIME_DIFFERENCE`.
- Sampling Functions, such as `M4`.
- Change Points Function, such as `CHANGE_POINTS`.

A full function list with examples can be found in the upstream [IoTDB Function reference manual](https://iotdb.apache.org/UserGuide/latest/Reference/Function-and-Expression.html).

### Data Quality Library functions
The IoTDB project also maintains a Data Quality Library which provides an additional collection of functions covering:
- Data Quality, such as `Accuracy`.
- Data Profiling, such as `Sample`.
- Anomaly Detection, such as `Outlier`.
- Frequency Domain Analysis, such as `HighPass`.
- Data Repair, such as `TimestampRepair`.
- Series Discovery, such as `ConsecutiveSequences`.
- Machine Learning, such as `AR`.

A full function list with examples can be found in the upstream [IoTDB Data Quality Library reference manual](https://iotdb.apache.org/UserGuide/latest/Reference/UDF-Libraries.html).

#### Setup
In the upstream IoTDB project the library is an optional install.

For your convenience it is included in the Playground IoTDB image for you. However to call the functions they must first be registered in the running IoTDB instance, which you only need to do once. The script `/iotdb/sbin/register-UDF.sh` is included in the IoTDB image to do this for you.

Steps:

1. Start the Playground if it is not already running.
2. From your host execute the following command to run the registration script in the IoTDB container image:
```
sudo docker exec -ti iotdb-service /iotdb/sbin/register-UDF.sh
```


### User Defined functions
IoTDB also allows you intergrate your own functions as User Defined Functions (UDF). The [UDF section](https://iotdb.apache.org/UserGuide/latest/User-Manual/Database-Programming.html#user-defined-function-udf) of the IoTDB documentation explains how to develop and register your own.

## VISSR (VISS) integration
As part of the initial development of the playground the team extended VISSR to support connections to Apache IoTDB as a VISSR data store backend and upstreamed the support.
Expand Down
Loading
Loading