Skip to content

Commit

Permalink
Time Series: Improve guidance, structure, layout, and wording
Browse files Browse the repository at this point in the history
- Expand canonical "time series" entry-point page
- Add dedicated time series sub-pages about:
  - Time Series Basics
  - Advanced Time Series Analysis
  - Connectivity Options
  - Video Tutorials
- Use "time series" 2-gram everywhere
- Improve page about "Industrial Data"
- Improve page about "Document Store"
- ML: Add section about "Exploratory data analysis (EDA)"
  • Loading branch information
amotl committed Feb 28, 2024
1 parent 500ed8d commit d63e5f3
Show file tree
Hide file tree
Showing 13 changed files with 753 additions and 34 deletions.
6 changes: 3 additions & 3 deletions docs/admin/sharding-partitioning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ partition as a set of shards. For each partition, the number of shards defined
by ``CLUSTERED INTO x SHARDS`` are created, when a first record with a specific
``partition key`` is inserted.

In the following example - which represents a very simple time-series use-case
In the following example - which represents a very simple time series use-case
- we added another column ``part`` that automatically generates the current
month upon insertion from the ``ts`` column. The ``part`` column is further used
as the ``partition key``.
Expand Down Expand Up @@ -132,12 +132,12 @@ Then, to calculate the number of shards, you should consider that the size of ea
shard should roughly be between 5 - 100 GB, and that each node can only manage
up to 1000 shards.

Time-series example
Time series example
-------------------

To illustrate the steps above, let's use them on behalf of an example. Imagine
you want to create a *partitioned table* on a *three-node cluster* to store
time-series data with the following assumptions:
time series data with the following assumptions:

- Inserts: 1.000 records/s
- Record size: 128 byte/record
Expand Down
3 changes: 3 additions & 0 deletions docs/domain/document/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,11 @@ Storing documents in CrateDB provides the same development convenience like the
document-oriented storage layer of Lotus Notes / Domino, CouchDB, MongoDB, and
PostgreSQL's `JSON(B)` types.

- [](inv:crate-reference#type-object)
- [](inv:cloud#object)
- [CrateDB Objects]
- [Unleashing the Power of Nested Data: Ingesting and Querying JSON Documents with SQL]


[CrateDB Objects]: https://youtu.be/aQi9MXs2irU?feature=shared
[Unleashing the Power of Nested Data: Ingesting and Querying JSON Documents with SQL]: https://youtu.be/S_RHmdz2IQM?feature=shared
129 changes: 116 additions & 13 deletions docs/domain/industrial/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# Industrial Data

Learn how to use CrateDB in industrial / IIoT / Industry 4.0 scenarios within
engineering, manufacturing, and other operational domains.
engineering, manufacturing, production, and other operational domains.

In the realm of Industrial IoT, dealing with diverse data, ranging from
slow-moving structured data, to high-frequency measurements, presents unique
Expand All @@ -15,24 +15,110 @@ The complexities of industrial big data are characterized by its high variety,
unstructured features, different data sampling rates, and how these attributes
influence data storage, retention, and integration.

Today's warehouses are complex systems with a very high degree of automation.
The key to the successful operation of these warehouses lies in having a
holistic view on the entire system based on data from various components like
sensors, PLCs, embedded controllers and software systems.

(rauch)=
## Rauch Insights

::::{info-card}

:::{grid-item}
:columns: 8

{material-outlined}`data_exploration;2em`   **Rauch: High-Speed Production Lines**

_Scaling a high-speed production environment with CrateDB._

In this interview, Arno Breuss, CIO of Rauch Fruchtsäfte, talks about how Rauch
is filling 33 cans per second and how that adds up to 400 data records per
second which are being processed, stored, and analyzed. In total, they are
within the range of one to ten billion records persisted in CrateDB.

- [Rauch: Arno Breuss about High-Speed Production Lines]

Arno explains why their traditional databases weren't capable to deal with so
many data records and unstructured data. He elaborates about the benefits of
CrateDB that made Rauch choose CrateDB over other databases such as PostgreSQL
compatibility, the support for unstructured data and why it is important
for them, and its excellent customer support.

:Industry: {tags-secondary}`Food` {tags-secondary}`Packaging` {tags-secondary}`Production`
:Tags: {tags-primary}`Data Historian` {tags-primary}`Industrial IoT` {tags-primary}`PLC` {tags-primary}`SCADA`
:::

:::{grid-item}  
:columns: 4

<iframe width="240" src="https://www.youtube-nocookie.com/embed/gJPmJ0uXeVs?si=J0w5yG56Ld4fIXfm" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

**Date:** 28 Jun 2022 \
**Speaker:** Arno Breuss
:::

::::


(tgw)=
## TGW Insights


::::{info-card}

:::{grid-item}
:columns: 8

{material-outlined}`inventory;2em` &nbsp; **TGW: Data acquisition in high-speed logistics**

_Storing, querying, and analyzing industrial IoT data and metadata without
much hassle._

Today's warehouses are complex systems with a very high degree of automation.

In this interview, Alexander Mann of TGW Logistics Group, Digital Core /
Connected Warehouse, talks about how TGW implements key factors to the
successful operation of these warehouses, by having a holistic view on the
entire system acquiring data from various components like sensors, PLCs,
embedded controllers, and software systems.

- [TGW: Alexander Mann about fixing data silos in a high-speed logistics environment]

Alexander states that all these components can be seen as "data silos",
distributed across the entire site, each of them storing just some pieces of
information in various data structures and different ways to access it.

After trying multiple database systems, TGW Logistics moved to CrateDB for
its ability to aggregate different data formats and ability to query this
information without much hassle.

its ability to aggregate different data formats and the ability to query this
information without further ado.

:Industry: {tags-secondary}`Logistics` {tags-secondary}`Shipping`
:Tags: {tags-primary}`Data Historian` {tags-primary}`Industrial IoT` {tags-primary}`PLC` {tags-primary}`SCADA`
:::

:::{grid-item} &nbsp;
:columns: 4

<iframe width="240" src="https://www.youtube-nocookie.com/embed/6dgjVQJtSKI?si=J0w5yG56Ld4fIXfm" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

**Date:** 22 Jun 2022 \
**Speakers:** Alexander Mann, Jan Weber
:::

::::



::::{info-card}

:::{grid-item}
:columns: 8

{material-outlined}`dashboard;2em` &nbsp; **TGW: Challenges of storing and analyzing industrial data**

In the second presentation, you will learn how TGW leverages CrateDB to build
digital twins of physical warehouses around the world.
digital twins of physical warehouses around the world, by using its unique set
of features suitable for storing and querying complex industrial big data with
high variety, unstructured features, and at different data frequencies.

- [Fixing data silos in a high-speed logistics environment]
- [Challenges of Storing and Analyzing Industrial Data]
- [TGW: Alexander Mann about challenges of storing and analyzing real-world industrial data]

**What's inside**

Expand All @@ -47,6 +133,23 @@ digital twins of physical warehouses around the world.
- Real-World Applications: Exploration of actual customer use cases to
illustrate how CrateDB can be applied in various industrial scenarios.

:Industry: {tags-secondary}`Logistics` {tags-secondary}`Shipping`
:Tags: {tags-primary}`Data Historian` {tags-primary}`Industrial IoT` {tags-primary}`Digital Twin`
:::

:::{grid-item} &nbsp;
:columns: 4

<iframe width="240" src="https://www.youtube-nocookie.com/embed/ugQvihToY0k?si=J0w5yG56Ld4fIXfm" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

**Date:** 5 Oct 2023 \
**Speakers:** Alexander Mann, Georg Traar
:::

::::



[Challenges of Storing and Analyzing Industrial Data]: https://youtu.be/ugQvihToY0k?feature=shared
[Fixing data silos in a high-speed logistics environment]: https://youtu.be/6dgjVQJtSKI?feature=shared
[TGW: Alexander Mann about challenges of storing and analyzing real-world industrial data]: https://youtu.be/ugQvihToY0k?feature=shared
[TGW: Alexander Mann about fixing data silos in a high-speed logistics environment]: https://youtu.be/6dgjVQJtSKI?feature=shared
[Rauch: Arno Breuss about High-Speed Production Lines]: https://youtu.be/gJPmJ0uXeVs?feature=shared
Loading

0 comments on commit d63e5f3

Please sign in to comment.