Skip to content

Commit

Permalink
Refactoring: Move tutorial-like content and references to cratedb-guide
Browse files Browse the repository at this point in the history
  • Loading branch information
amotl committed Mar 6, 2024
1 parent 28f5228 commit 0b3df8b
Show file tree
Hide file tree
Showing 31 changed files with 27 additions and 689 deletions.
Binary file removed docs/_assets/img/cluster-overview.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-add-panel.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-admin-create-table.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-connection.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-dashboard-final.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-import.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-new-dashboard.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-new-panel.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-panel1.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-panel2.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-psql.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-search.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-settings.png
Binary file not shown.
Binary file removed docs/_assets/img/grafana-welcome.png
Binary file not shown.
Binary file removed docs/_assets/img/metabase-add-database.png
Binary file not shown.
Binary file removed docs/_assets/img/metabase-answer.png
Binary file not shown.
Binary file removed docs/_assets/img/metabase-dashboard.png
Binary file not shown.
Binary file removed docs/_assets/img/metabase-database-configuration.png
Binary file not shown.
Binary file removed docs/_assets/img/metabase-question.png
Binary file not shown.
Binary file removed docs/_assets/img/metabase-sync-done.png
Binary file not shown.
27 changes: 4 additions & 23 deletions docs/connect/df.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
(df)=
(dataframes)=
# Use CrateDB with DataFrame libraries
# CrateDB and DataFrame libraries

This documentation section lists DataFrame libraries and frameworks which can
be used together with CrateDB, and outlines how to use them optimally.
be used together with CrateDB. Hands-on tutorials about them can be found
on the ["connect" section of the CrateDB Guide].


## Dask
Expand All @@ -29,11 +30,6 @@ the Python libraries that you know and love, like NumPy, pandas, and scikit-lear
:style: "clear: both"
```

**See also**
- [Guide to efficient data ingestion to CrateDB with pandas and Dask]
- [Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]
- [Dask code examples]


## pandas

Expand All @@ -49,13 +45,6 @@ and manipulation tool, built on top of the Python programming language.
:style: "clear: both"
```

**See also**
- [Importing Parquet files into CrateDB using Apache Arrow and SQLAlchemy]
- [Guide to efficient data ingestion to CrateDB with pandas]
- [Guide to efficient data ingestion to CrateDB with pandas and Dask]
- [Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]
- [pandas code examples]


## Polars

Expand Down Expand Up @@ -109,20 +98,12 @@ This allows you to easily integrate Polars into your existing data stack.
:style: "clear: both"
```

**See also**
- [Polars code examples]


[Apache Arrow]: https://arrow.apache.org/
["connect" section of the CrateDB Guide]: inv:guide:*:label#connect
[Dask]: https://www.dask.org/
[Dask code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/dask
[Dask DataFrames]: https://docs.dask.org/en/latest/dataframe.html
[Dask Futures]: https://docs.dask.org/en/latest/futures.html
[Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]: https://cratedb.com/docs/python/en/latest/by-example/sqlalchemy/dataframe.html
[Importing Parquet files into CrateDB using Apache Arrow and SQLAlchemy]: https://community.crate.io/t/importing-parquet-files-into-cratedb-using-apache-arrow-and-sqlalchemy/1161
[pandas]: https://pandas.pydata.org/
[pandas code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/pandas
[Polars]: https://pola.rs/
[Polars code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/polars
[Guide to efficient data ingestion to CrateDB with pandas]: https://community.crate.io/t/guide-to-efficient-data-ingestion-to-cratedb-with-pandas/1541
[Guide to efficient data ingestion to CrateDB with pandas and Dask]: https://community.crate.io/t/guide-to-efficient-data-ingestion-to-cratedb-with-pandas-and-dask/1482
16 changes: 4 additions & 12 deletions docs/connect/orm.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
(orm)=
# Use CrateDB with ORM libraries
# CrateDB and ORM libraries

This documentation section lists ORM libraries and frameworks which can
be used together with CrateDB, and outlines how to use them optimally.
be used together with CrateDB. Hands-on tutorials about them can be found
on the ["connect" section of the CrateDB Guide].


## SQLAlchemy
Expand All @@ -24,15 +25,6 @@ frameworks, are using SQLAlchemy as data abstraction library when connecting to
:style: "clear: both"
```

**See also**
- [SQLAlchemy support]
- [SQLAlchemy by example]
- [Code examples]



[Code examples]: https://github.com/crate/cratedb-examples/tree/main/by-language/python-sqlalchemy
["connect" section of the CrateDB Guide]: inv:guide:*:label#connect
[RDBMS]: https://en.wikipedia.org/wiki/RDBMS
[SQLAlchemy]: https://www.sqlalchemy.org/
[SQLAlchemy by example]: https://cratedb.com/docs/python/en/latest/by-example/index.html#sqlalchemy-by-example
[SQLAlchemy support]: https://cratedb.com/docs/python/en/latest/sqlalchemy.html
9 changes: 0 additions & 9 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,14 +145,6 @@ Adapters and integrations with machine learning frameworks.
:::


:::{grid-item-card} {material-outlined}`integration_instructions;2em` Software Testing
:link: testing
:link-type: ref

Test frameworks and libraries that support software integration testing with
CrateDB.
:::


::::

Expand Down Expand Up @@ -188,7 +180,6 @@ System Metrics <integrate/metrics>
Data Visualization <integrate/visualize>
Business Intelligence <integrate/bi>
Machine Learning <integrate/ml>
Software Testing <integrate/testing>
```

```{toctree}
Expand Down
6 changes: 3 additions & 3 deletions docs/integrate/bi.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@ data analytics and visualizations. Using Power BI Desktop, users can create repo
and dashboards from large datasets.

For connecting to CrateDB with Power BI, you can use the [Power Query PostgreSQL connector].
Earlier versions used the [PostgreSQL ODBC driver]. [](#cratedb-powerbi-desktop) walks
Earlier versions used the [PostgreSQL ODBC driver]. [](inv:guide#powerbi-desktop) walks
you through the process of configuring that correctly.

The [Power BI Service] is an online data analysis and visualization tool, making it
[Power BI Service] is an online data analysis and visualization tool, making it
possible to publish your dashboards, in order to share them with others.
[](#cratedb-powerbi-service) has a corresponding tutorial.
[](inv:guide#powerbi-service) has a corresponding tutorial.

![](https://crate.io/docs/crate/howtos/en/latest/_images/powerbi-table-navigator.png){h=160px}
![](https://crate.io/docs/crate/howtos/en/latest/_images/powerbi-pie-chart.png){h=160px}
Expand Down
76 changes: 3 additions & 73 deletions docs/integrate/etl.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
# ETL with CrateDB

Use ETL / data pipeline applications and frameworks for transferring data in
and out of CrateDB.
and out of CrateDB. Corresponding tutorials can be found within the
[CrateDB Guide: Integration Tutorials] section of the documentation.


(apache-airflow)=
Expand Down Expand Up @@ -30,26 +31,6 @@ to fit the level of abstraction that suits your environment.
[![](https://logowik.com/content/uploads/images/astronomer2824.jpg){w=180px}](https://www.astronomer.io/)
```

A set of starter tutorials.

- [Automating the import of Parquet files with Apache Airflow]
- [Updating stock market data automatically with CrateDB and Apache Airflow]
- [Automating stock data collection and storage with CrateDB and Apache Airflow]

A set of elaborated tutorials, including blueprint implementations.

- [Automating export of CrateDB data to S3 using Apache Airflow]
- [Implementing a data retention policy in CrateDB using Apache Airflow]
- [CrateDB and Apache Airflow: Building a data ingestion pipeline]
- [Building a hot and cold storage data retention policy in CrateDB with Apache Airflow]

Tutorials and resources about configuring the managed variants, Astro and CrateDB Cloud.

- [ETL with Astro and CrateDB Cloud in 30min - fully up in the cloud]
- [ETL pipeline using Apache Airflow with CrateDB (Source)]
- [Run an ETL pipeline with CrateDB and data quality checks]


```{seealso}
[CrateDB and Apache Airflow]
```
Expand Down Expand Up @@ -97,10 +78,6 @@ Systems Award].

> Apache Flink greatly expanded the use of stream data-processing.
- [Build a data ingestion pipeline using Kafka, Flink, and CrateDB]
- [Community Day: Stream processing with Apache Flink and CrateDB]
- [Executable stack: Apache Kafka, Apache Flink, and CrateDB]

![](https://flink.apache.org/img/flink-home-graphic.png){h=200px}

:::{dropdown} **Managed Flink**
Expand All @@ -122,10 +99,6 @@ A few companies are specializing in offering managed Flink services.
thousands of companies for high-performance data pipelines, streaming analytics,
data integration, and mission-critical applications.

- [Data Ingestion using Kafka and Kafka Connect]
- [Executable stack: Apache Kafka, Apache Flink, and CrateDB]
- [Tutorial: Replicating data to CrateDB with Debezium and Kafka]

```{seealso}
[CrateDB and Apache Kafka]
```
Expand Down Expand Up @@ -166,8 +139,6 @@ dbt projects run, for example with [Debezium](#debezium) or with [Airflow](#apac
Afterwards, data analysts can run their dbt projects against this data to produce models
(tables and views) that can be used with a number of [BI tools](#bi-tools).

- [Using dbt with CrateDB]

![](https://www.getdbt.com/ui/img/products/what-is-dbt-main-image.png){h=120px}
![](https://www.getdbt.com/ui/img/products/what-is-dbt-deploy.svg){h=120px}
![](https://www.getdbt.com/ui/img/products/what-is-dbt-eliminate-silos.svg){h=120px}
Expand Down Expand Up @@ -212,9 +183,6 @@ scale.
pointing it at your databases, you are able to subscribe to the event stream of
all database update operations.

- [Tutorial: Replicating data to CrateDB with Debezium and Kafka]
- [Webinar: How to replicate data from other databases to CrateDB with Debezium and Kafka]


## Kestra

Expand All @@ -236,8 +204,6 @@ Plugins are at the core of Kestra's extensibility. Many plugins are available fr
the Kestra core team, and creating your own is easy. With plugins, you can add new
functionality to Kestra.

- [Setting up data pipelines with CrateDB and Kestra]

![](https://kestra.io/landing/home/ui-3.png){h=120px}
![](https://kestra.io/landing/home/ui-4.png){h=120px}
![](https://kestra.io/landing/features/declarative.svg){h=120px}
Expand All @@ -263,9 +229,6 @@ It provides a browser-based editor that makes it easy to wire together flows
using the wide range of elements called "nodes" from the palette that can be
deployed to its runtime in a single-click.

- [Ingesting MQTT messages into CrateDB using Node-RED]
- [Automating recurrent CrateDB queries using Node-RED]

```{seealso}
[CrateDB and Node-RED]
```
Expand Down Expand Up @@ -319,12 +282,6 @@ integration engine adhering to the Singer specification.
[Meltano Hub] is the single source of truth to find any Meltano plugins as well
as Singer taps and targets.

- [meltano-target-cratedb]
- [meltano-tap-cratedb]
- [Examples about working with CrateDB and Meltano]

_Please note these adapters are a work in progress._

```{div}
:style: "clear: both"
```
Expand Down Expand Up @@ -356,10 +313,6 @@ Integration Services includes a rich set of built-in [tasks][ssis-tasks] and
[transformations][ssis-transformations], graphical tools for building packages, and
an SSIS Catalog database to store, run, and manage packages.

A demo project which uses SSIS and ODBC to read and write data from CrateDB:

- [Using SQL Server Integration Services with CrateDB]

```{div}
:style: "clear: both"
```
Expand All @@ -375,52 +328,29 @@ A demo project which uses SSIS and ODBC to read and write data from CrateDB:
[Apache Kafka]: https://kafka.apache.org/
[Apache Kafka on Azure]: https://azuremarketplace.microsoft.com/marketplace/consulting-services/canonical.0001-com-ubuntu-managed-kafka
[Astronomer]: https://www.astronomer.io/
[Automating recurrent CrateDB queries using Node-RED]: https://community.crate.io/t/automating-recurrent-cratedb-queries/788
[Automating export of CrateDB data to S3 using Apache Airflow]: https://community.crate.io/t/cratedb-and-apache-airflow-automating-data-export-to-s3/901
[Automating stock data collection and storage with CrateDB and Apache Airflow]: https://community.crate.io/t/automating-stock-data-collection-and-storage-with-cratedb-and-apache-airflow/990
[Automating the import of Parquet files with Apache Airflow]: https://community.crate.io/t/automating-the-import-of-parquet-files-with-apache-airflow/1247
[Azure Event Hubs for Apache Kafka]: https://learn.microsoft.com/en-us/azure/event-hubs/azure-event-hubs-kafka-overview
[Build a data ingestion pipeline using Kafka, Flink, and CrateDB]: https://dev.to/crate/build-a-data-ingestion-pipeline-using-kafka-flink-and-cratedb-1h5o
[Building a hot and cold storage data retention policy in CrateDB with Apache Airflow]: https://community.crate.io/t/cratedb-and-apache-airflow-building-a-hot-cold-storage-data-retention-policy/934
[Community Day: Stream processing with Apache Flink and CrateDB]: https://crate.io/blog/cratedb-community-day-2nd-edition-summary-and-highlights
[Confluent Cloud]: https://www.confluent.io/confluent-cloud/
[CrateDB and Apache Airflow]: https://crate.io/integrations/cratedb-and-apache-airflow
[CrateDB and Apache Airflow: Building a data ingestion pipeline]: https://community.crate.io/t/cratedb-and-apache-airflow-building-a-data-ingestion-pipeline/926
[CrateDB and Apache Kafka]: https://crate.io/integrations/cratedb-and-kafka
[CrateDB and Kestra]: https://crate.io/integrations/cratedb-and-kestra
[CrateDB and Node-RED]: https://crate.io/integrations/cratedb-and-node-red
[Data Ingestion using Kafka and Kafka Connect]: https://crate.io/docs/crate/howtos/en/latest/integrations/kafka-connect.html
[CrateDB Guide: Integration Tutorials]: inv:guide:*:label#integrate
[dbt]: https://www.getdbt.com/
[dbt Cloud]: https://www.getdbt.com/product/dbt-cloud/
[Debezium]: https://debezium.io/
[DoubleCloud Managed Service for Apache Kafka]: https://double.cloud/services/managed-kafka/
[ETL pipeline using Apache Airflow with CrateDB (Source)]: https://github.com/astronomer/astro-cratedb-blogpost
[ETL with Astro and CrateDB Cloud in 30min - fully up in the cloud]: https://www.astronomer.io/blog/run-etlelt-with-airflow-and-cratedb/
[Examples about working with CrateDB and Meltano]: https://github.com/crate/cratedb-examples/tree/amo/meltano/framework/singer-meltano
[Executable stack: Apache Kafka, Apache Flink, and CrateDB]: https://github.com/crate/cratedb-examples/tree/main/application/apache-kafka-flink
[Flink managed by Confluent]: https://www.datanami.com/2023/05/17/confluents-new-cloud-capabilities-address-data-streaming-hurdles/
[FlowFuse]: https://flowfuse.com/
[FlowFuse Cloud]: https://app.flowforge.com/
[Immerok Cloud]: https://web.archive.org/web/20230602085618/https://www.immerok.io/product
[Implementing a data retention policy in CrateDB using Apache Airflow]: https://community.crate.io/t/implementing-a-data-retention-policy-in-cratedb-using-apache-airflow/913
[Ingesting MQTT messages into CrateDB using Node-RED]: https://community.crate.io/t/ingesting-mqtt-messages-into-cratedb-using-node-red/803
[Introduction to FlowFuse]: https://flowfuse.com/webinars/2023/introduction-to-flowforge/
[Kestra]: https://kestra.io/
[Meltano]: https://meltano.com/
[Meltano Hub]: https://hub.meltano.com/
[meltano-tap-cratedb]: https://github.com/crate-workbench/meltano-tap-cratedb
[meltano-target-cratedb]: https://github.com/crate-workbench/meltano-target-cratedb
[Node-RED]: https://nodered.org/
[Overview about more managed Kafka offerings]: https://keen.io/blog/managed-apache-kafka-vs-diy/
[Run an ETL pipeline with CrateDB and data quality checks]: https://registry.astronomer.io/dags/etl_pipeline/
[Setting up data pipelines with CrateDB and Kestra]: https://community.crate.io/t/setting-up-data-pipelines-with-cratedb-and-kestra-io/1400
[Singer]: https://www.singer.io/
[SQL Server Integration Services]: https://learn.microsoft.com/en-us/sql/integration-services/sql-server-integration-services
[SSIS]: https://en.wikipedia.org/wiki/SQL_Server_Integration_Services
[ssis-tasks]: https://learn.microsoft.com/en-us/sql/integration-services/control-flow/integration-services-tasks
[ssis-transformations]: https://learn.microsoft.com/en-us/sql/integration-services/data-flow/transformations/integration-services-transformations
[Tutorial: Replicating data to CrateDB with Debezium and Kafka]: https://community.crate.io/t/replicating-data-to-cratedb-with-debezium-and-kafka/1388
[Updating stock market data automatically with CrateDB and Apache Airflow]: https://community.crate.io/t/updating-stock-market-data-automatically-with-cratedb-and-apache-airflow/1304
[Using dbt with CrateDB]: https://community.crate.io/t/using-dbt-with-cratedb/1566
[Using SQL Server Integration Services with CrateDB]: https://github.com/crate/cratedb-examples/tree/main/application/microsoft-ssis
[Webinar: How to replicate data from other databases to CrateDB with Debezium and Kafka]: https://crate.io/resources/webinars/lp-wb-debezium-kafka
25 changes: 6 additions & 19 deletions docs/integrate/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,9 @@
# Monitoring and Metrics with CrateDB

Storing metrics data for the long term is a common need in systems monitoring
scenarios. CrateDB offers corresponding integration adapters.

scenarios. CrateDB offers corresponding integration adapters. Relevant tutorials
can be found within the [CrateDB Guide: Integration Tutorials] section of the
documentation.

(prometheus)=
## Prometheus
Expand Down Expand Up @@ -45,18 +46,11 @@ Adapter], one can easily store the collected metrics data in CrateDB and
take advantage of its high ingestion and query speed and friendly UI to
massively scale-out Prometheus.


**Resources**

- [CrateDB Prometheus Adapter]
- [Getting Started With Prometheus and CrateDB for Long-Term Storage]
- [Storing long-term metrics with Prometheus in CrateDB]
- [Webinar: Using Prometheus and Grafana with CrateDB Cloud]

![](https://github.com/crate/crate-clients-tools/assets/453543/26b47686-889a-4137-a87f-d6a6b38d56d2){h=200px}

```{seealso}
[CrateDB and Prometheus]
- [CrateDB and Prometheus]
- [CrateDB Prometheus Adapter]
```

```{div}
Expand Down Expand Up @@ -93,10 +87,6 @@ a very minimal memory footprint.
- **System telemetry**: Metrics from system telemetry like iptables, Netstat,
NGINX, and HAProxy help provide a full stack view of your apps.

**Resources**

- [Use CrateDB With Telegraf, an Agent for Collecting & Reporting Metrics]

![](https://www.influxdata.com/wp-content/uploads/Main-Diagram_06.01.2022v1.png){h=200px}

```{seealso}
Expand All @@ -110,13 +100,10 @@ a very minimal memory footprint.

[CrateDB and Prometheus]: https://cratedb.com/integrations/cratedb-and-prometheus
[CrateDB and Telegraf]: https://crate.io/integrations/cratedb-and-telegraf
[CrateDB Guide: Integration Tutorials]: inv:guide:*:label#integrate
[CrateDB Prometheus Adapter]: https://github.com/crate/cratedb-prometheus-adapter
[Getting Started With Prometheus and CrateDB for Long-Term Storage]: https://cratedb.com/blog/getting-started-prometheus-cratedb-long-term-storage
[Prometheus]: https://prometheus.io/
[Prometheus remote endpoints and storage]: https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage
[remote read]: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_read
[remote write]: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write
[Storing long-term metrics with Prometheus in CrateDB]: https://community.cratedb.com/t/storing-long-term-metrics-with-prometheus-in-cratedb/1012
[Telegraf]: https://www.influxdata.com/time-series-platform/telegraf/
[Use CrateDB With Telegraf, an Agent for Collecting & Reporting Metrics]: https://crate.io/blog/use-cratedb-with-telegraf-an-agent-for-collecting-reporting-metrics
[Webinar: Using Prometheus and Grafana with CrateDB Cloud]: https://cratedb.com/resources/webinars/lp-wb-prometheus-grafana
Loading

0 comments on commit 0b3df8b

Please sign in to comment.