Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CONTENT] Refactoring: Absorb tutorial-like references and content from crate-clients-tools #29

Merged
merged 1 commit into from
Feb 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
47 changes: 47 additions & 0 deletions docs/getting-started/connect.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
(connect)=

# Connect

You have a variety of options to connect your custom applications to CrateDB,
and to integrate it with 3rd-party applications, mostly using [CrateDB's
PostgreSQL interface].

This documentation section lists client drivers, libraries, and frameworks,
which can be used together with CrateDB, and outlines how to use them optimally.

- [Drivers and Integrations]
- [Integration Tutorials]

About specific topics, there are code examples for database drivers,
dataframe-, and ORM-libraries.

- [Database Driver Code Examples]
- SQLAlchemy
- [SQLAlchemy Support], [SQLAlchemy by Example], [SQLAlchemy Code Examples]
- pandas and Dask
- [Importing Parquet files into CrateDB using Apache Arrow and SQLAlchemy]
- [Guide to efficient data ingestion to CrateDB with pandas]
- [Guide to efficient data ingestion to CrateDB with pandas and Dask]
- [Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]
- [pandas code examples]
- [Dask code examples]
- Polars
- [Polars code examples]
Comment on lines +22 to +29
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right place for this content? Section is about connect while this content looks more like integrations.
Should we add a pandas or so section to the integrations section instead? This would also highlight it better I think.

Copy link
Member Author

@amotl amotl Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will see what I can do here while compressing the whole "Getting Started" section into a single page, as suggested. Thanks.

Copy link
Member Author

@amotl amotl Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corresponding improvements have been implemented on behalf of a separate PR, in order to make reviewing easier, and to separate concerns.





[CrateDB's PostgreSQL interface]: inv:crate-reference:*:label#interface-postgresql
[Dask code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/dask
[Database Driver Code Examples]: inv:crate-clients-tools:*:label#connect
[Drivers and Integrations]: inv:crate-clients-tools:*:label#index
[Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]: https://cratedb.com/docs/python/en/latest/by-example/sqlalchemy/dataframe.html
[Guide to efficient data ingestion to CrateDB with pandas]: https://community.crate.io/t/guide-to-efficient-data-ingestion-to-cratedb-with-pandas/1541
[Guide to efficient data ingestion to CrateDB with pandas and Dask]: https://community.crate.io/t/guide-to-efficient-data-ingestion-to-cratedb-with-pandas-and-dask/1482
[Importing Parquet files into CrateDB using Apache Arrow and SQLAlchemy]: https://community.crate.io/t/importing-parquet-files-into-cratedb-using-apache-arrow-and-sqlalchemy/1161
[Integration Tutorials]: https://community.crate.io/t/overview-of-cratedb-integration-tutorials/1015
[pandas code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/pandas
[Polars code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/polars
[SQLAlchemy by Example]: https://cratedb.com/docs/python/en/latest/by-example/index.html#sqlalchemy-by-example
[SQLAlchemy Code Examples]: https://github.com/crate/cratedb-examples/tree/main/by-language/python-sqlalchemy
[SQLAlchemy Support]: https://cratedb.com/docs/python/en/latest/sqlalchemy.html
1 change: 1 addition & 0 deletions docs/getting-started/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ A collection of CrateDB best practices and tips for common scenarios.

introduction
create-user
connect
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@ reference-architectures/index
topic/analysis/index
topic/timeseries/index
topic/ml/index
topic/testing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we re-order the entries here so that the reference-architecture is not in the middle of integrations and topics? I'd suggest to move the reference-architecture up before integrations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can have a look how this looks like, independently of semantic matters. Thanks.

Copy link
Member Author

@amotl amotl Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented with 2b4b45d on behalf of a separate PR, stacked upon this one.

```


Expand Down
2 changes: 1 addition & 1 deletion docs/integrate/azure-functions.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.. _cratedb-azure-functions:
.. _azure-functions:

===========================================================
Data Enrichment using IoT Hubs, Azure Functions and CrateDB
Expand Down
169 changes: 156 additions & 13 deletions docs/integrate/index.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,172 @@
(integrate)=

# Integrations

A set of tutorials and guidelines how to integrate CrateDB with 3rd-party
systems.
You have a variety of options to connect and integrate 3rd-party
applications, mostly using [CrateDB's PostgreSQL interface].

This documentation section lists frameworks and applications which can
be used together with CrateDB, and outlines how to use them optimally.


## Apache Airflow / Astronomer

A set of starter tutorials.

- [Automating the import of Parquet files with Apache Airflow]
- [Updating stock market data automatically with CrateDB and Apache Airflow]
- [Automating stock data collection and storage with CrateDB and Apache Airflow]

A set of elaborated tutorials, including blueprint implementations.

- [Automating export of CrateDB data to S3 using Apache Airflow]
- [Implementing a data retention policy in CrateDB using Apache Airflow]
- [CrateDB and Apache Airflow: Building a data ingestion pipeline]
- [Building a hot and cold storage data retention policy in CrateDB with Apache Airflow]

Tutorials and resources about configuring the managed variants, Astro and CrateDB Cloud.

- [ETL with Astro and CrateDB Cloud in 30min - fully up in the cloud]
- [ETL pipeline using Apache Airflow with CrateDB (Source)]
- [Run an ETL pipeline with CrateDB and data quality checks]


## Apache Flink

- {ref}`kafka-connect`
- [Build a data ingestion pipeline using Kafka, Flink, and CrateDB]
- [Community Day: Stream processing with Apache Flink and CrateDB]
- [Executable stack: Apache Kafka, Apache Flink, and CrateDB]


## Apache Kafka

- [Data Ingestion using Kafka and Kafka Connect]
- [Executable stack: Apache Kafka, Apache Flink, and CrateDB]
- [Tutorial: Replicating data to CrateDB with Debezium and Kafka]


## Azure Functions

- {ref}`azure-functions`


## dbt

- [Using dbt with CrateDB]


## Debezium

- [Tutorial: Replicating data to CrateDB with Debezium and Kafka]
- [Webinar: How to replicate data from other databases to CrateDB with Debezium and Kafka]


## Kestra

- [Setting up data pipelines with CrateDB and Kestra]


## MongoDB

- {ref}`integrate-mongodb`


## MySQL

- {ref}`integrate-mysql`


## Node-RED

- [Ingesting MQTT messages into CrateDB using Node-RED]
- [Automating recurrent CrateDB queries using Node-RED]


## Prometheus

- [CrateDB Prometheus Adapter]
- [Getting Started With Prometheus and CrateDB for Long-Term Storage]
- [Storing long-term metrics with Prometheus in CrateDB]
- [Webinar: Using Prometheus and Grafana with CrateDB Cloud]


## Singer / Meltano

- [meltano-target-cratedb]
- [meltano-tap-cratedb]
- [Examples about working with CrateDB and Meltano]

🚧 _Please note these adapters are a work in progress._ 🚧


## SQL Server Integration Services

A demo project which uses SSIS and ODBC to read and write data from CrateDB:

- [Using SQL Server Integration Services with CrateDB]


## Streamsets

- {ref}`streamsets`


## Telegraf

- [Use CrateDB With Telegraf, an Agent for Collecting & Reporting Metrics]


[Automating recurrent CrateDB queries using Node-RED]: https://community.cratedb.com/t/automating-recurrent-cratedb-queries/788
[Automating export of CrateDB data to S3 using Apache Airflow]: https://community.cratedb.com/t/cratedb-and-apache-airflow-automating-data-export-to-s3/901
[Automating stock data collection and storage with CrateDB and Apache Airflow]: https://community.cratedb.com/t/automating-stock-data-collection-and-storage-with-cratedb-and-apache-airflow/990
[Automating the import of Parquet files with Apache Airflow]: https://community.cratedb.com/t/automating-the-import-of-parquet-files-with-apache-airflow/1247
[Build a data ingestion pipeline using Kafka, Flink, and CrateDB]: https://dev.to/crate/build-a-data-ingestion-pipeline-using-kafka-flink-and-cratedb-1h5o
[Building a hot and cold storage data retention policy in CrateDB with Apache Airflow]: https://community.cratedb.com/t/cratedb-and-apache-airflow-building-a-hot-cold-storage-data-retention-policy/934
[Community Day: Stream processing with Apache Flink and CrateDB]: https://cratedb.com/blog/cratedb-community-day-2nd-edition-summary-and-highlights
[CrateDB and Apache Airflow: Building a data ingestion pipeline]: https://community.cratedb.com/t/cratedb-and-apache-airflow-building-a-data-ingestion-pipeline/926
[CrateDB's PostgreSQL interface]: inv:crate-reference:*:label#interface-postgresql
[CrateDB Prometheus Adapter]: https://github.com/crate/cratedb-prometheus-adapter
[Data Ingestion using Kafka and Kafka Connect]: https://cratedb.com/docs/crate/howtos/en/latest/integrations/kafka-connect.html
[ETL pipeline using Apache Airflow with CrateDB (Source)]: https://github.com/astronomer/astro-cratedb-blogpost
[ETL with Astro and CrateDB Cloud in 30min - fully up in the cloud]: https://www.astronomer.io/blog/run-etlelt-with-airflow-and-cratedb/
[Examples about working with CrateDB and Meltano]: https://github.com/crate/cratedb-examples/tree/amo/meltano/framework/singer-meltano
[Executable stack: Apache Kafka, Apache Flink, and CrateDB]: https://github.com/crate/cratedb-examples/tree/main/application/apache-kafka-flink
[Getting Started With Prometheus and CrateDB for Long-Term Storage]: https://cratedb.com/blog/getting-started-prometheus-cratedb-long-term-storage
[Implementing a data retention policy in CrateDB using Apache Airflow]: https://community.cratedb.com/t/implementing-a-data-retention-policy-in-cratedb-using-apache-airflow/913
[Ingesting MQTT messages into CrateDB using Node-RED]: https://community.cratedb.com/t/ingesting-mqtt-messages-into-cratedb-using-node-red/803
[meltano-tap-cratedb]: https://github.com/crate-workbench/meltano-tap-cratedb
[meltano-target-cratedb]: https://github.com/crate-workbench/meltano-target-cratedb
[Overview of CrateDB integration tutorials]: https://community.cratedb.com/t/overview-of-cratedb-integration-tutorials/1015
[Run an ETL pipeline with CrateDB and data quality checks]: https://registry.astronomer.io/dags/etl_pipeline/
[Setting up data pipelines with CrateDB and Kestra]: https://community.cratedb.com/t/setting-up-data-pipelines-with-cratedb-and-kestra-io/1400
[Storing long-term metrics with Prometheus in CrateDB]: https://community.cratedb.com/t/storing-long-term-metrics-with-prometheus-in-cratedb/1012
[Tutorial: Replicating data to CrateDB with Debezium and Kafka]: https://community.cratedb.com/t/replicating-data-to-cratedb-with-debezium-and-kafka/1388
[Updating stock market data automatically with CrateDB and Apache Airflow]: https://community.cratedb.com/t/updating-stock-market-data-automatically-with-cratedb-and-apache-airflow/1304
[Use CrateDB With Telegraf, an Agent for Collecting & Reporting Metrics]: https://cratedb.com/blog/use-cratedb-with-telegraf-an-agent-for-collecting-reporting-metrics
[Using dbt with CrateDB]: https://community.cratedb.com/t/using-dbt-with-cratedb/1566
[Using SQL Server Integration Services with CrateDB]: https://github.com/crate/cratedb-examples/tree/main/application/microsoft-ssis
[Webinar: How to replicate data from other databases to CrateDB with Debezium and Kafka]: https://cratedb.com/resources/webinars/lp-wb-debezium-kafka
[Webinar: Using Prometheus and Grafana with CrateDB Cloud]: https://cratedb.com/resources/webinars/lp-wb-prometheus-grafana


:::{tip}
Please also visit the [Overview of CrateDB integration tutorials].
:::



```{toctree}
:maxdepth: 1
:hidden:
mongodb
mysql
```


```{toctree}
:maxdepth: 1
:hidden:
kafka-connect
azure-functions
streamsets
```


:::{tip}
Please also visit the [Overview of CrateDB integration tutorials].
:::


[Overview of CrateDB integration tutorials]: https://community.cratedb.com/t/overview-of-cratedb-integration-tutorials/1015
2 changes: 2 additions & 0 deletions docs/integrate/kafka-connect.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _kafka-connect:

============================================
Data Ingestion using Kafka and Kafka Connect
============================================
Expand Down
1 change: 1 addition & 0 deletions docs/integrate/mongodb.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
.. highlight:: psql

.. _integrate-mongodb:
.. _migrating-mongodb:

========================
Expand Down
1 change: 1 addition & 0 deletions docs/integrate/mysql.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
.. highlight:: psql

.. _integrate-mysql:
.. _migrating-mysql:

======================
Expand Down
2 changes: 1 addition & 1 deletion docs/integrate/streamsets.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.. _cratedb-streamsets:
.. _streamsets:

================================================================
Data Stream Pipelines with CrateDB and StreamSets Data Collector
Expand Down
Loading
Loading