diff --git a/docs/_assets/img/cluster-overview.png b/docs/_assets/img/cluster-overview.png deleted file mode 100644 index 6b27417..0000000 Binary files a/docs/_assets/img/cluster-overview.png and /dev/null differ diff --git a/docs/_assets/img/grafana-add-panel.png b/docs/_assets/img/grafana-add-panel.png deleted file mode 100644 index 2221c00..0000000 Binary files a/docs/_assets/img/grafana-add-panel.png and /dev/null differ diff --git a/docs/_assets/img/grafana-admin-create-table.png b/docs/_assets/img/grafana-admin-create-table.png deleted file mode 100644 index 9d72fc4..0000000 Binary files a/docs/_assets/img/grafana-admin-create-table.png and /dev/null differ diff --git a/docs/_assets/img/grafana-connection.png b/docs/_assets/img/grafana-connection.png deleted file mode 100644 index 1daf6d0..0000000 Binary files a/docs/_assets/img/grafana-connection.png and /dev/null differ diff --git a/docs/_assets/img/grafana-dashboard-final.png b/docs/_assets/img/grafana-dashboard-final.png deleted file mode 100644 index cb5f54b..0000000 Binary files a/docs/_assets/img/grafana-dashboard-final.png and /dev/null differ diff --git a/docs/_assets/img/grafana-import.png b/docs/_assets/img/grafana-import.png deleted file mode 100644 index 7b63db4..0000000 Binary files a/docs/_assets/img/grafana-import.png and /dev/null differ diff --git a/docs/_assets/img/grafana-new-dashboard.png b/docs/_assets/img/grafana-new-dashboard.png deleted file mode 100644 index 8ea8ac5..0000000 Binary files a/docs/_assets/img/grafana-new-dashboard.png and /dev/null differ diff --git a/docs/_assets/img/grafana-new-panel.png b/docs/_assets/img/grafana-new-panel.png deleted file mode 100644 index f971139..0000000 Binary files a/docs/_assets/img/grafana-new-panel.png and /dev/null differ diff --git a/docs/_assets/img/grafana-panel1.png b/docs/_assets/img/grafana-panel1.png deleted file mode 100644 index e78fa9b..0000000 Binary files a/docs/_assets/img/grafana-panel1.png and /dev/null differ diff --git a/docs/_assets/img/grafana-panel2.png b/docs/_assets/img/grafana-panel2.png deleted file mode 100644 index d8c0401..0000000 Binary files a/docs/_assets/img/grafana-panel2.png and /dev/null differ diff --git a/docs/_assets/img/grafana-psql.png b/docs/_assets/img/grafana-psql.png deleted file mode 100644 index a65dc13..0000000 Binary files a/docs/_assets/img/grafana-psql.png and /dev/null differ diff --git a/docs/_assets/img/grafana-search.png b/docs/_assets/img/grafana-search.png deleted file mode 100644 index 12c6f4a..0000000 Binary files a/docs/_assets/img/grafana-search.png and /dev/null differ diff --git a/docs/_assets/img/grafana-settings.png b/docs/_assets/img/grafana-settings.png deleted file mode 100644 index 8283e65..0000000 Binary files a/docs/_assets/img/grafana-settings.png and /dev/null differ diff --git a/docs/_assets/img/grafana-welcome.png b/docs/_assets/img/grafana-welcome.png deleted file mode 100644 index 9f9b602..0000000 Binary files a/docs/_assets/img/grafana-welcome.png and /dev/null differ diff --git a/docs/_assets/img/metabase-add-database.png b/docs/_assets/img/metabase-add-database.png deleted file mode 100644 index d71144e..0000000 Binary files a/docs/_assets/img/metabase-add-database.png and /dev/null differ diff --git a/docs/_assets/img/metabase-answer.png b/docs/_assets/img/metabase-answer.png deleted file mode 100644 index d8dc13e..0000000 Binary files a/docs/_assets/img/metabase-answer.png and /dev/null differ diff --git a/docs/_assets/img/metabase-dashboard.png b/docs/_assets/img/metabase-dashboard.png deleted file mode 100644 index f7440f1..0000000 Binary files a/docs/_assets/img/metabase-dashboard.png and /dev/null differ diff --git a/docs/_assets/img/metabase-database-configuration.png b/docs/_assets/img/metabase-database-configuration.png deleted file mode 100644 index 70df67c..0000000 Binary files a/docs/_assets/img/metabase-database-configuration.png and /dev/null differ diff --git a/docs/_assets/img/metabase-question.png b/docs/_assets/img/metabase-question.png deleted file mode 100644 index c71c697..0000000 Binary files a/docs/_assets/img/metabase-question.png and /dev/null differ diff --git a/docs/_assets/img/metabase-sync-done.png b/docs/_assets/img/metabase-sync-done.png deleted file mode 100644 index 14ef898..0000000 Binary files a/docs/_assets/img/metabase-sync-done.png and /dev/null differ diff --git a/docs/connect/df.md b/docs/connect/df.md index 2f4044c..e1b5454 100644 --- a/docs/connect/df.md +++ b/docs/connect/df.md @@ -1,9 +1,10 @@ (df)= (dataframes)= -# Use CrateDB with DataFrame libraries +# CrateDB and DataFrame libraries This documentation section lists DataFrame libraries and frameworks which can -be used together with CrateDB, and outlines how to use them optimally. +be used together with CrateDB. Hands-on tutorials about them can be found +on the ["connect" section of the CrateDB Guide]. ## Dask @@ -29,11 +30,6 @@ the Python libraries that you know and love, like NumPy, pandas, and scikit-lear :style: "clear: both" ``` -**See also** -- [Guide to efficient data ingestion to CrateDB with pandas and Dask] -- [Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy] -- [Dask code examples] - ## pandas @@ -49,13 +45,6 @@ and manipulation tool, built on top of the Python programming language. :style: "clear: both" ``` -**See also** -- [Importing Parquet files into CrateDB using Apache Arrow and SQLAlchemy] -- [Guide to efficient data ingestion to CrateDB with pandas] -- [Guide to efficient data ingestion to CrateDB with pandas and Dask] -- [Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy] -- [pandas code examples] - ## Polars @@ -109,20 +98,12 @@ This allows you to easily integrate Polars into your existing data stack. :style: "clear: both" ``` -**See also** -- [Polars code examples] [Apache Arrow]: https://arrow.apache.org/ +["connect" section of the CrateDB Guide]: inv:guide:*:label#connect [Dask]: https://www.dask.org/ -[Dask code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/dask [Dask DataFrames]: https://docs.dask.org/en/latest/dataframe.html [Dask Futures]: https://docs.dask.org/en/latest/futures.html -[Efficient batch/bulk INSERT operations with pandas, Dask, and SQLAlchemy]: https://cratedb.com/docs/python/en/latest/by-example/sqlalchemy/dataframe.html -[Importing Parquet files into CrateDB using Apache Arrow and SQLAlchemy]: https://community.crate.io/t/importing-parquet-files-into-cratedb-using-apache-arrow-and-sqlalchemy/1161 [pandas]: https://pandas.pydata.org/ -[pandas code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/pandas [Polars]: https://pola.rs/ -[Polars code examples]: https://github.com/crate/cratedb-examples/tree/main/by-dataframe/polars -[Guide to efficient data ingestion to CrateDB with pandas]: https://community.crate.io/t/guide-to-efficient-data-ingestion-to-cratedb-with-pandas/1541 -[Guide to efficient data ingestion to CrateDB with pandas and Dask]: https://community.crate.io/t/guide-to-efficient-data-ingestion-to-cratedb-with-pandas-and-dask/1482 diff --git a/docs/connect/orm.md b/docs/connect/orm.md index ca533b4..afee63d 100644 --- a/docs/connect/orm.md +++ b/docs/connect/orm.md @@ -1,8 +1,9 @@ (orm)= -# Use CrateDB with ORM libraries +# CrateDB and ORM libraries This documentation section lists ORM libraries and frameworks which can -be used together with CrateDB, and outlines how to use them optimally. +be used together with CrateDB. Hands-on tutorials about them can be found +on the ["connect" section of the CrateDB Guide]. ## SQLAlchemy @@ -24,15 +25,6 @@ frameworks, are using SQLAlchemy as data abstraction library when connecting to :style: "clear: both" ``` -**See also** -- [SQLAlchemy support] -- [SQLAlchemy by example] -- [Code examples] - - -[Code examples]: https://github.com/crate/cratedb-examples/tree/main/by-language/python-sqlalchemy +["connect" section of the CrateDB Guide]: inv:guide:*:label#connect [RDBMS]: https://en.wikipedia.org/wiki/RDBMS -[SQLAlchemy]: https://www.sqlalchemy.org/ -[SQLAlchemy by example]: https://cratedb.com/docs/python/en/latest/by-example/index.html#sqlalchemy-by-example -[SQLAlchemy support]: https://cratedb.com/docs/python/en/latest/sqlalchemy.html diff --git a/docs/index.md b/docs/index.md index 136a363..fa886e1 100644 --- a/docs/index.md +++ b/docs/index.md @@ -145,14 +145,6 @@ Adapters and integrations with machine learning frameworks. ::: -:::{grid-item-card} {material-outlined}`integration_instructions;2em` Software Testing -:link: testing -:link-type: ref - -Test frameworks and libraries that support software integration testing with -CrateDB. -::: - :::: @@ -188,7 +180,6 @@ System Metrics Data Visualization Business Intelligence Machine Learning -Software Testing ``` ```{toctree} diff --git a/docs/integrate/etl.md b/docs/integrate/etl.md index b4c029f..9676558 100644 --- a/docs/integrate/etl.md +++ b/docs/integrate/etl.md @@ -2,7 +2,8 @@ # ETL with CrateDB Use ETL / data pipeline applications and frameworks for transferring data in -and out of CrateDB. +and out of CrateDB. Corresponding tutorials can be found within the +[CrateDB Guide: Integration Tutorials] section of the documentation. (apache-airflow)= @@ -30,26 +31,6 @@ to fit the level of abstraction that suits your environment. [![](https://logowik.com/content/uploads/images/astronomer2824.jpg){w=180px}](https://www.astronomer.io/) ``` -A set of starter tutorials. - -- [Automating the import of Parquet files with Apache Airflow] -- [Updating stock market data automatically with CrateDB and Apache Airflow] -- [Automating stock data collection and storage with CrateDB and Apache Airflow] - -A set of elaborated tutorials, including blueprint implementations. - -- [Automating export of CrateDB data to S3 using Apache Airflow] -- [Implementing a data retention policy in CrateDB using Apache Airflow] -- [CrateDB and Apache Airflow: Building a data ingestion pipeline] -- [Building a hot and cold storage data retention policy in CrateDB with Apache Airflow] - -Tutorials and resources about configuring the managed variants, Astro and CrateDB Cloud. - -- [ETL with Astro and CrateDB Cloud in 30min - fully up in the cloud] -- [ETL pipeline using Apache Airflow with CrateDB (Source)] -- [Run an ETL pipeline with CrateDB and data quality checks] - - ```{seealso} [CrateDB and Apache Airflow] ``` @@ -97,10 +78,6 @@ Systems Award]. > Apache Flink greatly expanded the use of stream data-processing. -- [Build a data ingestion pipeline using Kafka, Flink, and CrateDB] -- [Community Day: Stream processing with Apache Flink and CrateDB] -- [Executable stack: Apache Kafka, Apache Flink, and CrateDB] - ![](https://flink.apache.org/img/flink-home-graphic.png){h=200px} :::{dropdown} **Managed Flink** @@ -122,10 +99,6 @@ A few companies are specializing in offering managed Flink services. thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. -- [Data Ingestion using Kafka and Kafka Connect] -- [Executable stack: Apache Kafka, Apache Flink, and CrateDB] -- [Tutorial: Replicating data to CrateDB with Debezium and Kafka] - ```{seealso} [CrateDB and Apache Kafka] ``` @@ -166,8 +139,6 @@ dbt projects run, for example with [Debezium](#debezium) or with [Airflow](#apac Afterwards, data analysts can run their dbt projects against this data to produce models (tables and views) that can be used with a number of [BI tools](#bi-tools). -- [Using dbt with CrateDB] - ![](https://www.getdbt.com/ui/img/products/what-is-dbt-main-image.png){h=120px} ![](https://www.getdbt.com/ui/img/products/what-is-dbt-deploy.svg){h=120px} ![](https://www.getdbt.com/ui/img/products/what-is-dbt-eliminate-silos.svg){h=120px} @@ -212,9 +183,6 @@ scale. pointing it at your databases, you are able to subscribe to the event stream of all database update operations. -- [Tutorial: Replicating data to CrateDB with Debezium and Kafka] -- [Webinar: How to replicate data from other databases to CrateDB with Debezium and Kafka] - ## Kestra @@ -236,8 +204,6 @@ Plugins are at the core of Kestra's extensibility. Many plugins are available fr the Kestra core team, and creating your own is easy. With plugins, you can add new functionality to Kestra. -- [Setting up data pipelines with CrateDB and Kestra] - ![](https://kestra.io/landing/home/ui-3.png){h=120px} ![](https://kestra.io/landing/home/ui-4.png){h=120px} ![](https://kestra.io/landing/features/declarative.svg){h=120px} @@ -263,9 +229,6 @@ It provides a browser-based editor that makes it easy to wire together flows using the wide range of elements called "nodes" from the palette that can be deployed to its runtime in a single-click. -- [Ingesting MQTT messages into CrateDB using Node-RED] -- [Automating recurrent CrateDB queries using Node-RED] - ```{seealso} [CrateDB and Node-RED] ``` @@ -319,12 +282,6 @@ integration engine adhering to the Singer specification. [Meltano Hub] is the single source of truth to find any Meltano plugins as well as Singer taps and targets. -- [meltano-target-cratedb] -- [meltano-tap-cratedb] -- [Examples about working with CrateDB and Meltano] - -_Please note these adapters are a work in progress._ - ```{div} :style: "clear: both" ``` @@ -356,10 +313,6 @@ Integration Services includes a rich set of built-in [tasks][ssis-tasks] and [transformations][ssis-transformations], graphical tools for building packages, and an SSIS Catalog database to store, run, and manage packages. -A demo project which uses SSIS and ODBC to read and write data from CrateDB: - -- [Using SQL Server Integration Services with CrateDB] - ```{div} :style: "clear: both" ``` @@ -375,52 +328,29 @@ A demo project which uses SSIS and ODBC to read and write data from CrateDB: [Apache Kafka]: https://kafka.apache.org/ [Apache Kafka on Azure]: https://azuremarketplace.microsoft.com/marketplace/consulting-services/canonical.0001-com-ubuntu-managed-kafka [Astronomer]: https://www.astronomer.io/ -[Automating recurrent CrateDB queries using Node-RED]: https://community.crate.io/t/automating-recurrent-cratedb-queries/788 -[Automating export of CrateDB data to S3 using Apache Airflow]: https://community.crate.io/t/cratedb-and-apache-airflow-automating-data-export-to-s3/901 -[Automating stock data collection and storage with CrateDB and Apache Airflow]: https://community.crate.io/t/automating-stock-data-collection-and-storage-with-cratedb-and-apache-airflow/990 -[Automating the import of Parquet files with Apache Airflow]: https://community.crate.io/t/automating-the-import-of-parquet-files-with-apache-airflow/1247 [Azure Event Hubs for Apache Kafka]: https://learn.microsoft.com/en-us/azure/event-hubs/azure-event-hubs-kafka-overview -[Build a data ingestion pipeline using Kafka, Flink, and CrateDB]: https://dev.to/crate/build-a-data-ingestion-pipeline-using-kafka-flink-and-cratedb-1h5o -[Building a hot and cold storage data retention policy in CrateDB with Apache Airflow]: https://community.crate.io/t/cratedb-and-apache-airflow-building-a-hot-cold-storage-data-retention-policy/934 -[Community Day: Stream processing with Apache Flink and CrateDB]: https://crate.io/blog/cratedb-community-day-2nd-edition-summary-and-highlights [Confluent Cloud]: https://www.confluent.io/confluent-cloud/ [CrateDB and Apache Airflow]: https://crate.io/integrations/cratedb-and-apache-airflow -[CrateDB and Apache Airflow: Building a data ingestion pipeline]: https://community.crate.io/t/cratedb-and-apache-airflow-building-a-data-ingestion-pipeline/926 [CrateDB and Apache Kafka]: https://crate.io/integrations/cratedb-and-kafka [CrateDB and Kestra]: https://crate.io/integrations/cratedb-and-kestra [CrateDB and Node-RED]: https://crate.io/integrations/cratedb-and-node-red -[Data Ingestion using Kafka and Kafka Connect]: https://crate.io/docs/crate/howtos/en/latest/integrations/kafka-connect.html +[CrateDB Guide: Integration Tutorials]: inv:guide:*:label#integrate [dbt]: https://www.getdbt.com/ [dbt Cloud]: https://www.getdbt.com/product/dbt-cloud/ [Debezium]: https://debezium.io/ [DoubleCloud Managed Service for Apache Kafka]: https://double.cloud/services/managed-kafka/ -[ETL pipeline using Apache Airflow with CrateDB (Source)]: https://github.com/astronomer/astro-cratedb-blogpost -[ETL with Astro and CrateDB Cloud in 30min - fully up in the cloud]: https://www.astronomer.io/blog/run-etlelt-with-airflow-and-cratedb/ -[Examples about working with CrateDB and Meltano]: https://github.com/crate/cratedb-examples/tree/amo/meltano/framework/singer-meltano -[Executable stack: Apache Kafka, Apache Flink, and CrateDB]: https://github.com/crate/cratedb-examples/tree/main/application/apache-kafka-flink [Flink managed by Confluent]: https://www.datanami.com/2023/05/17/confluents-new-cloud-capabilities-address-data-streaming-hurdles/ [FlowFuse]: https://flowfuse.com/ [FlowFuse Cloud]: https://app.flowforge.com/ [Immerok Cloud]: https://web.archive.org/web/20230602085618/https://www.immerok.io/product -[Implementing a data retention policy in CrateDB using Apache Airflow]: https://community.crate.io/t/implementing-a-data-retention-policy-in-cratedb-using-apache-airflow/913 -[Ingesting MQTT messages into CrateDB using Node-RED]: https://community.crate.io/t/ingesting-mqtt-messages-into-cratedb-using-node-red/803 [Introduction to FlowFuse]: https://flowfuse.com/webinars/2023/introduction-to-flowforge/ [Kestra]: https://kestra.io/ [Meltano]: https://meltano.com/ [Meltano Hub]: https://hub.meltano.com/ -[meltano-tap-cratedb]: https://github.com/crate-workbench/meltano-tap-cratedb -[meltano-target-cratedb]: https://github.com/crate-workbench/meltano-target-cratedb [Node-RED]: https://nodered.org/ [Overview about more managed Kafka offerings]: https://keen.io/blog/managed-apache-kafka-vs-diy/ -[Run an ETL pipeline with CrateDB and data quality checks]: https://registry.astronomer.io/dags/etl_pipeline/ -[Setting up data pipelines with CrateDB and Kestra]: https://community.crate.io/t/setting-up-data-pipelines-with-cratedb-and-kestra-io/1400 [Singer]: https://www.singer.io/ [SQL Server Integration Services]: https://learn.microsoft.com/en-us/sql/integration-services/sql-server-integration-services [SSIS]: https://en.wikipedia.org/wiki/SQL_Server_Integration_Services [ssis-tasks]: https://learn.microsoft.com/en-us/sql/integration-services/control-flow/integration-services-tasks [ssis-transformations]: https://learn.microsoft.com/en-us/sql/integration-services/data-flow/transformations/integration-services-transformations -[Tutorial: Replicating data to CrateDB with Debezium and Kafka]: https://community.crate.io/t/replicating-data-to-cratedb-with-debezium-and-kafka/1388 -[Updating stock market data automatically with CrateDB and Apache Airflow]: https://community.crate.io/t/updating-stock-market-data-automatically-with-cratedb-and-apache-airflow/1304 -[Using dbt with CrateDB]: https://community.crate.io/t/using-dbt-with-cratedb/1566 -[Using SQL Server Integration Services with CrateDB]: https://github.com/crate/cratedb-examples/tree/main/application/microsoft-ssis -[Webinar: How to replicate data from other databases to CrateDB with Debezium and Kafka]: https://crate.io/resources/webinars/lp-wb-debezium-kafka diff --git a/docs/integrate/metrics.md b/docs/integrate/metrics.md index 3c04bf0..7232657 100644 --- a/docs/integrate/metrics.md +++ b/docs/integrate/metrics.md @@ -2,8 +2,9 @@ # Monitoring and Metrics with CrateDB Storing metrics data for the long term is a common need in systems monitoring -scenarios. CrateDB offers corresponding integration adapters. - +scenarios. CrateDB offers corresponding integration adapters. Relevant tutorials +can be found within the [CrateDB Guide: Integration Tutorials] section of the +documentation. (prometheus)= ## Prometheus @@ -45,18 +46,11 @@ Adapter], one can easily store the collected metrics data in CrateDB and take advantage of its high ingestion and query speed and friendly UI to massively scale-out Prometheus. - -**Resources** - -- [CrateDB Prometheus Adapter] -- [Getting Started With Prometheus and CrateDB for Long-Term Storage] -- [Storing long-term metrics with Prometheus in CrateDB] -- [Webinar: Using Prometheus and Grafana with CrateDB Cloud] - ![](https://github.com/crate/crate-clients-tools/assets/453543/26b47686-889a-4137-a87f-d6a6b38d56d2){h=200px} ```{seealso} -[CrateDB and Prometheus] +- [CrateDB and Prometheus] +- [CrateDB Prometheus Adapter] ``` ```{div} @@ -93,10 +87,6 @@ a very minimal memory footprint. - **System telemetry**: Metrics from system telemetry like iptables, Netstat, NGINX, and HAProxy help provide a full stack view of your apps. -**Resources** - -- [Use CrateDB With Telegraf, an Agent for Collecting & Reporting Metrics] - ![](https://www.influxdata.com/wp-content/uploads/Main-Diagram_06.01.2022v1.png){h=200px} ```{seealso} @@ -110,13 +100,10 @@ a very minimal memory footprint. [CrateDB and Prometheus]: https://cratedb.com/integrations/cratedb-and-prometheus [CrateDB and Telegraf]: https://crate.io/integrations/cratedb-and-telegraf +[CrateDB Guide: Integration Tutorials]: inv:guide:*:label#integrate [CrateDB Prometheus Adapter]: https://github.com/crate/cratedb-prometheus-adapter -[Getting Started With Prometheus and CrateDB for Long-Term Storage]: https://cratedb.com/blog/getting-started-prometheus-cratedb-long-term-storage [Prometheus]: https://prometheus.io/ [Prometheus remote endpoints and storage]: https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage [remote read]: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_read [remote write]: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write -[Storing long-term metrics with Prometheus in CrateDB]: https://community.cratedb.com/t/storing-long-term-metrics-with-prometheus-in-cratedb/1012 [Telegraf]: https://www.influxdata.com/time-series-platform/telegraf/ -[Use CrateDB With Telegraf, an Agent for Collecting & Reporting Metrics]: https://crate.io/blog/use-cratedb-with-telegraf-an-agent-for-collecting-reporting-metrics -[Webinar: Using Prometheus and Grafana with CrateDB Cloud]: https://cratedb.com/resources/webinars/lp-wb-prometheus-grafana diff --git a/docs/integrate/ml.md b/docs/integrate/ml.md index ef8778c..74c0966 100644 --- a/docs/integrate/ml.md +++ b/docs/integrate/ml.md @@ -3,7 +3,8 @@ # Machine Learning with CrateDB This documentation section lists machine learning applications and frameworks -which can be used together with CrateDB. +which can be used together with CrateDB. Relevant tutorials can be found within +the [CrateDB Guide: Machine Learning Tutorials] section of the documentation. ## LangChain @@ -33,28 +34,6 @@ LangChain's conversational memory subsystem. :style: "clear: both" ``` -**See also** -- [LangChain and CrateDB] - -- CrateDB's `FLOAT_VECTOR` type and its `KNN_MATCH` function can be used for storing and - retrieving embeddings, and for conducting similarity searches. - - [![Open on GitHub](https://img.shields.io/badge/Open%20on-GitHub-lightgray?logo=GitHub)](https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/vector_search.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/vector_search.ipynb) [![Launch Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/crate/cratedb-examples/main?labpath=topic%2Fmachine-learning%2Fllm-langchain%2Fvector_search.ipynb) - -- Database tables in CrateDB can be used as a source provider for LangChain documents. - - [![Open on GitHub](https://img.shields.io/badge/Open%20on-GitHub-lightgray?logo=GitHub)](https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/document_loader.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/document_loader.ipynb) [![Launch Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/crate/cratedb-examples/main?labpath=topic%2Fmachine-learning%2Fllm-langchain%2Fdocument_loader.ipynb) - -- CrateDB supports managing LangChain's conversation history. - - [![Open on GitHub](https://img.shields.io/badge/Open%20on-GitHub-lightgray?logo=GitHub)](https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/conversational_memory.ipynb) [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/conversational_memory.ipynb) [![Launch Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/crate/cratedb-examples/main?labpath=topic%2Fmachine-learning%2Fllm-langchain%2Fconversational_memory.ipynb) - -- What can you build with LangChain? - - - [LangChain: Retrieval augmented generation] - - [LangChain: Analyzing structured data] - - [LangChain: Chatbots] - ## MLflow @@ -75,16 +54,6 @@ config, and results. :style: "clear: both" ``` -**See also** -- Blog series on "Running Time Series Models in Production using CrateDB" - - Part 1: [Introduction to Time Series Modeling using Machine Learning] - -- [MLflow and CrateDB]: Guidelines and runnable code to get started with MLflow and - CrateDB, exercising time series anomaly detection and timeseries forecasting / - prediction using NumPy, Merlion, and Matplotlib. - - [![Open on GitHub](https://img.shields.io/badge/Open%20on-GitHub-lightgray?logo=GitHub)](https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/mlops-mlflow/tracking_merlion.ipynb) [![Open in Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/machine-learning/mlops-mlflow/tracking_merlion.ipynb) - ## PyCaret @@ -101,23 +70,10 @@ libraries like scikit-learn, xgboost, ray, lightgbm, and many more. PyCaret prov universal interface to utilize these libraries without needing to know the details of the underlying model architectures and parameters. - ```{div} :style: "clear: both" ``` -**See also** -- [AutoML with PyCaret and CrateDB] -- The `automl_classification_with_pycaret.ipynb` example notebook explores the PyCaret - framework and shows how to use it to train different classification models. - - [![Open on GitHub](https://img.shields.io/badge/Open%20on-GitHub-lightgray?logo=GitHub)](https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/automl/automl_classification_with_pycaret.ipynb) [![Open in Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/machine-learning/automl/automl_classification_with_pycaret.ipynb) - -- The `automl_timeseries_forecasting_with_pycaret.ipynb` example notebook explores the PyCaret - framework and shows how to use it to train various timeseries forecasting models. - - [![Open on GitHub](https://img.shields.io/badge/Open%20on-GitHub-lightgray?logo=GitHub)](https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/automl/automl_timeseries_forecasting_with_pycaret.ipynb) [![Open in Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/machine-learning/automl/automl_timeseries_forecasting_with_pycaret.ipynb) - ## scikit-learn @@ -130,39 +86,17 @@ of the underlying model architectures and parameters. [![](https://jupyter.org/assets/logos/rectanglelogo-greytext-orangebody-greymoons.svg){w=180px}](https://jupyter.org/) ``` -Using [pandas] and [scikit-learn] to run a regression analysis within a [Jupyter Notebook]. - -- [Machine Learning and CrateDB: An introduction] -- [Machine Learning and CrateDB: Getting Started With Jupyter] -- [Machine Learning and CrateDB: Experiment Design & Linear Regression] - -**See also** -- [Automating financial data collection and storage in CrateDB with Python and pandas 2.0.0] -- [From data storage to data analysis: Tutorial on CrateDB and pandas] - ```{div} :style: "clear: both" ``` -[Automating financial data collection and storage in CrateDB with Python and pandas 2.0.0]: https://community.crate.io/t/automating-financial-data-collection-and-storage-in-cratedb-with-python-and-pandas-2-0-0/916 -[AutoML with PyCaret and CrateDB]: https://github.com/crate/cratedb-examples/tree/main/topic/machine-learning/automl -[From data storage to data analysis: Tutorial on CrateDB and pandas]: https://community.crate.io/t/from-data-storage-to-data-analysis-tutorial-on-cratedb-and-pandas/1440/1 -[Introduction to Time Series Modeling using Machine Learning]: https://cratedb.com/blog/introduction-to-time-series-modeling-with-cratedb-machine-learning-time-series-data -[Jupyter Notebook]: https://jupyter.org/ +[CrateDB Guide: Machine Learning Tutorials]: inv:guide:*:label#ml [LangChain]: https://python.langchain.com/ -[LangChain: Analyzing structured data]: https://python.langchain.com/docs/use_cases/qa_structured/sql -[LangChain: Chatbots]: https://python.langchain.com/docs/use_cases/chatbots -[LangChain: Retrieval augmented generation]: https://python.langchain.com/docs/use_cases/question_answering/ [LangChain adapter for CrateDB]: https://github.com/crate-workbench/langchain -[LangChain and CrateDB]: https://github.com/crate/cratedb-examples/tree/main/topic/machine-learning/llm-langchain -[Machine Learning and CrateDB: An introduction]: https://crate.io/blog/machine-learning-and-cratedb-part-one -[Machine Learning and CrateDB: Getting Started With Jupyter]: https://crate.io/blog/machine-learning-cratedb-jupyter -[Machine Learning and CrateDB: Experiment Design & Linear Regression]: https://crate.io/blog/machine-learning-and-cratedb-part-three-experiment-design-and-linear-regression [MLflow]: https://mlflow.org/ [mlflow-cratedb]: https://pypi.org/project/mlflow-cratedb/ [MLflow adapter for CrateDB]: https://github.com/crate-workbench/mlflow-cratedb -[MLflow and CrateDB]: https://github.com/crate/cratedb-examples/tree/main/topic/machine-learning/mlops-mlflow [MLflow Tracking]: https://mlflow.org/docs/latest/tracking.html [pandas]: https://pandas.pydata.org/ [PyCaret]: https://www.pycaret.org diff --git a/docs/integrate/testing.md b/docs/integrate/testing.md deleted file mode 100644 index 455057c..0000000 --- a/docs/integrate/testing.md +++ /dev/null @@ -1,59 +0,0 @@ -(testing)= -# Software Testing with CrateDB - -Java and Python based test frameworks and libraries that support software -integration testing with CrateDB. - - -(python-pytest)= -## Python pytest - -The popular [pytest] framework makes it easy to write small tests, but it -also supports complex functional testing for applications and libraries. -The [pytest-crate] package manages CrateDB instances for running integration -tests against them. - -It is based on [cr8](#cr8) for the heavy lifting, and additionally provides -the `crate`, `crate_execute`, and `crate_cursor` pytest fixtures for -developer convenience. - -- [Using "pytest-crate" with CrateDB and pytest] - - -(cr8)= -(python-unittest)= -## Python unittest - -[cr8], a collection of tools for CrateDB developers, provides primitive -elements to manage CrateDB single-node and multi-node instances through -its [run-crate] subsystem, that can be used to create test layers for -Python's built-in [unittest] framework. - -- [Using "cr8" test layers with CrateDB and unittest] - - -(testcontainers)= -## Testcontainers - -[Testcontainers] is an open source framework for providing throwaway, -lightweight instances of databases, message brokers, web browsers, or -just about anything that can run in a Docker container. - -CrateDB provides Testcontainers implementations for both Java and Python. - -- [Using "Testcontainers for Java" with CrateDB] -- [Using "Testcontainers for Python" with CrateDB and pytest] -- [Using "Testcontainers for Python" with CrateDB and unittest] - - -[cr8]: https://pypi.org/project/cr8/ -[pytest]: https://docs.pytest.org/ -[pytest-crate]: https://pypi.org/project/pytest-crate/ -[run-crate]: https://pypi.org/project/cr8/#run-crate -[Testcontainers]: https://testcontainers.com/ -[unittest]: https://docs.python.org/3/library/unittest.html -[Using "cr8" test layers with CrateDB and unittest]: https://github.com/crate/cratedb-examples/tree/main/testing/native/python-unittest -[Using "pytest-crate" with CrateDB and pytest]: https://github.com/crate/cratedb-examples/tree/main/testing/native/python-pytest -[Using "Testcontainers for Java" with CrateDB]: https://github.com/crate/cratedb-examples/tree/main/testing/testcontainers/java -[Using "Testcontainers for Python" with CrateDB and pytest]: https://github.com/crate/cratedb-examples/tree/main/testing/testcontainers/python-pytest -[Using "Testcontainers for Python" with CrateDB and unittest]: https://github.com/crate/cratedb-examples/tree/main/testing/testcontainers/python-unittest diff --git a/docs/integrate/visualize.md b/docs/integrate/visualize.md index 6a564bb..5bd42c1 100644 --- a/docs/integrate/visualize.md +++ b/docs/integrate/visualize.md @@ -23,17 +23,6 @@ platform, written in Python. [Preset] offers a managed, elevated, and enterprise-grade SaaS for open-source Apache Superset. -**Product:** -- [Introduction to time series visualization in CrateDB and Apache Superset (Blog)] -- [Use CrateDB and Apache Superset for Open Source Data Warehousing and Visualization (Blog)] -- [Introduction to time series visualization in CrateDB and Apache Superset (Webinar)] -- [Introduction to Time-Series Visualization in CrateDB and Apache Superset (Preset.io)] - -**Development:** -- [Set up Apache Superset with CrateDB] -- [Set up an Apache Superset development sandbox with CrateDB] -- [Verify Apache Superset with CrateDB] - ![](https://superset.apache.org/img/hero-screenshot.jpg){h=200px} ![](https://github.com/crate/crate-clients-tools/assets/453543/0f8f7bd8-2e30-4aca-bcf3-61fbc81da855){h=200px} @@ -82,8 +71,6 @@ specific timerange or filter dashboards by any individual attribute of your data Use SQL and R to analyze your data and create beautiful, interactive dashboards for your entire company in few minutes. -- [Data Analysis with Cluvio and CrateDB] - ![custom-filters.png](https://github.com/crate/crate-clients-tools/assets/453543/49ca6a35-239e-4915-951c-db6649fd35a4){h=200px} ![report-creator.png](https://github.com/crate/crate-clients-tools/assets/453543/844a5ffd-0b92-4c77-8cdd-0b5cc5b392b1){h=200px} @@ -112,19 +99,6 @@ domain-specific Dash components and applications. ![](https://github.com/crate/crate-clients-tools/assets/453543/cc538982-e351-437b-97ec-f1fc6ca34948){h=200px} ![](https://github.com/crate/crate-clients-tools/assets/453543/24908861-f0ad-43f3-b229-b2bfcc61596d){h=200px} -**See also** -- The `timeseries-queries-and-visualization.ipynb` notebook explores how to access - timeseries data from CrateDB via SQL, load it into pandas DataFrames, and visualize - it using Plotly. - - It includes advanced time series operations in SQL, like aggregations, window functions, - interpolation of missing data, common table expressions, moving averages, JOINs, and - the handling of JSON data. - - [![Open on GitHub](https://img.shields.io/badge/Open%20on-GitHub-lightgray?logo=GitHub)](https://github.com/crate/cratedb-examples/blob/main/topic/timeseries/timeseries-queries-and-visualization.ipynb) [![Open in Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/timeseries/timeseries-queries-and-visualization.ipynb) - -- [Explore Dash Examples] - :::{dropdown} **Dash Enterprise** ```{div} :style: "float: right" @@ -173,8 +147,6 @@ run white-labeled data portals. :style: "clear: both" ``` -- [Introduction to Time Series Visualization in CrateDB and Explo] - ![](https://crate.io/hs-fs/hubfs/Screenshot%202023-07-21%20at%2013.17.45.png?width=2948&height=2312&name=Screenshot%202023-07-21%20at%2013.17.45.png){h=200px} ![](https://crate.io/hs-fs/hubfs/Screenshot%202023-07-21%20at%2013.24.01.png?width=2932&height=1716&name=Screenshot%202023-07-21%20at%2013.24.01.png){h=200px} @@ -197,9 +169,6 @@ Connecting to a CrateDB cluster will use the Grafana PostgreSQL data source adap The following tutorials outline how to configure Grafana to connect to CrateDB, and how to run a database query. -**See also** -- [Using Grafana with CrateDB Cloud] - ![image](../_assets/img/grafana-connection.png){h=200px} ![image](../_assets/img/grafana-panel1.png){h=200px} @@ -257,14 +226,6 @@ explore very large data, and with [GeoViews], you can create geographic plots. :style: "clear: both" ``` -**See also** - -- The `cloud-datashader.ipynb` notebook explores the [HoloViews] and [Datashader] frameworks - and outlines how to use them to plot the venerable NYC Taxi dataset, after importing it - into a CrateDB Cloud database cluster. _Please note the notebook is a work in progress._ - - [![Open on GitHub](https://img.shields.io/badge/Open%20on-GitHub-lightgray?logo=GitHub)](https://github.com/crate/cratedb-examples/blob/amo/cloud-datashader/topic/timeseries/explore/cloud-datashader.ipynb) [![Open in Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/crate/cratedb-examples/blob/amo/cloud-datashader/topic/timeseries/explore/cloud-datashader.ipynb) - ![](https://github.com/crate/crate-clients-tools/assets/453543/7f38dff6-04bc-429e-9d31-6beeb9289c4b){h=200px} ![](https://github.com/crate/crate-clients-tools/assets/453543/23561a87-fb4f-4154-9891-1b3068e40579){h=200px} @@ -283,10 +244,6 @@ with no SQL required. Fast analytics with the friendly UX and integrated tooling to let your company explore data on their own. -**See also** -- [Using Metabase with CrateDB Cloud] -- [Real-time data analytics with Metabase and CrateDB] - ![image](../_assets/img/metabase-question.png){h=140px} ![image](../_assets/img/metabase-dashboard.png){h=140px} @@ -340,27 +297,6 @@ Based on Plotly, [Dash] is a low-code framework for rapidly building data apps i ![](https://github.com/crate/crate-clients-tools/assets/453543/380114a8-7984-4966-929b-6e6d52ddd48a){h=200px} ![](https://github.com/crate/crate-clients-tools/assets/453543/f6a99ae7-b730-4587-bd23-499e1be02c92){h=200px} -**See also** - -- The `timeseries-queries-and-visualization.ipynb` notebook explores how to access - timeseries data from CrateDB via SQL, load it into pandas DataFrames, and visualize - it using Plotly. - - It includes advanced time series operations in SQL, like aggregations, window functions, - interpolation of missing data, common table expressions, moving averages, JOINs, and - the handling of JSON data. - - [![Open on GitHub](https://img.shields.io/badge/Open%20on-GitHub-lightgray?logo=GitHub)](https://github.com/crate/cratedb-examples/blob/main/topic/timeseries/timeseries-queries-and-visualization.ipynb) [![Open in Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/crate/cratedb-examples/blob/main/topic/timeseries/timeseries-queries-and-visualization.ipynb) - -- [Explore Dash Examples] - - -```{toctree} -:hidden: - -- Grafana <../tutorials/grafana> -- Metabase <../tutorials/metabase> -``` [Apache Superset]: https://superset.apache.org/ @@ -369,11 +305,9 @@ Based on Plotly, [Dash] is a low-code framework for rapidly building data apps i [CrateDB and Superset]: https://crate.io/integrations/cratedb-and-apache-superset [CrateDB and Metabase]: https://crate.io/integrations/cratedb-and-metabase [Dash]: https://plotly.com/dash/ -[Data Analysis with Cluvio and CrateDB]: https://community.crate.io/t/data-analysis-with-cluvio-and-cratedb/1571 [Datashader]: https://datashader.org/ [Explo]: https://www.explo.co/ [Explo Explore]: https://www.explo.co/products/explore -[Explore Dash Examples]: https://plotly.com/examples/ [GeoViews]: https://geoviews.org/ [Grafana Cloud]: https://grafana.com/grafana/ [Grafana Labs]: https://grafana.com/about/team/ @@ -381,10 +315,6 @@ Based on Plotly, [Dash] is a low-code framework for rapidly building data apps i [HoloViews]: https://www.holoviews.org/ [HoloViz]: https://holoviz.org/ [hvPlot]: https://hvplot.holoviz.org/ -[Introduction to Time Series Visualization in CrateDB and Explo]: https://crate.io/blog/introduction-to-time-series-visualization-in-cratedb-and-explo -[Introduction to time series visualization in CrateDB and Apache Superset (Blog)]: https://community.crate.io/t/introduction-to-time-series-visualization-in-cratedb-and-superset/1041 -[Introduction to time series visualization in CrateDB and Apache Superset (Webinar)]: https://cratedb.com/resources/webinars/lp-wb-introduction-to-time-series-visualization-in-cratedb-apache-superset -[Introduction to Time-Series Visualization in CrateDB and Apache Superset (Preset.io)]: https://preset.io/blog/timeseries-cratedb-superset/ [Metabase]: https://www.metabase.com/ [Metabase Cloud]: https://www.metabase.com/cloud/ [Panel]: https://panel.holoviz.org/ @@ -392,11 +322,3 @@ Based on Plotly, [Dash] is a low-code framework for rapidly building data apps i [Preset]: https://preset.io/ [Preset Cloud]: https://preset.io/product/ [PyViz]: https://pyviz.org/ -[Real-time data analytics with Metabase and CrateDB]: https://www.metabase.com/community_posts/real-time-data-analytics-with-metabase-and-cratedb -[Set up Apache Superset with CrateDB]: https://community.cratedb.com/t/set-up-apache-superset-with-cratedb/1716 -[Set up an Apache Superset development sandbox with CrateDB]: https://community.crate.io/t/set-up-an-apache-superset-development-sandbox-with-cratedb/1163 -[Time Series with CrateDB]: https://github.com/crate/cratedb-examples/tree/main/topic/timeseries/explore -[Use CrateDB and Apache Superset for Open Source Data Warehousing and Visualization (Blog)]: https://crate.io/blog/use-cratedb-and-apache-superset-for-open-source-data-warehousing-and-visualization -[Using Grafana with CrateDB Cloud]: #integrations-grafana -[Using Metabase with CrateDB Cloud]: #integrations-metabase -[Verify Apache Superset with CrateDB]: https://github.com/crate/cratedb-examples/tree/main/application/apache-superset diff --git a/docs/tutorials/grafana.rst b/docs/tutorials/grafana.rst deleted file mode 100644 index 1a88a74..0000000 --- a/docs/tutorials/grafana.rst +++ /dev/null @@ -1,229 +0,0 @@ -.. _integrations-grafana: -.. _visualize-data-with-grafana: - -=========================== -Visualize data with Grafana -=========================== - -`Grafana`_ is an open-source tool that helps you build real-time dashboards, -graphs, and all sorts of data visualizations. It is the perfect complement -to CrateDB, which is purpose-built for monitoring large volumes of machine -data in real-time. - -For the purposes of this guide, it is assumed that you -have a cluster up and running and can access the Console. If not, please refer -to the :ref:`tutorial on how to deploy a cluster for the first time -`. - -.. rubric:: Table of contents - -.. contents:: - :local: - - -.. _grafana-load-dataset: - -Load a sample dataset -===================== - -To visualize data with Grafana, a dataset is needed first. In this sample, -demo data is added directly via the CrateDB Cloud Console. To import the data -go to the Overview page of your deployed cluster. - -.. image:: ../_assets/img/cluster-overview.png - :alt: Cloud Console Clusters overview - -Once on the Overview page, click on the *import the demo data* link in the -"Next steps" section of the Console. A window with 2 SQL statements will -appear. The first of them creates a table that will host the data from NYC -Taxi & Limousine Commission which is used in this example. The second -statement imports the data into the table created in the first step. These -statements must be executed in the shown order. First "1. Create the table" -and then "2. Import the data". - -.. image:: ../_assets/img/grafana-import.png - :alt: Importing data to Admin UI - -When you click on either of the *Execute* buttons, you will be brought to the -CrateDB Admin UI which is the admin UI of your cluster. When accessing it for -the first time, you will need the username and password that you set when you -deployed the cluster. - -.. image:: ../_assets/img/grafana-admin-create-table.png - :alt: Creating table in Admin UI - -After executing the second SQL statement, the "nyc_taxi" table will be -populated with data. Depending on your cluster configuration this can take -around 40 minutes. - -.. _grafana-install: - -Install Grafana -=============== - -To install Grafana locally, refer to the `Grafana documentation`_. In this -guide, local installation is used but you can also use Grafana cloud -deployment. - - -.. _grafana-connect: - -Connect Grafana to CrateDB Cloud -================================ - -After setting up and logging into Grafana, you should be greeted by -Grafana Home page. - -.. image:: ../_assets/img/grafana-welcome.png - :alt: Grafana Home page - -To visualize the data, you must add a data source. To do this, click on the -cogwheel "Settings" icon in the left menu bar. This should take you to the -Data sources Configuration page. - -.. image:: ../_assets/img/grafana-settings.png - :alt: Grafana Settings - -Once there, click on the *Add data source* button. Here, look up and choose -"PostgreSQL". - -.. image:: ../_assets/img/grafana-search.png - :alt: Grafana integrations - -Once "PostgreSQL" is chosen, you will be brought to a form that you must fill -out to connect to the CrateDB Cloud. A completed example might look like the -screenshot below. - -.. image:: ../_assets/img/grafana-connection.png - :alt: Grafana connection form - -The *host* and *user* credentials may appear differently to you. The host can -be found on the Overview page of your cluster on CrateDB Cloud under the -*Learn how to connect to the cluster* link. You will want to use the psql -link. Depending on the region where your cluster is deployed it might look -something like: - -.. code-block:: console - - samplecluster.aks1.eastus.azure.cratedb-dev.net - -.. image:: ../_assets/img/grafana-psql.png - :alt: Grafana psql connection - -After submitting all that to the Grafana connection form, it should return -"Database Connection OK". Then, the connection is established and you can move -on to creating some dashboards. - - -.. _grafana-first-dashboard: - -Build your first Grafana dashboard -================================== - -Now that you've got the data imported to CrateDB Cloud and Grafana connected -to it, it's time to visualize that data. In Grafana this is done using -Dashboards. To create a new dashboard click on the *Create your first -dashboard* on the Grafana homepage. You will be greeted by a dashboard -creation page. - -.. image:: ../_assets/img/grafana-new-dashboard.png - :alt: Grafana Dashboard creation - -In Grafana, dashboards are composed of individual blocks called panels, to -which you can assign different visualization types and individual queries. -First, click on *Add new panel*. - -That will bring you to the panel creation page. Here you define the -query for your panel, the type of visualization (like graphs, stats, tables, -or bar charts), and the time range. Grafana offers a lot of options for data -visualization, so this guide will showcase two simple use-cases. It is -recommended to look into the documentation on `Grafana panels`_. - -To create a panel, you start by defining the query. To do that click on the -*Edit SQL* button. - -.. image:: ../_assets/img/grafana-new-panel.png - :alt: Grafana panel creation - -A console into which you write the SQL statements will appear. This panel will -plot the number of rides per day in the first week of July 2019: - -.. code-block:: console - - SELECT date_trunc('day', pickup_datetime) AS time, - COUNT(*) AS rides - FROM nyc_taxi - WHERE pickup_datetime BETWEEN '2019-07-01T00:00:00' AND '2019-07-07T23:59:59' - GROUP BY 1 - ORDER BY 1; - -.. NOTE:: - - Something important to know about the "Time series" format mode in Grafana - is that your query needs to return a column called "time". Grafana will - identify this as your time metric, so make sure the column has the proper - datatype (any datatype representing an `epoch time`_). In this query, - we're labeling pickup_datetime as "time" for this reason. - -Once you input these SQL statements, there are a couple of adjustments you can -make: - -- On the top of the panel, select the appropriate time range for your - panel—in this case, from July 1st to July 7th, 2019: - -- Under "Settings" on the right, define the name of your panel. - -- Under "Display", select "Bars". - -After that, you should get a panel similar to this: - -.. image:: ../_assets/img/grafana-panel1.png - :alt: Grafana panel 1 - -When you're satisfied with the look of the panel, click *Apply*. This will -bring you back to the overview of the dashboard. Now it will have 1 panel -created in it. Click on the *Add panel* in the top menu bar and you can create -another one. - -.. image:: ../_assets/img/grafana-add-panel.png - :alt: Grafana add another panel to dashboard - -Another question worth asking might be: What was the average distance per ride -per day? To find this out, input the following SQL statement into the console -of the new panel: - -.. code-block:: console - - SELECT - date_trunc('day', pickup_datetime) AS time, - COUNT(*) as rides, - SUM(trip_distance) as total_distance, - SUM(trip_distance) / COUNT(*) AS average_distance_per_ride - FROM nyc_taxi - WHERE pickup_datetime BETWEEN '2019-07-01T00:00:00' AND '2019-07-07T23:59:59' - GROUP BY time - ORDER BY 1; - -Under the graph itself, click on the *average_distance_per_ride*. This will -show only the value we are interested in. Also, in the right menu under "Graph -style" select "Bars" once again. After that, you should have a panel similar -to this: - -.. image:: ../_assets/img/grafana-panel2.png - :alt: Grafana panel 2 - -When you're happy with the panel, click *Apply*. Now, when brought back to the -Dashboard overview, you will have a collection of two very useful graphs. - -.. image:: ../_assets/img/grafana-dashboard-final.png - :alt: Grafana completed dashboard - -Now you know how to get started with data visualization in Grafana. To find -out more, refer to the `Grafana documentation`_. - - - -.. _Grafana: https://www.grafana.com/ -.. _Grafana documentation: https://grafana.com/docs/grafana/latest/?pg=oss-graf&plcmt=quick-links -.. _Grafana panels: https://grafana.com/docs/grafana/latest/panels/ -.. _epoch time: https://en.wikipedia.org/wiki/Unix_time diff --git a/docs/tutorials/metabase.rst b/docs/tutorials/metabase.rst deleted file mode 100644 index dcb4e58..0000000 --- a/docs/tutorials/metabase.rst +++ /dev/null @@ -1,111 +0,0 @@ -.. _integrations-metabase: - -Visualize data with Metabase -============================ - -This tutorial introduces `Metabase`_, an ultimate data analysis and visualization -tool that unlocks the full potential of your data. - -.. rubric:: Table of contents - -.. contents:: - :local: - -.. _metabase-prereqs: - -Prerequisites -------------- - -First, you will need a running cluster. You can use Metabase with both -:ref:`Cloud clusters ` and :ref:`Edge clusters -`. - -To use Metabase, you must have an existing data set in your CrateDB cluster. -Feel free to use the sample dataset available in the `Cloud Console`_ or -import your own data similarly to how it's done `in this how-to`_ . - -.. _integration-metabase-config: - -Initial configuration ---------------------- - -Metabase offers both cloud version and local installation. Whichever you -choose, the first step will be adding your CrateDB cluster as a new database. -To do that, go to the ``Admin Settings`` -> ``Setup``, and choose -the ``Add a database`` option. - -.. image:: ../_assets/img/metabase-add-database.png - :alt: Add new database - -Database configuration is relatively simple, these are the necessary fields: - -- Database type (PostgreSQL) -- Display name -- Host (the URI of your cluster) -- Database name -- Username -- Password - -.. NOTE:: - - Make sure you also select "Use a secure connection (SSL)" option, unless - your cluster is not configured for SSL. - -.. image:: ../_assets/img/metabase-database-configuration.png - :alt: Configure new database - -After submitting your details, Metabase will sync with your CrateDB cluster for -a few moments. When that completes, you will get a message saying, "Syncing -complete". - -.. image:: ../_assets/img/metabase-sync-done.png - :alt: Database sync complete - -.. _integration-metabase-questions: - -Questions ---------- - -Now you are ready to visualize your data. Metabase works by asking questions. -You ask a question, and Metabase answers it in a visual form. These questions -can then be saved to form dashboards. To ask a question, go to ``Home`` and -click on ``New`` -> ``Question`` in the upper right corner. Then select the -database and a table from it. - -As an example, we ask about the Average tip amount, -sorted by the passenger count. - -.. image:: ../_assets/img/metabase-question.png - :alt: Asking a question - -Metabase then provides a visualization of that question. - -.. image:: ../_assets/img/metabase-answer.png - :alt: Answer - -The answer that you get can be saved. When you save a question, you will also -be asked if you want to add it to a dashboard. Dashboards provide an easy way -to monitor your data. - -.. image:: ../_assets/img/metabase-dashboard.png - :alt: Dashboard - -.. _integration-metabase-conclusion: - -Conclusion ----------- - -This was an introductory tutorial into the data visualization tool Metabase. -Metabase offers a quick and intuitive way to make sense of your data with -interactive dashboards, automated reporting, and more. - -If you'd like to see how the other questions were configured, feel free to -check out the `video tutorial`_ on this topic. - -If this integration could benefit you, feel free to head to `Cloud Console`_ -and get started! - -.. _Cloud Console: https://console.cratedb.cloud/ -.. _Metabase: https://www.metabase.com/ -.. _video tutorial: https://www.youtube.com/watch?v=veuR_76njCo -.. _in this how-to: https://community.crate.io/t/importing-data-to-cratedb-cloud-clusters/1467