From 97ad0ad77d89a66e24a435667d2d43f19bb8794d Mon Sep 17 00:00:00 2001
From: Andrew Lamb
Date: Thu, 12 Sep 2024 11:54:02 -0400
Subject: [PATCH] Improve comments on target user and unify summaries (#12418)
---
README.md | 25 ++++++++++++++++++++++---
datafusion/core/src/lib.rs | 24 ++++++++++++++----------
docs/source/index.rst | 25 +++++++++++++++++--------
3 files changed, 53 insertions(+), 21 deletions(-)
diff --git a/README.md b/README.md
index 816dc77714d2..bb8526c24e2c 100644
--- a/README.md
+++ b/README.md
@@ -41,9 +41,28 @@
-Apache DataFusion is a very fast, extensible query engine for building high-quality data-centric systems in
-[Rust](http://rustlang.org), using the [Apache Arrow](https://arrow.apache.org)
-in-memory format. [Python Bindings](https://github.com/apache/datafusion-python) are also available. DataFusion offers SQL and Dataframe APIs, excellent [performance](https://benchmark.clickhouse.com/), built-in support for CSV, Parquet, JSON, and Avro, extensive customization, and a great community.
+DataFusion is an extensible query engine written in [Rust] that
+uses [Apache Arrow] as its in-memory format. DataFusion's target users are
+developers building fast and feature rich database and analytic systems,
+customized to particular workloads. See [use cases] for examples.
+
+"Out of the box," DataFusion offers [SQL] and [`Dataframe`] APIs,
+excellent [performance], built-in support for CSV, Parquet, JSON, and Avro,
+extensive customization, and a great community.
+[Python Bindings] are also available.
+
+DataFusion features a full query planner, a columnar, streaming, multi-threaded,
+vectorized execution engine, and partitioned data sources. You can
+customize DataFusion at almost all points including additional data sources,
+query languages, functions, custom operators and more.
+See the [Architecture] section for more details.
+
+[rust]: http://rustlang.org
+[apache arrow]: https://arrow.apache.org
+[use cases]: https://datafusion.apache.org/user-guide/introduction.html#use-cases
+[python bindings]: https://github.com/apache/datafusion-python
+[performance]: https://benchmark.clickhouse.com/
+[architecture]: https://datafusion.apache.org/contributor-guide/architecture.html
Here are links to some important information
diff --git a/datafusion/core/src/lib.rs b/datafusion/core/src/lib.rs
index 9c368415bb05..63d4fbc0bba5 100644
--- a/datafusion/core/src/lib.rs
+++ b/datafusion/core/src/lib.rs
@@ -17,24 +17,28 @@
#![warn(missing_docs, clippy::needless_borrow)]
//! [DataFusion] is an extensible query engine written in Rust that
-//! uses [Apache Arrow] as its in-memory format. DataFusion help developers
-//! build fast and feature rich database and analytic systems, customized to
-//! particular workloads. See [use cases] for examples
+//! uses [Apache Arrow] as its in-memory format. DataFusion's target users are
+//! developers building fast and feature rich database and analytic systems,
+//! customized to particular workloads. See [use cases] for examples.
//!
-//! "Out of the box," DataFusion quickly runs complex [SQL] and
-//! [`DataFrame`] queries using a full-featured query planner, a columnar,
-//! streaming, multi-threaded, vectorized execution engine, and partitioned data
-//! sources (Parquet, CSV, JSON, and Avro).
+//! "Out of the box," DataFusion offers [SQL] and [`Dataframe`] APIs,
+//! excellent [performance], built-in support for CSV, Parquet, JSON, and Avro,
+//! extensive customization, and a great community.
+//! [Python Bindings] are also available.
//!
-//! DataFusion is designed for easy customization such as
-//! additional data sources, query languages, functions, custom
-//! operators and more. See the [Architecture] section for more details.
+//! DataFusion features a full query planner, a columnar, streaming, multi-threaded,
+//! vectorized execution engine, and partitioned data sources. You can
+//! customize DataFusion at almost all points including additional data sources,
+//! query languages, functions, custom operators and more.
+//! See the [Architecture] section below for more details.
//!
//! [DataFusion]: https://datafusion.apache.org/
//! [Apache Arrow]: https://arrow.apache.org
//! [use cases]: https://datafusion.apache.org/user-guide/introduction.html#use-cases
//! [SQL]: https://datafusion.apache.org/user-guide/sql/index.html
//! [`DataFrame`]: dataframe::DataFrame
+//! [performance]: https://benchmark.clickhouse.com/
+//! [Python Bindings]: https://github.com/apache/datafusion-python
//! [Architecture]: #architecture
//!
//! # Examples
diff --git a/docs/source/index.rst b/docs/source/index.rst
index bb5ea430a321..4c67e808a4dd 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -32,14 +32,23 @@ Apache DataFusion
Fork
-DataFusion is a very fast, extensible query engine for building high-quality data-centric systems in
-`Rust `_, using the `Apache Arrow `_
-in-memory format.
-
-DataFusion offers SQL and Dataframe APIs, excellent
-`performance `_, built-in support for
-CSV, Parquet, JSON, and Avro, extensive customization, and a great
-community.
+
+DataFusion is an extensible query engine written in `Rust `_ that
+uses `Apache Arrow `_ as its in-memory format. DataFusion's target users are
+developers building fast and feature rich database and analytic systems,
+customized to particular workloads. See `use cases `_ for examples.
+
+"Out of the box," DataFusion offers `SQL `_
+and `Dataframe `_ APIs,
+excellent `performance `_, built-in support for CSV, Parquet, JSON, and Avro,
+extensive customization, and a great community.
+`Python Bindings `_ are also available.
+
+DataFusion features a full query planner, a columnar, streaming, multi-threaded,
+vectorized execution engine, and partitioned data sources. You can
+customize DataFusion at almost all points including additional data sources,
+query languages, functions, custom operators and more.
+See the `Architecture `_ section for more details.
To get started, see