Skip to content

Commit

Permalink
feat: Add a plot to the tour
Browse files Browse the repository at this point in the history
  • Loading branch information
bjchambers committed Aug 21, 2023
1 parent 03becc8 commit 389f3d9
Show file tree
Hide file tree
Showing 5 changed files with 50 additions and 10 deletions.
3 changes: 2 additions & 1 deletion python/docs/.gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
_build
.jupyter_cache
jupyter_execute
source/reference/apidocs
source/reference/apidocs
source/iframe_figures
3 changes: 3 additions & 0 deletions python/docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
html_favicon = "_static/favicon.png"
html_logo = "_static/kaskada.svg"
html_title = "Kaskada Timestreams"
html_js_files = ["https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.4/require.min.js"]

html_theme_options: Dict[str, Any] = {
"repository_url": "https://github.com/kaskada-ai/kaskada",
Expand Down Expand Up @@ -111,3 +112,5 @@
autosummary_generate = True

napoleon_preprocess_types = True

suppress_warnings = ["mystnb.unknown_mime_type"]
12 changes: 5 additions & 7 deletions python/docs/source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@
hide-toc: true
html_theme.sidebar_secondary.remove: true
---
# Introduction

<div class="px-4 py-5 my-5 text-center">
<img class="d-block mx-auto mb-4" src="_static/kaskada.svg" alt="" width="auto">
<h1 class="display-5 fw-bold">Real-Time AI without the fuss.</h1>
Expand All @@ -13,7 +11,7 @@ html_theme.sidebar_secondary.remove: true
</div>
</div>

## Kaskada completes the Real-Time AI stack, providing...
# Kaskada completes the Real-Time AI stack, providing...

```{gallery-grid}
:grid-columns: 1 2 2 3
Expand All @@ -27,15 +25,15 @@ html_theme.sidebar_secondary.remove: true
```


### Real-time AI in minutes
## Real-time AI in minutes

Connect and compute over databases, streaming data, _and_ data loaded dynamically using Python..
Kaskada is seamlessly integrated with Python's ecosystem of AI/ML tooling so you can load data, process it, train and serve models all in the same place.

There's no infrastructure to provision (and no JVM hiding under the covers), so you can jump right in - check out the [Quick Start](quickstart).


### Built for scale and reliability
## Built for scale and reliability

Implemented in [Rust](https://www.rust-lang.org/) using [Apache Arrow](https://arrow.apache.org/), Kaskada's compute engine uses columnar data to efficiently execute large historic and high-throughput streaming queries.
Every operation in Kaskada is implemented incrementally, allowing automatic recovery if the process is terminated or killed.
Expand All @@ -60,7 +58,7 @@ kd.init_session()
messages = kd.sources.PyList(
rows = pyarrow.parquet.read_table("./messages.parquet")
.to_pylist(),
time_column_name = "ts",
time_column_name = "ts",
key_column_name = "channel",
)

Expand All @@ -78,7 +76,7 @@ conversations = ( messages

# Handle each conversation as it occurs
async for row in conversations.run(materialize=True).iter_rows_async():

# Use a pre-trained model to identify interested users
prompt = "\n\n".join([f' {msg["user"]} --> {msg["text"]} ' for msg in row["result"]])
res = openai.Completion.create(
Expand Down
41 changes: 39 additions & 2 deletions python/docs/source/tour.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
---
file_format: mystnb
kernelspec:
name: python3
display_name: Python 3
mystnb:
execution_mode: cache
---

% Level: Beginner
% Goal: Overview of the key features of Kaskada focused on explaining *why* you want them.
% Audience: Someone who has read the landing page and wants to understand what Kaskada can do for them.
Expand All @@ -8,6 +17,28 @@ This provides an overview of the key features in Kaskada that enable feature eng
The [Quick Start](quickstart) has details on how you can quickly get started running Kaskada queries.
For a more complete explanation, see the User Guide.

This tour uses Kaskada and Plotly to render the illustrations.
The initial setup / data is below.

```{code-cell}
---
tags: [hide-cell]
---
import kaskada as kd
kd.init_session()
single_entity = "\n".join(
[
"time,key,m,n",
"1996-12-19T16:39:57,A,5,10",
"1996-12-20T16:39:59,A,17,6",
"1996-12-22T16:40:00,A,,9",
"1996-12-23T16:40:01,A,12,",
"1996-12-24T16:40:02,A,,",
]
)
single_entity = kd.sources.CsvString(single_entity, time_column_name="time", key_column_name="key")
```

## Events and Aggregations

Every Kaskada query operates on one or more _sources_ containing events.
Expand All @@ -20,8 +51,14 @@ A natural question to ask about the purchases is the total--or `sum`--of all pur
This is accomplished by _aggregating_ the events.
The results of an aggregation change over time as additional events occur.

```{todo}
Port an example showing timestreams and aggregations.
```{code-cell}
---
tags: [remove-input]
---
kd.plot.render(
kd.plot.Plot(single_entity.col("m"), name="m"),
kd.plot.Plot(single_entity.col("m").sum(), name="sum of m")
)
```

The User Guide has [more details on aggregation](guide/aggregation.md), including how to use windows to control which events are aggregated.
Expand Down
1 change: 1 addition & 0 deletions python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ myst-parser = {version = ">=0.16.1"}
# Sphinx to < 6. Once a new release occurs we can upgrade to `0.18.0` or newer.
# https://github.com/executablebooks/MyST-NB/issues/530
myst-nb = { git = "https://github.com/executablebooks/MyST-NB.git", rev = "3d6a5d1"}
plotly = {version = "^5.16.1"}

[tool.poetry.group.test]
# Dependencies for testing
Expand Down

0 comments on commit 389f3d9

Please sign in to comment.