You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/windowing.md
+228-8Lines changed: 228 additions & 8 deletions
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,9 @@ With windows, you can calculate such aggregations as:
9
9
- Total of website visitors for every hour
10
10
- The average speed of a vehicle over the last 10 minutes
11
11
- Maximum temperature of a sensor observed over 30 second ranges
12
-
- Give an user a reward after 10 succesful actions
12
+
- Give an user a reward after 10 succesful actions
13
+
- Track user activity sessions on a website
14
+
- Detect fraud patterns in financial transactions
13
15
14
16
15
17
## Types of Time in Streaming
@@ -500,6 +502,220 @@ sdf = (
500
502
501
503
```
502
504
505
+
## Session Windows
506
+
507
+
Session windows group events that occur within a specified timeout period. Unlike fixed-time windows (tumbling, hopping, sliding), session windows have dynamic durations based on the actual timing of events, making them ideal for user activity tracking, fraud detection, and other event-driven scenarios.
508
+
509
+
A session starts with the first event and extends each time a new event arrives within the timeout period. The session closes after the timeout period with no new events.
510
+
511
+
Key characteristics of session windows:
512
+
513
+
-**Dynamic boundaries**: Each session can have different start and end times based on actual events
514
+
-**Activity-based**: Sessions extend automatically when events arrive within the timeout period
515
+
-**Event-driven closure**: Sessions close when no events arrive within the timeout period
516
+
-**Grace period support**: Late events can still extend sessions if they arrive within the grace period
517
+
518
+
### How Session Windows Work
519
+
520
+
```
521
+
Time: 0 5 10 15 20 25 30 35 40 45 50
522
+
Events: A B C D E
523
+
524
+
Timeout: 10 seconds
525
+
Grace: 2 seconds
526
+
527
+
Session 1: [0, 20] - Events A, B (B extends the session from A)
528
+
Session 2: [25, 35] - Events C, D (D extends the session from C)
529
+
Session 3: [45, 55] - Event E (session will close at 55 if no more events)
530
+
```
531
+
532
+
In this example:
533
+
- Event A starts Session 1 at time 0, session would timeout at time 10
534
+
- Event B arrives at time 10, extending Session 1 to timeout at time 20
535
+
- Event C arrives at time 25, starting Session 2 (too late for Session 1)
536
+
- Event D arrives at time 30, extending Session 2 to timeout at time 40
537
+
- Event E arrives at time 45, starting Session 3
538
+
539
+
### Basic Session Window Example
540
+
541
+
Imagine you want to track user activity sessions on a website, where a session continues as long as user actions occur within 30 minutes of each other:
-**`timeout_ms`**: The session timeout period. If no new events arrive within this period, the session will be closed. Can be specified as either an `int` (milliseconds) or a `timedelta` object.
700
+
701
+
-**`grace_ms`**: The grace period for data arrival. Allows late-arriving data to be included in the session, even if it arrives after the session has theoretically timed out. Can be specified as either an `int` (milliseconds) or a `timedelta` object.
702
+
703
+
-**`name`**: Optional unique identifier for the window. If not provided, it will be automatically generated based on the window's properties.
704
+
705
+
-**`on_late`**: Optional callback to react to late records that cannot extend any existing session. Use this to customize logging or route late events to a dead-letter queue.
706
+
707
+
### Session Window Behavior
708
+
709
+
**Session Creation**: A new session starts when an event arrives and no existing session can accommodate it (i.e., all existing sessions have timed out).
710
+
711
+
**Session Extension**: An existing session is extended when an event arrives within `timeout + grace_period` of the session's last activity.
712
+
713
+
**Session Closure**: A session closes when the current time exceeds `session_end_time + grace_period`, where `session_end_time = last_event_time + timeout`.
714
+
715
+
**Key Grouping**: Like all windows in Quix Streams, sessions are grouped by message key. Each key maintains its own independent sessions.
716
+
717
+
**Event Time**: Sessions use event time (from Kafka message timestamps) rather than processing time.
718
+
503
719
## Lateness and Out-of-Order Processing
504
720
When working with event time, some events may be processed later than they're supposed to.
505
721
Such events are called **"out-of-order"** because they violate the expected order of time in the data stream.
@@ -540,7 +756,7 @@ The appropriate value for a grace period varies depending on the use case.
540
756
### Reacting on late events
541
757
!!! info New in v3.8.0
542
758
543
-
To react on late records coming into time windows, you can pass the `on_late` callbacks to `.tumbling_window()`, `.hopping_window()`and `.sliding_window()` methods.
759
+
To react on late records coming into time windows, you can pass the `on_late` callbacks to `.tumbling_window()`, `.hopping_window()`, `.sliding_window()`, and `.session_window()` methods.
544
760
545
761
You can use this callback to customize the logging of such messages or to send them to some dead-letter queue, for example.
546
762
@@ -667,6 +883,8 @@ In this strategy, messages advance time and close only windows with the **same**
667
883
668
884
If some message keys appear irregularly in the stream, the latest windows can remain unprocessed until the message with the same key is received.
669
885
886
+
Session windows also support both closing strategies. With **key** strategy, sessions for each key close independently. With **partition** strategy, any message can advance time and close sessions for all keys in the partition.
887
+
670
888
```python
671
889
from datetime import timedelta
672
890
from quixstreams import Application
@@ -780,7 +998,7 @@ described in [the "Updating Kafka Headers" section](./processing.md#updating-kaf
780
998
781
999
Here are some general concepts about how windowed aggregations are implemented in Quix Streams:
782
1000
783
-
-Only time-based windows are supported.
1001
+
-Time-based windows (tumbling, hopping, sliding, session) and count-based windows are supported.
784
1002
- Every window is grouped by the current Kafka message key.
785
1003
- Messages with `None` key will be ignored.
786
1004
- The minimal window unit is a **millisecond**. More fine-grained values (e.g. microseconds) will be rounded towards the closest millisecond number.
@@ -794,10 +1012,12 @@ window specification.
794
1012
795
1013
The state store name is auto-generated by default using the following window attributes:
796
1014
797
-
- Window type: `"tumbling"`or `"hopping"`
798
-
- Window parameters: `duration_ms` and `step_ms`
1015
+
- Window type: `"tumbling"`, `"hopping"`, `"sliding"`, or `"session"`
1016
+
- Window parameters: `duration_ms` and `step_ms` for time-based windows, `timeout_ms` for session windows
799
1017
800
-
E.g. a store name for a hopping window of 30 seconds with a 5 second step will be `hopping_window_30000_5000`.
1018
+
Examples:
1019
+
- A hopping window of 30 seconds with a 5 second step: `hopping_window_30000_5000`
1020
+
- A session window with 30 second timeout: `session_window_30000`
801
1021
802
1022
### Updating Window Definitions
803
1023
@@ -807,8 +1027,8 @@ When you change the definition of the window (e.g. its size), the data in the st
807
1027
808
1028
Quix Streams handles some of the situations, like:
809
1029
810
-
- Updating window type (e.g. from tumbling to hopping)
811
-
- Updating window period or step
1030
+
- Updating window type (e.g. from tumbling to hopping, from hopping to session)
1031
+
- Updating window period, step, or timeout
812
1032
- Adding/Removing/Updating an aggregation function (except `Reduce()`)
813
1033
814
1034
Updating the window type and parameters will change the name of the underlying state store, and the new window definition will use a different one.
0 commit comments