perf(weave): push heavy conditions into WHERE for calls stream query #3501

gtarpenning · 2025-01-28T00:00:43Z

Description

Push heavy conditions into WHERE clause before aggregating in calls query. In testing, while this did not improve the amount of time a query took, it decreased max memory usage by 10x.

When testing in the clickhouse console on one of the historically impossibly bad queries, this change allows it to actually complete, although it still takes 20+ seconds...

Technically, the queries are different. In prod, the groupby before filtering allows us to include additional rows that have the same call_id but don't have dynamic fields matching the filters. I think the aggregation functions built into the table (going from call_parts to calls_merged) mitigate most of the common cases of duplicate rows (like deleted_at or display_name).

Example difference between master and branch query structure:
Master

WITH filtered_calls AS (...)
SELECT ...
FROM calls_merged
WHERE calls_merged.project_id = '<>'
  AND (calls_merged.id IN filtered_calls)
GROUP BY (calls_merged.project_id, calls_merged.id)
HAVING position(JSON_VALUE(any(calls_merged.output_dump), '$."prompt"'), 'ripples') > 0
ORDER BY any(calls_merged.started_at) DESC

Branch

WITH filtered_calls AS (...)
SELECT ...
FROM calls_merged
WHERE calls_merged.project_id = '<>'
  AND (calls_merged.id IN filtered_calls)
  AND position(JSON_VALUE(calls_merged.output_dump, '$."prompt"'), 'ripples') > 0
GROUP BY (calls_merged.project_id, calls_merged.id)
ORDER BY any(calls_merged.started_at) DESC

Testing

Back to back testing in a local environment with 20,000 calls with very large payloads:

Query generating above stats:

SELECT 
    event_time, 
    query_duration_ms / 1000 AS query_time_sec,
    read_rows,
    read_bytes / 1024 / 1024 AS read_mb,
    memory_usage / 1024 / 1024 AS max_memory_mb
FROM system.query_log
WHERE query LIKE '--%ripples%'
  AND type = 'QueryFinish'
ORDER BY event_time DESC
LIMIT 10;

Master:

Branch:

circle-job-mirror · 2025-01-28T00:05:11Z

Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=93454a35a7f6c8aaae43c2b1004c43e4d08bd0b9

gtarpenning · 2025-01-28T18:57:27Z

tests/trace_server/test_calls_query_builder.py

    exp_formatted = sqlparse.format(exp_query, reindent=True)
    found_formatted = sqlparse.format(query, reindent=True)

    assert exp_formatted == found_formatted
+    assert exp_params == params


easier to debug in this order

tssweeney

Before diving into the code, i have a concern:

I believe this will fail to return rows which have not been merged by the AMT. Referencing the example in the description: what happens if the start and end events are not merged? The resulting data will return the unmerged rows without their start events!

gtarpenning · 2025-01-28T23:23:35Z

Before diving into the code, i have a concern:

I believe this will fail to return rows which have not been merged by the AMT. Referencing the example in the description: what happens if the start and end events are not merged? The resulting data will return the unmerged rows without their start events!

Hmm, looking at the query plan I do think this is correct, although i'm not sure how often this will happen in practice. I am immediately confronted with dumb ways around this, like, always including all the start events if conditioning on an end event and vice versa @tssweeney

gtarpenning · 2025-01-29T00:25:44Z

@tssweeney We could also force merges by using FINAL... In large projects this will very likely be more performant than the groupby... In local testing, using a query that filters down to 200 rows, FINAL used 6x less memory than GROUP BY.

I'm still not exactly sure the best way to repro the conditions that would lead to the issue, merges are hard to predict... And the aggregation functions appear in my testing to actually be working as expected (ie, the query planner reports unmerged parts of the table, but doing filtering on the inputs still always returns the outputs as well). i'll use QA tomorrow.

gtarpenning · 2025-01-29T18:14:51Z

tests/trace/test_client_trace.py

+    assert res[0].inputs["param"]["value1"] == "hello"
+
+    # Does the query return the output?
+    assert res[0].output["d"] == 5


This test highlights the error case that the query creates.

mvp, bad techers but technically working

9727d0d

gtarpenning changed the title ~~mvp, bad techers but technically working~~ mvp, bad tekkers but technically working Jan 28, 2025

gtarpenning changed the title ~~mvp, bad tekkers but technically working~~ mvp, bad tekkers but technically working (?) Jan 28, 2025

less bad (?)

9538de0

gtarpenning commented Jan 28, 2025

View reviewed changes

ugly

6564b80

gtarpenning changed the title ~~mvp, bad tekkers but technically working (?)~~ perf(weave): push heavy conditions into WHERE for calls stream query Jan 28, 2025

gtarpenning marked this pull request as ready for review January 28, 2025 20:50

gtarpenning requested a review from a team as a code owner January 28, 2025 20:50

gtarpenning requested a review from tssweeney January 28, 2025 22:21

tssweeney reviewed Jan 28, 2025

View reviewed changes

add failing test

782c420

gtarpenning commented Jan 29, 2025

View reviewed changes

gtarpenning added 3 commits January 29, 2025 10:43

FINAL

88d455f

fix

a9e8ced

Merge branch 'master' into griffin/aggregation-is-bad

09b50c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(weave): push heavy conditions into WHERE for calls stream query #3501

perf(weave): push heavy conditions into WHERE for calls stream query #3501

gtarpenning commented Jan 28, 2025 •

edited

Loading

circle-job-mirror bot commented Jan 28, 2025 •

edited

Loading

gtarpenning Jan 28, 2025

tssweeney left a comment

gtarpenning commented Jan 28, 2025

gtarpenning commented Jan 29, 2025

gtarpenning Jan 29, 2025

perf(weave): push heavy conditions into WHERE for calls stream query #3501

Are you sure you want to change the base?

perf(weave): push heavy conditions into WHERE for calls stream query #3501

Conversation

gtarpenning commented Jan 28, 2025 • edited Loading

Description

Testing

circle-job-mirror bot commented Jan 28, 2025 • edited Loading

gtarpenning Jan 28, 2025

Choose a reason for hiding this comment

tssweeney left a comment

Choose a reason for hiding this comment

gtarpenning commented Jan 28, 2025

gtarpenning commented Jan 29, 2025

gtarpenning Jan 29, 2025

Choose a reason for hiding this comment

gtarpenning commented Jan 28, 2025 •

edited

Loading

circle-job-mirror bot commented Jan 28, 2025 •

edited

Loading