Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: allow full_sorting_merge in funnels #26610

Merged
merged 11 commits into from
Dec 4, 2024
Merged

fix: allow full_sorting_merge in funnels #26610

merged 11 commits into from
Dec 4, 2024

Conversation

aspicer
Copy link
Contributor

@aspicer aspicer commented Dec 3, 2024

Problem

Joining against person_distinct is causing memory issues on certain queries for certain users

https://posthog.slack.com/archives/C076E99B152/p1733217417080439

Changes

Allow "full_sorting_merge" for funnels, which greatly reduces memory usage.

https://clickhouse.com/docs/en/operations/settings/settings#join_algorithm

@aspicer aspicer requested a review from a team December 3, 2024 17:23
@aspicer aspicer changed the title fix: auto join fix: allow full_sorting_merge in funnels Dec 3, 2024
@aspicer aspicer removed the request for review from a team December 3, 2024 17:27
@aspicer aspicer requested a review from a team December 4, 2024 09:15
@@ -37,6 +37,8 @@
)
from posthog.types import EntityNode, ExclusionEntityNode

JOIN_ALGOS = "direct,parallel_hash,hash,full_sorting_merge"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for memory issues why don't we go for auto? performance concerns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when i tested it, it took longer and timed out on metabase.

@@ -195,7 +196,9 @@ def get_query(self) -> ast.SelectQuery:
""",
{"fill_query": fill_query, "inner_select": inner_select},
)
return cast(ast.SelectQuery, s)
s = cast(ast.SelectQuery, s)
s.settings = HogQLQuerySettings(join_algorithm=JOIN_ALGOS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be safer to first check emptiness and then override s.settings, cause if in future parse_select starts setting query settings as well, those will be lost

@aspicer aspicer merged commit fd23e4f into master Dec 4, 2024
89 checks passed
@aspicer aspicer deleted the aspicer/join_auto branch December 4, 2024 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants