Add caching to recursive `simplify_once` calls #797

rjzamora · 2024-01-22T18:09:02Z

The simplify logic currently collects a dependents dictionary, and then recursively calls simplify_once on all Expr objects in the expression graph. This PR adds a simple caching mechanism to avoid unnecessary repetition of the same logic (which is particularly problematic for column-name reassignment).

Note that an earlier version of this PR stored the cache in the Expr instance itself. The current approach is a bit lighter weight, but still provides a 10x performance boost for TPCh Q1 optimization.

Closes #796

~~Possible mitigation for #796 - There may be better ways to implement/achieve the caching we need long term, but this seems to work reasonably well.~~

~~With this PR, the optimization stage of TPCh Q1 is still slightly slower than it was before #395, but the overall runtime is back to something reasonable (down from roughly 26s to 10s on my machine)~~

fjetter · 2024-01-23T09:10:24Z

dask_expr/_core.py

+        # Check if we've already simplified for these dependents
+        key = _tokenize_deterministic(sorted(dependents.keys()))
+        if key in self._simplified:
+            return self._simplified[key]


Does this change anything for you? I tried the same thing and it made things actually even slower

Interesting. Yes, it certainly speed things up a lot for me, but I only tried Q1.

yeah, sorry. I had a bug in my cache :)

fjetter · 2024-01-23T13:43:16Z

After profiling this I am confused why this kind of cache would help us anything. Everything that lights up the profile is essentially a computation of _meta. Some of them can be optimized (see also #796 (comment)) but the fundamental problem is that we're generating a lot of intermediate expressions and we have to compute _meta for almost all of them which is unfortunately not always trivial.

How is this cache helping with this? I do indeed see fewer intermediates but I don't understand how this changes anything about the performance. I ran a test where I keep the cache but never access the element

diff --git a/dask_expr/_core.py b/dask_expr/_core.py
index 3388694..86ae404 100644
--- a/dask_expr/_core.py
+++ b/dask_expr/_core.py
@@ -282,8 +282,8 @@ class Expr:
         """
         # Check if we've already simplified for these dependents
         key = _tokenize_deterministic(sorted(dependents.keys()))
-        if key in self._simplified:
-            return self._simplified[key]
+        # if key in self._simplified:
+        #     return self._simplified[key]

         expr = self

@@ -315,6 +315,8 @@ class Expr:
                 if isinstance(operand, Expr):
                     new = operand.simplify_once(dependents=dependents)
                     if new._name != operand._name:
+                        if key in self._simplified:
+                            print("Already simplified but changed")
                         changed = True
                 else:
                     new = operand

and indeed... I get a lot of "Already simplified but changed" prints so even though the expression already passed once through this, there is still more work to do apparently. This cache prohibits that work which is why it's so much faster. However, will this yield the same results?

rjzamora · 2024-01-23T13:54:50Z

@fjetter - Thanks for looking into this.

and indeed... I get a lot of "Already simplified but changed" prints so even though the expression already passed once through this, there is still more work to do apparently. This cache prohibits that work which is why it's so much faster. However, will this yield the same results?

Right, my initial assumption here was that the new "dependents-informed" optimization loop was simply re-generating expressions that it didn't actually need to regenerate, and was therefore regenerating _meta that it had already generated before.

I'll admit that I didn't make a solid attempt to poke holes in my quick/simple "fix" yet. However, my assumption is that if our dependents graph has the same nodes, and our expression is the same, then we have already performed that "simplify" logic before. If something about the dependencies changed, then the node names would have changed.

fjetter · 2024-01-23T17:19:30Z

I think this cache must only be populated if the expr is indeed identical in the end, so

diff --git a/dask_expr/_core.py b/dask_expr/_core.py
index 3388694..b3f70a7 100644
--- a/dask_expr/_core.py
+++ b/dask_expr/_core.py
@@ -324,8 +324,8 @@ class Expr:
                 expr = type(expr)(*new_operands)

             break
-
-        self._simplified[key] = expr  # Cache the result
+        if expr is self:
+            self._simplified[key] = expr  # Cache the result
         return expr

     def simplify(self) -> Expr:

with this I still get almost 3k cache hits

you could also check the name but you'd have to check if the expression is not modified. I think an equivalent would be

if expr._name == key:
    self._simplified[key] = expr

but we can't populate the cache immediately.

fjetter · 2024-01-23T17:29:59Z

I tried the caching approach in #798 as well and all changes combined makes everything pleasantly fast.
Over there, I am using an attribute that's set on the expression that remembers whether an expression is fully simplified or not. This feels less invasive than a mapping on every instance

mrocklin · 2024-01-23T17:40:41Z

🎉

…

On Tue, Jan 23, 2024 at 11:30 AM Florian Jetter ***@***.***> wrote: I tried the caching approach in #798 <#798> as well and all changes combined makes everything pleasantly fast. Over there, I am using an attribute that's set on the expression that remembers whether an expression is fully simplified or not. This feels less invasive than a mapping on every instance — Reply to this email directly, view it on GitHub <#797 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTBFOS2N7KFSNKEVBEDYP7XSDAVCNFSM6AAAAABCFUTJHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBWGU3TCMBRGA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

rjzamora · 2024-01-23T18:22:40Z

Over there, I am using an attribute that's set on the expression that remembers whether an expression is fully simplified or not. This feels less invasive than a mapping on every instance

I do like the feel of that better, but tests are failing. I think you really do need to keep track of whether the expression was simplified for the specific dependents in question (not just that the expression was simplified before and it didn't change).

…ching

rjzamora · 2024-02-05T18:02:46Z

I know this PR wasn't the best solution in its original form. However, I think it should be a reasonable solution for #796 in its revised form. The optimization time becomes negligible (<0.5s) for Q1 on my machine (a ~10x improvement).

dask_expr/_core.py

dask_expr/io/_delayed.py

phofl

Yeah I intended to do the same, this makes sense performance wise

My other PR is still worthwhile btw, would appreciate if you could take a look there #842

rjzamora · 2024-02-06T00:26:24Z

Thanks for the review @phofl !

My other PR is still worthwhile btw, would appreciate if you could take a look there #842

Yes, I agree that's worthwhile as well. I'll definitely take a look.

phofl · 2024-02-06T10:42:56Z

thx

rjzamora added 3 commits January 22, 2024 09:08

try improving caching in simplify_once

7cbb683

add _simplified cache to _DelayedExpr

d344895

add comment

6e7d0dd

fjetter reviewed Jan 23, 2024

View reviewed changes

fjetter mentioned this pull request Jan 23, 2024

[Bug] Optimization is now much slower in TPCh benchmarks #796

Closed

fjetter mentioned this pull request Jan 23, 2024

Expr as singleton #798

Merged

rjzamora added 5 commits February 2, 2024 10:48

Merge remote-tracking branch 'upstream/main' into improve-simplify-ca…

2676442

…ching

Merge remote-tracking branch 'upstream/main' into improve-simplify-ca…

e286ae9

…ching

fix test

ebcc644

formatting

b07ef8b

remove Expr-instance-level caching

69d8f2b

rjzamora changed the title ~~Cache simplified expressions within Expr~~ Add caching to recursive simplify_once calls Feb 5, 2024

rjzamora marked this pull request as ready for review February 5, 2024 18:00

phofl reviewed Feb 5, 2024

View reviewed changes

dask_expr/_core.py Outdated Show resolved Hide resolved

phofl reviewed Feb 5, 2024

View reviewed changes

dask_expr/io/_delayed.py Outdated Show resolved Hide resolved

phofl reviewed Feb 5, 2024

View reviewed changes

fixup

8594759

phofl approved these changes Feb 6, 2024

View reviewed changes

phofl merged commit b5d17ad into dask:main Feb 6, 2024

rjzamora deleted the improve-simplify-caching branch February 6, 2024 14:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add caching to recursive `simplify_once` calls #797

Add caching to recursive `simplify_once` calls #797

Uh oh!

rjzamora commented Jan 22, 2024 •

edited

Loading

Uh oh!

fjetter Jan 23, 2024

Uh oh!

rjzamora Jan 23, 2024

Uh oh!

fjetter Jan 23, 2024

Uh oh!

fjetter commented Jan 23, 2024

Uh oh!

rjzamora commented Jan 23, 2024 •

edited

Loading

Uh oh!

fjetter commented Jan 23, 2024

Uh oh!

fjetter commented Jan 23, 2024

Uh oh!

mrocklin commented Jan 23, 2024 via email

Uh oh!

rjzamora commented Jan 23, 2024

Uh oh!

rjzamora commented Feb 5, 2024

Uh oh!

Uh oh!

Uh oh!

phofl left a comment

Uh oh!

rjzamora commented Feb 6, 2024

Uh oh!

phofl commented Feb 6, 2024

Uh oh!

Uh oh!

Uh oh!

Add caching to recursive simplify_once calls #797

Add caching to recursive simplify_once calls #797

Uh oh!

Conversation

rjzamora commented Jan 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fjetter Jan 23, 2024

Choose a reason for hiding this comment

Uh oh!

rjzamora Jan 23, 2024

Choose a reason for hiding this comment

Uh oh!

fjetter Jan 23, 2024

Choose a reason for hiding this comment

Uh oh!

fjetter commented Jan 23, 2024

Uh oh!

rjzamora commented Jan 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fjetter commented Jan 23, 2024

Uh oh!

fjetter commented Jan 23, 2024

Uh oh!

mrocklin commented Jan 23, 2024 via email

Uh oh!

rjzamora commented Jan 23, 2024

Uh oh!

rjzamora commented Feb 5, 2024

Uh oh!

Uh oh!

Uh oh!

phofl left a comment

Choose a reason for hiding this comment

Uh oh!

rjzamora commented Feb 6, 2024

Uh oh!

phofl commented Feb 6, 2024

Uh oh!

Uh oh!

Add caching to recursive `simplify_once` calls #797

Add caching to recursive `simplify_once` calls #797

rjzamora commented Jan 22, 2024 •

edited

Loading

rjzamora commented Jan 23, 2024 •

edited

Loading