Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Cache Expr objects #280

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

[RFC] Cache Expr objects #280

wants to merge 1 commit into from

Conversation

mrocklin
Copy link
Member

Much of our overhead comes from doing computational work in other libraries (pandas, arrow, ...) that could be cached. We do cache a lot of stuff today, but we store these caches on the object itself. When we then recreate objects (for example in optimization) then we lose those caches.

One solution here is to cache the objects themselves, so that Op(...) is Op(...). This technique is a bit magical, but is used in other projects like SymPy where it has had good performance impacts (although they use it because they make many more very small objects).

Maybe this isn't relevant for us. Ideally we wouldn't recreate objects often in optimization (this is why we return the original object if arguments match). But maybe it's hard to be careful. If so, this might provide a bit of a sledgehammer approach.

THis isn't done yet, in particular there are open questions about non-hashable inputs like pandas dataframes. Hopefully it is a useful proof of concept.

Much of our overhead comes from doing computational work in other
libraries (pandas, arrow, ...) that could be cached.  We do cache a lot
of stuff today, but we store these caches on the object itself.  When we
then recreate objects (for example in optimization) then we lose those
caches.

One solution here is to cache the objects themselves, so that
`Op(...) is Op(...)`.  This technique is a bit magical, but is used in
other projects like SymPy where it has had good performance impacts
(although they use it because they make many more very small objects).

Maybe this isn't relevant for us.  Ideally we wouldn't recreate objects
often in optimization (this is why we return the original object if
arguments match).  But maybe it's hard to be careful.  If so, this
might provide a bit of a sledgehammer approach.

THis isn't done yet, in particular there are open questions about
non-hashable inputs like pandas dataframes.  Hopefully it is a useful
proof of concept.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant