Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This reworks the transform function somewhat to speed it up. #64

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ThosRTanner
Copy link
Contributor

@ThosRTanner ThosRTanner commented Apr 1, 2024

This takes the time taken for the ls example from 2.6s to 0.5s on my PC.

The problem mostly seems to be related to the eq function using repr, so I've cached the repr value . I've also rearranged the transform function a little to avoid it finding out if there were any matching children and then getting the first child in the list.

Found one small tweak to improve it but I think this as good as it'll get without a rewrite

Fixes #16

This takes the time taken for the ls example from 2.6s to 0.5s on my PC.
Can do this as all the reprs of the classes have the class name in the
__repr__ string so can't possibly match
@ThosRTanner ThosRTanner marked this pull request as ready for review April 4, 2024 20:26
@ThosRTanner ThosRTanner marked this pull request as draft May 13, 2024 18:39
@ThosRTanner
Copy link
Contributor Author

made draft as attempting to make the relevant classes only have properties reveals that some of them can be changed which would potentially mess up the comparisons.

@NickCrews
Copy link
Contributor

This seems like a great idea @ThosRTanner .

Can you explain a little more what the problem is that needs to be fixed? You are saying that the objects are mutable, so that the cached value can become stale? We would need to make all the objects be immutable for this caching to work?

small suggestion for the implementation, could you do

def __repr__(self):
    if self.__repr is None:
        self.__repr = ...
    return self.__repr

and then this allows __eq__ etc to just use the plain return repr(self) == repr(other)?

@ThosRTanner
Copy link
Contributor Author

ThosRTanner commented Jul 12, 2024

Basically, yes. If you're going to use the hash of the object so you can use it as a key in a dict, then if the hash changes due to the object mutating, then you won't be able to find the object by the key any more. Then you might add it a 2nd time, and you might end up finding it twice if you iterated over the dict (depending a lot on how python does it).

I thought I'd be safe to do this and did the PR, then thought I should change all the attributes into properties and discovered they were sometimes getting updated. I don't know what effects this has, but I'm pretty sure it should have some, so I turned this into a draft.

@ThosRTanner
Copy link
Contributor Author

ThosRTanner commented Jul 12, 2024

on the bit about simplifying the eq - as I understand it it's not really good practice to omit the type check (and having it actually sped things up...) but the other part - yeah. that's much easier to understand, thanks. I am an idiot. I think. There was some reason for doing it like it is, because the various repr methods in the subclasses are different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance penalty: combinatorial explosion in transform
2 participants