Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree: various speedups #887

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Tree: various speedups #887

wants to merge 4 commits into from

Conversation

matthiasdiener
Copy link
Collaborator

@matthiasdiener matthiasdiener commented Nov 27, 2024

  • make dataclass non-frozen
  • use mutate() for cases where a Map is modified multiple times
  • remove asserts for cases that would fail immediately anyway

For the following microbenchmark:

from loopy.schedule.tree import Tree

def run_t():
    tree = Tree.from_root(-1)

    for i in range(100):
        tree = tree.add_node(i, i-1)

    for i in range(100):
        tree = tree.replace_node(i, 101+i)

    for i in range(100):
        tree = tree.move_node(101+i, -1)

from timeit import timeit

print(timeit(run_t, number=1000))

, this PR results in performance improvements:

Branch Without -O With -O
main 0.495 0.435
tree-speedups (78c19c8) 0.408 0.347

@dataclass(frozen=True)
# Not frozen because it is slower. Tree objects are immutable, and offer no
# way to mutate the tree.
@dataclass(frozen=False)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@dataclass(frozen=False)
@dataclass(frozen=__debug__)

(And adapt the comment suitably?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ ./run-mypy.sh
loopy/schedule/tree.py:55: error: "frozen" argument must be a True or False literal  [literal-required]

🤷

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def is_root(self, node: NodeT) -> bool:
assert node in self
# cast-reason: parent_of_node can not be None (as per is_root check)
return 1 + self.depth(cast(NodeT, parent_of_node))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use not_none?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, wouldn't it be easier just to leave the assert there, instead of the not_none?

Copy link
Contributor

@alexfikl alexfikl Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like this work?

parent_of_node = self.parent(node)
if parent_of_node is None:
	return 0

return 1 + self.depth(parent_of_node)

No need to hide the None check in another function (which tricks mypy) + this only does the parent lookup once.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, thanks! cdc3235

@inducer
Copy link
Owner

inducer commented Nov 27, 2024

What's the performance impact with/without when using python -O?

@matthiasdiener
Copy link
Collaborator Author

What's the performance impact with/without when using python -O?

I added a table in the first comment that compares with -O.

@alexfikl
Copy link
Contributor

The mypy failures are probably due to inducer/cgen#50 (which adds py.typed, so now it gets checked).

matthiasdiener and others added 4 commits November 27, 2024 23:08
- make dataclass non-frozen
- use mutate() for cases where a Map is modified multiple times
- remove asserts for cases that would fail immediately anyway
Co-authored-by: Alexandru Fikl <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants