Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internalize multi-tree comparisons in C++ #57

Open
2 tasks
ms609 opened this issue Jul 16, 2021 · 0 comments
Open
2 tasks

Internalize multi-tree comparisons in C++ #57

ms609 opened this issue Jul 16, 2021 · 0 comments

Comments

@ms609
Copy link
Owner

ms609 commented Jul 16, 2021

When comparing all pairs of trees, we could attain faster results by:

  • Loading all trees into C++ and converting to split lists once (rather than for each pair)
  • Storing a sorted list of splits alongside a list of their properties
    • Use a k-way merge to produce a single index of all unique splits
    • Each tree will then be represented as a series of links to splits
    • Each unique split can have its properties (in_split) calculated and stored once
    • Also possible to compare all pairs of splits once -- if this doesn't consume too much memory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant