Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF/WIP: Parallel subnet solving #507

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nkeim
Copy link
Contributor

@nkeim nkeim commented Jul 2, 2018

Inspired by #499 , this uses the nogil option for numba so that the subnet solver can be multithreaded. Threading lets us avoid worrying very much about shared mutable state in the linker, and also has lower overhead. The current implementation uses the concurrent.futures module so it is not compatible with older Python versions, and that's why some of the tests fail.

I have confirmed that this lets linking use multiple CPUs. However I have not confirmed that this is a significant speedup for common workloads. So I invite interested users to try it out!! At some point I will try this out on my own data, which has lots of subnets.

Obviously this is also missing tests, and an API to control the new feature. Right now multithreading is always enabled.

@apiszcz
Copy link
Contributor

apiszcz commented Jul 2, 2018 via email

@nkeim
Copy link
Contributor Author

nkeim commented Jul 2, 2018

@apiszcz Thanks! So for at least some workloads, the additional overhead is not worthwhile.

One option would be to send a subnet computation to the threads only if it contains many particles. That could be done via a shared queue.

@apiszcz
Copy link
Contributor

apiszcz commented Jul 2, 2018 via email

@speedymcs
Copy link

Is this still being considered? Yesterday I ran into a problem with large subnetworks and thought that using multiple CPUs might be smart. For my test coordinates with about 2700 trajectories and 480 steps, the current trackpy implementation is much faster than this code here. However, my 12 cores report only about 10-12% usage each.

@nkeim
Copy link
Contributor Author

nkeim commented Aug 9, 2019

Hi @speedymcs ! I'm really glad you tried this patch. I'm not actively working on this right now, in part because I don't have obvious ideas about why it is slower or how to speed it up. I guess it is more of a proof-of-concept (that subnets can be solved in parallel, not that doing so is faster!).

If you have slow, large subnets, right now the best things to try (besides reducing search_range and filtering out spurious features) are adaptive linking and prediction.

@speedymcs
Copy link

@nkeim Okay thanks! In the past I used the multiprocessing package for my own (crude) tracking implementation. It spread the CPU load very well but there were stability issues. Anyway I'll look into adaptive linking next.

I tried to find out where in the image the oversize error occurs and recently asked a question on Stack Overflow about that--excuse the shameless plug :)

@tacaswell
Copy link
Member

I responded on SO, but very short version is you should find a way to not get large sub-nets, they will lead to both long run times and unreliable linking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants