PERF/WIP: Parallel subnet solving #507

nkeim · 2018-07-02T01:08:45Z

Inspired by #499 , this uses the nogil option for numba so that the subnet solver can be multithreaded. Threading lets us avoid worrying very much about shared mutable state in the linker, and also has lower overhead. The current implementation uses the concurrent.futures module so it is not compatible with older Python versions, and that's why some of the tests fail.

I have confirmed that this lets linking use multiple CPUs. However I have not confirmed that this is a significant speedup for common workloads. So I invite interested users to try it out!! At some point I will try this out on my own data, which has lots of subnets.

Obviously this is also missing tests, and an API to control the new feature. Right now multithreading is always enabled.

apiszcz · 2018-07-02T17:54:16Z

Possible data point, I have a small tracking reference task that produces about 7,000 trajectories over 2,362 frames. 0.4.1, 49s 0.4.1 with PERF/WIP patch applied, 53s

…

On Sun, Jul 1, 2018 at 9:08 PM, Nathan Keim ***@***.***> wrote: Inspired by #499 <#499> , this uses the nogil option for numba so that the subnet solver can be multithreaded. Threading lets us avoid worrying very much about shared mutable state in the linker, and also has lower overhead. The current implementation uses the concurrent.futures module so it is not compatible with older Python versions, and that's why some of the tests fail. I have confirmed that this lets linking use multiple CPUs. However I have not confirmed that this is a significant speedup for common workloads. So I invite interested users to try it out!! At some point I will try this out on my own data, which has lots of subnets. Obviously this is also missing tests, and an API to control the new feature. Right now multithreading is always enabled. ------------------------------ You can view, comment on, or merge this pull request online at: #507 Commit Summary - PERF/WIP: Parallel subnet solving File Changes - *M* trackpy/linking/linking.py <https://github.com/soft-matter/trackpy/pull/507/files#diff-0> (15) - *M* trackpy/linking/subnetlinker.py <https://github.com/soft-matter/trackpy/pull/507/files#diff-1> (2) Patch Links: - https://github.com/soft-matter/trackpy/pull/507.patch - https://github.com/soft-matter/trackpy/pull/507.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#507>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABXVTRcAA-5Ul-07SdTg8AMLdZw5F81zks5uCXKdgaJpZM4U-jXP> .

nkeim · 2018-07-02T17:58:08Z

@apiszcz Thanks! So for at least some workloads, the additional overhead is not worthwhile.

One option would be to send a subnet computation to the threads only if it contains many particles. That could be done via a shared queue.

apiszcz · 2018-07-02T18:01:04Z

I did notice 86% CPU load for existing method vs. 50% with patch. However more tests are needed.

…

On Mon, Jul 2, 2018 at 1:58 PM, Nathan Keim ***@***.***> wrote: @apiszcz <https://github.com/apiszcz> Thanks! So for at least some workloads, the additional overhead is not worthwhile. One option would be to send a subnet computation to the threads only if it contains many particles. That could be done via a shared queue. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#507 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABXVTa4nNMTsegIX523sdJ8JZWVQwfAlks5uCl8xgaJpZM4U-jXP> .

speedymcs · 2019-08-09T11:09:29Z

Is this still being considered? Yesterday I ran into a problem with large subnetworks and thought that using multiple CPUs might be smart. For my test coordinates with about 2700 trajectories and 480 steps, the current trackpy implementation is much faster than this code here. However, my 12 cores report only about 10-12% usage each.

nkeim · 2019-08-09T14:25:03Z

Hi @speedymcs ! I'm really glad you tried this patch. I'm not actively working on this right now, in part because I don't have obvious ideas about why it is slower or how to speed it up. I guess it is more of a proof-of-concept (that subnets can be solved in parallel, not that doing so is faster!).

If you have slow, large subnets, right now the best things to try (besides reducing search_range and filtering out spurious features) are adaptive linking and prediction.

speedymcs · 2019-08-12T12:35:04Z

@nkeim Okay thanks! In the past I used the multiprocessing package for my own (crude) tracking implementation. It spread the CPU load very well but there were stability issues. Anyway I'll look into adaptive linking next.

I tried to find out where in the image the oversize error occurs and recently asked a question on Stack Overflow about that--excuse the shameless plug :)

tacaswell · 2019-08-13T03:01:03Z

I responded on SO, but very short version is you should find a way to not get large sub-nets, they will lead to both long run times and unreliable linking.

PERF/WIP: Parallel subnet solving

be890c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF/WIP: Parallel subnet solving #507

PERF/WIP: Parallel subnet solving #507

nkeim commented Jul 2, 2018

apiszcz commented Jul 2, 2018 via email

nkeim commented Jul 2, 2018

apiszcz commented Jul 2, 2018 via email

speedymcs commented Aug 9, 2019

nkeim commented Aug 9, 2019

speedymcs commented Aug 12, 2019

tacaswell commented Aug 13, 2019

PERF/WIP: Parallel subnet solving #507

Are you sure you want to change the base?

PERF/WIP: Parallel subnet solving #507

Conversation

nkeim commented Jul 2, 2018

apiszcz commented Jul 2, 2018 via email

nkeim commented Jul 2, 2018

apiszcz commented Jul 2, 2018 via email

speedymcs commented Aug 9, 2019

nkeim commented Aug 9, 2019

speedymcs commented Aug 12, 2019

tacaswell commented Aug 13, 2019