Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common data sending may be fully parallel #334

Open
mschubert opened this issue Sep 26, 2024 · 0 comments
Open

Common data sending may be fully parallel #334

mschubert opened this issue Sep 26, 2024 · 0 comments
Labels

Comments

@mschubert
Copy link
Owner

mschubert commented Sep 26, 2024

Currently, the idea of using 3 ZeroMQ's IO threads to send messages from the master to its workers is that 3 workers will be served in parallel, and as soon as one message send is complete then the next one starts.

That may not be true. With an example similar to this:

common = runif(1e8) # 700 Mb
clustermq::Q(mean, i=1:10, n_jobs=10, const=list(x=common), log_worker=T)
# check the logs for call waits, they should be staggered

we observed all workers processing their first call with the almost identical delay (wait in logs).

# cat *.log | grep 'call 1'
2024-10-07 08:47:51.170754 | > call 1 (68.241s wait)
2024-10-07 08:47:50.077038 | > call 1 (67.155s wait)
2024-10-07 08:47:50.502426 | > call 1 (67.577s wait)
2024-10-07 08:47:50.975683 | > call 1 (68.048s wait)
2024-10-07 08:47:45.277124 | > call 1 (62.356s wait)
2024-10-07 08:47:50.786405 | > call 1 (67.859s wait)
2024-10-07 08:47:48.356261 | > call 1 (65.435s wait)
2024-10-07 08:47:51.202221 | > call 1 (68.269s wait)
2024-10-07 08:47:51.098968 | > call 1 (68.169s wait)
2024-10-07 08:47:49.407605 | > call 1 (66.486s wait)

The wait time here should be staggered, and under 30 seconds (1 Gbit, 3 sends in parallel) for the first worker.

We should never send common data to all workers in parallel, as this will saturate a slow network interface in case of using many workers, and not leave enough bandwidth for other communication.

Related: #213

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant