You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the idea of using 3 ZeroMQ's IO threads to send messages from the master to its workers is that 3 workers will be served in parallel, and as soon as one message send is complete then the next one starts.
That may not be true. With an example similar to this:
common= runif(1e8) # 700 Mbclustermq::Q(mean, i=1:10, n_jobs=10, const=list(x=common), log_worker=T)
# check the logs for call waits, they should be staggered
we observed all workers processing their first call with the almost identical delay (wait in logs).
The wait time here should be staggered, and under 30 seconds (1 Gbit, 3 sends in parallel) for the first worker.
We should never send common data to all workers in parallel, as this will saturate a slow network interface in case of using many workers, and not leave enough bandwidth for other communication.
Currently, the idea of using 3 ZeroMQ's IO threads to send messages from the master to its workers is that 3 workers will be served in parallel, and as soon as one message send is complete then the next one starts.
That may not be true. With an example similar to this:
we observed all workers processing their first call with the almost identical delay (
wait
in logs).The wait time here should be staggered, and under 30 seconds (1 Gbit, 3 sends in parallel) for the first worker.
We should never send common data to all workers in parallel, as this will saturate a slow network interface in case of using many workers, and not leave enough bandwidth for other communication.
Related: #213
The text was updated successfully, but these errors were encountered: