Add support for configuring Dask distributed #2040

bouweandela · 2023-05-19T08:47:30Z

Since iris 3.6, it is possible to use Dask distributed with iris. This is a great new feature that will allow for better memory handling and distributed computing. See #1714 for an example implementation. However, it does require some extra configuration.

My proposal would be to allow users to specify the arguments to distributed.Client and to the associated cluster, e.g. distributed.LocalCluster or dask_jobqueue.SLURMCluster in this configuration. This could either be added under a new key in config-user.yml or in a new configuration file in the ~/.esmvaltool directory:

Add to the user configuration file

We could add these new options to config-user.yml under a new key dask, e.g.

Example config-user.yml settings for running locally using a LocalCluster:

dask:
  cluster:
    type: distributed.LocalCluster

Example settings for using an externally managed cluster (e.g. set it up from a Jupyter notebook)

dask:
  client:
    address: tcp://127.0.0.1:45695

Example settings for running on Levante:

dask:
  client: {}
  cluster:
    type: dask_jobqueue.SLURMCluster
    queue: interactive
    account: bk1088
    cores: 8
    memory: 16GiB
    local_directory: "/work/bd0854/b381141/dask-tmp"
    n_workers: 2

New configuration file

Or, we could add the new configuration in a separate file, e.g. called ~/.esmvaltool/dask.yml or ~/.esmvaltool/dask-distributed.yml.

Example config-user.yml settings for running locally using a LocalCluster:

cluster:
  type: distributed.LocalCluster

Example settings for using an externally managed cluster (e.g. set it up from a Jupyter notebook)

client:
  address: tcp://127.0.0.1:45695

Example settings for running on Levante:

client: {}
cluster:
  type: dask_jobqueue.SLURMCluster
  queue: interactive
  account: bk1088
  cores: 8
  memory: 16GiB
  local_directory: "/work/bd0854/b381141/dask-tmp"
  n_workers: 2

@ESMValGroup/esmvaltool-coreteam Does anyone have an opinion on what the best approach is here? A new file or add to config-user.yml?

The text was updated successfully, but these errors were encountered:

bouweandela · 2023-05-19T14:03:49Z

Another question: would we like to be able to configure Dask distributed from the command line? Or at least pass in the scheduler address if we already have a Dask cluster running, e.g. started from a Jupyter notebook?

valeriupredoi · 2023-05-23T14:54:21Z

cheers @bouweandela - sorry I slacked at this - I'll come back with a deeper and more meanigful analysis (yeh, beware 🤣 ) but before I do that, here's two quick comments:

we should shield this type of configuration as much as possible away from the regular user
we should think about HPC-wide configurations, in the case of central installations of the Tool, as prompted by Add support for configuring Dask distributed #2049 (comment)

remi-kazeroni · 2023-05-23T15:25:44Z

Thanks a lot for your work @bouweandela! I would also suggest not to put dask related settings to the config-user.yml. I think the Dask configuration topic is too advanced for many of our users and that should remain transparent for them. Also, if we were to modify the config-user.yml, that will need to be reflected to the Tool docs, tutorial, our training activities, ... I would prefer to have that in separated files as you suggest, e.g. .esmvaltool/dask.yml or .esmvaltool/distributed.yml.

Another question: would we like to be able to configure Dask distributed from the command line?

It could be nice to have the possibility to use something like esmvaltool config get_config_dask from the command line. But if you think that's not helpful or too much extra work, let's not worry too much about that.

we should think about HPC-wide configurations, in the case of central installations of the Tool

Yes, that's a good point. I'm just worried that this can make it more complicated for us the developers: time to get answers from HPC admins, updates in the software stack, number of machines supported, ... Perhaps we could simply link to specific documentation on Dask usage if HPC centers provide that (here is an example for DKRZ).

valeriupredoi · 2023-06-01T09:17:45Z

This is still an ongoing discussion so needs reopening

bouweandela · 2023-06-01T09:28:35Z

Suggestion by @sloosvel:

Regarding the configuration, is it possible to have multiple configurations in dask.yml? Not every recipe will require the same type of resources.

bouweandela · 2023-09-21T10:22:19Z

At the workshop at SMHI agreement was reached that a new configuration file format would be acceptable. I will make a proposal, but this will not be implemented in time for v2.10.

bouweandela added the enhancement New feature or request label May 19, 2023

bouweandela added this to the v2.9.0 milestone May 19, 2023

bouweandela changed the title ~~Configuring Dask distributed~~ Add support for configuring Dask distributed May 19, 2023

bouweandela mentioned this issue May 19, 2023

Add support for configuring Dask distributed #2049

Merged

10 tasks

remi-kazeroni added the dask related to improvements using Dask label Jun 1, 2023

remi-kazeroni closed this as completed in #2049 Jun 1, 2023

valeriupredoi reopened this Jun 1, 2023

bouweandela modified the milestones: v2.9.0, v2.10.0 Jun 1, 2023

bouweandela removed this from the v2.10.0 milestone Sep 21, 2023

This was referenced Apr 3, 2024

Make dask.yml path configurable #2369

Open

New configuration file plan #2371

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for configuring Dask distributed #2040

Add support for configuring Dask distributed #2040

bouweandela commented May 19, 2023 •

edited

Loading

bouweandela commented May 19, 2023

valeriupredoi commented May 23, 2023

remi-kazeroni commented May 23, 2023

valeriupredoi commented Jun 1, 2023

bouweandela commented Jun 1, 2023

bouweandela commented Sep 21, 2023

Add support for configuring Dask distributed #2040

Add support for configuring Dask distributed #2040

Comments

bouweandela commented May 19, 2023 • edited Loading

Add to the user configuration file

New configuration file

bouweandela commented May 19, 2023

valeriupredoi commented May 23, 2023

remi-kazeroni commented May 23, 2023

valeriupredoi commented Jun 1, 2023

bouweandela commented Jun 1, 2023

bouweandela commented Sep 21, 2023

bouweandela commented May 19, 2023 •

edited

Loading