Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add command anemoi-training train ... #3

Closed
wants to merge 14 commits into from
Closed

Conversation

b8raoult
Copy link
Collaborator

Implement loading of hydra config in the commands/train.py and add documentation.

training:
max_epochs: 3

token:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just an early example, but fyi the token will not be stored in the user config (it will be in its own config file and generated by code with the anemoi-training mlflow login cmd, not input by the user)

Copy link
Member

@JesperDramsch JesperDramsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good step toward enabling additional functionality through user-defined configs.

I do wonder how this will work with the work @theissenhelen has started on config validation / ConfigStores, but it's probably a good prototype that we can base that work off of.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this "just an example"?

# Add user config
user_config = config_path("training.yaml")

if os.path.exists(user_config):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use Pathlib

# Resolve the config
OmegaConf.resolve(cfg)

print(json.dumps(OmegaConf.to_container(cfg), indent=4))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably use the logger if we want this as an output.

@JesperDramsch
Copy link
Member

JesperDramsch commented Jul 9, 2024

I did a deep dive into hydra and its search paths. I have a feeling we may be able to avoid some complexity (and manual merging) by implementing SearchPathPlugins from ´hydra´. They're criminally under-documented, so I dove a bit deep into what they do.

They have an example here:
https://github.com/facebookresearch/hydra/tree/1.3_branch/examples/plugins/example_searchpath_plugin

hydra searches through different config directories anyways (we just have one right now, which is the packaged one).

We can basically configure a search path plugin to search ~/.anemoi/training (or something) and the current cwd by default in addition to the package defaults. Then they still have the standard hydra option --config-dir, -cd to add a singular other dir to the search path.

Additionally, we would be able to expose a searchpath plugin in anemoi inference to also do some inferency bits if we wanted to.

Then we don't have to do any merging by hand and maintain the full functionality from before. (And we can extend the configs with struct configs and pydantic a bit later this year.)

I think the code would look something like this:

from pathlib import Path

from hydra.core.config_search_path import ConfigSearchPath
from hydra.plugins.search_path_plugin import SearchPathPlugin

class AnemoiHomeSearchPathPlugin(SearchPathPlugin):
    def manipulate_search_path(self, search_path: ConfigSearchPath) -> None:
        anemoi_home_path = Path(Path.home(), ".anemoi", "training", "config")
        if anemoi_home_path.exists():
            search_path.append(
                provider="anemoi-home-searchpath-plugin",
                path=anemoi_home_path,
            )


class UserCWDSearchPathPlugin(SearchPathPlugin):
    def manipulate_search_path(self, search_path: ConfigSearchPath) -> None:
        cwd_path = Path(Path.cwd(), "config")
        if cwd_path.exists():
            search_path.append(
                provider="anemoi-env-searchpath-plugin",
                path=cwd_path,
            )

We may also need the anemoi-training default (maybe?):

class AnemoiTrainingSearchPathPlugin(SearchPathPlugin):
    def manipulate_search_path(self, search_path: ConfigSearchPath) -> None:
        search_path.append(
            provider="anemoi-training-searchpath-plugin",
            path="pkg://anemoi-training/config",
        )

By using the hydra default we also ensure that later on we can expand the code to structured configs easily and the neat part is that technically other packages can implement additional SearchPathPlugins, such as anemoi-inference providing an inference config (or something similar).

Could be something we implement in a second iteration for this?

@FussyDuck
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@mchantry
Copy link
Member

Implement in version 0.1

@mchantry mchantry closed this Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants