Replies: 6 comments 7 replies
-
Hi :) There is no native integration, but people do ask about it from time to time. See #4883 There are no plans to implement anything on our side in that direction in the near future, but please feel free to leave a comment in the issue, so we know there is interest. |
Beta Was this translation helpful? Give feedback.
-
Hei @aksg87 , as @efiop answered, there is no native integration. Please take a look at this question and answer on the forums, where I tried to give a high level overview of some things to consider: https://discuss.dvc.org/t/dvc-and-hydra-integration/868/2 It would be great to hear more about what features are you currently using from hydra |
Beta Was this translation helpful? Give feedback.
-
Another interesting hydra feature regarding doing sweeps over different parameter configurations: https://hydra.cc/docs/tutorials/basic/running_your_app/multi-run/ |
Beta Was this translation helpful? Give feedback.
-
I ran across the following issue combining DVC with Hydra. This is (roughly) my
with My
But DVC seems to internally convert
I think converting null to None is unexpected behaviour. Can it be disabled? |
Beta Was this translation helpful? Give feedback.
-
I noticed a couple more issues with Hydra and YAML parsing, one serious and one less so. To set the stage, we realised that even if you use a Hydra config YAML, you can still use DVC to write into it using
becomes
and You can also specify parameters inside the smaller Hydra configs, e.g. with
you can run
turns into
Unfortunately you lose the comment at the top, which is required by Hydra to correctly nest the contents of The second issue concerns indentation (and is again prompted by my
to
i.e. without the 2 spaces of indentation under a list. I don't think this hurts decoding though, so it's probably merely an aesthetic issue. |
Beta Was this translation helpful? Give feedback.
-
We have been working on a proposal for how to better integrate DVC and Hydra and would be happy to get feedback from @d-miketa and others. Instead of using DVC to inject values into the Hydra config, we think it will be more useful to do the opposite and use the Hydra output config to inject values into the DVC params. This should provide nearly full flexibility to compose complex configurations with Hydra while making it easy to track the final config in DVC. Basic workflowHydra can replace or add parameters from the command line (see https://hydra.cc/docs/intro/#basic-example). DVC can use Hydra to enhance the existing
Start with the following stages:
train:
cmd: python src/train.py data/features model.pkl
deps:
- data/features
- src/train.py
params:
- train
outs:
- model.pkl And the following train:
seed: 20170428
n_est: 50
min_split: 0.01 Run a new experiment with: $ dvc exp run --set-param train.min_split=0.05 --set-param +train.max_depth=5 This will run an experiment with the following modified train:
seed: 20170428
n_est: 50
min_split: 0.05 # Modified this parameter (same as current behavior)
max_depth: 5 # Added this parameter (new behavior) Advanced workflowHydra can also be used to compose configuration from multiple files (see https://hydra.cc/docs/intro/#composition-example for an example). DVC can use Hydra to update Building off the previous example, assume you have multiple model types you want to train. You can specify parameters for each under $ tree
.
└── conf
└── train
├── gradient_boosting.yaml
└── random_forest.yaml
algo: GradientBoostingClassifier # Specify model algorithm
seed: 20170428
n_est: 50
learning_rate: 0.1 # New parameter for gradient boosting Now you can run a new gradient boosting experiment using that configuration: $ dvc exp run --set-param +train=gradient_boosting This will update the train:
algo: GradientBoostingClassifier
seed: 20170428
n_est: 50
learning_rate: 0.1 Using Hydra enables even more advanced templating and composition, including:
Hyperparameter sweepsHydra also allows for running hyperparameter sweeps. Using Hydra, the same could be supported in DVC (see https://hydra.cc/docs/intro/#multirun): $ dvc exp run --set-param train=random_forest,gradient_boosting Benefits for Hydra usersNote: There are no new features in this section, just more info for those who are interested. Existing Hydra users get:
|
Beta Was this translation helpful? Give feedback.
-
Hello all,
I am very impressed with DVC and also CML. I have created a machine learning pipeline using Hydra.cc which sets up experiments from YAML files and outputs all experiment parameters, TensorFlow runs, and Model weights into clean outputs. I then store the larger artifacts (model weights) in GitLFS and the lighter-weight files (YAML) into Git.
I saw a few people use both Hydra and DVC. Are there suggestions on this integration? Lots of features of DVC seem really powerful so I would love to blend the best of both worlds. I'm just starting to learn about DVC via some of the great YouTube docs 🙂
Beta Was this translation helpful? Give feedback.
All reactions