[Config] Enable component overrides #456

RdoubleA · 2024-03-06T05:33:40Z

Context

After the config system was updated in #406 with _component_ fields, the CLI override experience for specifying TorchTune objects became clunky. For example, to change datasets, we now have to specify the component field in CLI:

tune full_finetune --config alpaca_llama2_full_finetune.yaml --override dataset._component_=torchtune.datasets.SlimOrcaDataset

Instead, we update the parse utility to enable specifying component path without using _component_ and merge the overrides properly. The above command will now become:

tune full_finetune --config alpaca_llama2_full_finetune.yaml dataset=torchtune.datasets.SlimOrcaDataset

Changelog

Update parsing to recognize _component_ and enable component overrides by adding a merge utility, merge_yaml_and_cli_args
Remove the --override flag by popular demand
Update tutorials

Test plan

Added unit test and ran pytest tests
tune --nnodes 1 --nproc_per_node 1 full_finetune --config alpaca_llama2_full_finetune dataset=torchtune.datasets.SlimOrcaDataset dataset.train_on_input=False

Running recipe_main with parameters {'tokenizer': {'_component_': 'torchtune.models.llama2.llama2_tokenizer', 'path': '/tmp/llama2/tokenizer.model'}, 'dataset': {'_component_': 'torchtune.datasets.SlimOrcaDataset', 'train_on_input': False}, 'seed': None, 'shuffle': True, 'model': {'_component_': 'torchtune.models.llama2.llama2_7b'}, 'model_checkpoint': '/tmp/llama2_native', 'batch_size': 2, 'epochs': 3, 'optimizer': {'_component_': 'torch.optim.SGD', 'lr': 2e-05}, 'loss': {'_component_': 'torch.nn.CrossEntropyLoss'}, 'max_steps_per_epoch': None, 'gradient_accumulation_steps': 1, 'log_every_n_steps': None, 'run_generation': None, 'resume_from_checkpoint': False, 'device': 'cuda', 'dtype': 'fp32', 'enable_fsdp': True, 'enable_activation_checkpointing': True, 'cpu_offload': False, 'metric_logger': {'_component_': 'torchtune.utils.metric_logging.DiskLogger', 'log_dir': '${output_dir}'}, 'output_dir': '/tmp/alpaca-llama2-finetune'}

netlify · 2024-03-06T05:34:03Z

✅ Deploy Preview for torchtune-preview ready!

Name	Link
🔨 Latest commit	`91fec7a`
🔍 Latest deploy log	https://app.netlify.com/sites/torchtune-preview/deploys/65f3579c407d7100087ebe1d
😎 Deploy Preview	https://deploy-preview-456--torchtune-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

ebsmothers · 2024-03-07T05:05:13Z

tests/torchtune/config/test_utils.py

+        yaml_args, cli_args = parser.parse_known_args(
+            [
+                "--config",
+                "test.yaml",
+                "b.c=4",  # Test overriding a flat param in a component
+                "b=5",  # Test overriding component path
+                "b.b.c=6",  # Test nested dotpath
+                "d=6",  # Test overriding a flat param
+                "e=7",  # Test adding a new param
+            ]
+        )


One case that's not covered here is when we only override some of the fields for a given DictConfig. E.g. with _Config as you've defined it above I would like to see the case of just b=5 tested (check that the final value of b.c=3), and the case of just b.c=4 (check that the final value of b._component_=2). I think (?) these did not work before and we actually needed to override every single field, but if I understand these changes correctly that will no longer be the case. Either way, would be good to explicitly test for it.

It shouldn't require overriding every single field, but yes it would be good to test for this explicitly

ebsmothers · 2024-03-07T05:10:49Z

docs/source/examples/configs.rst

+
+Overriding components
+^^^^^^^^^^^^^^^^^^^^^
+If you would like to override a parameter in the config that has a :code:`_component_`


Nitpicking

Suggested change

If you would like to override a parameter in the config that has a :code:`_component_`

If you would like to override a class or function in the config that is instantiated via the :code:`_component_`

ebsmothers · 2024-03-07T05:21:11Z

torchtune/config/_utils.py

+        # If a cli arg overrides a yaml arg with a _component_ field, update the
+        # key string to reflect this
+        if (
+            k in yaml_kwargs


So the assumption here is that anything in the CLI overrides was already in the YAML file, right? (I think this is fine and don't see a way to avoid it, just wanna confirm that we will not support the cases in Hydra like +my_appended_config_field=value.)

This is only for components, you cannot append a new component (well, you technically could but it would not be pretty and you'll have to explicitly use _component_). It has to exist in the yaml file. Appending new config values that aren't in the yaml file will still work without needing the +

ebsmothers

Overall this looks good to me and will improve the UX a lot. Really only one main question from me on the testing: just wanna confirm that we can override individual fields of a DictConfig without overriding all of them -- I think we can but it's not immediately clear from the command in the test plan.

RdoubleA · 2024-03-08T00:41:51Z

tests/recipes/test_lora_finetune.py


        if enable_fsdp:
-            cmd.append("--enable-fsdp")
+            cmd.append("enable_fsdp=True")


there is an issue with this test, it passes on main because the flag is not parsed correctly before I update it. After the change, this test fails because we don't call init_distributed when enable_fsdp is true. any advice on how I can quickly patch this? @rohan-varma @ebsmothers

Oops I actually missed this until now, but I think you, Rohan, and I all discovered this more or less independently. #472 should fix this

ebsmothers · 2024-03-08T20:50:21Z

One more comment here.. I think #454 is gonna land before this so please just take a pass and make sure all instances of --override are gone in the final version of your PR.

pytorch-bot · 2024-03-14T20:01:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/456

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 91fec7a with merge base e570803 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 6, 2024

ebsmothers reviewed Mar 7, 2024

View reviewed changes

ebsmothers approved these changes Mar 7, 2024

View reviewed changes

ebsmothers mentioned this pull request Mar 7, 2024

Set pytest import mode to importlib #460

Merged

RdoubleA force-pushed the rafiayub/override_update branch from 08918f5 to 4ca0a48 Compare March 7, 2024 18:24

ebsmothers mentioned this pull request Mar 7, 2024

Separate LoRA recipe into single and multi GPU, LoRA finetune < 16GB GPU #454

Merged

RdoubleA commented Mar 8, 2024

View reviewed changes

RdoubleA force-pushed the rafiayub/override_update branch 2 times, most recently from 3fee2ce to ec9ce97 Compare March 13, 2024 19:24

RdoubleA added 11 commits March 14, 2024 12:57

add merge utility and unit test

7b991bb

fix docs

70c9887

reuse has_component

6f80047

remove extra file

f410d3d

update test

0800b5c

undo some changes

3f5cbe1

fix tests

f7b0422

rebase

27dca95

rebase again

9df1940

rebase

0e2282e

rebase

91fec7a

RdoubleA force-pushed the rafiayub/override_update branch from ec9ce97 to 91fec7a Compare March 14, 2024 20:01

RdoubleA merged commit 9c75d48 into main Mar 14, 2024
16 checks passed

RdoubleA deleted the rafiayub/override_update branch March 14, 2024 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Config] Enable component overrides #456

[Config] Enable component overrides #456

RdoubleA commented Mar 6, 2024 •

edited

Loading

netlify bot commented Mar 6, 2024 •

edited

Loading

ebsmothers Mar 7, 2024

RdoubleA Mar 7, 2024

ebsmothers Mar 7, 2024

ebsmothers Mar 7, 2024

RdoubleA Mar 7, 2024 •

edited

Loading

ebsmothers left a comment

RdoubleA Mar 8, 2024

ebsmothers Mar 8, 2024

ebsmothers commented Mar 8, 2024

pytorch-bot bot commented Mar 14, 2024 •

edited

Loading

	If you would like to override a parameter in the config that has a :code:`_component_`
	If you would like to override a class or function in the config that is instantiated via the :code:`_component_`

[Config] Enable component overrides #456

[Config] Enable component overrides #456

Conversation

RdoubleA commented Mar 6, 2024 • edited Loading

Context

Changelog

Test plan

netlify bot commented Mar 6, 2024 • edited Loading

✅ Deploy Preview for torchtune-preview ready!

ebsmothers Mar 7, 2024

Choose a reason for hiding this comment

RdoubleA Mar 7, 2024

Choose a reason for hiding this comment

ebsmothers Mar 7, 2024

Choose a reason for hiding this comment

ebsmothers Mar 7, 2024

Choose a reason for hiding this comment

RdoubleA Mar 7, 2024 • edited Loading

Choose a reason for hiding this comment

ebsmothers left a comment

Choose a reason for hiding this comment

RdoubleA Mar 8, 2024

Choose a reason for hiding this comment

ebsmothers Mar 8, 2024

Choose a reason for hiding this comment

ebsmothers commented Mar 8, 2024

pytorch-bot bot commented Mar 14, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/456

✅ No Failures

RdoubleA commented Mar 6, 2024 •

edited

Loading

netlify bot commented Mar 6, 2024 •

edited

Loading

RdoubleA Mar 7, 2024 •

edited

Loading

pytorch-bot bot commented Mar 14, 2024 •

edited

Loading