Skip to content

Conversation

hokuyama0106
Copy link

@hokuyama0106 hokuyama0106 commented Sep 12, 2025

What does this PR do?

Add config_init_kwargs option in GRPOConfig

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@qgallouedec
Copy link
Member

Thanks for the PR, can you explain why you need this option?

@hokuyama0106
Copy link
Author

hokuyama0106 commented Sep 12, 2025

Thank you for the comment.
If I used models with custom code, I faced the following error.
I need to set trust_remote_code option to the AutoConfig.
We can set the target model with a model instance, but cannot set the reference model with an instance.

[rank1]: During handling of the above exception, another exception occurred:
[rank1]: Traceback (most recent call last):
...
[rank1]:     trainer = GRPOTrainer(
[rank1]:               ^^^^^^^^^^^^
[rank1]:   File "/usr/local/lib/python3.12/dist-packages/trl/trainer/grpo_trainer.py", line 531, in __init__
[rank1]:     config = AutoConfig.from_pretrained(model_id)
[rank1]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/usr/local/lib/python3.12/dist-packages/transformers/models/auto/configuration_auto.py", line 1259, in from_pretrained
[rank1]:     trust_remote_code = resolve_trust_remote_code(
[rank1]:                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/usr/local/lib/python3.12/dist-packages/transformers/dynamic_module_utils.py", line 735, in resolve_trust_remote_code
[rank1]:     raise ValueError(
[rank1]: ValueError: The repository <masked> contains custom code which must be executed to correctly load the model. You can inspect the repository content at <masked>.
[rank1]:  You can inspect the repository content at <masked>.
[rank1]: Please pass the argument `trust_remote_code=True` to allow custom code to be run.

@qgallouedec
Copy link
Member

Out of curiosity, when passing the trust remote code arg, does it allow to train with custom code?

@hokuyama0106
Copy link
Author

Out of curiosity, when passing the trust remote code arg, does it allow to train with custom code?

My current experiment is going well with downgrading transformers and modifying code in my model.

@qgallouedec
Copy link
Member

why do you need to downgrade transformers?

@hokuyama0106
Copy link
Author

hokuyama0106 commented Sep 14, 2025

The custom model I am using does not follow the latest transformers.
I don’t share the details of the model since it’s confidential.
I can train using this PR and the downgraded transformers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants