Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set diff params groups #55

Open
wants to merge 8 commits into
base: magma
Choose a base branch
from
Open

Set diff params groups #55

wants to merge 8 commits into from

Conversation

floatingbigcat
Copy link
Collaborator

@floatingbigcat floatingbigcat commented Sep 24, 2023

Divide all the parameters into 4 params groups depending on w/wo weight_decay and being finetuend/pretrained.

Add two extra args

  • "finetune_keywords":list of string, parameter will be putted into fintune groups as long as its name contain one of these keywords, "image_prefix" is the only keyword for now.

  • "finetune_factor":float, control the learning rate of fintuned groups, whose real lr=pretrained_lr*finetune_factor

In this way, it leave enough room for further change, such as adding more modility encoder and also avoid adding too much hyperparameter to adjust the lr of finetune group

@kshitijkg
Copy link
Member

Looks good overall, I was wondering if we can make it more general?

  • Right now the finetune_factor is used for all parameters that are in the finetune_groups_key_words? Can we instead pass a list of dictionaries called finetune_group_lr_info = {key_word, finetune_factor}

  • And right now the only option supported is real_lr=lr*finetune_factor, can we instead just have a new class called GroupedAnnealingLR, that suports passing in different annealing lr parameters for each group?

So the idea is to instead do this:

finetune_group_lr_info = {key_word, annealing_lr_params}
where annealing_lr_params : {start_lr,
warmup_iter,
total_iters,
decay_style,
last_iter,
min_lr=0.0,
}

@kshitijkg kshitijkg self-requested a review September 24, 2023 07:55
@kshitijkg
Copy link
Member

Could you also add a small dummy test case?

@floatingbigcat
Copy link
Collaborator Author

Looks good overall, I was wondering if we can make it more general?

  • Right now the finetune_factor is used for all parameters that are in the finetune_groups_key_words? Can we instead pass a list of dictionaries called finetune_group_lr_info = {key_word, finetune_factor}
  • And right now the only option supported is real_lr=lr*finetune_factor, can we instead just have a new class called GroupedAnnealingLR, that suports passing in different annealing lr parameters for each group?

So the idea is to instead do this:

finetune_group_lr_info = {key_word, annealing_lr_params} where annealing_lr_params : {start_lr, warmup_iter, total_iters, decay_style, last_iter, min_lr=0.0, }

As we discussed, we can make another PR for further enhancement

@floatingbigcat
Copy link
Collaborator Author

floatingbigcat commented Sep 24, 2023

Could you also add a small dummy test case?

Use deep.py wrapper to run this test file.
7682ef4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants