Replies: 4 comments
-
I used RMPROPTF or SGD combined with clip-grad=0.002 and clip-mode=AGC that the models work rather well in my own dataset.
I believe it should work with opt Adam. I guess you lack of setting clip-grad and clip-mode.
Good luck
Linh
… On 15 Mar 2021, at 08:25, ChoiBigO ***@***.***> wrote:
Hi,
I trying to training nfnet. I used the Pytorch internal Optimizer function ( Adam ). But Loss value has increased by more than 1billion won. I want to use the optimizer function inside the pytorch. How can we learn nfnet normally using hte Pytorch internal optimizer function
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Beta Was this translation helpful? Give feedback.
-
@choibigo this isn't a bug, so moving to discussion forums, I would not advise deviating far from the paper hparams or the rmsproptf w/ stronger clipping mentioned my @linhduongtuan that I've found to work decently (it can also work with rmsproptf or sgd/momentum + global norm clipping of 1.0. If you want to use ADAM, you're going to need to do some parameter sweeps |
Beta Was this translation helpful? Give feedback.
-
And if you aren't using any grad clipping, it won't work so well. The NFRegNets train okay(ish) without grad clipping but the NFResNets and NFNets are pretty unstable without grad clipping. Again, you'll need to do some hparam sweeps to find the best combo of learning rate, opt epsilons, grad clipping for your task and optimizer choice. |
Beta Was this translation helpful? Give feedback.
-
Thank you rwightman
I'm studying about pytorch and networks
I don't know much about grad clipping yet
Do you have any code to refer to for grad clipping? or is there a link for your reference?
It would be very helpful for me if you gave me reference link or code
…-----Original Message-----
From: "Ross ***@***.***>
To: ***@***.***>;
Cc: ***@***.***>; ***@***.***>;
Sent: 2021-03-15 (월) 13:08:18 (GMT+09:00)
Subject: Re: [rwightman/pytorch-image-models] nfent training issue (#499)
And if you aren't using any grad clipping, it won't work so well. The NFRegNets train okay(ish) without grad clipping but the NFResNets and NFNets are pretty unstable without grad clipping. Again, you'll need to do some hparam sweeps to find the best combo of learning rate, opt epsilons, grad clipping for your task and optimizer choice.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Beta Was this translation helpful? Give feedback.
-
Hi,
I trying to training nfnet. I used the Pytorch internal Optimizer function ( Adam ). But Loss value has increased by more than 1billion won. I want to use the optimizer function inside the pytorch. How can we learn nfnet normally using hte Pytorch internal optimizer function
I trained nfnet using a pytorch internal SGD function. when this is done, the Loss values converage. But the accuracy was very low( I used Cifar-10, test accuracy 69% ) How can we use the Pytorch internal function to improve accuracy? I want to use the nfnet network.
I will wait for your reply.
Thank you
Beta Was this translation helpful? Give feedback.
All reactions