You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently trying to fine-tune the Gemma-7B model using PiSSA, but I am encountering an issue where the initial loss and grad norm are extremely high.
This doesn't seem to be cuased by the pissa algorithm, since using LoRA to fine-tune Gemma-7B also has similar problem.
Do you have encounted this question, or have any ideas on how to solve it? Thanks a lot!
The text was updated successfully, but these errors were encountered:
+1, I have also encountered the same issue when fine-tuning the Qwen-2.5-7B model. The initial loss is approximately an order of magnitude higher than that of the standard LoRA method. For instance, while the standard LoRA achieves an initial loss of around 0.6, PiSSA exhibits an initial loss of approximately 6 or higher.
Dear authors,
Thans for your great works!
I am currently trying to fine-tune the Gemma-7B model using PiSSA, but I am encountering an issue where the initial loss and grad norm are extremely high.
This doesn't seem to be cuased by the pissa algorithm, since using LoRA to fine-tune Gemma-7B also has similar problem.
Do you have encounted this question, or have any ideas on how to solve it? Thanks a lot!
The text was updated successfully, but these errors were encountered: