Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error! Train with the pretrained model #134

Open
lamnt-nd994 opened this issue May 28, 2024 · 6 comments
Open

Error! Train with the pretrained model #134

lamnt-nd994 opened this issue May 28, 2024 · 6 comments

Comments

@lamnt-nd994
Copy link

“Hello, I'm a beginner . I’m encountering an error when loading pretrained weights glint360k_r18 to continue training with my new dataset. The error message is: ‘ValueError: Layer count mismatch when loading weights from file. Model expected 63 layers, found 62 saved layers.’”
Screenshot 2024-05-29 002602

Thanks

@leondgarse
Copy link
Owner

Those shared weights are basic model only. Try loading using the basic_model without header, not providing to tt:

import models

mm = models.buildin_models(
      "r18",  # Or "r18" / "r34" / "r100"
      dropout=0,
      emb_shape=512,
      output_layer='E',
      bn_momentum=0.9,
      bn_epsilon=1e-5,
      use_bias=True,
      scale=True,
      activation='PReLU'
)
mm.load_weights('glint360k_cosface_r18_fp16_0.1.h5')
...

@lamnt-nd994
Copy link
Author

lamnt-nd994 commented May 29, 2024

Thank you for your response. I have successfully run it. Could you please explain why the accuracy is very low when training on the https://github.com/X-zhangyang/Asian-Face-Image-Dataset-AFD-dataset and testing on LFW, CFP_FP, and AgeDB_30? with lr= 0.1
Screenshot 2024-05-29 002602

@leondgarse
Copy link
Owner

Try freezing backbone and training header only first:

tt.train(
    [
        {"loss": losses.ArcfaceLoss(scale=64), "epoch": 2, "bottleneckOnly": True},
        {"loss": losses.ArcfaceLoss(scale=64), "epoch": 17},
    ]
)

And also use a smaller learning_rate like lr_base=0.025.

@lamnt-nd994
Copy link
Author

I tried freezing backbone and training header only first but accuracy in "cfp_fp" was only 0.93 while Ported Models r18: 0.977143

@leondgarse
Copy link
Owner

Maybe just a difference in image encoding qulity when saving those bin files, refer Reproduce the results #110. While freezing backbone, the accuracy shouldn't change. May test the basic_model accuracy firstly.

@lamnt-nd994
Copy link
Author

I got it. Thank a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants