ValueError: The hidden size is not divisble by the number of attention heads! Make sure to update them! #110

cclough · 2024-08-25T04:41:56Z

When I try to load the 25B param model, using huggingface, I get an error on the AutoConfig stuff.

ValueError: The hidden size is not divisble by the number of attention heads! Make sure to update them!

Could it be an update to huggingface transformers?

  File "/home/ubuntu/test-filesystem/software-christian/riboseq-ml/translation-modelling/experiments/cambray_finetune-genslm-large/../../scripts/eval.py", line 181, in <module>
    main()
  File "/home/ubuntu/test-filesystem/software-christian/riboseq-ml/translation-modelling/experiments/cambray_finetune-genslm-large/../../scripts/eval.py", line 164, in main
    model, tokenizer = train.finetune_hf_model(args.model, X_train, y_train, X_test, y_test, device, args.num_epochs, genslm_weights_dir=args.genslm_weights_dir)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/test-filesystem/software-christian/riboseq-ml/translation-modelling/experiments/cambray_finetune-genslm-large/../../src/train.py", line 98, in finetune_hf_model
    model, tokenizer = models.load_model(model_name, device, regression=True, genslm_weights_dir=genslm_weights_dir)#, max_seq_len=50)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/test-filesystem/software-christian/riboseq-ml/translation-modelling/experiments/cambray_finetune-genslm-large/../../src/models.py", line 454, in load_model
    model, tokenizer = get_model_genslm(model_name, device, genslm_weights_dir, regression=regression)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/test-filesystem/software-christian/riboseq-ml/translation-modelling/experiments/cambray_finetune-genslm-large/../../src/models.py", line 101, in get_model_genslm
    base_config = AutoConfig.from_pretrained(weights_dir + "/neox_25,076,188,032.json")
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/translation/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1000, in from_pretrained
    return config_class.from_dict(config_dict, **unused_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/translation/lib/python3.11/site-packages/transformers/configuration_utils.py", line 772, in from_dict
    config = cls(**config_dict)
             ^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/miniconda3/envs/translation/lib/python3.11/site-packages/transformers/models/gpt_neox/configuration_gpt_neox.py", line 153, in __init__
    raise ValueError(
ValueError: The hidden size is not divisble by the number of attention heads! Make sure to update them!

The text was updated successfully, but these errors were encountered:

cclough added the bug Something isn't working label Aug 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: The hidden size is not divisble by the number of attention heads! Make sure to update them! #110

ValueError: The hidden size is not divisble by the number of attention heads! Make sure to update them! #110

cclough commented Aug 25, 2024

ValueError: The hidden size is not divisble by the number of attention heads! Make sure to update them! #110

ValueError: The hidden size is not divisble by the number of attention heads! Make sure to update them! #110

Comments

cclough commented Aug 25, 2024