Issue running BERT #19

facundosuenzo · 2022-03-07T22:59:50Z

Hi,

I'm having issues running the script of RoBERTa (for the US dataset)

I ran this line

!python run_language_modeling.py --output_dir=output_roberta_US --model_type=roberta --model_name_or_path=roberta-base --do_train --train_data_file=us_blog_train --do_eval --eval_data_file=us_blog_test --mlm

And I've got the following error. There is an issue with one of the arguments.

(cut output) Traceback (most recent call last): File "run_language_modeling.py", line 545, in <module> main() File "run_language_modeling.py", line 497, in main global_step, tr_loss = train(args, train_dataset, model, tokenizer) File "run_language_modeling.py", line 228, in train outputs = model(inputs, masked_lm_labels=labels) if args.mlm else model(inputs, labels=labels) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) TypeError: forward() got an unexpected keyword argument 'masked_lm_labels'

Then, after exercise 3, when visualizations are introduced, the following function word_vector does not seem to be defined (and it is used within the visualise_diffs

Thanks in advance for your help!

The text was updated successfully, but these errors were encountered:

JunsolKim · 2022-03-08T14:47:07Z

Hi @facundosuenzo, could you share your entire code (notebook) through GitHub or email? Also, what is the version of torch (run torch.__version__) and environment (e.g., colab) that you use?

jacyanthis · 2022-03-08T15:19:29Z

Hi @facundosuenzo, could you share your entire code (notebook) through GitHub or email? Also, what is the version of torch (run torch.__version__) and environment (e.g., colab) that you use?

Yeah, unexpected keyword argument is usually a version issue if you're using someone else's code. These codebases are changing rapidly, and people deprecate and replace argument names (unfortunately) very frequently. Our aim is to make all the notebooks run on the latest stable version of each package, though we don't always succeed!

Then, after exercise 3, when visualizations are introduced, the following function word_vector does not seem to be defined (and it is used within the visualise_diffs

Whoops! That function is defined in Homework 7, but I forgot to copy it into Homework 8 during recent edits. I've added it now, and here is the code. I will test it asap.

def word_vector(text, word_id, model, tokenizer):
    marked_text = "[CLS] " + text + " [SEP]"
    tokenized_text = tokenizer.tokenize(marked_text)
    indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
    tokens_tensor = torch.tensor([indexed_tokens])
    word_embeddings = model(tokens_tensor)[0]
    sentence_embeddings = model(tokens_tensor)[1]
    vector = word_embeddings[0][word_id].detach().numpy()
    return vector

facundosuenzo · 2022-03-08T15:32:49Z

Thank you both @jacyanthis and @JunsolKim!

I'm using colab with this torch version 1.10.0+cu111 and I'm sending my code by email too.

So if it's deprecated, does it mean that I won't be able to run BERT, or is there any workaround to this?

Re: word_vector thank you!!

jacyanthis · 2022-03-09T13:49:44Z

For others reading this, the notebook should now work with the latest torch version (which Colab loads automatically) because we have split run_language_modeling.py into two files, one to work with GPT-2 (run_language_modeling_gpt.py) and one to work with RoBERTa (run_language_modeling_roberta.py).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue running BERT #19

Issue running BERT #19

facundosuenzo commented Mar 7, 2022

JunsolKim commented Mar 8, 2022

jacyanthis commented Mar 8, 2022 •

edited

Loading

facundosuenzo commented Mar 8, 2022

jacyanthis commented Mar 9, 2022

Issue running BERT #19

Issue running BERT #19

Comments

facundosuenzo commented Mar 7, 2022

JunsolKim commented Mar 8, 2022

jacyanthis commented Mar 8, 2022 • edited Loading

facundosuenzo commented Mar 8, 2022

jacyanthis commented Mar 9, 2022

jacyanthis commented Mar 8, 2022 •

edited

Loading