Aligning tokens with supersenses? #4

victoryhb · 2020-10-16T17:28:31Z

Thank you very much for sharing the code for your excellent paper.
Pardon me for asking this newbie question: how to align the tokens in the input sentence with the supersenses outputted from the model?
For example, the words in the sentence "I went to the store to buy some groceries." do not appear to be aligned with the following senses

['noun.person']
['verb.communication']
['verb.social']
['verb.communication']
['noun.artifact']
['noun.artifact']
['verb.communication']
['verb.cognition']
['noun.artifact']
['noun.artifact']
['adv.all']
['adv.all']

as printed using the following code:

for i, id_ in enumerate(input_ids[0]):
  print(sensebert_model.tokenizer.convert_ids_to_senses([np.argmax(supersense_logits[0][i])]))

Could you please provide some example code for how to do this properly? Thanks a lot in advance!

The text was updated successfully, but these errors were encountered:

MeMartijn · 2021-07-15T11:16:39Z

@victoryhb This might be a long shot, but I was wondering whether you figured this out in the end. I also can't seem to figure out how to align the tokens.

MeMartijn · 2021-07-21T15:06:26Z

@oriram Do you have any hints on how to align the predicted senses to words in sentences?

oriram · 2021-07-22T13:13:31Z

Hi @MeMartijn,
There is no clear "alignment" as out-of-vocabulary words are split to multiple tokens (and therefore can have multiple supersenses).
However, you can do one of the following:

Enumerate over input_ids and predicted supersenses - This will give you the supersense for each token.
Change the tokenizer code such that it returns the index of the first token for each "word"

Hope this helps,
Ori

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aligning tokens with supersenses? #4

Aligning tokens with supersenses? #4

victoryhb commented Oct 16, 2020

MeMartijn commented Jul 15, 2021

MeMartijn commented Jul 21, 2021

oriram commented Jul 22, 2021

Aligning tokens with supersenses? #4

Aligning tokens with supersenses? #4

Comments

victoryhb commented Oct 16, 2020

MeMartijn commented Jul 15, 2021

MeMartijn commented Jul 21, 2021

oriram commented Jul 22, 2021