Problem with skip_special_token in attribute method #293

rafikg · 2024-11-08T19:38:43Z

Question

MRE:

import inseq
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
from inseq.data.aggregator import SequenceAttributionAggregator, SubwordAggregator
import torch
import numpy as np
model = M2M100ForConditionalGeneration.from_pretrained("facebook/m2m100_418M")
tokenizer = M2M100Tokenizer.from_pretrained("facebook/m2m100_418M")
tokenizer.src_lang = 'en'
tokenizer.tgt_lang = 'fr'
model = inseq.load_model(model=model,
                         tokenizer=tokenizer,
                         attribution_method="attention")
out = model.attribute(
    input_texts="Life is like a box of chocolates.",
    generated_texts="La vie est comme une boite de chocolats.",
    generation_args={"forced_bos_token_id": tokenizer.get_lang_id("fr")},
    attribute_target=False,
    show_progress=True,
    skip_special_tokens=True # 
)
**with skip_sepcial_toekns = True and generated_text is not None**
Working well without special tokens in the table. Generate warning as skip_special_tokens is not used!! 

**with skip_sepcial_toekns = True and generated_text is  None**
I have empty table.

**with skip_sepcial_toekns = False and generated_text is  None**
Working well but with special character in the table.



agg = SubwordAggregator()
agg_out = agg.aggregate(attr=out.sequence_attributions[0])
agg_out.show(do_aggregation=True)

I've searched the project's issues.

The text was updated successfully, but these errors were encountered:

rafikg added the question Further information is requested label Nov 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with skip_special_token in attribute method #293

Problem with skip_special_token in attribute method #293

rafikg commented Nov 8, 2024 •

edited

Loading

Problem with skip_special_token in attribute method #293

Problem with skip_special_token in attribute method #293

Comments

rafikg commented Nov 8, 2024 • edited Loading

Question

rafikg commented Nov 8, 2024 •

edited

Loading