question about multi content reference #88

256785 · 2024-08-01T09:51:25Z

In the process of training generation model，the traindata contains one question，one or zero recall passage，answer and some special tokens.
During inference，I wonder if a question need to get messages from several passages，how to aggregate the answers？

256785 · 2024-08-01T09:56:31Z

for example，question is talking about UK. Passage one is about UK food, Passage two is about UK history.During self-rag inference，I would get only one answer about UK food or history，but not all messages，is it？

fate-ubw · 2024-08-25T00:01:14Z

During self-rag inference，In the first stage selfrag will generate top-k candidate answers, then selfrag will rank all candidate answers by score calculated by special tokens. As a result, selfrag only outputs one final answer. But the situation is different in long form inference by of the beam search mechanism.
I have rewrite the selfrag algorithm, providing clearer and more concise code, which has been integrated into the library RAGLAB. For your inquiries, you can refer to this part of the code at https://github.com/fate-ubw/RAGLAB/blob/main/raglab/rag/infer_alg/self_rag_reproduction/selfrag_reproduction.py#L137

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about multi content reference #88

question about multi content reference #88

256785 commented Aug 1, 2024

256785 commented Aug 1, 2024

fate-ubw commented Aug 25, 2024

question about multi content reference #88

question about multi content reference #88

Comments

256785 commented Aug 1, 2024

256785 commented Aug 1, 2024

fate-ubw commented Aug 25, 2024