Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about multi content reference #88

Open
256785 opened this issue Aug 1, 2024 · 2 comments
Open

question about multi content reference #88

256785 opened this issue Aug 1, 2024 · 2 comments

Comments

@256785
Copy link

256785 commented Aug 1, 2024

In the process of training generation model,the traindata contains one question,one or zero recall passage,answer and some special tokens.
During inference,I wonder if a question need to get messages from several passages,how to aggregate the answers?

@256785
Copy link
Author

256785 commented Aug 1, 2024

for example,question is talking about UK. Passage one is about UK food, Passage two is about UK history.During self-rag inference,I would get only one answer about UK food or history,but not all messages,is it?

@fate-ubw
Copy link

During self-rag inference,In the first stage selfrag will generate top-k candidate answers, then selfrag will rank all candidate answers by score calculated by special tokens. As a result, selfrag only outputs one final answer. But the situation is different in long form inference by of the beam search mechanism.
I have rewrite the selfrag algorithm, providing clearer and more concise code, which has been integrated into the library RAGLAB. For your inquiries, you can refer to this part of the code at https://github.com/fate-ubw/RAGLAB/blob/main/raglab/rag/infer_alg/self_rag_reproduction/selfrag_reproduction.py#L137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants