Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FactScore Inference Fails with KeyError: 'original_splitted_sentences' #79

Open
hideaki-j opened this issue May 24, 2024 · 2 comments
Open

Comments

@hideaki-j
Copy link

hideaki-j commented May 24, 2024

Hello, thanks for your amazing work!

I want to ask questions about an error KeyError: 'original_splitted_sentences' that I encountered when trying to generate results for FactScore.

Error

When I run run_long_form_static.py for FactScore following the command shown in "Run inference using pre-retrieved passages" in README.md, I encounter:

KeyError: 'original_splitted_sentences'

The error originates from the following line:

"cat": item["cat"], "intermediate": intermediate["original_splitted_sentences"][0]})

This error seems to be the same as issue #76 . However, since that issue was retracted, I am reposting it here.

Culprit?

The error occurs when do_retrieve == False, and the culprit seems to be:

    if do_retrieve is False:
        ...
        prediction_tree = {}
        return preds[0], prediction_tree

at here, since it always return prediction_tree = {}, resulting in KeyError

Another issue: always no retrieval

Upon investigating this error, I also found that no retrieval occurs unless using --mode always_retrieve (i.e., do_retrieve is always False even when using adaptive_retrieval or default). Therefore, when I run run_long_form_static.py with the same flags specified in README.md, it always goes to the if do_retrieve is False path, causing the above error.

Adding the --mode always_retrieve flag solves the error, but I'm not sure if it was accidentally omitted from the instruction command.

Also, I am not sure that always being do_retrieve == False is an expected behaviour here - it seems not to be.

Questions

Q1. Is the --mode always_retrieve flag missing from the command instructions for FactScore, or is the command correct and the cause of the error lies elsewhere?

Q2. With mode == "adaptive_retrieval” and mode == "default”, it appears to always go to do_retrieve == False, but is this expected behavior?

Thanks!

@aiden-leong
Copy link

Adding the --mode always_retrieve flag solves the error, but I'm not sure if it was accidentally omitted from the instruction command.

I believe this is the case. 🍻

@fate-ubw
Copy link

answer1:
I encountered numerous difficulties during the evaluation of factscore as well. The Selfrag repository only provided scripts for the always retrieval mode. I personally evaluated always retrieval, adaptive retrieval, and no retrieval and found that adaptive retrieval and no retrieval produced the same errors as you did. I spent some time resolving this issue, and you can refer to my scripts for evaluating factscore.
https://github.com/fate-ubw/RAGLAB/blob/main/run/Factscore/2-eval_fact-raglab-selfrag-selfrag_8B-adaptive_retrieval-GPT.sh
answer2:
I encountered the same problem as you, where there are discrepancies in the logic of longform compared to the selfrag paper, and the code related to the construction of prediction_tree = {} is difficult to understand. I have rewritten the code for the three modes of selfrag longform (always retrieval, adaptive retrieval, no retrieval) in a clearer manner, which you can refer to for understanding the reasoning process of selfrag longform.
https://github.com/fate-ubw/RAGLAB/blob/main/raglab/rag/infer_alg/self_rag_reproduction/selfrag_reproduction.py#L216

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants