Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Bug in calculate_score.py, empty responses or extractions results in non-empty normalized extraction due to get_most_similar #19

Open
mattmazzola opened this issue Feb 14, 2024 · 0 comments · May be fixed by #20

Comments

@mattmazzola
Copy link
Contributor

I was debugging an issue with our model outputting empty responses for all questions and noticed the accuracy score was still 22% when I expected it should be 0%.

I debug further and found that for multi_choice questions there is a path that computes Levenshtein distance but doesn't guard against empty inputs meaning it would output a valid choice regardless.
(Likely the choice with the least amount of characters which would be the minimum edit distance, or first choice if all equal length)

if extraction in options:
# convert option letter to text, e.g. "A" -> "text"
ind = options.index(extraction)
extraction = choices[ind]
else:
# select the most similar option
extraction = get_most_similar(extraction, choices)

def get_most_similar(prediction, choices):
"""
Use the Levenshtein distance (or edit distance) to determine which of the choices is most similar to the given prediction
"""
distances = [distance(prediction, choice) for choice in choices]
ind = distances.index(min(distances))
return choices[ind]

I also saw there was a questionable Exception handling when coercing the input value to a string.
It assigns an empty string and continues. I think it should exiting early and return None.
This assignment of empty string could further contribute to the issue above, for multiple choice problems where the extraction is not a string

if isinstance(extraction, str):
extraction = extraction.strip()
else:
try:
extraction = str(extraction)
except:
extraction = ""

Video Demonstration

https://youtu.be/vj07WRvcLDw

mattmazzola added a commit to mattmazzola/MathVista that referenced this issue Feb 15, 2024
@mattmazzola mattmazzola linked a pull request Feb 15, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant