Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SingleStream and MultiStream reciprocal score comparison for TEST04 #1491

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

sahelib25
Copy link
Contributor

@sahelib25 sahelib25 commented Sep 11, 2023

Fixes #1433

@sahelib25 sahelib25 requested a review from a team as a code owner September 11, 2023 11:26
@github-actions
Copy link

github-actions bot commented Sep 11, 2023

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@sahelib25 sahelib25 changed the title Fix SingleStream and MultiStream reciprocal score comparison for TEST04, Fixes #1433 Fix SingleStream and MultiStream reciprocal score comparison for TEST04 Sep 11, 2023
@@ -52,13 +52,11 @@ def main():
if ref_mode == "SingleStream":
if re.match(".*Early stopping 90th percentile estimate", line):
ref_score = line.split(": ",1)[1].strip()
ref_score = 1e9 / float(ref_score)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change can potentially break the below check as the check assumes throughput metric for all the scenarios.
https://github.com/mlcommons/inference/pull/1491/files#diff-5c101fd75c9062a7dec72722d5f4aafe66e7d55fbc32f97378b74c08034c272cR141

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, but why a similar approach (of not using reciprocals) does not cause any issue for TEST01 and TEST05?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TEST01 is doing comparison like this - both upper and lower bounds are checked and so we don't need reciprocal but the check is stricter than required.

TEST05 also had a similar check but was removed in a PR before 3.1 and so is currently broken for the offline scenario. This PR should fix that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the TEST05-related PR has been merged, should we do TEST04 in the same way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks fine to me. But I don't know the full reasoning behind taking this reciprocal. @nv-ananjappa any suggestions here?

@psyhtest
Copy link
Contributor

Ping.

@arjunsuresh
Copy link
Contributor

@psyhtest can you please fix the merge conflict? Since there's no opposition we can discuss and merge this in next WG meeting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TEST04 takes reciprocal of scores for SingleStream/MultiStream
4 participants