Make the T5 model tests use cosine similarity #895

sogartar · 2025-02-01T01:19:05Z

There were several xfail tests with bad metric. Cosine similarity is a better metric for language embeddings.

The comparison between bf16 and f32 exhibits a small fraction of outliers that have a higher per-token numerical error than the majority of tokens. To account for that the testing metric is expanded to test for inlier and outlier absolute tolerance.

sharktank/tests/models/t5/t5_test.py

ScottTodd · 2025-02-07T00:46:36Z

https://github.com/nod-ai/shark-ai/actions/runs/13164598080/job/36809434362?pr=895 stuck? I'm going to cancel it @sogartar

There were several xfail tests with bad metric. Cosine similarity is a better metric for language embeddings. The comparison between bf16 and f32 exhibits a small fraction of outliers that have a higher per-token numerical error than the majority of tokens. To account for that the testing metric is expanded to test for inlier and outlier absolute tolerance.

sogartar marked this pull request as ready for review February 3, 2025 14:44

sogartar requested review from dan-garvey, rsuderman, archana-ramalingam and KyleHerndon February 3, 2025 14:45

archana-ramalingam reviewed Feb 3, 2025

View reviewed changes

sharktank/tests/models/t5/t5_test.py Outdated Show resolved Hide resolved

archana-ramalingam reviewed Feb 3, 2025

View reviewed changes

sharktank/tests/models/t5/t5_test.py Outdated Show resolved Hide resolved

sogartar force-pushed the t5-fix-xfails branch from 7e6b46a to 46f76df Compare February 5, 2025 16:52

sogartar requested a review from archana-ramalingam February 5, 2025 16:52

sogartar added 4 commits February 10, 2025 16:56

Paraphrase some test comments

af36007

Fix wrong atol from 1e-10 to 1e-1 and mark HF test as skipped

f9c1d9a

Really skip the test

7366cc2

sogartar force-pushed the t5-fix-xfails branch from ea9358a to 7366cc2 Compare February 10, 2025 16:57

archana-ramalingam approved these changes Feb 10, 2025

View reviewed changes

sogartar merged commit 7a8f360 into nod-ai:main Feb 10, 2025
34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the T5 model tests use cosine similarity #895

Make the T5 model tests use cosine similarity #895

sogartar commented Feb 1, 2025

ScottTodd commented Feb 7, 2025

Make the T5 model tests use cosine similarity #895

Make the T5 model tests use cosine similarity #895

Conversation

sogartar commented Feb 1, 2025

ScottTodd commented Feb 7, 2025