-
Notifications
You must be signed in to change notification settings - Fork 36
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[sharktank] Evaluation - Add Perplexity test (#233)
Add Perplexity test for LLM evaluation
- Loading branch information
1 parent
1430182
commit e30d0af
Showing
9 changed files
with
1,279 additions
and
13 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
name: Evaluation Tests | ||
|
||
on: | ||
workflow_dispatch: | ||
schedule: | ||
# Weekdays nightly at 07:00 UTC = 23:00 PST / 00:00 PDT. | ||
- cron: "0 7 * * 1-5" | ||
|
||
concurrency: | ||
# A PR number if a pull request and otherwise the commit hash. This cancels | ||
# queued and in-progress runs for the same PR (presubmit) or commit | ||
# (postsubmit). The workflow name is prepended to avoid conflicts between | ||
# different workflows. | ||
group: ${{ github.workflow }}-${{ github.event.number || github.sha }} | ||
cancel-in-progress: true | ||
|
||
jobs: | ||
test_perplexity: | ||
name: "Evaluation Tests - perplexity" | ||
strategy: | ||
matrix: | ||
version: [3.11] | ||
os: [ubuntu-latest, windows-latest] | ||
fail-fast: false | ||
runs-on: ${{matrix.os}} | ||
defaults: | ||
run: | ||
shell: bash | ||
env: | ||
PIP_CACHE_DIR: "${{ github.workspace }}/.pip-cache" | ||
steps: | ||
- name: "Setting up Python" | ||
id: setup_python | ||
uses: actions/setup-python@v3 | ||
with: | ||
python-version: ${{matrix.version}} | ||
|
||
- name: "Checkout Code" | ||
uses: actions/checkout@v3 | ||
|
||
- name: Cache Pip Packages | ||
uses: actions/cache@v4 | ||
id: cache-pip | ||
with: | ||
path: ${{ env.PIP_CACHE_DIR }} | ||
key: pip-${{ steps.setup_python.outputs.python-version }}-${{ hashFiles('*requirements.txt') }} | ||
|
||
- name: Install pip deps | ||
run: | | ||
python -m pip install --no-compile --upgrade pip | ||
# Note: We install in three steps in order to satisfy requirements | ||
# from non default locations first. Installing the PyTorch CPU | ||
# wheels saves multiple minutes and a lot of bandwidth on runner setup. | ||
pip install --no-compile -r pytorch-cpu-requirements.txt | ||
pip install --no-compile -f https://iree.dev/pip-release-links.html --src deps \ | ||
-e "git+https://github.com/iree-org/iree-turbine.git#egg=iree-turbine" | ||
pip install --no-compile -r requirements.txt -e sharktank/ shortfin/ | ||
- name: Run perplexity test | ||
run: pytest sharktank/tests/evaluate/perplexity_test.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
Robert Boulter is an English film, television and theatre actor. | ||
Robert Boulter had a guest-starring role on the television series "The Bill" in 2000. | ||
Du Fu was a prominent Chinese poet of the Tang dynasty. | ||
Along with Li Bai (Li Po), Du Fu is frequently called the greatest of the Chinese poets. | ||
The Ise-class battleships were a pair of dreadnought battleships built for the Imperial Japanese Navy (IJN) during World War I. | ||
Originally intended to be repeats of the preceding Fusō class, the Ise-class battleships were redesigned before construction began. Both ships carried supplies for the survivors of the Great Kantō earthquake in 1923. | ||
They were modernized in 1934-37 with improvements to their armour and machinery and a rebuilt superstructure in the pagoda mast style. Afterwards they played a minor role in the Second Sino-Japanese War. | ||
Richard Gale "Dick" Rifenburg (August 21, 1926-December 5, 1994) was an American football player and a pioneering television broadcaster for the forerunner to WIVB-TV in Buffalo. | ||
Rifenburg played college football for the University of Michigan Wolverines in 1944 and from 1946 to 1948. He was a consensus selection at end on the 1948 College Football All-America Team. | ||
Rifenburg played professionally in the National Football League (NFL) with the Detroit Lions for one season in 1950. After retiring from football he settled in Buffalo and became a sports broadcaster. | ||
An oxaziridine is an organic molecule that features a three-membered heterocycle containing oxygen, nitrogen, and carbon. In their largest application, oxazidines are intermediates in the industrial production of hydrazine. | ||
Oxaziridine derivatives are also used as specialized reagents in organic chemistry for a variety of oxidations, including alpha hydroxylation of enolates, epoxidation and aziridination of olefins, and other heteroatom transfer reactions. |
Oops, something went wrong.