Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling multitenancy tests with FSDP #233

Draft
wants to merge 32 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
e773c8e
Enabling multitenacy tests with FSDP
jyothisambolu Sep 3, 2024
bd28da5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 3, 2024
3010287
Merge branch 'main' into multitenacy_fix_fsdp
jyothisambolu Sep 6, 2024
2ddcd24
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 6, 2024
e33a415
Update hpu-tests.yml
jerome-habana Sep 6, 2024
120a456
Update hpu-tests.yml
jyothisambolu Sep 6, 2024
7554e53
Update hpu-tests.yml
jyothisambolu Sep 6, 2024
45545f8
Update hpu-tests.yml
jyothisambolu Sep 6, 2024
1f2a127
Update hpu-tests.yml
jyothisambolu Sep 6, 2024
1552da9
Update test_fsdp.py
jyothisambolu Sep 6, 2024
01dcc2a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 6, 2024
fc57273
Updating standalone tests execution
jyothisambolu Sep 6, 2024
81349cc
Update hpu-tests.yml remove extra space
jyothisambolu Sep 6, 2024
5ea38d9
Update hpu-tests.yml
jyothisambolu Sep 6, 2024
97896ae
Update hpu-tests.yml with debug flags
jyothisambolu Sep 6, 2024
3c3f5a3
Update hpu-tests.yml moving fsdp
jyothisambolu Sep 6, 2024
576f24d
Update test_fsdp.py with failed test
jyothisambolu Sep 6, 2024
4b819af
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 6, 2024
3b0e3c6
Update hpu-tests.yml
jyothisambolu Sep 6, 2024
0929cf7
Update run_standalone_tests.sh with file prefix
jyothisambolu Sep 6, 2024
8dd99cd
Update test_fsdp.py
jyothisambolu Sep 6, 2024
7f7163b
Update test_fsdp.py
jyothisambolu Sep 6, 2024
90ce2f2
Update run_standalone_tests.sh
jyothisambolu Sep 6, 2024
6ec67ae
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 6, 2024
6d3afc4
Update test_fsdp.py
jyothisambolu Sep 6, 2024
90905a4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 6, 2024
513991d
Merge branch 'main' into multitenacy_fix_fsdp
jerome-habana Sep 9, 2024
8263a71
Merge branch 'main' into multitenacy_fix_fsdp
jyothisambolu Oct 11, 2024
dba6c3b
Merge branch 'main' into multitenacy_fix_fsdp
jyothisambolu Oct 17, 2024
bb8f4e8
Merge branch 'main' into multitenacy_fix_fsdp
jerome-habana Oct 17, 2024
7ecbfd8
Update hpu-tests.yml
jyothisambolu Oct 17, 2024
d89ca10
Merge branch 'main' into multitenacy_fix_fsdp
jyothisambolu Nov 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 22 additions & 22 deletions .azure/hpu-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,13 @@ jobs:

# todo: add sanity check that needed cards are visible and accessible

- bash: |
bash tests/run_standalone_tests.sh --hpus 2 -f \
tests/test_pytorch/strategies/test_fsdp.py
env:
PT_HPU_LAZY_MODE: 0
displayName: 'FSDP PT Multi card(2) HPU test'

- bash: |
set -ex
pip list
Expand Down Expand Up @@ -128,14 +135,14 @@ jobs:
PT_HPU_LAZY_MODE: 0
displayName: 'HPU torch compile tests'

# - bash: |
# python -m pytest -sv tests/test_fabric/test_fsdp.py \
# --hpus 1 --junitxml=hpu_test_fsdp-fabric-results.xml
# python -m pytest -sv tests/test_pytorch/strategies/test_fsdp.py \
# --hpus 1 --junitxml=hpu_test_fsdp-results.xml
# env:
# PT_HPU_LAZY_MODE: 0
# displayName: 'HPU FSDP tests'
- bash: |
python -m pytest -sv tests/test_fabric/test_fsdp.py \
--hpus 1 --junitxml=hpu_test_fsdp-fabric-results.xml
python -m pytest -sv tests/test_pytorch/strategies/test_fsdp.py \
--hpus 1 --junitxml=hpu_test_fsdp-results.xml
env:
PT_HPU_LAZY_MODE: 0
displayName: 'HPU FSDP tests'

- bash: |
python -m pytest -sv tests/test_pytorch/strategies/test_deepspeed.py \
Expand Down Expand Up @@ -165,20 +172,13 @@ jobs:
tests/test_pytorch/test_profiler.py
displayName: 'Multi card(2) HPU test'

# - bash: |
# bash tests/run_standalone_tests.sh --hpus 2 -f \
# tests/test_pytorch/test_fsdp.py
# env:
# PT_HPU_LAZY_MODE: 0
# displayName: 'FSDP PT Multi card(2) HPU test'

# - bash: |
# bash tests/run_standalone_tests.sh --hpus 2 -f \
# tests/test_fabric/test_fsdp.py
# env:
# PT_HPU_LAZY_MODE: 0
# condition: or(eq(variables['HABANA_VISIBLE_MODULES'], '4,5'), eq(variables['HABANA_VISIBLE_MODULES'], '6,7'))
# displayName: 'FSDP Fabric Multi card(2) HPU test'
- bash: |
bash tests/run_standalone_tests.sh --hpus 2 -f \
tests/test_fabric/test_fsdp.py
env:
PT_HPU_LAZY_MODE: 0
condition: or(eq(variables['HABANA_VISIBLE_MODULES'], '4,5'), eq(variables['HABANA_VISIBLE_MODULES'], '6,7'))
displayName: 'FSDP Fabric Multi card(2) HPU test'

- bash: pip install ".[examples]"
condition: or(eq(variables['HABANA_VISIBLE_MODULES'], '4,5'), eq(variables['HABANA_VISIBLE_MODULES'], '6,7'))
Expand Down
1 change: 1 addition & 0 deletions src/lightning_habana/fabric/strategies/fsdp.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ def __init__(
) -> None:
if not _LIGHTNING_GREATER_EQUAL_2_3_0:
raise OSError("HPUFSDPStrategy requires `lightning>=2.3.0 or pytorch-lightning >= 2.3.0`.")

super().__init__(
accelerator=accelerator,
parallel_devices=parallel_devices,
Expand Down
7 changes: 6 additions & 1 deletion tests/run_standalone_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,12 @@ fi
echo "Test files: $test_files"

# Get all the tests marked with standalone marker
file_prefix="${HABANA_VISIBLE_MODULES//,/}"
echo "$file_prefix"

TEST_FILE="standalone_tests.txt"
TEST_FILE="$file_prefix$TEST_FILE"
echo "Test file $TEST_FILE"
test_command="python -um pytest ${test_files} -q --collect-only --pythonwarnings ignore -m \"${marker}\""

if [[ -n "$filter" ]]; then
Expand Down Expand Up @@ -102,5 +107,5 @@ for result in "${results[@]}"; do
done
echo "===== STANDALONE TEST STATUS END ====="

mv tests/**/*.xml .
mv tests/**/**/*.xml .
rm $TEST_FILE
Loading