Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows E2E #689

Merged
merged 28 commits into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
ed32041
[WIP] windows e2e
makslevental Aug 19, 2024
ce8be14
50% of the way there
makslevental Aug 20, 2024
c4a3933
works with the closed source shim
makslevental Aug 21, 2024
ce332c3
fork XRT
makslevental Aug 21, 2024
aae1aa4
install xrt_coreutil on windows right next run-module
makslevental Aug 21, 2024
85e2c79
clean up
makslevental Aug 21, 2024
26320cc
add E2E Test windows
makslevental Aug 22, 2024
7d48f79
disable some tests to check linux
makslevental Aug 22, 2024
b9bf3b3
refactor run_test and restore ci-linux
makslevental Aug 22, 2024
b4f0838
undo temp peano hack
makslevental Aug 22, 2024
a78deab
undo more hacks
makslevental Aug 22, 2024
aca8816
undo even more hacks
makslevental Aug 22, 2024
9fb6823
copy stuff in build_test_cpp.sh
makslevental Aug 22, 2024
2980893
use latest driver
makslevental Aug 22, 2024
4113c91
Update run_matmul_test.sh
makslevental Aug 23, 2024
766d429
hardcode npu1 for now (different pcie id)
makslevental Aug 23, 2024
4220c51
try disable some tests
makslevental Aug 23, 2024
22a53cd
cleanup
makslevental Aug 24, 2024
8317bf9
fix mistake (hopefully)
makslevental Aug 24, 2024
9fffbc0
incorporate comments
makslevental Aug 24, 2024
489fec7
xfail 1536 medium test on windows
makslevental Aug 24, 2024
612d5d0
remove dead code
makslevental Aug 24, 2024
f371629
Update ci-linux.yml
makslevental Aug 24, 2024
f60f9f0
address my own comments
makslevental Aug 26, 2024
6dd3e50
fix darwin cxx
makslevental Aug 26, 2024
5c7704e
rename runner
makslevental Aug 28, 2024
627f7e1
fix after rebase
makslevental Aug 29, 2024
cac317d
disable aie-rt warnings
makslevental Aug 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 12 additions & 22 deletions .github/workflows/ci-linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,9 +70,7 @@ jobs:
- name: Create artifacts
if: ${{ !cancelled() }}
run: |
rm -f iree-install/bin/clang*
rm -f iree-install/bin/llvm-link*
tar cf iree-dist-linux.tar -C iree-install . -C ../iree-build tools/testing/e2e/iree-e2e-matmul-test
tar cf iree-dist-linux.tar -C iree-install .

- name: Upload artifacts
uses: actions/upload-artifact@v4
Expand All @@ -95,7 +93,7 @@ jobs:
strategy:
fail-fast: false
matrix:
runs-on: [linux-phoenix-20240606, linux-phoenix-20240819]
runs-on: [linux-phoenix]
runs-on: ${{ matrix.runs-on }}
env:
XILINXD_LICENSE_FILE: /opt/xilinx/Xilinx.lic
Expand Down Expand Up @@ -152,6 +150,7 @@ jobs:
# on the guthub CI machine.
sudo prlimit -lunlimited --pid $$

source /opt/xilinx/xrt/setup.sh
bash build_tools/ci/run_matmul_test.sh \
test_matmuls \
iree-install \
Expand All @@ -163,20 +162,14 @@ jobs:
- name : Smoke E2E comparison flag test
run: |
source .venv/bin/activate
# install requirements
# TODO(newling) make requirements.txt file
pip install numpy
source /opt/xilinx/xrt/setup.sh
python3 build_tools/ci/cpu_comparison/run_test.py \
test_aie_vs_cpu \
iree-install \
$PWD/llvm-aie \
/opt/xilinx/xrt \
/opt/Xilinx/Vitis/2024.2 \
--reset_npu_between_runs=0 \
--test_set='Smoke' \
--do_not_run_aie=1 \
--verbose=0
--xrt-dir /opt/xilinx/xrt \
--test-set='Smoke' \
--do-not-run-aie

# Assert that output.log is empty (because verbose=0)
if [ -s output.log ]; then
Expand All @@ -190,23 +183,20 @@ jobs:
- name : E2E comparison of AIE to llvm-cpu
run: |
source .venv/bin/activate
# install requirements
# TODO(newling) make requirements.txt file
pip install numpy
source /opt/xilinx/xrt/setup.sh
python3 build_tools/ci/cpu_comparison/run_test.py \
test_aie_vs_cpu \
iree-install \
$PWD/iree-install \
$PWD/llvm-aie \
/opt/xilinx/xrt \
/opt/Xilinx/Vitis/2024.2
makslevental marked this conversation as resolved.
Show resolved Hide resolved
--xrt-dir /opt/xilinx/xrt \
--vitis-dir /opt/Xilinx/Vitis/2024.2 \
--reset-npu-between-runs -v

- name: Printing IR from aie2xclbin
run: |
source .venv/bin/activate
source /opt/xilinx/xrt/setup.sh
bash build_tools/ci/print_ir_aie2xclbin/print_ir_aie2xclbin.sh \
iree-install \
print_ir_aie2xclbin_results \
$PWD/llvm-aie \
/opt/xilinx/xrt \
/opt/Xilinx/Vitis/2024.2
$PWD/llvm-aie
62 changes: 58 additions & 4 deletions .github/workflows/ci-windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ jobs:
- name: Setup Cpp
uses: aminya/setup-cpp@v1
with:
compiler: msvc
compiler: llvm
vcvarsall: true
cmake: true
ninja: true
Expand Down Expand Up @@ -86,9 +86,7 @@ jobs:
- name: Create artifacts
if: ${{ !cancelled() }}
run: |
rm -f iree-install/bin/clang*
rm -f iree-install/bin/llvm-link*
tar cf iree-dist-windows.tar -C iree-install . -C ../iree-build tools/testing/e2e/iree-e2e-matmul-test.exe
tar cf iree-dist-windows.tar -C iree-install .

- name: Upload artifacts
uses: actions/upload-artifact@v4
Expand All @@ -104,3 +102,59 @@ jobs:
with:
path: ${{ env.CACHE_DIR }}
key: windows-build-test-cpp-asserts-v1-${{ github.sha }}-${{ github.event.repository.updated_at }}

test_windows:
name: E2E Test windows
runs-on: windows-phoenix
needs: build_and_ctest
strategy:
fail-fast: true
steps:
- name: "Checking out repository" # for test scripts
uses: actions/checkout@8f4b7f84864484a7bf31766abe9204da3cbe65b3 # v3.5.0
with:
submodules: false # not required for testbench

- name: Download artifacts
uses: actions/download-artifact@v4
with:
name: windows_x86_64_release_packages

- name: Extract artifact
run: |
mkdir iree-install
tar -xf iree-dist-windows.tar -C iree-install
bash build_tools/download_peano.sh

- name: Create venv and install dependencies
run: |
python -m venv .venv
source .venv/Scripts/activate
pip install -r tests/matmul/requirements.txt

- name: E2E correctness matmul test
run: |
source .venv/Scripts/activate
export XILINX_XRT=/c/Xilinx/XRT

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On windows OS, all applications are supposed to uses xrt_coreutil.dll from c:\windows\system32. But this line make the CI computer use the xrt_coreutil.dll from /c/Xilinx/XRT.
This may cause all our tests get a different test result from a system deployed to general public

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Dezhi's statement. On Windows XILINX_XRT should not be set when running tests.

Copy link
Collaborator Author

@makslevental makslevental Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows XILINX_XRT should not be set when running tests.

As far as I am aware, XRT is open source and thus we are free to use it according to how it best suits our needs. So we have our usecase and implementation and you have yours. If you feel so strongly, you are free to refactor your code so that it doesn't expose these env variables (and I am free to fork).

Anyway XRT is deprecated in iree-amd-aie so this entire discussion is moot.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@makslevental ,
What version of NPU KMD driver is installed on the windows computer used for CI?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows XILINX_XRT should not be set when running tests.

As far as I am aware, XRT is open source and thus we are free to use it according to how it best suits our needs. So we have our usecase and implementation and you have yours. If you feel so strongly, you are free to refactor your code so that it doesn't expose these env variables (and I am free to fork).

Anyway XRT is deprecated in iree-amd-aie so this entire discussion is moot.

Based on what I see, removing XILINX_XRT works perfectly fine on my local system. There is a possible wrong setting on the CI computer

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please stop commenting on this PR. It is now 3 months since it was closed. If you have some current issue then create a new issue about it.

bash build_tools/ci/run_matmul_test.sh \
/c/test_matmuls \
$PWD/iree-install \
$PWD/llvm-aie

- name : E2E comparison of AIE to llvm-cpu
run: |
source .venv/Scripts/activate
export XILINX_XRT=/c/Xilinx/XRT
python build_tools/ci/cpu_comparison/run_test.py \
/c/test_aie_vs_cpu \
$PWD/iree-install \
$PWD/llvm-aie -v

- name: Printing IR from aie2xclbin
run: |
source .venv/Scripts/activate
export XILINX_XRT=/c/Xilinx/XRT
bash build_tools/ci/print_ir_aie2xclbin/print_ir_aie2xclbin.sh \
$PWD/iree-install \
/c/print_ir_aie2xclbin_results \
$PWD/llvm-aie
3 changes: 2 additions & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
[submodule "third_party/XRT"]
path = third_party/XRT
url = https://github.com/Xilinx/XRT.git
url = https://github.com/nod-ai/XRT.git
shallow = true
branch = iree-amd-aie-patches
[submodule "third_party/mlir-air"]
path = third_party/mlir-air
url = https://github.com/nod-ai/mlir-air.git
Expand Down
31 changes: 23 additions & 8 deletions build_tools/ci/build_test_cpp.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash

set -eu -o errtrace
set -eux -o errtrace

this_dir="$(cd $(dirname $0) && pwd)"
repo_root="$(cd $this_dir/../.. && pwd)"
Expand Down Expand Up @@ -32,6 +32,9 @@ if [[ "$OSTYPE" == "linux-gnu"* ]]; then
export CMAKE_TOOLCHAIN_FILE="$this_dir/linux_default_toolchain.cmake"
export CC=clang
export CXX=clang++
elif [[ "$OSTYPE" == "msys"* ]]; then
export CC=clang-cl.exe
export CXX=clang-cl.exe
fi
export CCACHE_DIR="${cache_dir}/ccache"
export CCACHE_MAXSIZE="700M"
Expand All @@ -57,8 +60,6 @@ echo '{

cd $iree_dir
CMAKE_ARGS="\
-S $iree_dir \
-B $build_dir \
-GNinja \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=$install_dir \
Expand All @@ -76,14 +77,24 @@ CMAKE_ARGS="\
-DIREE_INPUT_STABLEHLO=OFF \
-DIREE_INPUT_TORCH=OFF \
-DCMAKE_OBJECT_PATH_MAX=4096 \
-DIREE_CMAKE_PLUGIN_PATHS=$PWD/../iree-amd-aie"
-DIREE_CMAKE_PLUGIN_PATHS=$repo_root"

if [[ "$OSTYPE" != "darwin"* ]]; then
CMAKE_ARGS="$CMAKE_ARGS -DIREE_EXTERNAL_HAL_DRIVERS=xrt"
cmake $CMAKE_ARGS \
-DCMAKE_EXE_LINKER_FLAGS_INIT="-fuse-ld=lld" \
-DCMAKE_SHARED_LINKER_FLAGS_INIT="-fuse-ld=lld" \
-DCMAKE_MODULE_LINKER_FLAGS_INIT="-fuse-ld=lld" \
-DCMAKE_C_COMPILER="${CC}" \
-DCMAKE_CXX_COMPILER="${CXX}" \
-DLLVM_TARGET_ARCH=X86 \
-DLLVM_TARGETS_TO_BUILD=X86 \
-DIREE_EXTERNAL_HAL_DRIVERS=xrt \
-S $iree_dir -B $build_dir
else
cmake $CMAKE_ARGS \
-S $iree_dir -B $build_dir
fi

cmake $CMAKE_ARGS

echo "Building all"
echo "------------"
cmake --build "$build_dir" -- -k 0
Expand All @@ -99,10 +110,14 @@ if [[ "$OSTYPE" == "linux-gnu"* ]]; then
ctest --test-dir "$build_dir" -R amd-aie --output-on-failure -j
elif [[ "$OSTYPE" == "darwin"* ]]; then
ctest --test-dir "$build_dir" -R amd-aie -E "pack_peel_pipeline_matmul|conv_fill_spec_pad" --output-on-failure -j --repeat until-pass:5
else
elif [[ "$OSTYPE" == "msys"* ]]; then
# hack while windows is flaky to get past failing tests
ctest --test-dir "$build_dir" -R amd-aie --output-on-failure -j --repeat until-pass:5
fi

# Show ccache stats.
ccache --show-stats

rm -f "$install_dir"/bin/clang*
rm -f "$install_dir"/bin/llvm-link*
cp "$build_dir"/tools/testing/e2e/iree-e2e-matmul-test "$install_dir"/bin
Loading
Loading