forked from NVIDIA/cutlass
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intel gpu backend gemm pipeline #89
Merged
aacostadiaz
merged 36 commits into
codeplaysoftware:sycl-develop
from
Jiaxingla:intel_gpu_backend_pipeline
Aug 2, 2024
Merged
Changes from 15 commits
Commits
Show all changes
36 commits
Select commit
Hold shift + click to select a range
07f36e5
apply patch of gemm pipeline
Jiaxingla d4cf3eb
fix format of copyright
Jiaxingla 665f9be
replace the macro of cache flush and idx
Jiaxingla 59c0ce4
auto format
Jiaxingla bdadf1e
auto format
Jiaxingla 9e23cd6
fix comments about prefetch
Jiaxingla 60adb24
fix comments of enum and sycl macro
Jiaxingla 1e3f855
update from tensor library repo
Jiaxingla 8e951d1
fix format
Jiaxingla c92adb3
rm redundancy code
Jiaxingla 6bdda75
resolve conflict
Jiaxingla 3496593
revert the change of nv hpp
Jiaxingla 69d5c2a
Restore invalid changes
Jiaxingla 962766b
refine gemm interface will codeplay epilogue
Jiaxingla 7739df6
fix the issue of batch gemm
Jiaxingla 5b1f514
rm epilogue and revert gemm example
Jiaxingla f5e23e8
only keep code changes of gemm
Jiaxingla 1c57c36
comments clean
Jiaxingla 13ae1a1
rebase other examples
Jiaxingla fdb7244
rm vnni_matrix func
Jiaxingla d09da29
code clean
Jiaxingla 5a3d227
define N-major tensor
Jiaxingla b50574a
delete useless header
Jiaxingla 2c6d1ba
more comments
Jiaxingla c97ccd8
modify comments
Jiaxingla ede5c03
Update pvc_gemm
Jiaxingla f9aae6f
Update mma_xe
Jiaxingla 7878a7c
more comments
Jiaxingla 4c42645
code clean
Jiaxingla abbbe4f
fix typo
Jiaxingla 8e9a84f
revert the change of copy_atom
Jiaxingla ea30c83
rename enum of LSC_LDCC
Jiaxingla 043fbea
fix typo
Jiaxingla abf38bd
scope enums
Jiaxingla 5193329
modify commment of copy
Jiaxingla b854995
remove useless copy
Jiaxingla File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
script_dir=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) | ||
cp ${script_dir}/tools/clang-format/clang-format.hook ${script_dir}/.git/hooks/pre-commit | ||
chmod +x ${script_dir}/.git/hooks/pre-commit | ||
|
||
# https://github.com/intel/llvm/releases/tag/nightly-2024-07-03 | ||
sycl_compiler_path=/opt/cutlass/compiler/0703/ | ||
|
||
# https://ubit-gfx.intel.com/build/19168301/artifacts | ||
gpu_driver_path=/opt/cutlass/gpu_driver/gfx-driver-ci-comp_igc-25012/extract/ | ||
|
||
# AOT compile | ||
output=intel_gpu_pvc | ||
# jit compile | ||
#output=spir64 | ||
|
||
unset epilogue | ||
|
||
# epilogue relu | ||
# epilogue+=" -DEPILOGUE_RELU " | ||
|
||
# epilogue softmax | ||
# epilogue+=" -DEPILOGUE_SOFTMAX " | ||
|
||
export ZE_AFFINITY_MASK=0 | ||
export CPATH=$sycl_compiler_path:$sycl_compiler_path/include/:$sycl_compiler_path/include/sycl/ | ||
export LIBRARY_PATH=$gpu_driver_path/usr/lib/x86_64-linux-gnu/:$sycl_compiler_path/lib/ | ||
export LD_LIBRARY_PATH=$LIBRARY_PATH | ||
export IGC_EnableVISANoSchedule=1 | ||
export IGC_ShaderDumpEnable=1 | ||
export IGC_DumpToCustomDir=./mm_dumps | ||
export IGC_VATemp=1 | ||
export ONEAPI_DEVICE_SELECTOR=level_zero:gpu | ||
|
||
target=./examples/sycl/pvc/pvc_bfloat_dpas_gemm_cute | ||
rm -rf * | ||
|
||
cmake .. -G Ninja -DCMAKE_CUDA_HOST_COMPILER=${sycl_compiler_path}/bin/clang++ \ | ||
-DCUTLASS_ENABLE_SYCL=ON -DDPCPP_SYCL_TARGET=$output -DCMAKE_CXX_COMPILER=${sycl_compiler_path}/bin/clang++ \ | ||
-DCMAKE_CXX_FLAGS=" -DPREFETCH_DEFAULT -DSYCL_INTEL_TARGET ${epilogue} " \ | ||
&& ninja -v $target && $target |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please delete this
build.sh
file,There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, the changes on this branch are meant to be for the general public and should not include code or configurations specific to an internal setup or internal development.
If, for whatever reason, the example needs a particular configuration, it should be done via CMake rather than
.sh
files