Skip to content

Commit

Permalink
Implement SplitK and StreamK algorithm for Intel PVC (#132)
Browse files Browse the repository at this point in the history
* WIP: Introduce StreamK for PVC

* fixed starting index calculation

* Fixed barrier count update

* Fixed compilation for normal GEMM

* Perform fixup using threadid instead of subgroup_id

* Fixed the k_idx offset for MMA atom and corrected the reduction offset calculation

* Use log2 for available_xecores

* SplitK working

* Minor cleanup
* Need to fix splitK for batch > 1

* Fixed splitK for batch > 1

* Re-enabled GEMM Universal Adater specialization

* Update split barrier arguments

* Minor cleanup

* Changed initialization to workspace only

* Fix CI failure

* Added support for scheduling non-uniform tiles

* Only include split barrier flags for PVC

* Test

* Code cleanup

* Add separate example for StreamK

* Address feedback for split barrier

* Fix address space for atomicAdd

* Instantiate new accumulator registers per iteration

* Renamed the pipeline file

* Renamed files to xe_*

* Removed l2 workspace alignment

* Update the example to reduce caching effects

* Refactor pipeline code

* Add the option to invoke data parallel decomposition

* Fixing bugs post merge

* Address PR feedback

* Fix tile size

* Fix performance for streamk

* Match the number of workgroups to the available XeCores

* Fix performance for pvc_gemm example

* Address comments

---------

Co-authored-by: Mehdi Goli <[email protected]>
Co-authored-by: Alejandro Acosta <[email protected]>
  • Loading branch information
3 people authored Nov 22, 2024
1 parent 0ca52b6 commit b4a9835
Show file tree
Hide file tree
Showing 15 changed files with 2,228 additions and 37 deletions.
4 changes: 4 additions & 0 deletions cmake/FindDPCPP.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,10 @@ if(NOT "${DPCPP_SYCL_ARCH}" STREQUAL "")
endif()
endif()

if("${DPCPP_SYCL_TARGET}" STREQUAL "intel_gpu_pvc")
list(APPEND DPCPP_FLAGS "-Xspirv-translator;--spirv-ext=+SPV_INTEL_split_barrier")
endif()

if(UNIX)
set_target_properties(DPCPP::DPCPP PROPERTIES
INTERFACE_COMPILE_OPTIONS "${DPCPP_FLAGS};${DPCPP_COMPILE_ONLY_FLAGS}"
Expand Down
5 changes: 5 additions & 0 deletions examples/sycl/pvc/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,8 @@ cutlass_example_add_executable(
pvc_collective_builder
pvc_collective_builder.cpp
)

cutlass_example_add_executable(
pvc_gemm_streamk
pvc_gemm_streamk.cpp
)
Loading

0 comments on commit b4a9835

Please sign in to comment.