[SYCLomatic] Block Store headers core #1819

abhilash1910 · 2024-03-25T07:31:15Z

PR for Store header functions for Block API (related later to #1305 )
Linked with Load: #1640
cc @yihanwg @danhoeflinger @mmichel11

clang/runtime/dpct-rt/include/dpct/group_utils.hpp

abhilash1910 · 2024-05-30T06:59:52Z

@danhoeflinger @mmichel11 requesting review when available. Thanks.

yihanwg · 2024-05-30T08:53:12Z

clang/runtime/dpct-rt/include/dpct/group_utils.hpp

+
+};
+
+/// Stores a blocked arrangement of work items linear segment of items.


I'd like to have a more detail comments like

SYCLomatic/clang/runtime/dpct-rt/include/dpct/sparse_utils.hpp

Lines 278 to 306 in b2e5588

/// Computes a CSR format sparse matrix-dense matrix product.

/// C = alpha * op(A) * B + beta * C

/// \param [in] queue The queue where the routine should be executed. It must

/// have the in_order property when using the USM mode.

/// \param [in] trans The operation applied to the matrix A.

/// \param [in] sparse_rows Number of rows of the matrix A.

/// \param [in] dense_cols Number of columns of the matrix op(B) or C.

/// \param [in] sparse_cols Number of columns of the matrix A.

/// \param [in] alpha Scaling factor for the matrix A.

/// \param [in] info Matrix info of the matrix A.

/// \param [in] val An array containing the non-zero elements of the matrix A.

/// \param [in] row_ptr An array of length \p num_rows + 1.

/// \param [in] col_ind An array containing the column indices in index-based

/// numbering.

/// \param [in] b Data of the matrix B.

/// \param [in] ldb Leading dimension of the matrix B.

/// \param [in] beta Scaling factor for the matrix B.

/// \param [in, out] c Data of the matrix C.

/// \param [in] ldc Leading dimension of the matrix C.

template <typename T>

void csrmm(sycl::queue &queue, oneapi::mkl::transpose trans, int sparse_rows,

int dense_cols, int sparse_cols, const T *alpha,

const std::shared_ptr<matrix_info> info, const T *val,

const int *row_ptr, const int *col_ind, const T *b, int ldb,

const T *beta, T *c, int ldc) {

csrmm<T>(queue, trans, oneapi::mkl::transpose::nontrans, sparse_rows,

dense_cols, sparse_cols, alpha, info, val, row_ptr, col_ind, b, ldb,

beta, c, ldc);

}

danhoeflinger · 2024-05-30T12:00:06Z

Please create a testing PR and link it in the description. Doing this earlier in the process as compared to load will help make this review progress more smoothly.

abhilash1910 · 2024-05-31T07:33:06Z

Linking test PR : oneapi-src/SYCLomatic-test#680 WIP.

mmichel11 · 2024-06-05T15:40:49Z

clang/runtime/dpct-rt/include/dpct/group_utils.hpp

+  // storage
+  size_t linear_tid = item.get_local_linear_id();
+  OutputIteratorT workitem_itr = block_itr + linear_tid;
+  size_t GROUP_WORK_ITEMS = item.get_global_range().size();


get_global_range returns the global dimensions of the kernel, so when launching more than a single work-group this will be incorrect.

We can switch this to what we did in group load: size_t group_work_items = item.get_local_range().size();

I'd recommend making sure we have testing coverage for such a case as well.

Yes local range should be used, all the tests are currently in 1 wg . Will extend tests for other wg sizes in a separate PR. thanks.

clang/runtime/dpct-rt/include/dpct/group_utils.hpp

danhoeflinger · 2024-08-21T20:59:11Z

clang/runtime/dpct-rt/include/dpct/group_utils.hpp

+store_subgroup_striped(const Item &item, OutputIteratorT block_itr,
+                       InputT (&items)[ITEMS_PER_WORK_ITEM]) {
+
+  // This implementation does not take in account range loading across


Can you describe what you mean by "range loading" means in this context?

Range loading/storing refers to loading/storing within bounded intervals across the warps. Not in the full scope .

clang/runtime/dpct-rt/include/dpct/group_utils.hpp

Co-authored-by: Dan Hoeflinger <[email protected]>

danhoeflinger · 2024-08-22T13:12:29Z

clang/runtime/dpct-rt/include/dpct/group_utils.hpp

+/// Stores a subgroup-striped arrangement of work items linear segment of items.
+// Created as free function until exchange mechanism is
+// implemented.
+// To-do: inline this function with BLOCK_STORE_WARP_TRANSPOSE mechanism


I'm not sure what this comment means exactly. But also, lets use our own terminology here.

clang/runtime/dpct-rt/include/dpct/group_utils.hpp

Co-authored-by: Dan Hoeflinger <[email protected]>

abhilash1910 · 2024-12-06T02:45:02Z

Closed as implemented.

block store

13d8b67

abhilash1910 requested a review from a team as a code owner March 25, 2024 07:31

abhilash1910 requested review from the-slow-one and tomflinda March 25, 2024 07:31

abhilash1910 and others added 7 commits March 25, 2024 14:54

fix bug

7517519

update code

6b7fd09

fix template param

454c453

Merge branch 'SYCLomatic' into block_store

9e75c62

fix error

ffbd181

Merge branch 'SYCLomatic' into block_store

a0007e1

add in group_utils

49147b8

yihanwg reviewed May 30, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

use class

18f826a

yihanwg reviewed May 30, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

review commit

7149372

format

431d4a4

yihanwg reviewed May 30, 2024

View reviewed changes

This was referenced Jun 5, 2024

[SYCLomatic-test] test load and store headers oneapi-src/SYCLomatic-test#680

Open

[SYCLomatic-test] Block store test oneapi-src/SYCLomatic-test#725

Open

mmichel11 reviewed Jun 5, 2024

View reviewed changes

abhilash1910 added 4 commits June 6, 2024 09:51

review commit

8cc73f1

Merge branch 'oneapi-src:SYCLomatic' into block_store

a677eb2

clang-format

98d0193

Merge branch 'oneapi-src:SYCLomatic' into block_store

79295f8

abhilash1910 mentioned this pull request Jul 11, 2024

[SYCLomatic] Support migration for cub::{StoreDirectBlocked, StoreDirectStriped} API #1305

Closed

abhilash1910 added 2 commits July 11, 2024 19:53

reorder template args for better visibility in parsing

c4fe035

revert template alignment

76ec684

fix temps pointer

41b1c8a

mmichel11 reviewed Aug 21, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

rectify comment

b046dcc

danhoeflinger reviewed Aug 21, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

danhoeflinger reviewed Aug 21, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

danhoeflinger reviewed Aug 21, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

danhoeflinger reviewed Aug 21, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

abhilash1910 and others added 4 commits August 22, 2024 13:30

Merge branch 'SYCLomatic' into block_store

f86801d

Update clang/runtime/dpct-rt/include/dpct/group_utils.hpp

273d098

Co-authored-by: Dan Hoeflinger <[email protected]>

Update group_utils.hpp

3185ceb

fix review comments

cc00403

danhoeflinger reviewed Aug 22, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

danhoeflinger reviewed Aug 22, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

danhoeflinger reviewed Aug 22, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

mmichel11 reviewed Aug 22, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Show resolved Hide resolved

mmichel11 reviewed Aug 22, 2024

View reviewed changes

clang/runtime/dpct-rt/include/dpct/group_utils.hpp Outdated Show resolved Hide resolved

abhilash1910 and others added 4 commits August 26, 2024 11:42

Merge branch 'SYCLomatic' into block_store

56c07e1

fix

28ff868

Update clang/runtime/dpct-rt/include/dpct/group_utils.hpp

e87c0a6

Co-authored-by: Dan Hoeflinger <[email protected]>

update correct variables

1802fbe

abhilash1910 closed this Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCLomatic] Block Store headers core #1819

[SYCLomatic] Block Store headers core #1819

abhilash1910 commented Mar 25, 2024

abhilash1910 commented May 30, 2024

yihanwg May 30, 2024

danhoeflinger commented May 30, 2024 •

edited

Loading

abhilash1910 commented May 31, 2024

mmichel11 Jun 5, 2024

danhoeflinger Jun 5, 2024

abhilash1910 Jun 6, 2024

danhoeflinger Aug 21, 2024

abhilash1910 Aug 22, 2024

danhoeflinger Aug 22, 2024

abhilash1910 commented Dec 6, 2024


		};

		/// Stores a blocked arrangement of work items linear segment of items.

	/// Computes a CSR format sparse matrix-dense matrix product.
	/// C = alpha * op(A) * B + beta * C
	/// \param [in] queue The queue where the routine should be executed. It must
	/// have the in_order property when using the USM mode.
	/// \param [in] trans The operation applied to the matrix A.
	/// \param [in] sparse_rows Number of rows of the matrix A.
	/// \param [in] dense_cols Number of columns of the matrix op(B) or C.
	/// \param [in] sparse_cols Number of columns of the matrix A.
	/// \param [in] alpha Scaling factor for the matrix A.
	/// \param [in] info Matrix info of the matrix A.
	/// \param [in] val An array containing the non-zero elements of the matrix A.
	/// \param [in] row_ptr An array of length \p num_rows + 1.
	/// \param [in] col_ind An array containing the column indices in index-based
	/// numbering.
	/// \param [in] b Data of the matrix B.
	/// \param [in] ldb Leading dimension of the matrix B.
	/// \param [in] beta Scaling factor for the matrix B.
	/// \param [in, out] c Data of the matrix C.
	/// \param [in] ldc Leading dimension of the matrix C.
	template <typename T>
	void csrmm(sycl::queue &queue, oneapi::mkl::transpose trans, int sparse_rows,
	int dense_cols, int sparse_cols, const T *alpha,
	const std::shared_ptr<matrix_info> info, const T *val,
	const int row_ptr, const int col_ind, const T *b, int ldb,
	const T beta, T c, int ldc) {
	csrmm<T>(queue, trans, oneapi::mkl::transpose::nontrans, sparse_rows,
	dense_cols, sparse_cols, alpha, info, val, row_ptr, col_ind, b, ldb,
	beta, c, ldc);
	}

[SYCLomatic] Block Store headers core #1819

[SYCLomatic] Block Store headers core #1819

Conversation

abhilash1910 commented Mar 25, 2024

abhilash1910 commented May 30, 2024

yihanwg May 30, 2024

Choose a reason for hiding this comment

danhoeflinger commented May 30, 2024 • edited Loading

abhilash1910 commented May 31, 2024

mmichel11 Jun 5, 2024

Choose a reason for hiding this comment

danhoeflinger Jun 5, 2024

Choose a reason for hiding this comment

abhilash1910 Jun 6, 2024

Choose a reason for hiding this comment

danhoeflinger Aug 21, 2024

Choose a reason for hiding this comment

abhilash1910 Aug 22, 2024

Choose a reason for hiding this comment

danhoeflinger Aug 22, 2024

Choose a reason for hiding this comment

abhilash1910 commented Dec 6, 2024

danhoeflinger commented May 30, 2024 •

edited

Loading