-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-implement SYCL backend parallel_for
to improve bandwidth utilization
#1976
Open
mmichel11
wants to merge
65
commits into
main
Choose a base branch
from
dev/mmichel11/parallel_for_vectorize
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 53 commits
Commits
Show all changes
65 commits
Select commit
Hold shift + click to select a range
9764a57
Optimize memory transactions in SYCL backend parallel for
mmichel11 c836b1d
clang-format
mmichel11 55f33a4
Correct comment and error handling.
mmichel11 adadd56
__num_groups bugfix
mmichel11 71d7bcc
Introduce stride recommender for different targets and better distrib…
mmichel11 ebb3d56
Cleanup
mmichel11 2c4ecd0
Unroll loop if possible
mmichel11 dc6bd0c
Revert "Unroll loop if possible"
mmichel11 d5126b2
Use a small and large kernel in parallel for
mmichel11 6433a50
Improve __iters_per_work_item heuristic.
mmichel11 d376124
Code cleanup
mmichel11 a7c7606
Clang format
mmichel11 b8aa15c
Update comments
mmichel11 b45a7c2
Bugfix in comment
mmichel11 4f9a360
More cleanup and better handle non-full case
mmichel11 7bb1d2b
Rename __ndi to __item for consistency with codebase
mmichel11 a2ad920
Update all comments on kernel naming trick
mmichel11 47fe214
Handle non-full case in a cleaner way
mmichel11 79a18e9
Switch min tuple type utility to return size of type
mmichel11 3ab8c75
Remove unnecessary template parameter
mmichel11 4a70fe2
Make non-template function inline for ODR compliance
mmichel11 5530209
If the iters per work item is 1, then only compile the basic pfor kernel
mmichel11 90f19d4
Address several PR comments
mmichel11 1ac65b9
Remove free function __stride_recommender
mmichel11 6a5a562
Accept ranges as forwarding references in __parallel_for_large_submitter
mmichel11 357032f
Address reviewer comments
mmichel11 ca9e594
Introduce vectorized for-path for small types and parallel_backend_sy…
mmichel11 e4060f5
Improve testing and cleanup of code
mmichel11 283b053
clang format
mmichel11 75e4beb
Miscellaneous fixes identified during testing
mmichel11 7990bc1
clang-format
mmichel11 4aaa81f
Fix ordering to __vector_load call
mmichel11 65e4a68
Add support for vectorization with C++20 parallel range APIs
mmichel11 b4657a6
Add device copyable specializations for new walk patterns
mmichel11 3086dd3
Align vector_walk implementation with other vector functors
mmichel11 df17673
Add back non-spirv path
mmichel11 fd4e2c3
Further improve test coverage
mmichel11 58fd466
Restore original shift_left due to implicit implementation requiremen…
mmichel11 094124f
Fix issues in vectorized rotate
mmichel11 82135f6
Fix fpga parallel for compilation issues
mmichel11 e979118
Restore initial shift_left_right.pass.cpp
mmichel11 4bfaada
Fix test side issue when unnamed lambdas are disabled
mmichel11 8ae18db
Add a vector path specialization for std::swap_ranges
mmichel11 6cb11c7
General code cleanup
mmichel11 505bdf3
Bugfix with __pattern_swap using nanoranges
mmichel11 114924d
clang-format
mmichel11 845de21
Address applicable comments from PR #1870
mmichel11 71678d0
Refactor __lazy_ctor_storage deleter
mmichel11 8b0b18b
Address review comments
mmichel11 f7d9753
Remove intrusive test macro and adjust input sizes in test framework
mmichel11 83c5ca4
Make walk_scalar_base and walk_vector_or_scalar_base structs
mmichel11 fedd5de
Add missing max_n
mmichel11 08aa260
Add constructors for for-based bricks
mmichel11 a5eca96
Remove extraneous {} and add constructor to custom_brick
mmichel11 32612a1
Limit recursive searching of __min_nested_type_size to tuples
mmichel11 1336735
Work around compiler vectorization issue
mmichel11 c5e7d61
Add missing decays
mmichel11 0c6ca75
Add compile time check to ensure we do not get buffer pointer on host
mmichel11 be8aeda
Revert "Work around compiler vectorization issue"
mmichel11 86b9c89
Remove all begin() calls on views in vectorization paths
mmichel11 ffd95cc
Remove unused __is_passed_directly_range utility
mmichel11 537a6f0
Rename __scalar_path / __vector_path to __scalar_path_impl / __vector…
mmichel11 50a60ea
Correct __vector_walk deleters and a type in __reverse_copy
mmichel11 1081ab8
Set upper limit of 10,000,000 for get_pattern_for_max_n
mmichel11 9513edb
General cleanup and renaming for consistency
mmichel11 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets fix the naming of this while were touching all its instances
__custom_brick
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the historical convention within the
internal/
directory is to not use any leading underscores although it has changed a bit over time.I do not have a strong preference if we make this change or leave it as is, but maybe it fits in a broader discussion regarding the remaining implementations in this directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there is compelling reason other than resistance to making purely cosmetic changes in the changelog to have a different convention here. This is why I suggest adjusting it while we are already touching all (or most) instances of it. Perhaps someone with a longer historical knowledge of this code could chime in here if there is a reason to keep this with different conventions.
Not super important to me, so optional nitpick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will wait a bit longer to see if anyone has objections. If not, then I will add this suggestion.