-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize merge
algorithm for data sizes equal or greater then 4M items with SLM cache usage
#1937
Draft
SergeyKopienko
wants to merge
82
commits into
main
Choose a base branch
from
dev/skopienko/optimize_merge_to_main_V21_final
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Optimize merge
algorithm for data sizes equal or greater then 4M items with SLM cache usage
#1937
SergeyKopienko
wants to merge
82
commits into
main
from
dev/skopienko/optimize_merge_to_main_V21_final
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…re-implement __find_start_point function Signed-off-by: Sergey Kopienko <[email protected]>
…rename template params in __parallel_merge_submitter Signed-off-by: Sergey Kopienko <[email protected]>
…implementation of __parallel_merge_submitter_large Signed-off-by: Sergey Kopienko <[email protected]>
…using __parallel_merge_submitter_large in the __parallel_merge Signed-off-by: Sergey Kopienko <[email protected]>
…removed redundand comment Signed-off-by: Sergey Kopienko <[email protected]>
…small data types should be acceptable too Signed-off-by: Sergey Kopienko <[email protected]>
…define __base_diagonals_sp_global_ptr outside of parallel_for Signed-off-by: Sergey Kopienko <[email protected]>
…calculate and use cached data-size for work-group Signed-off-by: Sergey Kopienko <[email protected]>
…rename some local variables Signed-off-by: Sergey Kopienko <[email protected]>
…h - debug code Signed-off-by: Sergey Kopienko <[email protected]>
…fix review comment: let's use __parallel_merge_submitter with std::uint32_t data type only Signed-off-by: Sergey Kopienko <[email protected]>
…load source data into SLM by all available work-items in the group Signed-off-by: Sergey Kopienko <[email protected]>
…remove debug code Signed-off-by: Sergey Kopienko <[email protected]>
…rename some variables Signed-off-by: Sergey Kopienko <[email protected]>
…removed redundand comment Signed-off-by: Sergey Kopienko <[email protected]>
…removed redundand assert Signed-off-by: Sergey Kopienko <[email protected]>
…fix unused variable Signed-off-by: Sergey Kopienko <[email protected]>
…rename some variables Signed-off-by: Sergey Kopienko <[email protected]>
…declare load_data_into_slm as inline Signed-off-by: Sergey Kopienko <[email protected]>
…removed redundand assert Signed-off-by: Sergey Kopienko <[email protected]>
…additional comments for load_data_into_slm Signed-off-by: Sergey Kopienko <[email protected]>
…rename some local variables and params Signed-off-by: Sergey Kopienko <[email protected]>
…rewrite the data loading into SLM cache #1 Signed-off-by: Sergey Kopienko <[email protected]>
…h - always use two separate SLM cache Signed-off-by: Sergey Kopienko <[email protected]>
…use large submitter after 16M items Signed-off-by: Sergey Kopienko <[email protected]>
…h - using __parallel_merge_submitter_large for all data sizes Signed-off-by: Sergey Kopienko <[email protected]>
…avoid barrier if we have more then one work-item in each work-group Signed-off-by: Sergey Kopienko <[email protected]>
…avoid any action in the __parallel_merge_submitter_large::operator() if we haven't any data to process Signed-off-by: Sergey Kopienko <[email protected]>
…remove inline on load_data_into_slm_impl Signed-off-by: Sergey Kopienko <[email protected]>
…fix types in __serial_merge Signed-off-by: Sergey Kopienko <[email protected]>
…remove extra local variable Signed-off-by: Sergey Kopienko <[email protected]>
SergeyKopienko
force-pushed
the
dev/skopienko/optimize_merge_to_main_V21_final
branch
from
November 20, 2024 14:48
1fbe771
to
253ca8d
Compare
…h - debug code under DUMP_DATA_LOADING Signed-off-by: Sergey Kopienko <[email protected]>
…fix an error in data loading Signed-off-by: Sergey Kopienko <[email protected]>
…fix chunk size on GPU Signed-off-by: Sergey Kopienko <[email protected]>
…fix calculation of available SLM memory amount Signed-off-by: Sergey Kopienko <[email protected]>
…l_merge.h - debug code under DUMP_DATA_LOADING" This reverts commit 952871e.
SergeyKopienko
force-pushed
the
dev/skopienko/optimize_merge_to_main_V21_final
branch
from
November 20, 2024 15:35
f604a72
to
39b68e4
Compare
…another approach to calculate the amount of work-groups and work-items Signed-off-by: Sergey Kopienko <[email protected]>
SergeyKopienko
force-pushed
the
dev/skopienko/optimize_merge_to_main_V21_final
branch
2 times, most recently
from
November 21, 2024 08:59
0aa0ca3
to
56060d0
Compare
…do not use SLM bank size Signed-off-by: Sergey Kopienko <[email protected]>
SergeyKopienko
force-pushed
the
dev/skopienko/optimize_merge_to_main_V21_final
branch
from
November 21, 2024 09:17
56060d0
to
b04b25e
Compare
…use std::size_t instead of _IdType Signed-off-by: Sergey Kopienko <[email protected]>
….h - fix compile errors Signed-off-by: Sergey Kopienko <[email protected]>
…fix compile errors Signed-off-by: Sergey Kopienko <[email protected]>
…using oneapi::dpl::__internal::__value_t to detect range's value types Signed-off-by: Sergey Kopienko <[email protected]>
SergeyKopienko
changed the title
Optimize
Optimize Dec 16, 2024
merge
algorithm for data sizes equal or greater then 4M items with SLM cache usagemerge
algorithm for data sizes equal or greater then 4M items
SergeyKopienko
changed the title
Optimize
Optimize Dec 16, 2024
merge
algorithm for data sizes equal or greater then 4M itemsmerge
algorithm for data sizes equal or greater then 4M items with SLM cache usage
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
One more approach for #1933
Unfortunately this approach doesn't gave us performance profit in comparison with #1933