Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize merge algorithm for data sizes equal or greater then 4M items with SLM cache usage #1937

Draft
wants to merge 82 commits into
base: main
Choose a base branch
from

Conversation

SergeyKopienko
Copy link
Contributor

@SergeyKopienko SergeyKopienko commented Nov 18, 2024

One more approach for #1933

Unfortunately this approach doesn't gave us performance profit in comparison with #1933

…re-implement __find_start_point function

Signed-off-by: Sergey Kopienko <[email protected]>
…rename template params in __parallel_merge_submitter

Signed-off-by: Sergey Kopienko <[email protected]>
…implementation of __parallel_merge_submitter_large

Signed-off-by: Sergey Kopienko <[email protected]>
…using __parallel_merge_submitter_large in the __parallel_merge

Signed-off-by: Sergey Kopienko <[email protected]>
…removed redundand comment

Signed-off-by: Sergey Kopienko <[email protected]>
…small data types should be acceptable too

Signed-off-by: Sergey Kopienko <[email protected]>
…define __base_diagonals_sp_global_ptr outside of parallel_for

Signed-off-by: Sergey Kopienko <[email protected]>
…calculate and use cached data-size for work-group

Signed-off-by: Sergey Kopienko <[email protected]>
…rename some local variables

Signed-off-by: Sergey Kopienko <[email protected]>
…fix review comment: let's use __parallel_merge_submitter with std::uint32_t data type only

Signed-off-by: Sergey Kopienko <[email protected]>
…load source data into SLM by all available work-items in the group

Signed-off-by: Sergey Kopienko <[email protected]>
…removed redundand comment

Signed-off-by: Sergey Kopienko <[email protected]>
…removed redundand assert

Signed-off-by: Sergey Kopienko <[email protected]>
…declare load_data_into_slm as inline

Signed-off-by: Sergey Kopienko <[email protected]>
…removed redundand assert

Signed-off-by: Sergey Kopienko <[email protected]>
…additional comments for load_data_into_slm

Signed-off-by: Sergey Kopienko <[email protected]>
…rename some local variables and params

Signed-off-by: Sergey Kopienko <[email protected]>
…rewrite the data loading into SLM cache #1

Signed-off-by: Sergey Kopienko <[email protected]>
…h - always use two separate SLM cache

Signed-off-by: Sergey Kopienko <[email protected]>
…use large submitter after 16M items

Signed-off-by: Sergey Kopienko <[email protected]>
…h - using __parallel_merge_submitter_large for all data sizes

Signed-off-by: Sergey Kopienko <[email protected]>
…avoid barrier if we have more then one work-item in each work-group

Signed-off-by: Sergey Kopienko <[email protected]>
…avoid any action in the __parallel_merge_submitter_large::operator() if we haven't any data to process

Signed-off-by: Sergey Kopienko <[email protected]>
…remove inline on load_data_into_slm_impl

Signed-off-by: Sergey Kopienko <[email protected]>
…fix types in __serial_merge

Signed-off-by: Sergey Kopienko <[email protected]>
…remove extra local variable

Signed-off-by: Sergey Kopienko <[email protected]>
@SergeyKopienko SergeyKopienko force-pushed the dev/skopienko/optimize_merge_to_main_V21_final branch from 1fbe771 to 253ca8d Compare November 20, 2024 14:48
…h - debug code under DUMP_DATA_LOADING

Signed-off-by: Sergey Kopienko <[email protected]>
…fix an error in data loading

Signed-off-by: Sergey Kopienko <[email protected]>
…fix calculation of available SLM memory amount

Signed-off-by: Sergey Kopienko <[email protected]>
…l_merge.h - debug code under DUMP_DATA_LOADING"

This reverts commit 952871e.
@SergeyKopienko SergeyKopienko force-pushed the dev/skopienko/optimize_merge_to_main_V21_final branch from f604a72 to 39b68e4 Compare November 20, 2024 15:35
…another approach to calculate the amount of work-groups and work-items

Signed-off-by: Sergey Kopienko <[email protected]>
@SergeyKopienko SergeyKopienko force-pushed the dev/skopienko/optimize_merge_to_main_V21_final branch 2 times, most recently from 0aa0ca3 to 56060d0 Compare November 21, 2024 08:59
…do not use SLM bank size

Signed-off-by: Sergey Kopienko <[email protected]>
@SergeyKopienko SergeyKopienko force-pushed the dev/skopienko/optimize_merge_to_main_V21_final branch from 56060d0 to b04b25e Compare November 21, 2024 09:17
@SergeyKopienko SergeyKopienko marked this pull request as ready for review November 28, 2024 17:15
@SergeyKopienko SergeyKopienko added this to the 2022.8.0 milestone Nov 29, 2024
@SergeyKopienko SergeyKopienko removed the request for review from MikeDvorskiy November 29, 2024 08:17
@SergeyKopienko SergeyKopienko removed this from the 2022.8.0 milestone Nov 29, 2024
@SergeyKopienko SergeyKopienko marked this pull request as draft November 29, 2024 08:17
@SergeyKopienko SergeyKopienko changed the title Optimize merge algorithm for data sizes equal or greater then 4M items with SLM cache usage Optimize merge algorithm for data sizes equal or greater then 4M items Dec 16, 2024
@SergeyKopienko SergeyKopienko changed the title Optimize merge algorithm for data sizes equal or greater then 4M items Optimize merge algorithm for data sizes equal or greater then 4M items with SLM cache usage Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants