Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
fe0ba63
enabled max_Ls from python benchmark to host kernel and applied chang…
kudomcho Jan 24, 2025
2c6eb78
deleted hip files
kudomcho Jan 24, 2025
e739d27
deleted another .hip file
kudomcho Jan 24, 2025
c0f916d
removed typo on packed_d_idx line
kudomcho Jan 27, 2025
abee234
indexed through pack_bag_idx_L and D
kudomcho Jan 27, 2025
2b5f0f6
added 6 different types of mx_ls
kudomcho Jan 29, 2025
10f83d8
added 6 different types of max_ls
kudomcho Jan 29, 2025
ec59908
added 6 different types of max_ls
kudomcho Jan 29, 2025
e227433
added 6 different types of max_ls
kudomcho Jan 29, 2025
c965a72
changed indexing on load_idx_valid and bag size offset condition.
kudomcho Jan 30, 2025
d850c0f
fixed typo error
kudomcho Jan 30, 2025
6805e28
fixed typo error
kudomcho Jan 30, 2025
797b690
fixed typo error
kudomcho Jan 30, 2025
90d1a97
fixed typo error
kudomcho Jan 30, 2025
392f0db
changed if else condition on pack_bag_L to one liner
kudomcho Jan 31, 2025
b24c98d
Update embedding_forward_quantized_split_nbit_kernel_template.cu
kudomcho Feb 3, 2025
900bcd7
Update embedding_forward_quantized_split_nbit_kernel_template.cu
kudomcho Feb 4, 2025
c3a9ae3
Update embedding_forward_quantized_split_nbit_kernel_template.cu
kudomcho Feb 4, 2025
5a673ab
Update embedding_forward_quantized_split_nbit_kernel_template.cu
kudomcho Feb 5, 2025
d7f5b83
Update embedding_forward_quantized_split_nbit_kernel_template.cu
kudomcho Feb 7, 2025
0abb648
Update embedding_forward_quantized_split_nbit_kernel_template.cu
kudomcho Feb 7, 2025
0f169f8
added packed mode condition
kudomcho Feb 14, 2025
1205720
added packed mode template argument
kudomcho Feb 14, 2025
d4349b7
changed retun of find_max_ls function to tensor type
kudomcho Feb 14, 2025
385123a
reduced accumulator size to prevent register spilling /scratch alloca…
kudomcho Feb 20, 2025
b14b0ca
reset accumulator back to 0 after each packed_bag_L (k) iteration
kudomcho Feb 23, 2025
4b89641
removed packedMode argument for GPU_no_cache testing
kudomcho Feb 24, 2025
62ac114
reverted back the return value of find_max_ls function to int type
kudomcho Feb 26, 2025
def71cb
refactored host code based on packedmode to support bag packing L
kudomcho Mar 13, 2025
4a74789
added packedmode_L for packing L and packedmode for packing D and nai…
kudomcho Mar 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading