Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
GPU implementation of hamming distance (#541)
* take static methods out of tcrdist * made _tcrdist_mat a normal class method * parent method NumbaDistanceCalculator extracted * numba version of hamming distance implemented * hamming numba tests passed and reference test added * hamming numba distance calculator implemented and tested * n_jobs parameter handling done in NumbaDistanceCalculator superclass * documentation adapted * removed unnecessary import * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * hamming distance with numba parallelization implemented * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * imports fixed * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * implemented parallelization with n_jobs and n_blocks for hamming and tcrdist distance metrics * performance optimization for hamming and tcrdist * more documentation added * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * documentation adapted * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * documentation adapted * signature of _calc_dist_mat_block changed * the alphabet for the hamming distance is now the unique characters occuring in all sequences * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * normalized hamming distance added * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * renaming test * histogram creation for hamming distance added * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refactored * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * hamming histogram adjustments * reference test cases added for normalized hamming and hamming histogram * Update src/scirpy/ir_dist/metrics.py Co-authored-by: Gregor Sturm <[email protected]> * test cases for normalized hamming and hamming histogram adapted * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * docstring for normalized hamming distance and tcrdist distance added * adapted default parameters and tests for n_jobs and n_blocks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * test_sequence_dist_all_metrics adaptions * n_jobs default value set to -1 * docstring of ir_dist for n_jobs adapted * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * docstring change to test cicd pipeline * docstring for n_jobs of _ir_dist changed * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * docstring for n_jobs of _ir_dist changed * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * moved histogram creation to parent class of hamming distance calculator * histogram computation adaptions * test case test_tcrdist_histogram_not_implemented added * documentation for histogram adapted * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * reformatted doc string * handling of symmetric matrices with respect to histogram variable changed * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * retrieval of usable cpus for numba adapted * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more documentation for histogram and (hamming) normalize added * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added GPUHammingDistanceCalculator * added test case for gpu hamming distance * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * documentation for GPUHammingDistanceCalculator adapted * adapted documentation of _tcrdist_mat * cuda numba experiments * cupy experiments * cupy experiments * scaled cupy to 1 million cells * sorted sequences by length * textures used for seqs_mat1 and seqs_mat2 * texture mit up to 100k cells * sorted seqs with multiple blocks * scaled textures to 1 million cells * use char for sequences * shared memory used * experiments, run 1 million cells with global memory * run 1 million cells with only global memory * refactoring and time measurements * optimized seqs2mat * increased result matrix stacking speed * changed data dtype to int8 * scaled to 1 million cells * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * sort indices of result csr matrix * refactoring * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove test from ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move cupy import to func * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add GPU test * rename * Rename .cirun.yaml to .cirun.yml * print library versions for debugging * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * data types updated * refactored * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update meta.yaml * Update meta.yaml * Update meta.yaml * removed time measurement prints and added progress bar * removed cuda synchronize statements used for performance testing * added parameters gpu_n_blocks and gpu_block_width * added parameter documentation * cleaned up hamming kernel * catch sequence conversion error by retrying with non ascii implementation * use parameters gpu_n_blocks and gpu_block_width for testing * adapted documentation about n_blocks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapted error handling for ascii UnicodeError * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed reference to :class:`~scirpy.ir_dist.metrics.GPUHammingDistanceCalculator` in documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * WIP: update docs * Update changelog * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added an explanation for choosing parameters to GPU hamming class documentation. * logging.info used instead of print * Add tutorial for large datasets * Bump version * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added checks for int32 overflow * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapted error text of int32 overflow check. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Gregor Sturm <[email protected]> Co-authored-by: Intron7 <[email protected]> Co-authored-by: Severin Dicks <[email protected]>
- Loading branch information