Releases: ROCm/rocBLAS
Releases · ROCm/rocBLAS
rocBLAS-0.12.1.0 for ROCm 1.7.1
Changelist
- fix dependency installation
rocBLAS-0.12.0.0 release for ROCM 1.7.1
Same source as rocBLAS-0.12.0.0 release for ROCM 1.7.0 but for ROCM 1.7.1
rocBLAS-0.12.0.0 release for ROCM 1.7.0
Changelist:
- add hgemm
- additional fix for multi-process and multi-threading
- new solution selection logic
rocBLAS-0.10.4.0 release for ROCM 1.7.0
Changelist:
- fix race condition for multi-process and multi-thread
- hipLaunchKernelGGL replaces hipLaunchKernel
- add logging
rocBLAS-0.10.3.0 release for ROCM 1.6.4
Changelist:
- add dgemm assembly from Tensile v3.4.0
- fix packaging install path
- integrate clang-format
rocBLAS-0.10.2.0 release for ROCM 1.6.4
Changelist:
- ported to CentOS
- updated to use Tensile v3.3.7 with v_add_i32->u32 fix and fix for M<4
- refactored code and tests for rocblas_pointer_mode
rocBLAS-0.10.1.0 release for ROCM 1.6.4
Changelist:
- add MI25 tuning for Tensile 3.3.4
- fix sgemm assembly kernels for thread safety
- correct iXamax to 1 based indexing
- refactor tests
Release for ROCM 1.6.4
NOTE: API breaking changes introduced in this release related to: rocblas_iXamax, rocblas_iXamin, complex functions, and half functions.
Changelist:
- correct API: rocblas_samax -> rocblas_isamax, rocblas_damax -> rocblas_idamax
- remove from the API functions for complex and half that have not been implemented
- update to Tensile v3.2.0. This uses sgemm assembly kernels for gfx803 and gfx900
- add rocblas_sgeam and rocblas_dgeam functions
- improve repeatability of rocblas_Xgemm performance tests
- update perf script
release for ROCM 1.6.3
NOTE: API breaking changes introduced in this release, primarily related to library NAME and SONAME.
Changelist:
- Library removed the suffix which annotated platform (i.e. now librocblas.so)
- so-name link renamed to reflect the MAJOR version number, (currently 0, changed from 1)
- Build system entirely rewritten to simplify build/install process. Convenience bash script added to automate builds on Ubuntu distro (install.sh script added to root)
- Tensile updated to v3.0.4, which includes fixes for NaN propogating on GEMM calls with beta == 0
- 2 new samples added in samples directory (gemm & strided gemm)
- haxpy implementation added
- extra unit tests added and benchmarking capabilities for axpy, dot, scal
- Improved stability of TRSM unit tests
rocBLAS-0.4.3.0 release for ROCM 1.6
Library release associated with ROCM v1.6 release.
Library tuned for Fiji family hardware.