Skip to content

Releases: ROCm/rocBLAS

rocBLAS-0.12.1.0 for ROCm 1.7.1

12 Mar 14:41
Compare
Choose a tag to compare

Changelist

  • fix dependency installation

rocBLAS-0.12.0.0 release for ROCM 1.7.1

07 Mar 22:43
Compare
Choose a tag to compare

Same source as rocBLAS-0.12.0.0 release for ROCM 1.7.0 but for ROCM 1.7.1

rocBLAS-0.12.0.0 release for ROCM 1.7.0

05 Mar 21:40
Compare
Choose a tag to compare

Changelist:

  • add hgemm
  • additional fix for multi-process and multi-threading
  • new solution selection logic

rocBLAS-0.10.4.0 release for ROCM 1.7.0

08 Feb 16:30
Compare
Choose a tag to compare

Changelist:

  • fix race condition for multi-process and multi-thread
  • hipLaunchKernelGGL replaces hipLaunchKernel
  • add logging

rocBLAS-0.10.3.0 release for ROCM 1.6.4

05 Dec 15:35
Compare
Choose a tag to compare

Changelist:

  • add dgemm assembly from Tensile v3.4.0
  • fix packaging install path
  • integrate clang-format

rocBLAS-0.10.2.0 release for ROCM 1.6.4

30 Nov 23:39
Compare
Choose a tag to compare

Changelist:

  • ported to CentOS
  • updated to use Tensile v3.3.7 with v_add_i32->u32 fix and fix for M<4
  • refactored code and tests for rocblas_pointer_mode

rocBLAS-0.10.1.0 release for ROCM 1.6.4

15 Nov 00:18
Compare
Choose a tag to compare
Pre-release

Changelist:

  • add MI25 tuning for Tensile 3.3.4
  • fix sgemm assembly kernels for thread safety
  • correct iXamax to 1 based indexing
  • refactor tests

Release for ROCM 1.6.4

17 Oct 15:04
Compare
Choose a tag to compare
Pre-release

NOTE: API breaking changes introduced in this release related to: rocblas_iXamax, rocblas_iXamin, complex functions, and half functions.

Changelist:

  • correct API: rocblas_samax -> rocblas_isamax, rocblas_damax -> rocblas_idamax
  • remove from the API functions for complex and half that have not been implemented
  • update to Tensile v3.2.0. This uses sgemm assembly kernels for gfx803 and gfx900
  • add rocblas_sgeam and rocblas_dgeam functions
  • improve repeatability of rocblas_Xgemm performance tests
  • update perf script

release for ROCM 1.6.3

16 Oct 22:09
Compare
Choose a tag to compare
Pre-release

NOTE: API breaking changes introduced in this release, primarily related to library NAME and SONAME.

Changelist:

  • Library removed the suffix which annotated platform (i.e. now librocblas.so)
  • so-name link renamed to reflect the MAJOR version number, (currently 0, changed from 1)
  • Build system entirely rewritten to simplify build/install process. Convenience bash script added to automate builds on Ubuntu distro (install.sh script added to root)
  • Tensile updated to v3.0.4, which includes fixes for NaN propogating on GEMM calls with beta == 0
  • 2 new samples added in samples directory (gemm & strided gemm)
  • haxpy implementation added
  • extra unit tests added and benchmarking capabilities for axpy, dot, scal
  • Improved stability of TRSM unit tests

rocBLAS-0.4.3.0 release for ROCM 1.6

25 Jul 21:34
Compare
Choose a tag to compare
Pre-release

Library release associated with ROCM v1.6 release.

Library tuned for Fiji family hardware.