Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v1.1.0 for develop #370

Merged
merged 10 commits into from
Oct 21, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 71 additions & 20 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,80 @@
# Changelog

This file may not always be up to date for the unreleased commits. For a
comprehensive list, use the following commands:
This file may not always be up to date in particular for the unreleased
commits. For a comprehensive list, use the following command:
```bash
git log --first-parent
```

## Unreleased
### Added
+ [2d3f0318](https://github.com/ginkgo-project/ginkgo/commit/2d3f0318ed9412a3522d12a85b863efad12fd033), [5e744cad](https://github.com/ginkgo-project/ginkgo/commit/5e744cad1ac0a86b58e3a982dfd7fff4123a7ae3), [22e4b07d](https://github.com/ginkgo-project/ginkgo/commit/22e4b07db7642b54e89c026372c9aae7554ff385), [a5d60de9](https://github.com/ginkgo-project/ginkgo/commit/a5d60de994d0d2073d6dfd4b170fccb557ab6663): Code quality tools in the CI system such as IWYU, clang-tidy and sonarqube.
+ [e1ed14da](https://github.com/ginkgo-project/ginkgo/commit/e1ed14dae236cf4880e1aab418ba8b1784cc8c6e): Fully abide to the xSDK compatibility policies.
+ [0c60deec](https://github.com/ginkgo-project/ginkgo/commit/0c60deec4ce806394fb3287735fad4fb9e7e5c71), [de51ee9a](https://github.com/ginkgo-project/ginkgo/commit/de51ee9a4fbec45d4af99c877a3a49ab94c8cdb5): Two new examples, a 9pt and 27pt stencil.
+ [5e0ca656](https://github.com/ginkgo-project/ginkgo/commit/5e0ca656865f2fa8c35c3470bc6e531c7cf95b66): Benchmark support for cuSPARSE SpMVs.
+ [2f2f09eb](https://github.com/ginkgo-project/ginkgo/commit/2f2f09eb8e653b2552fe97c997e6729c6a3dbcdc), [ec7918f0](https://github.com/ginkgo-project/ginkgo/commit/ec7918f0a3ddb8084a7b2854d4f4d88dc86a1c11): Benchmark support for conversion between SpMV formats.
+ [c9be4445](https://github.com/ginkgo-project/ginkgo/commit/c9be444527fb985f9646c4ebb1b8fb7b9ef72615), [82e6da60](https://github.com/ginkgo-project/ginkgo/commit/82e6da6022a4a5405ad2b91f0f48ccc2490114cd): CSR conversions to and from Hybrid.
+ [fce8dad4](https://github.com/ginkgo-project/ginkgo/commit/fce8dad411603fa517e56073c47b0582910a0b1a), [a3307f07](https://github.com/ginkgo-project/ginkgo/commit/a3307f0760174f7f8b9d4edf20688fe5e2ff9d7a): New ParILU preconditioner.
+ [75a398fc](https://github.com/ginkgo-project/ginkgo/commit/75a398fc64aaa17e8ab343a84f4d8d8caa3ca662): Support for sorting CSR matrices. See also the ParILU commits.

### Changed
+ [fe58c940](https://github.com/ginkgo-project/ginkgo/commit/fe58c940aa365d1c7434836150c53fdb4832c3ef): Fix the CUDA conversion from CSR and Dense to Sell-P.
+ [75806c26](https://github.com/ginkgo-project/ginkgo/commit/75806c26ff6af86d2bb436c9b19a6df3d9be76ce), [c6229b80](https://github.com/ginkgo-project/ginkgo/commit/c6229b804e27c4adb02df17af46f925d48f312ff): General fixes to the CI system scripts.
+ [8bf33e0e](https://github.com/ginkgo-project/ginkgo/commit/8bf33e0e3386d0e6a6c41631444deeea627d1d94), [37dfe3b8](https://github.com/ginkgo-project/ginkgo/commit/37dfe3b865a5902a5e395aa424e13190d1bd2c65): Improve CSR->ELL,Hybrid conversions.
+ [c4f567eb](https://github.com/ginkgo-project/ginkgo/commit/c4f567ebc80b22252c5c5284a00e4d9f86d22e2c): Fix compilation with GCC 6.4.

### Removed
## Version 1.1.0

The Ginkgo team is proud to announce the new minor release of Ginkgo version
1.1.0. This release brings several performance improvements, adds Windows support,
adds support for factorizations inside Ginkgo and a new ILU preconditioner
based on ParILU algorithm, among other things. For detailed information, check the respective issue.

Supported systems and requirements:
+ For all platforms, cmake 3.9+
+ Linux and MacOS
+ gcc: 5.3+, 6.3+, 7.3+, 8.1+
+ clang: 3.9+
+ Intel compiler: 2017+
+ Apple LLVM: 8.0+
+ CUDA module: CUDA 9.0+
+ Windows
+ MinGW and Cygwin: gcc 5.3+, 6.3+, 7.3+, 8.1+
+ Microsoft Visual Studio: VS 2017 15.7+
+ CUDA module: CUDA 9.0+, Microsoft Visual Studio
+ OpenMP module: MinGW or Cygwin.


The current known issues can be found in the [known issues
page](https://github.com/ginkgo-project/ginkgo/wiki/Known-Issues).


### Additions
+ Upper and lower triangular solvers ([#327](https://github.com/ginkgo-project/ginkgo/issues/327), [#336](https://github.com/ginkgo-project/ginkgo/issues/336), [#341](https://github.com/ginkgo-project/ginkgo/issues/341), [#342](https://github.com/ginkgo-project/ginkgo/issues/342))
+ New factorization support in Ginkgo, and addition of the ParILU
algorithm ([#305](https://github.com/ginkgo-project/ginkgo/issues/305), [#315](https://github.com/ginkgo-project/ginkgo/issues/315), [#319](https://github.com/ginkgo-project/ginkgo/issues/319), [#324](https://github.com/ginkgo-project/ginkgo/issues/324))
+ New ILU preconditioner ([#348](https://github.com/ginkgo-project/ginkgo/issues/348), [#353](https://github.com/ginkgo-project/ginkgo/issues/353))
+ Windows MinGW and Cygwin support ([#347](https://github.com/ginkgo-project/ginkgo/issues/347))
+ Windows Visual Studio support ([#351](https://github.com/ginkgo-project/ginkgo/issues/351))
+ New example showing how to use ParILU as a preconditioner ([#358](https://github.com/ginkgo-project/ginkgo/issues/358))
+ New example on using loggers for debugging ([#360](https://github.com/ginkgo-project/ginkgo/issues/360))
+ Add two new 9pt and 27pt stencil examples ([#300](https://github.com/ginkgo-project/ginkgo/issues/300), [#306](https://github.com/ginkgo-project/ginkgo/issues/306))
+ Allow benchmarking CuSPARSE spmv formats through Ginkgo's benchmarks ([#303](https://github.com/ginkgo-project/ginkgo/issues/303))
+ New benchmark for sparse matrix format conversions ([#312](https://github.com/ginkgo-project/ginkgo/issues/312)[#317](https://github.com/ginkgo-project/ginkgo/issues/317))
+ Add conversions between CSR and Hybrid formats ([#302](https://github.com/ginkgo-project/ginkgo/issues/302), [#310](https://github.com/ginkgo-project/ginkgo/issues/310))
+ Support for sorting rows in the CSR format by column idices ([#322](https://github.com/ginkgo-project/ginkgo/issues/322))
+ Addition of a CUDA COO SpMM kernel for improved performance ([#345](https://github.com/ginkgo-project/ginkgo/issues/345))
+ Addition of a LinOp to handle perturbations of the form (identity + scalar *
basis * projector) ([#334](https://github.com/ginkgo-project/ginkgo/issues/334))
+ New sparsity matrix representation format with Reference and OpenMP
kernels ([#349](https://github.com/ginkgo-project/ginkgo/issues/349), [#350](https://github.com/ginkgo-project/ginkgo/issues/350))

### Fixes
+ Accelerate GMRES solver for CUDA executor ([#363](https://github.com/ginkgo-project/ginkgo/issues/363))
+ Fix BiCGSTAB solver convergence ([#359](https://github.com/ginkgo-project/ginkgo/issues/359))
+ Fix CGS logging by reporting the residual for every sub iteration ([#328](https://github.com/ginkgo-project/ginkgo/issues/328))
+ Fix CSR,Dense->Sellp conversion's memory access violation ([#295](https://github.com/ginkgo-project/ginkgo/issues/295))
+ Accelerate CSR->Ell,Hybrid conversions on CUDA ([#313](https://github.com/ginkgo-project/ginkgo/issues/313), [#318](https://github.com/ginkgo-project/ginkgo/issues/318))
+ Fixed slowdown of COO SpMV on OpenMP ([#340](https://github.com/ginkgo-project/ginkgo/issues/340))
+ Fix gcc 6.4.0 internal compiler error ([#316](https://github.com/ginkgo-project/ginkgo/issues/316))
+ Fix compilation issue on Apple clang++ 10 ([#322](https://github.com/ginkgo-project/ginkgo/issues/322))
+ Make Ginkgo able to compile on Intel 2017 and above ([#337](https://github.com/ginkgo-project/ginkgo/issues/337))
+ Make the benchmarks spmv/solver use the same matrix formats ([#366](https://github.com/ginkgo-project/ginkgo/issues/366))
+ Fix self-written isfinite function ([#348](https://github.com/ginkgo-project/ginkgo/issues/348))
+ Fix Jacobi issues shown by cuda-memcheck

### Tools and ecosystem improvements
+ Multiple improvements to the CI system and tools ([#296](https://github.com/ginkgo-project/ginkgo/issues/296), [#311](https://github.com/ginkgo-project/ginkgo/issues/311), [#365](https://github.com/ginkgo-project/ginkgo/issues/365))
+ Multiple improvements to the Ginkgo containers ([#328](https://github.com/ginkgo-project/ginkgo/issues/328), [#361](https://github.com/ginkgo-project/ginkgo/issues/361))
+ Add sonarqube analysis to Ginkgo ([#304](https://github.com/ginkgo-project/ginkgo/issues/304), [#308](https://github.com/ginkgo-project/ginkgo/issues/308), [#309](https://github.com/ginkgo-project/ginkgo/issues/309))
+ Add clang-tidy and iwyu support to Ginkgo ([#298](https://github.com/ginkgo-project/ginkgo/issues/298))
+ Improve Ginkgo's support of xSDK M12 policy by adding the `TPL_` arguments
to CMake ([#300](https://github.com/ginkgo-project/ginkgo/issues/300))
+ Add support for the xSDK R7 policy ([#325](https://github.com/ginkgo-project/ginkgo/issues/325))
+ Fix examples in html documentation ([#367](https://github.com/ginkgo-project/ginkgo/issues/367))

## Version 1.0.0
The Ginkgo team is proud to announce the first release of Ginkgo, the next-generation high-performance on-node sparse linear algebra library. Ginkgo leverages the features of modern C++ to give you a tool for the iterative solution of linear systems that is:
Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cmake_minimum_required(VERSION 3.9)

project(Ginkgo LANGUAGES C CXX VERSION 1.0.0 DESCRIPTION "A numerical linear algebra library targeting many-core architectures")
project(Ginkgo LANGUAGES C CXX VERSION 1.1.0 DESCRIPTION "A numerical linear algebra library targeting many-core architectures")
set(Ginkgo_VERSION_TAG "develop")
set(PROJECT_VERSION_TAG ${Ginkgo_VERSION_TAG})

Expand Down
21 changes: 11 additions & 10 deletions INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,14 @@ Ginkgo adds the following additional switches to control what is being built:

* `-DGINKGO_DEVEL_TOOLS={ON, OFF}` sets up the build system for development
(requires clang-format, will also download git-cmake-format),
default is `ON`
default is `ON`.
* `-DGINKGO_BUILD_TESTS={ON, OFF}` builds Ginkgo's tests
(will download googletest), default is `ON`
(will download googletest), default is `ON`.
* `-DGINKGO_BUILD_BENCHMARKS={ON, OFF}` builds Ginkgo's benchmarks
(will download gflags and rapidjson), default is `ON`
(will download gflags and rapidjson), default is `ON`.
* `-DGINKGO_BUILD_EXAMPLES={ON, OFF}` builds Ginkgo's examples, default is `ON`
* `-DGINKGO_BUILD_EXTLIB_EXAMPLE={ON, OFF}` builds the interfacing example with deal.II, default is `OFF`
* `-DGINKGO_BUILD_EXTLIB_EXAMPLE={ON, OFF}` builds the interfacing example
with deal.II, default is `OFF`.
* `-DGINKGO_BUILD_REFERENCE={ON, OFF}` build reference implementations of the
kernels, useful for testing, default is `ON`
* `-DGINKGO_BUILD_OMP={ON, OFF}` builds optimized OpenMP versions of the kernels,
Expand All @@ -42,21 +43,21 @@ Ginkgo adds the following additional switches to control what is being built:
CMake package registry. The default is `OFF`.
* `-DGINKGO_WITH_CLANG_TIDY={ON, OFF}` makes Ginkgo call `clang-tidy` to find
programming issues. The path can be manually controlled with the CMake
variable `-DGINKGO_CLANG_TIDY_PATH=<path>`.
variable `-DGINKGO_CLANG_TIDY_PATH=<path>`. The default is `OFF`.
* `-DGINKGO_WITH_IWYU={ON, OFF}` makes Ginkgo call `iwyu` to find include
issues. The path can be manually controlled with the CMake variable
`-DGINKGO_IWYU_PATH=<path>`.
`-DGINKGO_IWYU_PATH=<path>`. The default is `OFF`.
* `-DGINKGO_VERBOSE_LEVEL=integer` sets the verbosity of Ginkgo.
* `0` disables all output in the main libraries,
* `1` enables a few important messages related to unexpected behavior (default).
* `-DCMAKE_INSTALL_PREFIX=path` sets the installation path for `make install`.
The default value is usually something like `/usr/local`
The default value is usually something like `/usr/local`.
* `-DCMAKE_BUILD_TYPE=type` specifies which configuration will be used for
this build of Ginkgo. The default is `RELEASE`. Supported values are CMake's
standard build types such as `DEBUG` and `RELEASE` and the Ginkgo specific
`COVERAGE`, `ASAN` (AddressSanitizer) and `TSAN` (ThreadSanitizer) types.
* `-DBUILD_SHARED_LIBS={ON, OFF}` builds ginkgo as shared libraries (`OFF`)
or as dynamic libraries (`ON`), default is `ON`
or as dynamic libraries (`ON`), default is `ON`.
* `-DGINKGO_JACOBI_FULL_OPTIMIZATIONS={ON, OFF}` use all the optimizations
for the CUDA Jacobi algorithm. `OFF` by default. Setting this option to `ON`
may lead to very slow compile time (>20 minutes) for the
Expand Down Expand Up @@ -92,7 +93,7 @@ Ginkgo adds the following additional switches to control what is being built:
program, default is `windows_shared_library`.
* `-DGINKGO_CHECK_PATH={ON, OFF}` checks if the environment variable PATH is valid.
It is checked only when building shared libraries and executable program,
default is `ON`
default is `ON`.

For example, to build everything (in debug mode), use:

Expand Down Expand Up @@ -135,7 +136,7 @@ Information, see the [CMake documentation for
CMAKE_PREFIX_PATH](https://cmake.org/cmake/help/v3.9/variable/CMAKE_PREFIX_PATH.html)
for details.

To manually configure the paths Ginkgo relies on the [standard xSDK Installation
To manually configure the paths, Ginkgo relies on the [standard xSDK Installation
policies](https://xsdk.info/policies/) for all packages except `CAS` (as it is
neither a library nor a header, it cannot be expressed through the `TPL`
format):
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,10 @@ The prequirement needs to be verified
* _cmake 3.9+_
* C++11 compliant 64-bits compiler:
* _MinGW : gcc 5.3+, 6.3+, 7.3+, 8.1+_
* _CygWin : gcc 5.3+, 6.3+, 7.3+, 8.1+_
* _Cygwin : gcc 5.3+, 6.3+, 7.3+, 8.1+_
* _Microsoft Visual Studio : VS 2017 15.7+_

__NOTE:__ Need to add `--autocrlf=input` after `git clone` in _CygWin_.
__NOTE:__ Need to add `--autocrlf=input` after `git clone` in _Cygwin_.

The Ginkgo CUDA module has the following __additional__ requirements:

Expand Down
4 changes: 2 additions & 2 deletions core/device_hooks/cuda_hooks.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,9 @@ namespace gko {

version version_info::get_cuda_version() noexcept
{
// We just return 1.0.0 with a special "not compiled" tag in placeholder
// We just return 1.1.0 with a special "not compiled" tag in placeholder
// modules.
return {1, 0, 0, "not compiled"};
return {1, 1, 0, "not compiled"};
}


Expand Down
4 changes: 2 additions & 2 deletions core/device_hooks/omp_hooks.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,9 @@ namespace gko {

version version_info::get_omp_version() noexcept
{
// We just return 1.0.0 with a special "not compiled" tag in placeholder
// We just return 1.1.0 with a special "not compiled" tag in placeholder
// modules.
return {1, 0, 0, "not compiled"};
return {1, 1, 0, "not compiled"};
}


Expand Down
4 changes: 2 additions & 2 deletions core/device_hooks/reference_hooks.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,9 @@ namespace gko {

version version_info::get_reference_version() noexcept
{
// We just return 1.0.0 with a special "not compiled" tag in placeholder
// We just return 1.1.0 with a special "not compiled" tag in placeholder
// modules.
return {1, 0, 0, "not compiled"};
return {1, 1, 0, "not compiled"};
}


Expand Down
18 changes: 9 additions & 9 deletions cuda/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,16 @@ if (NOT BUILD_SHARED_LIBS)
set(CMAKE_CUDA_DEVICE_LINK_EXECUTABLE ${CMAKE_CUDA_DEVICE_LINK_EXECUTABLE} PARENT_SCOPE)
endif()

# MSVC can not find CUDA automatically
# Use CUDA_COMPILER PATH to define the CUDA TOOLKIT ROOT DIR
if ("${CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES}" STREQUAL "")
string(REPLACE "/bin/nvcc.exe" "" CMAKE_CUDA_ROOT_DIR ${CMAKE_CUDA_COMPILER})
set(CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES "${CMAKE_CUDA_ROOT_DIR}/include")
set(CMAKE_CUDA_IMPLICIT_LINK_DIRECTORIES "${CMAKE_CUDA_ROOT_DIR}/lib/x64")
endif()

# This is modified from https://gitlab.kitware.com/cmake/community/wikis/FAQ#dynamic-replace
if(MSVC)
# MSVC can not find CUDA automatically
# Use CUDA_COMPILER PATH to define the CUDA TOOLKIT ROOT DIR
if("${CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES}" STREQUAL "")
string(REPLACE "/bin/nvcc.exe" "" CMAKE_CUDA_ROOT_DIR ${CMAKE_CUDA_COMPILER})
set(CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES "${CMAKE_CUDA_ROOT_DIR}/include")
set(CMAKE_CUDA_IMPLICIT_LINK_DIRECTORIES "${CMAKE_CUDA_ROOT_DIR}/lib/x64")
endif()

# This is modified from https://gitlab.kitware.com/cmake/community/wikis/FAQ#dynamic-replace
if(BUILD_SHARED_LIBS)
ginkgo_switch_to_windows_dynamic("CUDA")
else()
Expand Down
4 changes: 3 additions & 1 deletion cuda/components/diagonal_block_manipulation.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ __device__ __forceinline__ void extract_transposed_diag_blocks(
auto bid = static_cast<size_type>(blockIdx.x) * warps_per_block *
processed_blocks +
threadIdx.z * processed_blocks;
auto bstart = block_ptrs[bid];
auto bstart = (bid < num_blocks) ? block_ptrs[bid] : zero<IndexType>();
IndexType bsize = 0;
#pragma unroll
for (int b = 0; b < processed_blocks; ++b, ++bid) {
Expand All @@ -84,6 +84,7 @@ __device__ __forceinline__ void extract_transposed_diag_blocks(
if (threadIdx.y == b && threadIdx.x < max_block_size) {
workspace[threadIdx.x] = zero<ValueType>();
}
warp.sync();
const auto row = bstart + i;
const auto rstart = row_ptrs[row] + tid;
const auto rend = row_ptrs[row + 1];
Expand All @@ -101,6 +102,7 @@ __device__ __forceinline__ void extract_transposed_diag_blocks(
if (threadIdx.y == b && threadIdx.x < bsize) {
block_row[i * increment] = workspace[threadIdx.x];
}
warp.sync();
}
}
}
Expand Down
2 changes: 2 additions & 0 deletions cuda/preconditioner/jacobi_generate_kernel.cu
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#include "cuda/components/thread_ids.cuh"
#include "cuda/components/uninitialized_array.hpp"
#include "cuda/components/warp_blas.cuh"
#include "cuda/components/zero_array.hpp"
#include "cuda/preconditioner/jacobi_common.hpp"


Expand Down Expand Up @@ -296,6 +297,7 @@ void generate(std::shared_ptr<const CudaExecutor> exec,
Array<precision_reduction> &block_precisions,
const Array<IndexType> &block_pointers, Array<ValueType> &blocks)
{
zero_array(blocks.get_num_elems(), blocks.get_data());
select_generate(compiled_kernels(),
[&](int compiled_block_size) {
return max_block_size <= compiled_block_size;
Expand Down
4 changes: 2 additions & 2 deletions cuda/preconditioner/jacobi_kernels.cu
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ __global__ void generate_natural_block_pointer(
}
size_type num_blocks = 1;
int32 current_block_size = 1;
for (size_type i = 1; i < num_rows; ++i) {
for (size_type i = 0; i < num_rows - 1; ++i) {
if ((matching_next_row[i]) && (current_block_size < max_block_size)) {
++current_block_size;
} else {
Expand All @@ -157,7 +157,7 @@ size_type find_natural_blocks(std::shared_ptr<const CudaExecutor> exec,
{
Array<size_type> nums(exec, 1);

Array<bool> matching_next_row(exec, mtx->get_size()[0]);
Array<bool> matching_next_row(exec, mtx->get_size()[0] - 1);

const dim3 block_size(default_block_size, 1, 1);
const dim3 grid_size(
Expand Down
Loading