Diocotron error with Gingko #86

PaulineVidal · 2025-02-07T16:52:52Z

An error with Gingko appears while running the diocotron simulation:

terminate called after throwing an instance of 'std::runtime_error'
  what():  Ginkgo did not converge in MatrixBatchCsr
Aborted (core dumped)

It does not appears for

Circular mapping with a grid of 256x512.
Czarny mapping with a grid of 128x256.

It does appears for

Czarny mapping with a grid of 256x256.
Czarny mapping with a grid of 256x512.
Czarny mapping with (1e-5, 1) parameters make it look like a circular mapping with a grid of 256x512.

The simulation can be a bit non-smooth. Here is the density at t = 50 for ( Czarny mapping with a grid of 128x256)

The text was updated successfully, but these errors were encountered:

tpadioleau · 2025-02-10T08:29:34Z

How many iterations does the linear solver do ?

PaulineVidal · 2025-02-10T08:37:22Z

I don't know very well Gingko. I guess it is a parameter in Gingko, I just see m_gko_matrix->solve(m_x_init, b); in the code.
@AbdelhadiKara , maybe you know better the answer.

AbdelhadiKara · 2025-02-10T08:58:00Z

You have the possibility to check the iterations number by activating the logger. there is a boolean you can set to true when instantiating the m_gko_matrix. you can have a look here.

PS: You can try to solve this issue by increasing the maximum number of iterations.

PaulineVidal · 2025-02-10T09:32:52Z

Yes, it seems to not fail when we set m_max_iter(max_iter.value_or(10000)) instead of m_max_iter(max_iter.value_or(1000)).
Ok, thank you 👍

tpadioleau · 2025-02-10T11:13:53Z

Doesn't it mean the condition number is high ? Could we find a better preconditioner ?

AbdelhadiKara · 2025-02-10T11:18:36Z

@tpadioleau the preconditinner used is Jacobi. It was the only one availiable. IMO it would be better to rely on kokkos kernels solvers. They released batched version of CG.

tpadioleau · 2025-02-10T12:03:42Z

@tpadioleau the preconditinner used is Jacobi. It was the only one availiable. IMO it would be better to rely on kokkos kernels solvers. They released batched version of CG.

How do you think it would help to use Kokkos Kernels ?

Regarding Ginkgo, do you know if the version 1.9 has other preconditioners ?

AbdelhadiKara · 2025-02-11T06:13:50Z

For several reasons.

to reduce dépendancies.
a totally personal point of view, but kokkos-kernels implementation seem more stable.(Robust)
if in the future We need something, it will be easier to Ask cexa team.

tpadioleau · 2025-02-11T09:04:26Z

For several reasons.

to reduce dépendancies.

a totally personal point of view, but kokkos-kernels implementation seem more stable.(Robust)

if in the future We need something, it will be easier to Ask cexa team.

On the overall I think I can agree but I want to point out that it will not avoid the work of finding adapted preconditioners and stopping criteria.

AbdelhadiKara · 2025-02-11T09:18:21Z

Thank you for these precisions
I agree with you, I've just checked and none of the release 1.9 and develop branches contain any preconditioners other than Jacobi for batch solvers.
For stopping criterion, i had investigated several months ago and ginkgo doc makes distinction beetween the "true" stopping criteria and the "implicit" one which is internal to solvers.

tpadioleau · 2025-02-11T10:34:51Z

Ok thanks!

PaulineVidal · 2025-02-17T08:08:59Z

For indications, here are the numbers of iterations needed for the simulation I mentionned:
Diocotron simulation on a Czarny mapping on a 256x512 grid.

csr_log.txt

(for the 700th time step, around 1900 iterations in the Poisson solver were needed.)

The modifications I made are on the branch pvidal_diocotron_on_czarny_map
(https://github.com/gyselax/gyselalibxx/blob/pvidal_diocotron_on_czarny_map/simulations/geometryRTheta/diocotron/diocotron.cpp)

tpadioleau · 2025-02-17T11:00:30Z

 System no. 0:
 Number of iterations = 1909
 Implicit residual norm = 2.07155e-18
 True (Ax-b) residual norm = 3.4037e-14
 Right-hand side (b) norm = 0.00214497
 --- System 0 did not converge! ---

Looking at the residual, to me it looks converged. It is possible that the last 1000 iterations are not increasing much the residual. It could be that the tolerance requested is too strict. A colleague once told me that a rule of thumb to select the value of the tolerance is roughly the machine precision divided by the condition number of the matrix.

AbdelhadiKara · 2025-02-18T07:47:52Z

I have found an interesting feature in ginkgo CMakeLists,
option(GINKGO_JACOBI_FULL_OPTIMIZATIONS "Use all the optimizations for the CUDA Jacobi algorithm" OFF)
Perhaps switching this to ON will bring some improvements?

tpadioleau · 2025-02-18T08:30:22Z

@AbdelhadiKara What kind of optimization can we expect from this option ?

AbdelhadiKara · 2025-02-18T10:05:55Z

I have found at least two things if this option is activated:

leads to increased jacobi blocksize
unrolling of some loops

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diocotron error with Gingko #86

Diocotron error with Gingko #86

PaulineVidal commented Feb 7, 2025 •

edited

Loading

tpadioleau commented Feb 10, 2025

PaulineVidal commented Feb 10, 2025

AbdelhadiKara commented Feb 10, 2025 •

edited

Loading

PaulineVidal commented Feb 10, 2025 •

edited

Loading

tpadioleau commented Feb 10, 2025

AbdelhadiKara commented Feb 10, 2025

tpadioleau commented Feb 10, 2025

AbdelhadiKara commented Feb 11, 2025 •

edited by tpadioleau

Loading

tpadioleau commented Feb 11, 2025

AbdelhadiKara commented Feb 11, 2025

tpadioleau commented Feb 11, 2025

PaulineVidal commented Feb 17, 2025 •

edited

Loading

tpadioleau commented Feb 17, 2025

AbdelhadiKara commented Feb 18, 2025

tpadioleau commented Feb 18, 2025

AbdelhadiKara commented Feb 18, 2025

Diocotron error with Gingko #86

Diocotron error with Gingko #86

Comments

PaulineVidal commented Feb 7, 2025 • edited Loading

tpadioleau commented Feb 10, 2025

PaulineVidal commented Feb 10, 2025

AbdelhadiKara commented Feb 10, 2025 • edited Loading

PaulineVidal commented Feb 10, 2025 • edited Loading

tpadioleau commented Feb 10, 2025

AbdelhadiKara commented Feb 10, 2025

tpadioleau commented Feb 10, 2025

AbdelhadiKara commented Feb 11, 2025 • edited by tpadioleau Loading

tpadioleau commented Feb 11, 2025

AbdelhadiKara commented Feb 11, 2025

tpadioleau commented Feb 11, 2025

PaulineVidal commented Feb 17, 2025 • edited Loading

tpadioleau commented Feb 17, 2025

AbdelhadiKara commented Feb 18, 2025

tpadioleau commented Feb 18, 2025

AbdelhadiKara commented Feb 18, 2025

PaulineVidal commented Feb 7, 2025 •

edited

Loading

AbdelhadiKara commented Feb 10, 2025 •

edited

Loading

PaulineVidal commented Feb 10, 2025 •

edited

Loading

AbdelhadiKara commented Feb 11, 2025 •

edited by tpadioleau

Loading

PaulineVidal commented Feb 17, 2025 •

edited

Loading