Skip to content

Yambo 4.5

Compare
Choose a tag to compare
@sangallidavide sangallidavide released this 03 Mar 15:18
· 13791 commits to master since this release

With Yambo 4.5 support to CUDA Fortran has been implemented

  • Yambo structure modified to deal with GPU accelerator devices porting done using CUDA Fortran (available whit the PGI compiler)
  • DIPOLES, RESPONSE FUNCTION, HF, GW, BSE have been ported;
  • fully compatible with MPI and OpenMP; typically, 1 MPI/card, OpenMP threads used to exploit the remaining computational capabilities of the host.
  • inclusion of dedicated headers (dev_defs.h) to handle simultaneously the CPU and GPU compilation.
  • GPU allocations integrated in YAMBO_ALLOC/YAMBO_FREE and memory module.
  • DevXlib (developed jointly with the QE team and hosted as a separate repo on GitLab) imported and extensively used to provide wrappers for memcpy, sync, init, and simple data operations.

New DIPOLES_driver

As part of the modularization process of the code, within the MAX project, all the subroutines dealing with the calculation and the I/O of the dipoles have been moved under the folder src/dipoles and the "dipoles" runlevel has been created. The DIPOLES_driver is not called directly by the yambo_driver. This made possible the creation of a dedicated parallel scheme for the dipoles and thus a more efficient distribution of the calculation (both time-to-solution and memory footprint). Later, other runlevels just need to load the pre-computed DIPOLES from disk. This also avoids strong load umbalance for example in the calculation of the response function, where the dipoles are needed only at q=0 .

Modularization of the BSE subroutines.

The files
K.o K_correlation_collisions.o K_exchange_collisions.o
have been split into
K.o K_correlation_collisions.o K_exchange_collisions.o K_correlation_kernel.o K_exchange_kernel.o K_screened_interaction.o
This reduces code replication and to make possible an easier handling of CUDA and OPENMP directives
Moreover the code is ready for finite-q BSE implementation which will be likely made available with the next release

More:

  • Reorganization of the main yambo_driver. The main subroutine of the code has been cleaned and reorganized to allow a more easy implementation of new features;
  • p2y can now also read the output of the projwfc.x post processing (QE suite);
  • improved configuration of external libraries;
  • new mapping of the k-points introduced. It can be useful for gamma centered grids in hexagoanl cells, when the standard mapping may fail;
  • subroutine G_index_energy_factor introduced;
  • IO of some tables moved from integer/real to character to reduce disk use in real-time calculations;
  • general improvements in coulomb_cutoff for reduced dimensionality systems;
  • Modularization of the subroutines dealing with the input file. Subroutine
    src/interface/INT.F
    split into
    INIT.o INIT_read_command_line.o INIT_check_databases.o INIT_activate.o ;
  • Several bug-fixes.