-
Notifications
You must be signed in to change notification settings - Fork 160
Isambard AI
These are preliminary notes for how to install Firedrake on Isambard AI. This is based on a short period of early access in October 2024 and the installation was not thoroughly tested. PETSc and Firedrake were installed directly in the user's home directory.
This guide uses the following system modules and environment variables:
module load PrgEnv-gnu
module load cray-mpich
module load cray-python
unset PYTHONPATH
module load cray-libsci
module load cray-hdf5-parallel
module load cray-netcdf-hdf5parallel
module load cray-parallel-netcdf
module load xpmem
# Set preferred executables
CC=/opt/cray/pe/craype/2.7.30/bin/cc
CXX=/opt/cray/pe/craype/2.7.30/bin/CC
FORT=/opt/cray/pe/craype/2.7.30/bin/ftn
PYTHON=/opt/cray/pe/python/3.11.5/bin/python
For reference at the time that these notes were written, these commands load the following modules:
> module list
Currently Loaded Modules:
1) brics/userenv/2.4 5) craype-arm-grace 9) cray-mpich/8.1.28 13) cray-netcdf-hdf5parallel/4.9.0.9
2) brics/default/1.0 6) libfabric/1.15.2.0 10) cray-python/3.11.5 14) cray-parallel-netcdf/1.12.3.9
3) gcc-native/12.3 7) craype-network-ofi 11) cray-libsci/23.12.5 15) xpmem/2.8.2-1.0_3.7__g84a27a5.shasta
4) craype/2.7.30 8) PrgEnv-gnu/8.5.0 12) cray-hdf5-parallel/1.12.2.9
For more rapid debugging of installation dependencies this guide uses a separate PETSc installation. We clone the Firedrake fork of the PETSc repo and install in the ususal manner. The following difficulties were encountered:
- The
--with-mpi-dir
PETSc configure option couldn't be used with on IsambardAI since the Cray MPICH has thempi{cc,cxx,ftn}
wrappers linked to an ancient GCC. Instead we specify the--with-{cc,cxx,fc}
options. - PASTIX won't build, so we don't install it!
# Setup installation variables
BASE_INSTALL_DIR=$HOME
MAKE_NP=32 # Use up to 32 cores when building
# Clone Firedrake fork of repositories
git clone https://github.com/firedrakeproject/petsc.git
# Issues:
# + mpi-dir on IsambardAI has the mpi{cc,cxx,ftn} wrappers linked to an ancient GCC
# + PASTIX won't build
cd $BASE_INSTALL_DIR/petsc
$PYTHON ./configure \
--with-cc=$CC \
--with-cxx=$CXX \
--with-fc=$FORT \
--with-python-exec=$PYTHON \
--COPTFLAGS=-O3 -mcpu=neoverse-v2 \
--CXXOPTFLAGS=-O3 -mcpu=neoverse-v2 \
--FOPTFLAGS=-O3 -mcpu=neoverse-v2 \
--with-c2html=0 \
--with-debugging=0 \
--with-fortran-bindings=0 \
--with-make-np=$MAKE_NP \
--with-shared-libraries=1 \
--with-zlib \
--with-blaslapack-include=$CRAY_PE_LIBSCI_PREFIX_DIR/include \
--with-blaslapack-lib=$CRAY_PE_LIBSCI_PREFIX_DIR/lib/libsci_gnu.so \
--with-hdf5-dir=$HDF5_DIR \
--with-hdf5-dir=$HDF5_DIR \
--with-netcdf-dir=$NETCDF_DIR \
--with-pnetcdf-dir=$PNETCDF_DIR \
--download-cmake \
--download-hwloc \
--download-hypre \
--download-metis \
--download-mumps \
--download-ptscotch \
--download-scalapack \
--download-suitesparse \
--download-superlu_dist \
PETSC_ARCH=real
make PETSC_DIR=$BASE_INSTALL_DIR/petsc PETSC_ARCH=real all
This is not a requirement to complete the installation, but thoroughly recommended!
The code snippet below starts an interactive session and tries running some of the exercises.
Note that make check
won't work correctly by itself, this isn't documented on Isambard AI docs, but the --overlap --oversubscribe
flags must be passed to srun
for anything to work.
srun --time=00:05:00 --pty /bin/bash --login
module load PrgEnv-gnu
module load cray-mpich
module load cray-python
unset PYTHONPATH
module load cray-libsci
module load cray-hdf5-parallel
module load cray-netcdf-hdf5parallel
module load cray-parallel-netcdf
module load xpmem
# Set preferred executables
CC=/opt/cray/pe/craype/2.7.30/bin/cc
CXX=/opt/cray/pe/craype/2.7.30/bin/CC
FORT=/opt/cray/pe/craype/2.7.30/bin/ftn
PYTHON=/opt/cray/pe/python/3.11.5/bin/python
make MPIEXEC_TAIL="--overlap --oversubscribe" check
exit
Currently seeing errors from Hypre that look like:
xpmem_attach error: : Invalid argument
but the residual looks okay.
More concerning, there are errors from HDF5:
> 0 ADIOI_CRAY_Calc_aggregator_pfl() ../../../../src/mpi/romio/adio/ad_cray/ad_cray_aggregate.c +77 should not get here : num_comp=1 off=-7926205430001303552
> 0 ADIOI_CRAY_Calc_aggregator_pfl() comps[0]= 1*-7926205430001303552 0--1
> MPICH ERROR [Rank 0] [job id 25331.10] [Fri Oct 18 10:42:04 2024] [nid001001] - Abort(86) (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 86) - process 0
>
> HDF5: infinite loop closing library
> L,G_top,S_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top,T_top
Once PETSc is installed, Firedrake can be installed honouring the PETSc directory.
It is not possible to install with VTK support as Kitware do not currently release wheels for Linux on ARM.
Frustratingly, this build will probably fail to build libsupermesh
as the system CMake is incapable of detecting the correct flags to pass to the build process!
This would be fixed by addressing #3806
export PETSC_DIR=$HOME/petsc
export PETSC_ARCH=real
cd ..
# Fetch and install Firedrake
curl -O https://raw.githubusercontent.com/firedrakeproject/firedrake/master/scripts/firedrake-install
$PYTHON firedrake-install \
--no-package-manager \
--honour-petsc-dir \
--mpicc=$CC \
--mpicxx=$CXX \
--mpif90=$FORT \
--mpiexec=srun \
--no-vtk \
--venv-name=firedrake-real
The above will fail, then we must manually install libsupermesh
:
. $HOME/firedrake-real/bin/activate
cd $VIRTUAL_ENV/src/libsupermesh/build
$HOME/petsc/firedrake-real/bin/cmake .. \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=$HOME/firedrake-real \
-DMPI_C_COMPILER=$CC \
-DMPI_CXX_COMPILER=$CXX \
-DMPI_Fortran_COMPILER=$FORT \
-DCMAKE_Fortran_COMPILER=$FORT \
-DMPIEXEC_EXECUTABLE=srun \
-DCMAKE_Fortran_FLAGS=-fallow-argument-mismatch
make
make install
firedrake-update
This is not a requirement to complete the installation, but thoroughly recommended! The code snippet below starts an interactive session and tries running some serial tests and one of the demos. The Firedrake test suite won't run due to longstanding issues using system MPI distributions that are not vanilla MPICH.
srun --time=00:15:00 --pty /bin/bash --login
module load PrgEnv-gnu
module load cray-mpich
module load cray-python
unset PYTHONPATH
module load cray-libsci
module load cray-hdf5-parallel
module load cray-netcdf-hdf5parallel
module load cray-parallel-netcdf
module load xpmem
# Set preferred executables
CC=/opt/cray/pe/craype/2.7.30/bin/cc
CXX=/opt/cray/pe/craype/2.7.30/bin/CC
FORT=/opt/cray/pe/craype/2.7.30/bin/ftn
PYTHON=/opt/cray/pe/python/3.11.5/bin/python
export PETSC_DIR=$HOME/petsc
export PETSC_ARCH=real
. $HOME/firedrake-real/bin/activate
cd $VIRTUAL_ENV/src/firedrake
pytest -v tests/regression/ -m "not parallel" -k "poisson_strong or stokes_mini or dg_advection"
cd $VIRTUAL_ENV/src/firedrake/demos
make
cd helmholtz
# Need to remove the line that creates a VTKFile
python helmholtz.py
srun --overlap -n 2 python helmholtz.py
srun --overlap -n 4 python helmholtz.py
exit
The command:
srun --overlap --oversubscribe -n 3 pytest -v tests/regression/ -m "parallel[3]" -k "poisson_strong or stokes_mini or dg_advection"
should run but does not. This requires further investigation.
Building locally
Tips
- Running Firedrake tests with different subpackage branches
- Modifying and Rebuilding PETSc and petsc4py
- Vectorisation
- Debugging C kernels with
lldb
on MacOS - Parallel MPI Debugging with
tmux-mpi
,pdb
andgdb
- Parallel MPI Debugging with VSCode and
debugpy
- Modifying generated code
- Kernel profiling with LIKWID
- breakpoint() builtin not working
- Debugging pytest with multiple processing
Developers Notes
- Upcoming meeting 2024-08-21
- 2024-08-07
- 2024-07-24
- 2024-07-17
- 2024-07-10
- 2024-06-26
- 2024-06-19
- 2024-06-05
- 2024-05-29
- 2024-05-15
- 2024-05-08
- 2024-05-01
- 2024-04-28
- 2024-04-17
- 2024-04-10
- 2024-04-03
- 2024-03-27
- 2024-03-20
- 2024-03-06
- 2024-02-28
- 2024-02-28
- 2024-02-21
- 2024-02-14
- 2024-02-07
- 2024-01-31
- 2024-01-24
- 2024-01-17
- 2024-01-10
- 2023-12-13
- 2023-12-06
- 2023-11-29
- 2023-11-22
- 2023-11-15
- 2023-11-08
- 2023-11-01
- 2023-10-25
- 2023-10-18
- 2023-10-11
- 2023-10-04
- 2023-09-27
- 2023-09-20
- 2023-09-06
- 2023-08-30
- 2023-08-23
- 2023-07-12
- 2023-07-05
- 2023-06-21
- 2023-06-14
- 2023-06-07
- 2023-05-17
- 2023-05-10
- 2023-03-08
- 2023-02-22
- 2023-02-15
- 2023-02-08
- 2023-01-18
- 2023-01-11
- 2023-12-14
- 2022-12-07
- 2022-11-23
- 2022-11-16
- 2022-11-09
- 2022-11-02
- 2022-10-26
- 2022-10-12
- 2022-10-05
- 2022-09-28
- 2022-09-21
- 2022-09-14
- 2022-09-07
- 2022-08-25
- 2022-08-11
- 2022-08-04
- 2022-07-28
- 2022-07-21
- 2022-07-07
- 2022-06-30
- 2022-06-23
- 2022-06-16
- 2022-05-26
- 2022-05-19
- 2022-05-12
- 2022-05-05
- 2022-04-21
- 2022-04-07
- 2022-03-17
- 2022-03-03
- 2022-02-24
- 2022-02-10
- 2022-02-03
- 2022-01-27
- 2022-01-20
- 2022-01-13
- 2021-12-15
- 2021-12-09
- 2021-11-25
- 2021-11-18
- 2021-11-11
- 2021-11-04
- 2021-10-28
- 2021-10-21
- 2021-10-14
- 2021-10-07
- 2021-09-30
- 2021-09-23
- 2021-09-09
- 2021-09-02
- 2021-08-26
- 2021-08-18
- 2021-08-11
- 2021-08-04
- 2021-07-28
- 2021-07-21
- 2021-07-14
- 2021-07-07
- 2021-06-30
- 2021-06-23
- 2021-06-16
- 2021-06-09
- 2021-06-02
- 2021-05-19
- 2021-05-12
- 2021-05-05
- 2021-04-28
- 2021-04-21
- 2021-04-14
- 2021-04-07
- 2021-03-17
- 2021-03-10
- 2021-02-24
- 2021-02-17
- 2021-02-10
- 2021-02-03
- 2021-01-27
- 2021-01-20
- 2021-01-13
- 2021-01-06