Skip to content

Commit

Permalink
Toolchain 2023.5 (#3196)
Browse files Browse the repository at this point in the history
* Update install_elpa.sh

update elpa version to 2023.05.001

* Update cmake version to 3.27.6

* Update gcc version to 13.2.0

* Update install_openblas.sh

- update OpenBLAS version to 0.3.24
- update NUM_THREADS to 128 in compiling

* Set default intel compiler to `icpx`

* update build.sh scripts

* Update README.md

* Update README.md

* modified toolchain*.sh

* Update toolchain_intel-mpich.sh

* remove error message in build*.sh when do install

* Update README.md

* minor update

* Update README.md

* version 2023.5 tag

* Update README for GPU version

* update optional message in README
  • Loading branch information
QuantumMisaka authored Nov 11, 2023
1 parent 35bd3a7 commit d62ca6f
Show file tree
Hide file tree
Showing 13 changed files with 133 additions and 67 deletions.
78 changes: 69 additions & 9 deletions toolchain/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# The ABACUS Toolchain
Version 2023.4
Version 2023.5

## Author
[QuantumMisaka](https://github.com/QuantumMisaka)
Expand Down Expand Up @@ -31,14 +31,16 @@ and give setup files that you can use to compile ABACUS.
- [ ] Better compliation method for ABACUS-DEEPMD and ABACUS-DEEPKS.
- [ ] A better `setup` and toolchain code structure.
- [ ] Modulefile generation scripts.
- [ ] Support for `acml` toolchain (scripts are partly in toolchain now) or other AMD compiler and math lib like `AOCL` and `AOCC`
- [ ] Support for AMD compiler and math lib like `AOCL` and `AOCC`


## Usage Online & Offline
Main script is `install_abacus_toolchain.sh`,
which will use scripts in `scripts` directory
to compile install dependencies of ABACUS.

You can just `./install_abacus_toolchain.sh -h` to get more help message.

**Notice: You SHOULD `source` or `module load` related environments before use toolchain method for installation, espacially for `gcc` or `intel-oneAPI` !!!! for example, `module load mkl mpi icc compiler`**

**Notice: You SHOULD keep your environments systematic, for example, you CANNOT load `intel-OneAPI` environments while use gcc toolchain !!!**
Expand All @@ -51,7 +53,7 @@ to compile install dependencies of ABACUS.

All packages will be downloaded from [cp2k-static/download](https://www.cp2k.org/static/downloads). by `wget` , and will be detailedly compiled and installed in `install` directory by toolchain scripts, despite of:
- `CEREAL` which will be downloaded from [CEREAL](https://github.com/USCiLab/cereal)
- `LibNPY` which will be downloaded from [LIBNPY](https://github.com/llohse/libnpy)
- `Libnpy` which will be downloaded from [LIBNPY](https://github.com/llohse/libnpy)
- `LibRI` which will be downloaded from [LibRI](https://github.com/abacusmodeling/LibRI)
- `LibCOMM` which will be downloaded from [LibComm](https://github.com/abacusmodeling/LibComm)
Notice: These packages will be downloaded by `wget` from `github.com`, which is hard to be done in Chinese Internet. You may need to use offline installation method.
Expand All @@ -78,7 +80,28 @@ just by using this toolchain
> cp ***.tar.gz build/
```

Notice: for `CEREAL`, `LibNPY`, `LibRI` and `LibCOMM`,
The needed dependencies version default:
- `cmake` 3.27.6
- `gcc` 13.2.0 (which will always NOT be installed, But use system)
- `OpenMPI` 4.1.5
- `MPICH` 4.1.2
- `OpenBLAS` 0.3.24 (Intel toolchain need `get_vars.sh` tool from it)
- `ScaLAPACK` 2.2.1
- `FFTW` 3.3.10
- `LibXC` 6.2.2
- `ELPA` 2023.05.001
- `CEREAL` 1.3.2
And Intel-oneAPI need user or server manager to manually install from Intel.
[Intel-oneAPI](https://www.intel.cn/content/www/cn/zh/developer/tools/oneapi/toolkits.html)

Dependencies below are optional, which is NOT installed by default:
- `LibTorch` 2.0.1
- `Libnpy` 0.1.0
- `LibRI` 0.1.0
- `LibComm` 0.1.0
Users can install them by using `--with-*=install` in toolchain*.sh, which is `no` in default.

Notice: for `CEREAL`, `Libnpy`, `LibRI` and `LibComm`,
you need to download them from github.com,
rename it as formatted, and put them in `build` directory at the same time
e.g.:
Expand Down Expand Up @@ -118,7 +141,7 @@ If compliation is successful, a message will be shown like this:
> ./build_abacus_intel.sh
> or you can modify the builder scripts to suit your needs.
```
You can run build_abacus_gnu.sh or build_abacus_intel.sh to build ABACUS
You can run `build_abacus_gnu.sh` or `build_abacus_intel.sh` to build ABACUS
by gnu-toolchain or intel-toolchain respectively, the builder scripts will
automatically locate the environment and compile ABACUS.
You can manually change the builder scripts to suit your needs.
Expand All @@ -139,13 +162,20 @@ or you can also do it in a more completely way:
> rm -rf install build/*/* build/OpenBLAS*/ build/setup_*
```

Users can get help messages by simply:
## Common Problem and Solution
### GPU version of ABACUS
add following options in build*.sh:
```shell
> ./install_abacus_toolchain.sh -h # or --help
cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
-DCMAKE_CXX_COMPILER=icpx \
-DMPI_CXX_COMPILER=mpiicpc \
......
-DUSE_CUDA=1 \
-DCMAKE_CUDA_COMPILER=${path to cuda toolkit}/bin/nvcc \
......
```


## Common Problem and Solution
### shell problem
If you encounter problem like:
```shell
/bin/bash^M: bad interpreter: No such file or directory
Expand All @@ -157,6 +187,36 @@ or `permission denied` problem, you can simply run:
And also, you can fix `permission denied` problem via `chmod +x`
if `pre_set.sh` have no execution permission.

### libtorch and deepks problem
If deepks feature have problem, you can manually change libtorch version
from 2.0.1 to 1.12.0 in `toolchain/scripts/stage4/install_libtorch.sh`.
Also, you can install ABACUS without deepks by removing all the deepks and related options.

NOTICE: if you want deepks feature, your intel-mkl environment should be accessible in building process. you can check it in `build_abacus_gnu.sh`

### deepmd feature problem
When you encounter problem like `GLIBCXX_3.4.29 not found`, it is sure that your `gcc` version is lower than the requirement of `libdeepmd`.

After my test, you need `gcc`>11.3.1 to enable deepmd feature in ABACUS.

### ELPA problem via Intel-oneAPI toolchain in AMD server
The default compiler for Intel-oneAPI is `icpx` and `icx`, which will cause problem when compling ELPA in AMD server.

The best way is to change `icpx` to `icpc`, `icx` to `icc`. user can manually change it in toolchain*.sh via `--with-intel-classic=yes`


### LibRI and LibComm problem
(There is some problem sometimes when compling with LibRI and LibComm, detailed information is needed)


### Intel-oneAPI problem
Sometimes Intel-oneAPI have problem to link `mpirun`,
which will always show in 2023.2.0 version of MPI in Intel-oneAPI.
Try `source /path/to/setvars.sh` or install another version of IntelMPI may help.

More problem and possible solution can be accessed via [#2928](https://github.com/deepmodeling/abacus-develop/issues/2928)



## Advanced Installation Usage

Expand Down
15 changes: 8 additions & 7 deletions toolchain/build_abacus_gnu.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,13 @@ source $INSTALL_DIR/setup
cd $ABACUS_DIR
ABACUS_DIR=$(pwd)

BUILD_DIR=build_abacus
BUILD_DIR=build_abacus_gnu
rm -rf $BUILD_DIR

PREFIX=$ABACUS_DIR
LAPACK=$INSTALL_DIR/openblas-0.3.23/lib
LAPACK=$INSTALL_DIR/openblas-0.3.24/lib
SCALAPACK=$INSTALL_DIR/scalapalack-2.2.1/lib
ELPA=$INSTALL_DIR/elpa-2021.11.002/cpu
ELPA=$INSTALL_DIR/elpa-2023.05.001/cpu
FFTW3=$INSTALL_DIR/fftw-3.3.10
CEREAL=$INSTALL_DIR/cereal-1.3.2/include/cereal
LIBXC=$INSTALL_DIR/libxc-6.2.2
Expand All @@ -46,7 +46,6 @@ cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
-DENABLE_LCAO=ON \
-DENABLE_LIBXC=ON \
-DUSE_OPENMP=ON \
-DENABLE_ASAN=OFF \
-DUSE_ELPA=ON \
# -DENABLE_DEEPKS=1 \
# -DTorch_DIR=$LIBTORCH \
Expand All @@ -58,12 +57,14 @@ cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
# -DTensorFlow_DIR=$DEEPMD \

# # add mkl env for libtorch to link
# # gnu-toolchain will lack of -lmkl when load libtorch
# # need to fix -- zhaoqing in 2023-09-02
# if one want to install libtorch, mkl should be load in build process
# for -lmkl when load libtorch
# module load mkl

# if one want's to include deepmd, your gcc version should be >= 11.3.0

cmake --build $BUILD_DIR -j `nproc`
cmake --install $BUILD_DIR
cmake --install $BUILD_DIR 2>/dev/null

# generate abacus_env.sh
cat << EOF > "${TOOL}/abacus_env.sh"
Expand Down
23 changes: 12 additions & 11 deletions toolchain/build_abacus_intel-mpich.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#SBATCH -n 16
#SBATCH -o install.log
#SBATCH -e install.err
# install ABACUS with libxc and deepks
# build and install ABACUS with libxc, also can with deepks and deepmd
# JamesMisaka in 2023.08.31

# Build ABACUS by intel-toolchain with mpich
Expand All @@ -19,21 +19,21 @@ source $INSTALL_DIR/setup
cd $ABACUS_DIR
ABACUS_DIR=$(pwd)

BUILD_DIR=build_abacus
BUILD_DIR=build_abacus_intel-mpich
rm -rf $BUILD_DIR

PREFIX=$ABACUS_DIR
ELPA=$INSTALL_DIR/elpa-2021.11.002/cpu
ELPA=$INSTALL_DIR/elpa-2023.05.001/cpu
CEREAL=$INSTALL_DIR/cereal-1.3.2/include/cereal
LIBXC=$INSTALL_DIR/libxc-6.2.2
LIBTORCH=$INSTALL_DIR/libtorch-2.0.1/share/cmake/Torch
LIBNPY=$INSTALL_DIR/libnpy-0.1.0/include
# LIBTORCH=$INSTALL_DIR/libtorch-2.0.1/share/cmake/Torch
# LIBNPY=$INSTALL_DIR/libnpy-0.1.0/include
# LIBRI=$INSTALL_DIR/LibRI-0.1.0
# LIBCOMM=$INSTALL_DIR/LibComm-0.1.0
# DEEPMD=$HOME/apps/anaconda3/envs/deepmd

cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
-DCMAKE_CXX_COMPILER=icpc \
-DCMAKE_CXX_COMPILER=icpx \
-DMPI_CXX_COMPILER=mpicxx \
-DMKLROOT=$MKLROOT \
-DELPA_DIR=$ELPA \
Expand All @@ -42,19 +42,20 @@ cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
-DENABLE_LCAO=ON \
-DENABLE_LIBXC=ON \
-DUSE_OPENMP=ON \
-DENABLE_ASAN=OFF \
-DUSE_ELPA=ON \
-DENABLE_DEEPKS=1 \
-DTorch_DIR=$LIBTORCH \
-Dlibnpy_INCLUDE_DIR=$LIBNPY \
# -DENABLE_DEEPKS=1 \
# -DTorch_DIR=$LIBTORCH \
# -Dlibnpy_INCLUDE_DIR=$LIBNPY \
# -DENABLE_LIBRI=ON \
# -DLIBRI_DIR=$LIBRI \
# -DLIBCOMM_DIR=$LIBCOMM \
# -DDeePMD_DIR=$DEEPMD \
# -DTensorFlow_DIR=$DEEPMD \

# if one want's to include deepmd, your gcc version should be >= 11.3.0

cmake --build $BUILD_DIR -j `nproc`
cmake --install $BUILD_DIR
cmake --install $BUILD_DIR 2>/dev/null

# generate abacus_env.sh
cat << EOF > "${TOOL}/abacus_env.sh"
Expand Down
22 changes: 11 additions & 11 deletions toolchain/build_abacus_intel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,44 +19,44 @@ source $INSTALL_DIR/setup
cd $ABACUS_DIR
ABACUS_DIR=$(pwd)

BUILD_DIR=build_abacus
BUILD_DIR=build_abacus_intel
rm -rf $BUILD_DIR

PREFIX=$ABACUS_DIR
ELPA=$INSTALL_DIR/elpa-2021.11.002/cpu
ELPA=$INSTALL_DIR/elpa-2023.05.001/cpu
CEREAL=$INSTALL_DIR/cereal-1.3.2/include/cereal
LIBXC=$INSTALL_DIR/libxc-6.2.2
LIBTORCH=$INSTALL_DIR/libtorch-2.0.1/share/cmake/Torch
LIBNPY=$INSTALL_DIR/libnpy-0.1.0/include
# LIBTORCH=$INSTALL_DIR/libtorch-2.0.1/share/cmake/Torch
# LIBNPY=$INSTALL_DIR/libnpy-0.1.0/include
# LIBRI=$INSTALL_DIR/LibRI-0.1.0
# LIBCOMM=$INSTALL_DIR/LibComm-0.1.0
# DEEPMD=$HOME/apps/anaconda3/envs/deepmd

# if use deepks and deepmd
cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
-DCMAKE_CXX_COMPILER=icpc \
-DCMAKE_CXX_COMPILER=icpx \
-DMPI_CXX_COMPILER=mpiicpc \
-DMKLROOT=$MKLROOT \
-DELPA_DIR=$ELPA \
-DCEREAL_INCLUDE_DIR=$CEREAL \
-DLibxc_DIR=$LIBXC \
-DENABLE_LCAO=ON \
-DENABLE_LIBXC=ON \
-DENABLE_LIBRI=OFF \
-DUSE_OPENMP=ON \
-DENABLE_ASAN=OFF \
-DUSE_ELPA=ON \
-DENABLE_DEEPKS=1 \
-DTorch_DIR=$LIBTORCH \
-Dlibnpy_INCLUDE_DIR=$LIBNPY \
# -DENABLE_DEEPKS=1 \
# -DTorch_DIR=$LIBTORCH \
# -Dlibnpy_INCLUDE_DIR=$LIBNPY \
# -DENABLE_LIBRI=ON \
# -DLIBRI_DIR=$LIBRI \
# -DLIBCOMM_DIR=$LIBCOMM \
# -DDeePMD_DIR=$DEEPMD \
# -DTensorFlow_DIR=$DEEPMD \

cmake --build $BUILD_DIR -j `nproc`
cmake --install $BUILD_DIR
cmake --install $BUILD_DIR 2>/dev/null

# if one want's to include deepmd, your gcc version should be >= 11.3.0

# generate abacus_env.sh
cat << EOF > "${TOOL}/abacus_env.sh"
Expand Down
6 changes: 3 additions & 3 deletions toolchain/install_abacus_toolchain.sh
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ The --enable-FEATURE options follow the rules:
--enable-FEATURE=no Disable this particular feature
--enable-FEATURE The option keyword alone is equivalent to
--enable-FEATURE=yes
===== NOTICE: THESE FEATURE AER NOT INCLUDED IN ABACUS =====
--enable-cuda Turn on GPU (CUDA) support (can be combined
with --enable-opencl).
Default = no
Expand Down Expand Up @@ -302,9 +302,9 @@ enable_tsan="__FALSE__"
enable_opencl="__FALSE__"
enable_cuda="__FALSE__"
enable_hip="__FALSE__"
export intel_classic="yes"
export intel_classic="no"
# no, then icc->icx, icpc->icpx,
# which cannot compile elpa-2021 and fftw.3.3.10 in some place
# which cannot compile elpa in AMD server
# due to some so-called cross-compile problem
# and will lead to problem in force calculation
# but icx is recommended by intel compiler
Expand Down
2 changes: 1 addition & 1 deletion toolchain/scripts/VERSION
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
# version file to force a rebuild of the entire toolchain
VERSION="2023.4"
VERSION="2023.5"
6 changes: 3 additions & 3 deletions toolchain/scripts/stage0/install_cmake.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@ cd "${BUILDDIR}"
case "${with_cmake}" in
__INSTALL__)
echo "==================== Installing CMake ===================="
cmake_ver="3.26.3"
cmake_ver="3.27.6"
if [ "${OPENBLAS_ARCH}" = "arm64" ]; then
cmake_arch="linux-aarch64"
cmake_sha256="b002c22b926aacd6fefe64bcf08620216088eb72f55ac532b7bcfd4d93443d50"
cmake_sha256="a83e01ed1cdf44c2e33e0726513b9a35a8c09e3b5a126fd720b3c8a9d5552368"
elif [ "${OPENBLAS_ARCH}" = "x86_64" ]; then
cmake_arch="linux-x86_64"
cmake_sha256="8ec0ef24375a1d0e78de2f790b4545d0718acc55fd7e2322ecb8e135696c77fe"
cmake_sha256="8c449dabb2b2563ec4e6d5e0fb0ae09e729680efab71527b59015131cea4a042"
else
report_error ${LINENO} \
"cmake installation for ARCH=${ARCH} is not supported. You can try to use the system installation using the flag --with-cmake=system instead."
Expand Down
4 changes: 2 additions & 2 deletions toolchain/scripts/stage0/install_gcc.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
[ "${BASH_SOURCE[0]}" ] && SCRIPT_NAME="${BASH_SOURCE[0]}" || SCRIPT_NAME=$0
SCRIPT_DIR="$(cd "$(dirname "$SCRIPT_NAME")/.." && pwd -P)"

gcc_ver="13.1.0"
gcc_sha256="bacd4c614d8bd5983404585e53478d467a254249e0f1bb747c8bc6d787bd4fa2"
gcc_ver="13.2.0"
gcc_sha256="8cb4be3796651976f94b9356fa08d833524f62420d6292c5033a9a26af315078"

source "${SCRIPT_DIR}"/common_vars.sh
source "${SCRIPT_DIR}"/tool_kit.sh
Expand Down
10 changes: 5 additions & 5 deletions toolchain/scripts/stage2/install_openblas.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
[ "${BASH_SOURCE[0]}" ] && SCRIPT_NAME="${BASH_SOURCE[0]}" || SCRIPT_NAME=$0
SCRIPT_DIR="$(cd "$(dirname "$SCRIPT_NAME")/.." && pwd -P)"

openblas_ver="0.3.23" # Keep in sync with get_openblas_arch.sh
openblas_sha256="5d9491d07168a5d00116cdc068a40022c3455bf9293c7cb86a65b1054d7e5114"
openblas_ver="0.3.24" # Keep in sync with get_openblas_arch.sh
openblas_sha256="ceadc5065da97bd92404cac7254da66cc6eb192679cf1002098688978d4d5132"
openblas_pkg="OpenBLAS-${openblas_ver}.tar.gz"

source "${SCRIPT_DIR}"/common_vars.sh
Expand Down Expand Up @@ -76,7 +76,7 @@ case "${with_openblas}" in
make -j $(get_nprocs) \
MAKE_NB_JOBS=0 \
TARGET=${TARGET} \
NUM_THREADS=64 \
NUM_THREADS=128 \
USE_THREAD=1 \
USE_OPENMP=1 \
NO_AFFINITY=1 \
Expand All @@ -88,7 +88,7 @@ case "${with_openblas}" in
make -j $(get_nprocs) \
MAKE_NB_JOBS=0 \
TARGET=NEHALEM \
NUM_THREADS=64 \
NUM_THREADS=128 \
USE_THREAD=1 \
USE_OPENMP=1 \
NO_AFFINITY=1 \
Expand All @@ -100,7 +100,7 @@ case "${with_openblas}" in
make -j $(get_nprocs) \
MAKE_NB_JOBS=0 \
TARGET=${TARGET} \
NUM_THREADS=64 \
NUM_THREADS=128 \
USE_THREAD=1 \
USE_OPENMP=1 \
NO_AFFINITY=1 \
Expand Down
Loading

0 comments on commit d62ca6f

Please sign in to comment.