-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request to add toolchain for NEC #53
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -60,29 +60,6 @@ Balthasar Reuter ([email protected]) | |
move parameter structures to constant memory. To enable this variant, | ||
a suitable CUDA installation is required and the `--with-cuda` flag | ||
needs to be passed at the build stage. | ||
- **dwarf-cloudsc-gpu-scc-cuf-k-caching**: GPU-enabled and further | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this here was deleted by accident? Or was this intended? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It'd also be great, if you could add a small section (or bullet point) to the README to tell users about the availability of the NEC variant. Probably a small paragraph? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, the deleted part was not intentional. I thought I already added how to compile on NEC in the README file. I'll check that again. |
||
optimized version of CLOUDSC that uses the SCC loop layout in | ||
combination with loop fusion and temporary local array demotion, implemented | ||
using CUDA-Fortran (CUF). To enable this variant, | ||
a suitable CUDA installation is required and the `--with-cuda` flag | ||
needs to be passed at the build stage. | ||
- **CUDA C prototypes**: To enable these variants, a suitable | ||
CUDA installation is required and the `--with-cuda` flag needs | ||
to be pased at the build stage. | ||
- **dwarf-cloudsc-cuda**: GPU-enabled, CUDA C version of CLOUDSC. | ||
- **dwarf-cloudsc-cuda-hoist**: GPU-enabled, optimized CUDA C version | ||
of CLOUDSC including host side hoisted temporary local variables. | ||
- **dwarf-cloudsc-cuda-k-caching**: GPU-enabled, further optimized CUDA | ||
C version of CLOUDSC including loop fusion and temporary local | ||
array demotion. | ||
- **dwarf-cloudsc-gpu-scc-field**: GPU-enabled and optimized version of | ||
CLOUDSC that uses the SCC loop layout, and a dedicated Fortran FIELD | ||
API to manage device offload and copyback. The intent is to demonstrate | ||
the explicit use of pinned host memory to speed-up data transfers, as | ||
provided by the shipped prototype implmentation, and investigate the | ||
effect of different data storage allocation layouts. To enable this | ||
variant, a suitable CUDA installation is required and the | ||
`--with-cuda` flag needs to be passed at the build stage. | ||
|
||
## Download and Installation | ||
|
||
|
@@ -231,6 +208,17 @@ cd build | |
./bin/dwarf-cloudsc-fortran 4 16384 32 # The cleaned-up Fortran | ||
./bin/dwarf-cloudsc-c 4 16384 32 # The standalone C version | ||
``` | ||
### Building on NEC SX-AURORA TSUBAS | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the VE not called TSUBAS A ? Applies also to the line below. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, VE is vector engine and NEC SX-AURORA TSUBAS is typically used as the architecture. |
||
To build on NEC SX-AURORA TSUBAS system, run the following commands | ||
|
||
```sh | ||
./cloudsc-bundle create | ||
HDF5_ROOT=HDF5-installation-PATH ./cloudsc-bundle build --arch arch/ecmwf/aurora/nec/4.0.0/ [--single-precision] [--with-mpi] --hdf5 ON --cloudsc-fortran ON --cloudsc-prototype1 OFF --verbose --log DEBUG | ||
``` | ||
|
||
Currently available `NEC ompiler/version` selections are: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "compiler" is missing a "c" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks |
||
|
||
* `nec/4.0.0 (nfort, ncc, nc++)` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add also an example invocation and approximate performance that can be expected? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I'll add it |
||
|
||
### Running on ECMWF's Atos BullSequana XH2000 | ||
|
||
|
@@ -272,27 +260,6 @@ srun bash -c "CUDA_VISIBLE_DEVICES=\$SLURM_LOCALID bin/dwarf-cloudsc-gpu-scc-hoi | |
|
||
In principle, the same should work for multi-node execution (`-N 2`, `-N 4` etc.) once interconnect issues are resolved. | ||
|
||
### GPU runs: Timing device kernels and data transfers | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please re-add this paragraph |
||
|
||
For GPU-enabled runs two internal timer results are reported: | ||
|
||
* The isolated compute time of the main compute kernel on device (where `#BLKS == 1`) | ||
* The overall time of the execution loop including data offload and copyback | ||
|
||
It is important to note that due to the nature of the kernel, data | ||
transfer overheads will dominate timings, and that most supported GPU | ||
variants aim to optimise compute kernel timings only. However, a | ||
dedicated variant `dwarf-cloudsc-gpu-scc-field` has been added to | ||
explore host-side memory pinning, which improves data transfer times | ||
and alternative data layout strategies. By default, this will allocate | ||
each array variable individually in pinned memory. A runtime flag | ||
`CLOUDSC_PACKED_STORAGE=ON` can be used to enable "packed" storage, | ||
where multiple arrays are stored in a single base allocation, eg. | ||
|
||
```sh | ||
NV_ACC_CUDA_HEAPSIZE=8G CLOUDSC_PACKED_STORAGE=ON ./bin/dwarf-cloudsc-gpu-scc-field 1 80000 128 | ||
``` | ||
|
||
## Loki transformations for CLOUDSC | ||
|
||
[Loki](https://github.com/ecmwf-ifs/loki) is an in-house developed | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Source me to get the correct configure/build/run environment | ||
|
||
# Store tracing and disable (module is *way* too verbose) | ||
{ tracing_=${-//[^x]/}; set +x; } 2>/dev/null | ||
|
||
module_load() { | ||
echo "+ module load $1" | ||
module load $1 | ||
} | ||
module_unload() { | ||
echo "+ module unload $1" | ||
module unload $1 | ||
} | ||
|
||
export FC=nfort | ||
export CC=ncc | ||
export CXX=nc++ | ||
|
||
set -x | ||
|
||
# Increase stack size to maximum | ||
ulimit -S -s unlimited | ||
|
||
# Enable floating point error trapping at run time | ||
export VE_FPE_ENABLE=DIV,INV,FOF,FUF,INE | ||
|
||
|
||
export PATH="/local/hdd/nabr/openmpi/nvhpc-nompi/20.9/bin:$PATH" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think this absolute path setting should be necessary - also, it looks like it is carried over from volta? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, I forgot to remove it |
||
|
||
# Restore tracing to stored setting | ||
if [[ -n "$tracing_" ]]; then set -x; else set +x; fi | ||
|
||
export ECBUILD_TOOLCHAIN="./toolchain.cmake" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
|
||
#################################################################### | ||
# COMPILER | ||
#################################################################### | ||
|
||
include( /opt/nec/ve/share/cmake/toolchainVE.cmake ) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Traditionally, the low-level toolchains (per compiler version) are symlink up to the |
||
|
||
|
||
set( ECBUILD_FIND_MPI ON ) | ||
|
||
#################################################################### | ||
# Enviroment Variables | ||
#################################################################### | ||
set(NMPI_ROOT /opt/nec/ve/mpi/2.23.0) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this the same on all NEC VE machines, or should this maybe be exported as an environment variable in env.sh and then picked up in the toolchain file? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it's not. I think environment variable is the better idea for it. |
||
|
||
#################################################################### | ||
# OpenMP FLAGS | ||
#################################################################### | ||
|
||
set( OpenMP_C_FLAGS "-fopenmp " ) | ||
set( OpenMP_CXX_FLAGS "-fopenmp " ) | ||
set( OpenMP_Fortran_FLAGS "-fopenmp " ) | ||
|
||
#################################################################### | ||
# OpenAcc FLAGS | ||
#################################################################### | ||
|
||
set( OpenACC_Fortran_FLAGS "-acc -ta=tesla:lineinfo,deepcopy,maxregcount:100,fastmath" ) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These look suspiciously like Nvidia GPU flags; does NEC really support OpenACC, or are these development/debug leftovers? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, they are leftovers that I forgot to remove them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These look suspiciously like Nvidia GPU flags; does NEC really support OpenACC, or are these development/debug leftovers? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it doesn't. Leftovers as well! |
||
set( OpenACC_Fortran_FLAGS "${OpenACC_Fortran_FLAGS} -Mvect=levels:6" ) | ||
set( OpenACC_Fortran_FLAGS "${OpenACC_Fortran_FLAGS} -Mconcur=levels:6" ) | ||
set( OpenACC_Fortran_FLAGS "${OpenACC_Fortran_FLAGS} -Minfo" ) | ||
|
||
#################################################################### | ||
# NEC MPI Compiler | ||
#################################################################### | ||
|
||
set(MPI_C_COMPILER ${NMPI_ROOT}/bin/mpincc CACHE FILEPATH "") | ||
set(MPI_C_INCLUDE_PATH ${NMPI_ROOT}/include CACHE FILEPATH "") | ||
set(MPI_C_LIBRARIES ${NMPI_ROOT}/lib64/ve/libmpi.a CACHE FILEPATH "") | ||
set(MPI_C_COMPILE_FLAGS "-D_MPIPP_INCLUDE" CACHE STRING "") | ||
|
||
set(MPI_CXX_COMPILER ${NMPI_ROOT}/bin/mpinc++ CACHE FILEPATH "") | ||
set(MPI_CXX_INCLUDE_PATH ${NMPI_ROOT}/include CACHE FILEPATH "") | ||
set(MPI_CXX_LIBRARIES ${NMPI_ROOT}/lib64/ve/libmpi++.a CACHE FILEPATH "") | ||
|
||
set(MPI_Fortran_COMPILER ${NMPI_ROOT}/bin/mpifort CACHE FILEPATH "") | ||
set(MPI_Fortran_INCLUDE_PATH ${NMPI_ROOT}/include CACHE FILEPATH "") | ||
set(MPI_Fortran_ADDITIONAL_INCLUDE_DIR ${NMPI_ROOT}/lib/ve/module CACHE FILEPATH "") | ||
set(MPI_Fortran_LIBRARIES ${NMPI_ROOT}/lib64/ve/libmpi.a CACHE FILEPATH "") | ||
set(MPI_Fortran_COMPILE_FLAGS "-D_MPIPP_INCLUDE" CACHE STRING "") | ||
#################################################################### | ||
# COMMON FLAGS | ||
#################################################################### | ||
|
||
set(ECBUILD_Fortran_FLAGS "-fpic") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mstack-arrays") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -fdiag-vector=3") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -fcse-after-vectorization") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-collapse ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-fusion ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-interchange ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-unroll-complete=200 ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -ftrace") | ||
###set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -fmove-loop-invariants-if ") | ||
###set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -freplace-loop-equation ") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't mind having commented out options like this lying around, but could we maybe group them, and or annotate them with comments that provide a bit of context as to why they are enabled/disabled? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, we can comment them because -ftrace enables profiling feature and -floop-unroll-complete=200 are used for optimisation purpose here. |
||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -msched-interblock ") | ||
###set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-floating-divide-instruction ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-power-to-explog ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-sqrt-instruction ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-threshold=3 ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -finline-functions ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -finline-max-depth=5 ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -finline-max-function-size=200 ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-merge-conditional ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -fivdep ") | ||
###set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-strip-mine ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -muse-mmap ") | ||
##set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-packed") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -report-all") | ||
|
||
set( ECBUILD_Fortran_FLAGS_BIT "-O4 -mvector-fma" ) | ||
|
||
set( ECBUILD_C_FLAGS "-O2 " ) | ||
|
||
set( ECBUILD_CXX_FLAGS "-O2" ) | ||
|
||
# Fix for C++ template headers needed for Serialbox | ||
set( GNU_HEADER_INCLUDE "-I/usr/local/apps/gcc/7.3.0/lib/gcc/x86_64-linux-gnu/7.3.0/include-fixed" ) | ||
set( ECBUILD_CXX_FLAGS "${ECBUILD_CXX_FLAGS} ${GNU_HEADER_INCLUDE}" ) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -37,14 +37,6 @@ set( OpenACC_Fortran_FLAGS "-acc=gpu -mp=gpu -gpu=cc80,lineinfo,fastmath" CACHE | |
# Enable this to get more detailed compiler output | ||
# set( OpenACC_Fortran_FLAGS "${OpenACC_Fortran_FLAGS} -Minfo" ) | ||
|
||
#################################################################### | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, not sure the NEC changes require us to change Nvidia configurations as such? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, defining the OpenACC flags does not cause error on NEC but they are unnecessary for this toolchain. |
||
# CUDA FLAGS | ||
#################################################################### | ||
|
||
if(NOT DEFINED CMAKE_CUDA_ARCHITECTURES) | ||
set(CMAKE_CUDA_ARCHITECTURES 80) | ||
endif() | ||
|
||
#################################################################### | ||
# COMMON FLAGS | ||
#################################################################### | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
|
||
#################################################################### | ||
# COMPILER | ||
#################################################################### | ||
|
||
include( /opt/nec/ve/share/cmake/toolchainVE.cmake ) | ||
|
||
|
||
set( ECBUILD_FIND_MPI ON ) | ||
|
||
#################################################################### | ||
# Enviroment Variables | ||
#################################################################### | ||
set(NMPI_ROOT /opt/nec/ve/mpi/2.23.0) | ||
|
||
#################################################################### | ||
# OpenMP FLAGS | ||
#################################################################### | ||
|
||
set( OpenMP_C_FLAGS "-fopenmp " ) | ||
set( OpenMP_CXX_FLAGS "-fopenmp " ) | ||
set( OpenMP_Fortran_FLAGS "-fopenmp " ) | ||
|
||
#################################################################### | ||
# OpenAcc FLAGS | ||
#################################################################### | ||
|
||
set( OpenACC_Fortran_FLAGS "-acc -ta=tesla:lineinfo,deepcopy,maxregcount:100,fastmath" ) | ||
set( OpenACC_Fortran_FLAGS "${OpenACC_Fortran_FLAGS} -Mvect=levels:6" ) | ||
set( OpenACC_Fortran_FLAGS "${OpenACC_Fortran_FLAGS} -Mconcur=levels:6" ) | ||
set( OpenACC_Fortran_FLAGS "${OpenACC_Fortran_FLAGS} -Minfo" ) | ||
|
||
#################################################################### | ||
# NEC MPI Compiler | ||
#################################################################### | ||
|
||
set(MPI_C_COMPILER ${NMPI_ROOT}/bin/mpincc CACHE FILEPATH "") | ||
set(MPI_C_INCLUDE_PATH ${NMPI_ROOT}/include CACHE FILEPATH "") | ||
set(MPI_C_LIBRARIES ${NMPI_ROOT}/lib64/ve/libmpi.a CACHE FILEPATH "") | ||
set(MPI_C_COMPILE_FLAGS "-D_MPIPP_INCLUDE" CACHE STRING "") | ||
|
||
set(MPI_CXX_COMPILER ${NMPI_ROOT}/bin/mpinc++ CACHE FILEPATH "") | ||
set(MPI_CXX_INCLUDE_PATH ${NMPI_ROOT}/include CACHE FILEPATH "") | ||
set(MPI_CXX_LIBRARIES ${NMPI_ROOT}/lib64/ve/libmpi++.a CACHE FILEPATH "") | ||
|
||
set(MPI_Fortran_COMPILER ${NMPI_ROOT}/bin/mpifort CACHE FILEPATH "") | ||
set(MPI_Fortran_INCLUDE_PATH ${NMPI_ROOT}/include CACHE FILEPATH "") | ||
set(MPI_Fortran_ADDITIONAL_INCLUDE_DIR ${NMPI_ROOT}/lib/ve/module CACHE FILEPATH "") | ||
set(MPI_Fortran_LIBRARIES ${NMPI_ROOT}/lib64/ve/libmpi.a CACHE FILEPATH "") | ||
set(MPI_Fortran_COMPILE_FLAGS "-D_MPIPP_INCLUDE" CACHE STRING "") | ||
#################################################################### | ||
# COMMON FLAGS | ||
#################################################################### | ||
|
||
set(ECBUILD_Fortran_FLAGS "-fpic") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mstack-arrays") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -fdiag-vector=3") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -fcse-after-vectorization") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-collapse ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-fusion ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-interchange ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-unroll-complete=200 ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -ftrace") | ||
###set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -fmove-loop-invariants-if ") | ||
###set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -freplace-loop-equation ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -msched-interblock ") | ||
###set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-floating-divide-instruction ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-power-to-explog ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-sqrt-instruction ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-threshold=3 ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -finline-functions ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -finline-max-depth=5 ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -finline-max-function-size=200 ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-merge-conditional ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -fivdep ") | ||
###set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-strip-mine ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -muse-mmap ") | ||
##set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -mvector-packed") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -report-all") | ||
|
||
set( ECBUILD_Fortran_FLAGS_BIT "-O4 -mvector-fma" ) | ||
|
||
set( ECBUILD_C_FLAGS "-O2 " ) | ||
|
||
set( ECBUILD_CXX_FLAGS "-O2" ) | ||
|
||
# Fix for C++ template headers needed for Serialbox | ||
set( GNU_HEADER_INCLUDE "-I/usr/local/apps/gcc/7.3.0/lib/gcc/x86_64-linux-gnu/7.3.0/include-fixed" ) | ||
set( ECBUILD_CXX_FLAGS "${ECBUILD_CXX_FLAGS} ${GNU_HEADER_INCLUDE}" ) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,18 +21,20 @@ if( HAVE_CLOUDSC_FORTRAN ) | |
dwarf_cloudsc.F90 | ||
cloudsc_driver_mod.F90 | ||
cloudsc.F90 | ||
LIBS | ||
cloudsc-common-lib | ||
DEFINITIONS ${CLOUDSC_DEFINITIONS} | ||
) | ||
|
||
target_link_libraries( dwarf-cloudsc-fortran PRIVATE cloudsc-common-lib ) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there any reason for adding There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I need to check this part. |
||
|
||
# Create symlink for the input data | ||
if( HAVE_SERIALBOX ) | ||
execute_process(COMMAND ${CMAKE_COMMAND} -E create_symlink | ||
${CMAKE_CURRENT_SOURCE_DIR}/../../data ${CMAKE_CURRENT_BINARY_DIR}/../../../data ) | ||
endif() | ||
|
||
if( HAVE_HDF5 ) | ||
target_include_directories( dwarf-cloudsc-fortran PRIVATE ${HDF5_Fortran_INCLUDE_DIRS} ) | ||
target_link_libraries( dwarf-cloudsc-fortran PRIVATE ${HDF5_LIBRARIES} ) | ||
execute_process(COMMAND ${CMAKE_COMMAND} -E create_symlink | ||
${CMAKE_CURRENT_SOURCE_DIR}/../../config-files/input.h5 ${CMAKE_CURRENT_BINARY_DIR}/../../../input.h5 ) | ||
execute_process(COMMAND ${CMAKE_COMMAND} -E create_symlink | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only affect the GNU compiler, does this really have to be removed for NEC? If it makes problems with the NEC compiler, could this be suitably guarded so it still takes effect for GNU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it makes compiling error on NEC.