-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request to add toolchain for NEC #53
base: develop
Are you sure you want to change the base?
Conversation
Fatemeh Pouyan seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Hi, first of all, many thanks for this contribution; this looks like a very valuable addition to our collection of architectures, and the numbers seem very impressive. We have no facilities right now to automatically test this, but we would very happily accept the environment as a reference implementation. For the contribution itself, I've flagged a few inconsistencies, that simply seem like debug/dev leftovers; but it looks like a small clean-up effort. I've also noted that the base and target branch are @PouyanLaleh You'll also need to sing the CLA (fussyduck). I'll put this into "changes requested" for now, but once the above is addressed and the license is signed, I'll enable the CI runners to get this merged quickly. And thanks again for a fantastic first contribution. 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, many thanks. I've flagged a few things that need cleaning up before merging, and we will unfortunately be able to auto-test this, but once the clean-ups are addressed we should be able to merge this.
# OpenAcc FLAGS | ||
#################################################################### | ||
|
||
set( OpenACC_Fortran_FLAGS "-acc -ta=tesla:lineinfo,deepcopy,maxregcount:100,fastmath" ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These look suspiciously like Nvidia GPU flags; does NEC really support OpenACC, or are these development/debug leftovers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, they are leftovers that I forgot to remove them.
# OpenAcc FLAGS | ||
#################################################################### | ||
|
||
set( OpenACC_Fortran_FLAGS "-acc -ta=tesla:lineinfo,deepcopy,maxregcount:100,fastmath" ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These look suspiciously like Nvidia GPU flags; does NEC really support OpenACC, or are these development/debug leftovers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't. Leftovers as well!
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -floop-unroll-complete=200 ") | ||
set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -ftrace") | ||
###set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -fmove-loop-invariants-if ") | ||
###set(ECBUILD_Fortran_FLAGS "${ECBUILD_Fortran_FLAGS} -freplace-loop-equation ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind having commented out options like this lying around, but could we maybe group them, and or annotate them with comments that provide a bit of context as to why they are enabled/disabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can comment them because -ftrace enables profiling feature and -floop-unroll-complete=200 are used for optimisation purpose here.
@@ -37,14 +37,6 @@ set( OpenACC_Fortran_FLAGS "-acc=gpu -mp=gpu -gpu=cc80,lineinfo,fastmath" CACHE | |||
# Enable this to get more detailed compiler output | |||
# set( OpenACC_Fortran_FLAGS "${OpenACC_Fortran_FLAGS} -Minfo" ) | |||
|
|||
#################################################################### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, not sure the NEC changes require us to change Nvidia configurations as such?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, defining the OpenACC flags does not cause error on NEC but they are unnecessary for this toolchain.
# COMPILER | ||
#################################################################### | ||
|
||
include( /opt/nec/ve/share/cmake/toolchainVE.cmake ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Traditionally, the low-level toolchains (per compiler version) are symlink up to the arch/toolchains/<arch-toolchain>-.cmake
files higher up, for re-use?
@@ -60,29 +60,6 @@ Balthasar Reuter ([email protected]) | |||
move parameter structures to constant memory. To enable this variant, | |||
a suitable CUDA installation is required and the `--with-cuda` flag | |||
needs to be passed at the build stage. | |||
- **dwarf-cloudsc-gpu-scc-cuf-k-caching**: GPU-enabled and further |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this here was deleted by accident? Or was this intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd also be great, if you could add a small section (or bullet point) to the README to tell users about the availability of the NEC variant. Probably a small paragraph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, the deleted part was not intentional.
I thought I already added how to compile on NEC in the README file. I'll check that again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks, it's great to have the NEC toolchain with the promising performance numbers in the repo. Can only parrot what Michael has said already, but left a few extra questions.
Just a remark on the CLA: You may have to add the email address that you used for the commits to your Github account to be able to sign the CLA.
@@ -19,7 +19,7 @@ project( dwarf-p-cloudsc LANGUAGES C Fortran ) | |||
|
|||
include( cmake/compat.cmake ) | |||
if( CMAKE_Fortran_COMPILER_ID MATCHES "GNU") | |||
ecbuild_add_fortran_flags("-ffree-line-length-none") | |||
# ecbuild_add_fortran_flags("-ffree-line-length-none") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only affect the GNU compiler, does this really have to be removed for NEC? If it makes problems with the NEC compiler, could this be suitably guarded so it still takes effect for GNU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it makes compiling error on NEC.
@@ -231,6 +208,17 @@ cd build | |||
./bin/dwarf-cloudsc-fortran 4 16384 32 # The cleaned-up Fortran | |||
./bin/dwarf-cloudsc-c 4 16384 32 # The standalone C version | |||
``` | |||
### Building on NEC SX-AURORA TSUBAS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the VE not called TSUBAS A ? Applies also to the line below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, VE is vector engine and NEC SX-AURORA TSUBAS is typically used as the architecture.
HDF5_ROOT=HDF5-installation-PATH ./cloudsc-bundle build --arch arch/ecmwf/aurora/nec/4.0.0/ [--single-precision] [--with-mpi] --hdf5 ON --cloudsc-fortran ON --cloudsc-prototype1 OFF --verbose --log DEBUG | ||
``` | ||
|
||
Currently available `NEC ompiler/version` selections are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"compiler" is missing a "c"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
@@ -272,27 +260,6 @@ srun bash -c "CUDA_VISIBLE_DEVICES=\$SLURM_LOCALID bin/dwarf-cloudsc-gpu-scc-hoi | |||
|
|||
In principle, the same should work for multi-node execution (`-N 2`, `-N 4` etc.) once interconnect issues are resolved. | |||
|
|||
### GPU runs: Timing device kernels and data transfers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please re-add this paragraph
|
||
Currently available `NEC ompiler/version` selections are: | ||
|
||
* `nec/4.0.0 (nfort, ncc, nc++)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add also an example invocation and approximate performance that can be expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll add it
export VE_FPE_ENABLE=DIV,INV,FOF,FUF,INE | ||
|
||
|
||
export PATH="/local/hdd/nabr/openmpi/nvhpc-nompi/20.9/bin:$PATH" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this absolute path setting should be necessary - also, it looks like it is carried over from volta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I forgot to remove it
#################################################################### | ||
# Enviroment Variables | ||
#################################################################### | ||
set(NMPI_ROOT /opt/nec/ve/mpi/2.23.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the same on all NEC VE machines, or should this maybe be exported as an environment variable in env.sh and then picked up in the toolchain file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not. I think environment variable is the better idea for it.
DEFINITIONS ${CLOUDSC_DEFINITIONS} | ||
) | ||
|
||
target_link_libraries( dwarf-cloudsc-fortran PRIVATE cloudsc-common-lib ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason for adding target_link_libraries
, target_include_directories
separately instead of pointing here to the cloudsc-common-lib
in LIBS
, as it was done before? That way, include and link settings propagate automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to check this part.
Hello,
The toolchain for cross-compiling on the NEC system is added. The READM file is modified to instruct compiling for NEC as well. We applied minor changes in 2 cmake files to avoid compilation errors on NEC. With this pull request, nothing is modified in the code.
Thanks,
Laleh