-
Notifications
You must be signed in to change notification settings - Fork 145
Conference call notes 20180228
Kenneth Hoste edited this page Feb 28, 2018
·
3 revisions
(back to Conference calls)
Notes on the 96th EasyBuild conference call, Wednesday February 28th 2018 (5pm - 6pm CET)
Alphabetical list of attendees (9):
- Damian Alvarez (JSC, Germany)
- Fotis Georgatos (Illumina, UK)
- Balazs Hajgato (Free University of Brussels)
- Victor Holanda (CSCS)
- Kenneth Hoste (HPC-UGent)
- Adam Huffman (Big Data Institute, University of Oxford)
- Alan O'Cais (JSC, Germany)
- Åke Sangren (Umeå University, Sweden)
- Davide Vanzo (Vanderbilt University)
- update on upcoming EasyBuild v3.5.2 release
- (very) early outlook to EasyBuild v3.6.0
- best practices on Intel Skylake systems
- Q&A
- TensorFlow easyblock
- very important to build from source on CPU-only
- virtually no performance gain on GPUs compared to binary release (K80, P100)
- may be a different story on Volta GPUs (Adam can test this?)
- ETA: end of this week
- Victor: -ftree-vectorize by default (https://github.com/easybuilders/easybuild-framework/pull/2388)
- not in v3.5.2, maybe in 3.6.0
- major feature: Singularity integration (https://github.com/easybuilders/easybuild-framework/pull/2332)
- will be under --experimental
- ETA: mid April
- -ftree-vectorize: https://github.com/easybuilders/easybuild-framework/pull/2388
- Balazs
- building software on Skylake with different versions of toolchain (except for intel/2018a)
- intel/2016b, intel/2017a, intel/2017b
- heterogenous setup
- problems
- sometimes compiler gets stuck in infinite loop when building on Skylake (cfr. https://github.com/easybuilders/easybuild-easyconfigs/pull/5915)
- can be fixed with forcing AVX2 or -O1
- Intel Compiler Error (ICE) when building with -O2 on Skylake
- can be fixed with forcing AVX2 or -O1
- sometimes compilation works, but resulting build produces NaN values (both with foss & intel)
- occurs with TensorFlow (bug in TF, not in compiler), also occurs with intel/2018a (Åke)
- can be fixed with -march=native -mno-avx512f (when using foss/2017b + TF 1.4/1.5)
- forcing AVX2 resulted in TF complaining about not using AVX512
- cfr. https://github.com/easybuilders/easybuild-easyconfigs/issues/5936
- compilation issues may be fixed with intel/2018a
- sometimes compiler gets stuck in infinite loop when building on Skylake (cfr. https://github.com/easybuilders/easybuild-easyconfigs/pull/5915)
- recommendation is to use */2018a on Skylake if possible
- Victor: GCC 5.4 in foss/2016b may not support AVX-512 yet (same for icc in intel/2016b?)
- Damian: GCC 5.3/5.4 supports AVX-512 already, but maybe the binutils is the problem?
- Victor: GCC 5.4 in foss/2016b may not support AVX-512 yet (same for icc in intel/2016b?)
- building software on Skylake with different versions of toolchain (except for intel/2018a)
- Molpro & VASP: can't compile even with AVX2, produces bogus results
- Åke: VASP works fine even with AVX2 & AVX512
- ScaLAPACK provided by MKL is a problem, should use netlib ScaLAPACK or even OpenBLAS in some cases
- only way to pick an installation that produces right results is to test...
- code usually reports problems when it notices something 'off', sometimes OK if it doesn't complain
- main regression suite used is from Peter Larsson (see https://github.com/egplar/vasptest)
- good starting point to test, results are scientifically correct
- Victor (CSCS): also fine with AVX2/AVX512, in-house regtest doesn't show problems
- or actually no :)
- similar issues with related software: official binary produces wrong results on Skylake
- Åke: VASP works fine even with AVX2 & AVX512
- can be starting point for a "Best practices" document for building software on Skylake systems...
- Davide: fosscuda based on goolfc
- fosscuda support https://github.com/easybuilders/easybuild-framework/pull/2253
- goolfc/2016* and forwards matches foss/2016*
- see e.g. goolfc/2017b (https://github.com/easybuilders/easybuild-easyconfigs/pull/5768)
- goolfc/2018a could be renamed to fosscuda/2018a?
- keeping fosscuda & foss aligned may become an issue when GCC in foss is bumped to 7.x
- easy way to deal with this could be to just skip a fosscuda 'common' version