Fixes in xxxxx for IEEE_DIVIDE_BY_ZERO FPE; separate cpu/gpu namespaces and fix runtest segfault#723
Merged
valassi merged 155 commits intomadgraph5:masterfrom valassi:fpeJul 21, 2023
+95,591-60,273
Commits
Commits on Jul 17, 2023
- committed
- committed
[fpe] in ggttsa cudacpp.mk, try to debug madgraph5#701 IEEE_DIVIDE_BY_ZERO (see firemodels/fds/issues/5638 on gh) with -ffpe flags
committed- committed
[fpe] in ggtt.sa testxxx.cc, enable FPE floating point exception signals to debug madgraph5#701 (see https://stackoverflow.com/a/17473528)
committed- committed
[fpe] in ggtt.sa testxxx.cc, add some context information to the FPE signal handler for madgraph5#701
committed- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
[fpe] in ggtt.sa testxxx.cc, use the same trick as for ipz/opzxxx also for 4 other special functions
committed- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
[fpe] in ggtt.sa HelAmps_sm.h, DISABLE AUTO-VECTORIZATION on the whole ixxxxx function... otherwise my fix for madgraph5#701 is ignored!
committed[fpe] in ggtt.sa HelAmps_sm.h, add a first fix for my fix of madgraph5#701 (emp_sv could give a FPE)
committed- committed
- committed
- committed
- committed
- committed
Commits on Jul 19, 2023
[fpe] in ggtt.sa cudacpp makefiles, remove -DDEBUG2 and add back -lineinfo to 'debug' flags (investigate madgraph5#725)
committed- committed
[fpe] in ggtt.sa cudacpp_src makefile, add -march=x86-64 to AVX=none flags as in SubProcesses (investigate madgraph5#725)
committed- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
[fpe] in ggtt.sa HelAmps_sm.h, REENABLE AUTO-VECTORIZATION on the whole oxxxxx/ixxxxx/vxxxxx functions as the performance hit was too high (madgraph5#727)
committed- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
[fpe] == COMPLETE MAJOR CHANGE OF STRATEGY!!! == in ggtt.sa HelAmps_sm.h, implement the latest changes to avoid FPEs madgraph5#701 without losing SIMD performance madgraph5#727
committed[fpe] rerun tput for ggtt.sa and copy the log: recover the previous performance! madgraph5#727 is fixed
committed- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Jul 20, 2023
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Jul 21, 2023
- committed
- committed
- committed
- committed
- committed
[namespace/fpe] in ggtt.sa makefiles, add 'export CUFLAGS' in SubProcesses towards src - this fixes HRDCOD=1 builds on non-SM processes madgraph5#731
committed[namespace/fpe] backport fix for madgraph5#731 (HRDCOD=1 builds in cuda of non-SM) to CODEGEN from heft_gg_h.sa
committed[fpe] regenerate gg_tt and heft_gg_h sa - all ok, differences as expected from madgraph5#730 and madgraph5#731
committed- committed
[fpe] ** COMPLETE FPE ** regenerate all 7 processes mad with fixes for madgraph5#730 and madgraph5#731
committed