-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Val june24 fixed ci #981
Val june24 fixed ci #981
Conversation
…6 builds #904 (disabling OMP only for clang16; add -no-pie for fcheck_cpp.exe)
…7 builds #904 (disable OMP also for clang17)
…move link-time -no-pie, add compiler-time -fPIC to fortran
…nd.h (BSD license) to detect when running on valgrind #906 This is needed as part of the fixes for runTest.exe #903, preliminary to #896 Note: the header as-is is copied from /cvmfs/sft.cern.ch/lcg/releases/valgrind/3.23.0-24262/x86_64-el9-gcc11-opt/ (except for the inclusion of "clang-format off" directives) See https://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.clientreq
…rehensive fixes and debug printouts for bug #903 (recursive iteration, stack overflow, segfault etc)
…target (address sanitizer #207), but keep it commented out
… (I had forgotten it enabled)
…st.cc, testxxx.cc: simplify gtest templates, remove cudaDeviceReset to fix #907, complete preparation of two-test infrastructure #896 More in detail: - move to the simplest "TEST(" use case of Google tests in MadgraphTest.h and runTest.cc (remove unnecessary levels of templating) - move gpuDeviceReset() to an atexit function of main in testxxx and comment it out anyway, to fix the segfaults #907 (eventually it may be necessary to remove all CUDA API calls from destructors, if ever we need to put this back in) - in runTest.cc, complete a proff of concept for adding two separate tests (without/with multichannel #896) Fix some clang formatting issues with respect to the last gg_tt.mad
…ng PR #905, constexpr_math.h PR #908 and runTest/cudaDeviceReset PR #909 Add valgrind.h and its symlink in the repo for gg_tt.mad The new runTest.cc template now has a (commented out) proof of concept for including two tests (with/without multichannel) #896, I will resume from there After building bldall, the following succeeds for bck in none sse4 avx2 512y 512z cuda; do echo $bck; ./build.${bck}_d_inl0_hrd0/runTest_*.exe; done This instead is crashing (again?) for some AVX values for bck in none sse4 avx2 512y 512z cuda; do echo $bck; valgrind ./build.${bck}_d_inl0_hrd0/runTest_*.exe; done On closer inspection, this is because valgrind does not support AVX512, so this is ok
…th/without multichannel #896 into the latest regenerated with fixes Revert "[june24] in gg_tt.mad, temporarely go back to the last code regeneration, removing the attempts to add two tests #896" This reverts commit 7ef597f. Fix conflicts: epochX/cudacpp/gg_tt.mad/SubProcesses/runTest.cc OK! Now the test runs, but nomultichannel succeeds, while multichannel fails as the reference ME is wrong! This is now back on track, must create a second reference file, then add the actual channelid filling of warps...
…channel and <file.txt2> as ref with multichannel
…unTest (use cuda/double as the reference platform) CUDACPP_RUNTEST_DUMPEVENTS=1 ./runTest_cuda.exe Rerunning all tests then succeeds (but the channelid array is constant in all values for the moment...) for bck in none sse4 avx2 512y 512z cuda; do echo $bck; ./build.${bck}_d_inl0_hrd0/runTest_*.exe; done
…th/without multichannel #896; use <file.txt> as ref without multichannel and <file.txt2> as ref with multichannel
…ate txt ref for runTest (use cuda/double as the reference platform) CUDACPP_RUNTEST_DUMPEVENTS=1 ./runTest_cuda.exe \cp ../../test/ref/dump* ../../../CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/test/ref/ Rerunning all tests then succeeds (but the channelid array is constant in all values for the moment...)
…channelid debugging)
…onstexpr FIXME? #910 this is a third different expression for the number of diagrams, should sanity checks for internal consistency...
….cc, move the dumpSignallingFPEs() call to the base clas dtor, add debug printouts commented out
…tations from .h to .cc, move the dumpSignallingFPEs() call to the base clas dtor, add debug printouts commented out
…ebug printouts if the code is compiled with 'make MG5AMC_CHANNELID_DEBUG=1' FIXME? Note that MEKDevice takes a device channelid array, it would be easier if this was always a host array and MEKD managed the copy?
…add channelid debug printouts if the code is compiled with 'make MG5AMC_CHANNELID_DEBUG=1' FIXME? Note that MEKDevice takes a device channelid array, it would be easier if this was always a host array and MEKD managed the copy?
…els 1,2,3,1,2,3... for different events (previously it was 1 for all events) NB1: the cuda test now fails, the reference file must be recreated NB2: I expect the SIMD tests to fail using the CUDA reference, due to the different bugs in the current channelId implementation NB3: eventually #898 the implementation should enforce that all events in a warp use the same channelid
…channel test #896 to use channels 1,2,3,1,2,3... for different events (previously it was 1 for all events) NB1: the cuda test now fails, the reference file must be recreated NB2: I expect the SIMD tests to fail using the CUDA reference, due to the different bugs in the current channelId implementation NB3: eventually #898 the implementation should enforce that all events in a warp use the same channelid
…ate txt ref for runTest (use cuda/double as the reference platform) CUDACPP_RUNTEST_DUMPEVENTS=1 ./runTest_cuda.exe \cp ../../test/ref/dump* ../../../CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/test/ref/ NB: the CUDA test succeeds with the new reference files, but the C++ multichannel test #896 fails due to bugs #894 and #899
…ization of m_hstChannelIds
…ds (introduced in 55b3e74): I prefer that users get and report an error if there is something wrong here...
…the previous patch
…asier merging git checkout upstream/master $(git ls-tree --name-only HEAD tmad/logs* tput/logs*)
…ier merging git checkout upstream/master $(git ls-tree --name-only upstream/master */CODEGEN*txt)
… gg_tt.mad, to ease merging and conflict resolution (From the cudacpp directory) git checkout upstream/master $(git ls-tree --name-only upstream/master *.mad *.sa | grep -v ^gg_tt.mad)
…, nvcc #966) into june24 Fix conflicts: epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1 epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/counters.cc epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/gpu/fbridge.cc epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f epochX/cudacpp/gg_tt.mad/SubProcesses/counters.cc epochX/cudacpp/gg_tt.mad/SubProcesses/fbridge.cc NB: here I essentially fixed gg_tt.mad, not CODEGEN, which will need to be adjusted a posteriori with a backport In particular: - Note1: patch.P1 is now taken from june24, but will need to be recomputed git checkout HEAD CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1 - Note2: I need to manually port some upstream/master changes in auto_dsig1.f to smatrix_multi.f, which did not yet exist
…sig1.f changes in the latest upstream/master merge
…'call counters_' to uppercase 'CALL COUNTERS_'...
… double space before '!' comments in fortran to please the MG formatter...
…to_dsig1.f after merging upstream/master
… upstream/master Only patch.P1 changes: in practice, the only three changes are the removal of counters_smatrix1_start/stop calls. Note that auto_dsig1.f can still be kept out of patching The only files that still need to be patched are - 3 in patch.common: Source/makefile, Source/genps.inc, SubProcesses/makefile - 2 in patch.P1: driver.f, matrix1.f ./CODEGEN/generateAndCompare.sh gg_tt --mad --nopatch git diff --no-ext-diff -R gg_tt.mad/Source/makefile gg_tt.mad/Source/genps.inc gg_tt.mad/SubProcesses/makefile > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common git diff --no-ext-diff -R gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1 git checkout gg_tt.mad
STARTED AT Wed Aug 21 08:07:41 PM CEST 2024 ./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean ENDED(1) AT Wed Aug 21 08:45:12 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean ENDED(2) AT Wed Aug 21 08:55:06 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean ENDED(3) AT Wed Aug 21 09:04:04 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst ENDED(4) AT Wed Aug 21 09:06:49 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst ENDED(5) AT Wed Aug 21 09:09:32 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -common ENDED(6) AT Wed Aug 21 09:12:19 PM CEST 2024 [Status=0] ./tput/teeThroughputX.sh -mix -hrd -makej -susyggtt -susyggt1t1 -smeftggtttt -heftggbb -makeclean ENDED(7) AT Wed Aug 21 09:32:51 PM CEST 2024 [Status=0] No errors found in logs eemumu MEK (channelid array) processed 512 events across 2 channels { 1 : 256, 2 : 256 } eemumu MEK (no multichannel) processed 512 events across 2 channels { no-multichannel : 512 } ggttggg MEK (channelid array) processed 512 events across 1240 channels { 1 : 32, 2 : 32, 4 : 32, 5 : 32, 7 : 32, 8 : 32, 14 : 32, 15 : 32, 16 : 32, 18 : 32, 19 : 32, 20 : 32, 22 : 32, 23 : 32, 24 : 32, 26 : 32 } ggttggg MEK (no multichannel) processed 512 events across 1240 channels { no-multichannel : 512 } ggttgg MEK (channelid array) processed 512 events across 123 channels { 2 : 32, 3 : 32, 4 : 32, 5 : 32, 6 : 32, 7 : 32, 8 : 32, 9 : 32, 10 : 32, 11 : 32, 12 : 32, 13 : 32, 14 : 32, 15 : 32, 16 : 32, 17 : 32 } ggttgg MEK (no multichannel) processed 512 events across 123 channels { no-multichannel : 512 } ggttg MEK (channelid array) processed 512 events across 16 channels { 1 : 64, 2 : 32, 3 : 32, 4 : 32, 5 : 32, 6 : 32, 7 : 32, 8 : 32, 9 : 32, 10 : 32, 11 : 32, 12 : 32, 13 : 32, 14 : 32, 15 : 32 } ggttg MEK (no multichannel) processed 512 events across 16 channels { no-multichannel : 512 } ggtt MEK (channelid array) processed 512 events across 3 channels { 1 : 192, 2 : 160, 3 : 160 } ggtt MEK (no multichannel) processed 512 events across 3 channels { no-multichannel : 512 } gqttq MEK (channelid array) processed 512 events across 5 channels { 1 : 128, 2 : 96, 3 : 96, 4 : 96, 5 : 96 } gqttq MEK (no multichannel) processed 512 events across 5 channels { no-multichannel : 512 } heftggbb MEK (channelid array) processed 512 events across 4 channels { 1 : 128, 2 : 128, 3 : 128, 4 : 128 } heftggbb MEK (no multichannel) processed 512 events across 4 channels { no-multichannel : 512 } smeftggtttt MEK (channelid array) processed 512 events across 72 channels { 1 : 32, 2 : 32, 3 : 32, 4 : 32, 5 : 32, 6 : 32, 7 : 32, 8 : 32, 9 : 32, 10 : 32, 11 : 32, 12 : 32, 13 : 32, 14 : 32, 15 : 32, 16 : 32 } smeftggtttt MEK (no multichannel) processed 512 events across 72 channels { no-multichannel : 512 } susyggt1t1 MEK (channelid array) processed 512 events across 6 channels { 2 : 128, 3 : 96, 4 : 96, 5 : 96, 6 : 96 } susyggt1t1 MEK (no multichannel) processed 512 events across 6 channels { no-multichannel : 512 } susyggtt MEK (channelid array) processed 512 events across 3 channels { 1 : 192, 2 : 160, 3 : 160 } susyggtt MEK (no multichannel) processed 512 events across 3 channels { no-multichannel : 512 }
…e24 branch - everything ok STARTED AT Wed Aug 21 11:17:50 PM CEST 2024 (SM tests) ENDED(1) AT Thu Aug 22 03:22:15 AM CEST 2024 [Status=0] (BSM tests) ENDED(1) AT Thu Aug 22 03:33:50 AM CEST 2024 [Status=0] 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_d_inl0_hrd0.txt 1 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_heftggbb_mad/log_heftggbb_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_smeftggtttt_mad/log_smeftggtttt_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggt1t1_mad/log_susyggt1t1_mad_m_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_d_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_f_inl0_hrd0.txt 24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_susyggtt_mad/log_susyggtt_mad_m_inl0_hrd0.txt eemumu MEK processed 8192 events across 2 channels { 1 : 8192 } eemumu MEK processed 90112 events across 2 channels { 1 : 90112 } ggttggg MEK processed 8192 events across 1240 channels { 1 : 8192 } ggttggg MEK processed 90112 events across 1240 channels { 1 : 90112 } ggttgg MEK processed 8192 events across 123 channels { 112 : 8192 } ggttgg MEK processed 90112 events across 123 channels { 112 : 90112 } ggttg MEK processed 8192 events across 16 channels { 1 : 8192 } ggttg MEK processed 90112 events across 16 channels { 1 : 90112 } ggtt MEK processed 8192 events across 3 channels { 1 : 8192 } ggtt MEK processed 90112 events across 3 channels { 1 : 90112 } gqttq MEK processed 8192 events across 5 channels { 1 : 8192 } gqttq MEK processed 90112 events across 5 channels { 1 : 90112 } heftggbb MEK processed 8192 events across 4 channels { 1 : 8192 } heftggbb MEK processed 90112 events across 4 channels { 1 : 90112 } smeftggtttt MEK processed 8192 events across 72 channels { 1 : 8192 } smeftggtttt MEK processed 90112 events across 72 channels { 1 : 90112 } susyggt1t1 MEK processed 8192 events across 6 channels { 3 : 8192 } susyggt1t1 MEK processed 90112 events across 6 channels { 3 : 90112 } susyggtt MEK processed 8192 events across 3 channels { 1 : 8192 } susyggtt MEK processed 90112 events across 3 channels { 1 : 90112 }
… and put ee_mumua to the sde=1 cross-section
@Andrea, we can decide what to do with this PR/branch But this include the CI (his fixed) and "your" version of june24. So now I'm finally in business to work on warp_used part. |
Hi @oliviermattelaer I have fixed instead my june24 branch in #882. I would propose to close this #981 and focus instead on #882. Tomorrow morning after I have run some tests we should be able to merge that. Can I close this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to close this and focus on #882
Hi @oliviermattelaer as discussed: this is a duplicate of #882 (and it is not up to date), so I am closing this. I am about to merge #882 into master_june24 instead, and will then merge master_june24 into master in #985. CLOSING. |
This is in order to have the full diff when combining andrea version of june24 with the proper ci.