Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fbridgesequence_nomultichannel #796

Merged
merged 27 commits into from
Feb 8, 2024
Merged

Add fbridgesequence_nomultichannel #796

merged 27 commits into from
Feb 8, 2024

Conversation

valassi
Copy link
Member

@valassi valassi commented Nov 23, 2023

Hi @roiser @oliviermattelaer as discussed earlier on.

This is a MR that adds fortran call fbridgesequence_nomultichannel. It should be useful in Stefan's MR for trasnsforming channelid into an array.

As noted by @roiser, the issue is that fbridgesequence is presently called also with channelid=0 from the fortran (the integer is 0, not the pointer to the integer).

        IF ( .NOT. MULTI_CHANNEL ) THEN
          CALL FBRIDGESEQUENCE(FBRIDGE_PBRIDGE, P_MULTI, ALL_G,
     &      HEL_RAND, COL_RAND, 0, OUT2,
     &      SELECTED_HEL2, SELECTED_COL2 ) ! 0: multi channel disabled

Technically it is possible from fortran to use a pointer (this is what we do for FBRIDGE_PBRIDGE via CppObjectInFortran struct), but it seems cumbersome here.

As noted by Stefan, better use a different function for the bridge sequence, this is what I have done here.

Note, initially I had misunderstood that the issue was eleiminating the #ifdef branch where multichannel is not supported. I tried this in the initial commits of this PR, but eventually this failed because it would break pure standalone code with no madevent, as there is no vcoloramps.h in that case and the code does not build. So I have also kept that anyway. Just to clarify, there are three use cases and they are all still supported

  1. multichannel code, with channelid>0 (presently a scalar, eventuallay an array via a pointer)
  2. multichannel code, with channelid=0 (presently a scalar, eventually a pointer=0 in cudacpp and a different function in fortran, i.e. this PR)
  3. no-multichannel code (no coloramps.h), is pure standalone generation

Stefan I ask you as reviewer. Up to you if you want to include this now before your PR, or cherry-pick inside your PR, or do your PR first and then we merge this on top (I can fix teh conflicts in tht case).

Thanks Andrea

…ode as "wih multichannel" - the build fails

ccache g++  -O3  -std=c++17 -I. -I../../src -Wall -Wshadow -Wextra -ffast-math  -fopenmp  -march=skylake-avx512 -mprefer-vector-width=256  -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -fPIC -c CPPProcess.cc -o CPPProcess.o
CPPProcess.cc:30:10: fatal error: coloramps.h: No such file or directory
   30 | #include "coloramps.h"
      |          ^~~~~~~~~~~~~
compilation terminated.
Revert "[mch] regenerate gg_tt.sa, "without multichannel" but with the same code as "wih multichannel" - the build fails"
This reverts commit f42c4bb.

Revert "[mch] regenerate gg_tt.mad, all ok"
This reverts commit 23f750f.

Revert "[mch] backport to codegen from gg_tt.mad the removal of #ifdef MGONGPU_SUPPORTS_MULTICHANNEL"
This reverts commit 71b6a74.

Revert "[mch] in gg_tt.mad, remove #ifdef MGONGPU_SUPPORTS_MULTICHANNEL and assume multichannel is always supported"
This reverts commit ce57dda.
./CODEGEN/generateAndCompare.sh gg_tt --mad --nopatch
git diff --no-ext-diff -R gg_tt.mad/Source/makefile gg_tt.mad/Source/dsample.f gg_tt.mad/Source/genps.inc gg_tt.mad/SubProcesses/makefile > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common
git diff --no-ext-diff -R gg_tt.mad/bin/internal/banner.py gg_tt.mad/bin/internal/gen_ximprove.py gg_tt.mad/bin/internal/madevent_interface.py >> CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common
git diff --no-ext-diff -R gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1
git checkout gg_tt.mad

NB: I have now removed make_opts from patch.common
… (will revert)

./tput/teeThroughputX.sh -ggtt -makeclean -makej
./tmad/teeMadX.sh -ggtt +10x
Revert "[mch] rerun tput and tmad tests for ggtt.mad as a cross check, all ok (will revert)"
This reverts commit a395efb.
@valassi
Copy link
Member Author

valassi commented Nov 23, 2023

For an easier review, just look at this
d5c4c1b

The rest is code generation changes, and process regeneration

valassi added a commit to valassi/madgraph4gpu that referenced this pull request Dec 9, 2023
…patch.P1 and patch.common from scratch

./CODEGEN/generateAndCompare.sh gg_tt --mad --nopatch
sed -i 's/DEFAULT_F2PY_COMPILER=f2py3.*/DEFAULT_F2PY_COMPILER=f2py3/' gg_tt.mad/Source/make_opts
git diff --no-ext-diff -R gg_tt.mad/Source/make_opts gg_tt.mad/Source/makefile gg_tt.mad/Source/dsample.f gg_tt.mad/Source/genps.inc gg_tt.mad/SubProcesses/makefile > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common
git diff --no-ext-diff -R gg_tt.mad/bin/internal/banner.py gg_tt.mad/bin/internal/gen_ximprove.py gg_tt.mad/bin/internal/madevent_interface.py >> CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common
git diff --no-ext-diff -R gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1
git checkout gg_tt.mad

NB1: In the instructions above, I changed the order of files in the patches (I moved make_opts to the beginning), to minimise differences in patch.common with respect to Stephan's.
NB2: In another branch mch (PR madgraph5#796) I had decided to remove make_opts from patch.common: this is clearly no longer possible now

Before this commit, i.e. using Stephan's patch.P1 and patch.common, this was happening:
- code generation succeeds on itscrd90, fails on itscrd80 (patchMad.sh fails on the latter)
- executing './CODEGEN/generateAndCompare.sh gg_tt  --mad --nopatch' shows that the only difference between itscrd80 and itscrd90 is f2py (f2py3.9 on the former, f2py on the latter)

After this commit, however:
- code generation fails on itscrd90 (patchMad.sh fails, probably due to f2py issues)
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Dec 14, 2023
…patch.P1 and patch.common from scratch

(This is an old commit from 9 Dec but the description is still valid on 14 Dec)

./CODEGEN/generateAndCompare.sh gg_tt --mad --nopatch
sed -i 's/DEFAULT_F2PY_COMPILER=f2py3.*/DEFAULT_F2PY_COMPILER=f2py3/' gg_tt.mad/Source/make_opts
git diff --no-ext-diff -R gg_tt.mad/Source/make_opts gg_tt.mad/Source/makefile gg_tt.mad/Source/dsample.f gg_tt.mad/Source/genps.inc gg_tt.mad/SubProcesses/makefile > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common
git diff --no-ext-diff -R gg_tt.mad/bin/internal/banner.py gg_tt.mad/bin/internal/gen_ximprove.py gg_tt.mad/bin/internal/madevent_interface.py >> CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common
git diff --no-ext-diff -R gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1
git checkout gg_tt.mad

NB1: In the instructions above, I changed the order of files in the patches (I moved make_opts to the beginning), to minimise differences in patch.common with respect to Stephan's.
NB2: In another branch mch (PR madgraph5#796) I had decided to remove make_opts from patch.common: this is clearly no longer possible now

Before this commit, i.e. using Stephan's patch.P1 and patch.common, this was happening:
- code generation succeeds on itscrd90, fails on itscrd80 (patchMad.sh fails on the latter)
- executing './CODEGEN/generateAndCompare.sh gg_tt  --mad --nopatch' shows that the only difference between itscrd80 and itscrd90 is f2py (f2py3.9 on the former, f2py on the latter)

After this commit, however:
- code generation fails on itscrd90 (patchMad.sh fails, probably due to f2py issues)
@valassi
Copy link
Member Author

valassi commented Dec 16, 2023

I have just merged #706, I will modify this PR to merge in the new master

…adgraph5#706) into mch

Fix conflicts in CODEGEN logs by checking them out from upstream/master
git checkout upstream/master $(git ls-tree --name-only HEAD */CODEGEN*txt)
…pstream/master including PR madgraph5#706) - ok, changes are only in codegen logs
… common ancestor

git merge-base --fork-point upstream/master
80ff716

git checkout 80ff716 $(git ls-tree --name-only HEAD */CODEGEN*txt)
STARTED  AT Sat Feb  3 06:00:52 PM CET 2024
./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -gqttq -ggttggg -makeclean
ENDED(1) AT Sat Feb  3 06:31:47 PM CET 2024 [Status=0]
./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean
ENDED(2) AT Sat Feb  3 06:41:48 PM CET 2024 [Status=0]
./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -gqttq -ggttgg -ggttggg -flt -bridge -makeclean
ENDED(3) AT Sat Feb  3 06:51:54 PM CET 2024 [Status=0]
./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst
ENDED(4) AT Sat Feb  3 06:55:15 PM CET 2024 [Status=0]
./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst
ENDED(5) AT Sat Feb  3 06:58:37 PM CET 2024 [Status=0]
…nd maybe more) ** rerun 18 tmad tests on itscrd90, all ok

STARTED AT Sat Feb  3 07:02:02 PM CET 2024
ENDED   AT Sat Feb  3 11:20:09 PM CET 2024

Status=0

24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_eemumu_mad/log_eemumu_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttggg_mad/log_ggttggg_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttgg_mad/log_ggttgg_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggttg_mad/log_ggttg_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_ggtt_mad/log_ggtt_mad_m_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_d_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_f_inl0_hrd0.txt
24 /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp/tmad/logs_gqttq_mad/log_gqttq_mad_m_inl0_hrd0.txt
@valassi
Copy link
Member Author

valassi commented Feb 5, 2024

Hi @oliviermattelaer this PR is also complete. It is one small ingredient of the work you are doing with @roiser to have many channels on the same grid. I would suggest putting this in independently of Stefan's work and doing it now. Otherwise every time there is a need to update to the latest master, retest etc.

Can you please @oliviermattelaer review and approve for merge?

As mentioned above the main change is just this one d5c4c1b which is really simple

Thanks Andrea

@roiser
Copy link
Member

roiser commented Feb 5, 2024

Actually with the latest changes that we need to make (no channel ids for CPU) I would propose to postpone this PR a bit if you don't mind. I haven't thought it through yet what the implications are.

@valassi
Copy link
Member Author

valassi commented Feb 5, 2024

Thanks @roiser. No I would very much prefer to merge this now. Having all these PRs in flight forces me to spend a large effort fixing conflicts and retesting. The sooner we get rid of open PRs so that we only have a more manageable number, the better.

Concerning this specific PR: it is so simple that it should not be a problem in the work you are doing now (and actually you asked for this initially so it cannot be bad!). It just adds a new function for the no-multichannel case and uses it where appropriate, in helicity filtering.

I would really merge this now. @oliviermattelaer what's your take?

Thanks Andrea

@valassi
Copy link
Member Author

valassi commented Feb 5, 2024

PS The fact that this now appears as not-mergeable is because I just merged PR #368. Probably easy conflicts to fix, but again I would rather not have to do this all the time. The sooner we merge this the better.

@roiser
Copy link
Member

roiser commented Feb 5, 2024

I'm just asking for postponing it by one or two days, the latest incompatibilities only came up on Friday, I can also incorporate it into the bigger channelID work if wanted.

@valassi
Copy link
Member Author

valassi commented Feb 5, 2024

I'm just asking for postponing it by one or two days, the latest incompatibilities only came up on Friday,

Ok for one or two days then...

I can also incorporate it into the bigger channelID work if wanted.

I would still find it cleaner to have separate PRs for separate issues - in general. If you want I can reduce this to essentially one cherry-pick in the CODEGEN part, and hide away all changes in generated code and test logs, so that it's easier to merge. It becomes equivalent to a cherry-pick on your side.

@@ -56,6 +56,31 @@ C
END SUBROUTINE FBRIDGESEQUENCE
END INTERFACE

C
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot leave a comment in code that was not changed, but I would propose in the line 52 above to change to

INTEGER*4 CHANID(*)

I have done this change in the new_interface_wrap branch, see

and this seems to work correctly.

Copy link
Member Author

@valassi valassi Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @roiser thanks for having a look.

I think these are two different issues:

  • in my PR here, channelid is still a scalar, and so this must be CHANID and not CHANID(*)
  • in your upcoming PR, you will indeed need to change CHANID to CHANID(*)

Specifically, if I simply change myself CHANID to CHANID(*). then this fails the build

ccache /cvmfs/sft.cern.ch/lcg/releases/gcc/12.1.0-57c96/x86_64-centos9/bin/gfortran -w -fPIC -O3 -ffast-math -fbounds-check -ffixed-line-length-132 -w -cpp -c -DMG5AMC_MEEXPORTER_CUDACPP auto_dsig1.f -I../../Source/ -fopenmp -o auto_dsig1_cudacpp.o
auto_dsig1.f:573:32:

  573 |      &      HEL_RAND, COL_RAND, CHANNEL, OUT2,
      |                                1
Error: Rank mismatch in argument 'chanid' at (1) (rank-1 and scalar)

I would suggest to keep this in two different PRs

  • here I just add the new method and keep CHANID
  • in your PR you will then change CHANID to CHANID(*)

Does this sound ok? Please let me know if you think this can be merged @roiser thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this PR emerged out of the work for channel id arrays. If I do not remember correctly and it was done for another purpose please merge this here and I would adapt then with the later channel id PR if needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it was indeed done for that purpose. But it does not harm to include it anyway (it does simplify one function call without multichannel support). I would include it anyway now so that there is no need for further conflict fixes etc. It makes completely sense by itself and was fully tested by itself.

@roiser
Copy link
Member

roiser commented Feb 6, 2024

I'm just asking for postponing it by one or two days, the latest incompatibilities only came up on Friday,

Ok for one or two days then...

I looked through the code and found one possibly minor issue, please have a look at the comment above. Otherwise I imagine this should work. @oliviermattelaer you would need to pass me an INTEGER*4 (*) in cases the channel id is used. Following our latest discussion IIUC for CPU this would be only of length 1, is this possible for you to distinguish?

…se merging of upstream/master here (and later to ease the merge of master into Stefan's work)

git checkout upstream/master tput/logs* tmad/logs*
git checkout upstream/master $(git ls-tree --name-only HEAD */CODEGEN*txt)
@roiser
Copy link
Member

roiser commented Feb 6, 2024

please go ahead then, thanks Stefan

…ion of FP vector by a boolean mask (for channel array madgraph5#765)
…nstrate multiplication of FP vector by a boolean mask (for channel array madgraph5#765)
… testmisc.cc

Checked that code generation of gg_tt.mad is now stable
for f in $(git ls-tree --name-only HEAD */SubProcesses/testmisc.cc); do \cp gg_tt.mad/SubProcesses/testmisc.cc $f; done
… ** improve a comment in CODEGEN and generated code for testmisc.cc (boolean vector times float vector, not integer vector times float vector)
@valassi
Copy link
Member Author

valassi commented Feb 8, 2024

please go ahead then, thanks Stefan

Thanks I will merge then

@valassi valassi merged commit 57c1ba6 into madgraph5:master Feb 8, 2024
57 checks passed
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Feb 9, 2024
…am/master' (including fbridgesequence_nomultichannel PR madgraph5#796) into rocrand

Checked that ggtt.mad can be regenerated correctly an dthat its tput/tmad tests succeed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants