Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New interface #764

Merged
merged 15 commits into from
Sep 15, 2023
Merged

New interface #764

merged 15 commits into from
Sep 15, 2023

Conversation

oliviermattelaer
Copy link
Member

This includes various bug fix but also a starting point for the new interface
including

  • the possibility to use the launch command within MG5aMC
  • the possibility to control the backend via the run_card
  • the possibility to control vector_size via the run_card (recompilation correctly handle)

@oliviermattelaer
Copy link
Member Author

Looks like the update of the master branch by @hageboeck creates an issue with this PR.
I'm now having this crash:

	The compilation fails with the following output message:
	    make: *** No rule to make target `madevent_cpp_link'.  Stop.

I'm investigating it (so sorry for having open the PR too early)

@oliviermattelaer
Copy link
Member Author

oliviermattelaer commented Sep 12, 2023

With this latest fix and on my mac.

  • tt~(0-2)j is working
    • from the launch commmand
    • from the ./bin/generate_events (but important to correctly set the run_card (and the first time) ! (in particular vector_size!)
  • DY 0+1j
    • from the launch commmand
    • from the ./bin/generate_events
  • DY 0+1+2j:
    • issue with "Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG IEEE_DIVIDE_BY_ZERO"

Improvment required (but let's not wait for those):

  • improve the handling of the vector_size in the run_card such that the code does not complains if the run_card does not defines it
  • checking recompilation if backend is changed
  • checking recompilation if compilation flag are updated
  • cudacpp backend does not appear in the default run_card (need to think how this would be handle)
  • define way to have shortcut for the complex option and better default value from the run_card (especially if parameter are not defined) (like for SDE_strategy)

So note that it is currently MANDATORRY to add within the run_card the definition of vector_size (otherwise it is set back to one and the code crash)

@oliviermattelaer
Copy link
Member Author

nice user interface are now available for the users:
Note that previous command is STILL working (so this should not break anything)
but here is the new command

generate p p > t t~
output madevent_simd MYPROC
launch

(which is equivalent to:

generate p p > t t~
output MYPROC --hel_recycling=False --vector_size=16 --me_exporter=standalone_cudacpp
launch

and for GPU

generate p p > t t~
output madevent_gpu MYPROC
launch

which has the same meaning (but a default vector_size=2048)

@roiser
Copy link
Member

roiser commented Sep 13, 2023

This is really great !!! A huge simplification, thanks a lot

I was just wondering if

which has the same meaning (but a default vector_size=2048)

this vector_size is enough? Didn't we run with 16384 in the past, in the sense that we need to fill the GPU with events. The open question is maybe if there is a "physics case" for doing this?

@hageboeck
Copy link
Member

which has the same meaning (but a default vector_size=2048)

this vector_size is enough? Didn't we run with 16384 in the past, in the sense that we need to fill the GPU with events. The open question is maybe if there is a "physics case" for doing this?

It sounds a bit small to me. The sweet spot for gg-->ttgg(g) was as 8k and 16k, but I don't know this number for other processes.

@oliviermattelaer
Copy link
Member Author

Ok will change to 16k.
But this worries me since this is too large to avoid bias in the phase-space.

I guess that this means that we have to change the bridge such that the channel can either be set event by event or wrap by wrap. Would you be able to do that?

@oliviermattelaer
Copy link
Member Author

oliviermattelaer commented Sep 14, 2023

So I have updated it to 16k.

and started to work on a branch which will fix my worry about bias:
(where vector_size will be "small" (< 100) and a second number (nb_wrap) could be large (=512 for gpu)
https://github.com/madgraph5/madgraph4gpu/tree/new_interface_wrap
But that would be another merge request/discussion.
(and for me, I'm using #765 as a block note on the change that I want to do since I will likely need a break on those)

@roiser roiser merged commit fbdacbd into master Sep 15, 2023
21 checks passed
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Oct 28, 2023
This adds the following commits by Olivier (final part of PR madgraph5#764, I had already merged individually previous commits)
fbdacbd       Stefan Roiser   Fri Sep 15 09:36:36 2023 +0200  Merge pull request madgraph5#764 from madgraph5/new_interface
87afdc2       Olivier Mattelaer       Thu Sep 14 09:13:25 2023 +0200  change default gpu size to 16k
f27f44d       Olivier Mattelaer       Tue Sep 12 23:33:24 2023 +0200  adding shortcut for the SIMD/GPU mode
22fd349       Olivier Mattelaer       Tue Sep 12 16:53:39 2023 +0200  allowing run_card to not include the vector_size (then use the value defined at output time)
5b594c7       Olivier Mattelaer       Tue Sep 12 16:08:42 2023 +0200  fix compilation issue
0dd8542       oliviermattelaer        Tue Sep 12 15:29:00 2023 +0200  Merge branch 'master' into new_interface
c839c47       Olivier Mattelaer       Tue Sep 12 15:07:03 2023 +0200  update to version supporting vector_size to be setup via the run_card

Fix conflicts:
epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/output.py
epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/patchMad.sh
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Oct 28, 2023
…rd generation, cleanup) from patchMad.sh to generateAndCompare.sh

This is meant to fix various code generation issues introduced in generateAndCompare.sh by PR madgraph5#764
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Oct 28, 2023
…atching Olivier's PR madgraph5#764

Code changes are only in .mad directories and mainly in the python infrastructure.
There are several matrix1.f that have changed too however (including ggttg/gg/ggg. but not ggtt).

I checked that ggtt and also ggttg tput/tmad tests succeed (logs not kept).
@@ -405,8 +405,7 @@ namespace mg5amcCpu
{
// nprocesses>1 was last observed for "mirror processes" in uux_ttx in the 270 branch (see issue #343 and PRs #360 and #396)
constexpr int nprocesses = %(nproc)i;
static_assert( nprocesses == 1, "Assume nprocesses == 1" );
// process_id corresponds to the index of DSIG1 Fortran functions (must be 1 because cudacpp is unable to handle DSIG2)
static_assert( nprocesses == 1 || nprocesses == 2, "Assume nprocesses == 1 or 2" );
Copy link
Member

@valassi valassi Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably related to #872.

Note that nprocesses=2 was introduced back into cudacpp by PR #754.

I will propose to undo both these changes (from 754 and this 764) to fix #872 in PR #935

valassi added a commit to valassi/madgraph4gpu that referenced this pull request Jul 24, 2024
…ting nprocesses == 1 (part of the fix for madgraph5#872 in PR madgraph5#935; undo changes from PR madgraph5#754 and madgraph5#764)

NB: nprocesses is unused in cudacpp, because cudacpp is unable to handle mirror processes!
This means that the result of madevent with cudacpp MEs are the same whether nprocesses is 1 or 2.
However, nprocesses=2 signals that Fortran is using mirror processes.
If mirror processes are used in Fortran MEs but not in cudacpp MEs, this leads to a xsec mismatch madgraph5#872.

Implementing mirror processes in cudacpp would be extremely complicated.
This is an assumption that was made years ago.
To be cross checked with Olivier, but I think that the code generated with cudacpp without mirror processes
also ensure that the mirror processes are _separately_ generated as separate subprocesses also in Fortran,
therefore the whole calculation is internally consistent if mirrorprocesses is false in both Fortran and cudacpp.
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Jul 25, 2024
…ck to upstream/master (i.e. do not undo PR madgraph5#754, this is needed, keep also madgraph5#764)

** NB bug madgraph5#872 xsec mismatch will show up again in the CI and in manual tests **

Revert "[pptt] in CODEGEN for CPPProcess.cc, add back the static_assert expecting nprocesses == 1 (part of the fix for madgraph5#872 in PR madgraph5#935; undo changes from PR madgraph5#754 and madgraph5#764)"
This reverts commit b5576ad.

Revert "[pptt] in CODEGEN model_handling.py, disable mirror processing again (undo PR madgraph5#754) to fix madgraph5#872"
This reverts commit b28e04b.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wrong type argument to unary minus (FFV functions should not use "-COUPs" as argument)
4 participants