Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master_june24: nomultichannel should be an array nullptr, not an array full of 0s #892

Closed
valassi opened this issue Jul 8, 2024 · 5 comments · Fixed by #882
Closed

master_june24: nomultichannel should be an array nullptr, not an array full of 0s #892

valassi opened this issue Jul 8, 2024 · 5 comments · Fixed by #882
Assignees
Milestone

Comments

@valassi
Copy link
Member

valassi commented Jul 8, 2024

Another issue introduced in #830 and being reviewed in #882.

Before channelid array #830, the same channel was used for all events. The convention was to use channelid=0 to indicate nomultichannel in some fringe cases (like helicity filtering, or runTest.exe, of check.exe).

To prepare for channelid array #830, we had agreed that nomultichannel would be a nullptr to a channelid array. I was asked to prepare a specific function to handle one of these cases, fbridgesequence_nomultichannel (PR #796). This was already used in the code in some of the relevant places.

In channelid array #830 for master_june24, however, this has changed

  • the fbridgesequence_nomultichannel that I had been asked to create is no longer used
  • some bits of the code now seem to go back using channelid=0 (event by event) to indicate no multichannel
  • other bits of the code (e.g. BridgeKernels.cc) still use pchannelid=nullptr to indicate nonmultichannel

This is confusing, internally inconsistent, and differs from what we had agreed.

More fundamentally, when code is generated in multichannel mode, but a function is called in nomultichannel mode (e.g. because helicity filtering does not need single diagram enhancement, or because so far runTest.exe and check.exe use no multichannel), this is a global parameter of the function for all events: there is no use case where we use multichannel for some events, and nomultichannel for other events, in the same batch.

We should implement nomultichannel as nullptr to channelid.

@valassi valassi self-assigned this Jul 8, 2024
@valassi
Copy link
Member Author

valassi commented Jul 8, 2024

Example in gg_tt.mad (regenerated with master_june24 as is).

This is completely internally inconsistent

https://github.com/valassi/madgraph4gpu/blob/8e312bc02d9d072615fcb1052b5db54754498517/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/CPPProcess.cc#L342

      // Amplitude(s) for diagram number 1
      FFV1_0<W_ACCESS, A_ACCESS, CD_ACCESS>( w_fp[3], w_fp[2], w_fp[4], COUPs[1], 1.0, &amp_fp[0] );
#ifdef MGONGPU_SUPPORTS_MULTICHANNEL
      if( channelIds != 0 )
      {
        channelids_sv = CID_ACCESS::kernelAccessConst( channelIds );
#if defined __CUDACC__ or !defined MGONGPU_CPPSIMD
        if( channelids_sv == 1 ) numerators_sv += cxabs2( amp_sv[0] );
        denominators_sv += cxabs2( amp_sv[0] );
#else
        for( int i = 0; i < neppV; ++i )
        {
          if( channelids_sv[i] == 1 ) numerators_sv += cxabs2( amp_sv[0] );
          denominators_sv += cxabs2( amp_sv[0] );
        }
#endif
      }
#endif

https://github.com/valassi/madgraph4gpu/blob/8e312bc02d9d072615fcb1052b5db54754498517/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/CPPProcess.cc#L1033

#ifdef MGONGPU_SUPPORTS_MULTICHANNEL
    // Event-by-event random choice of color #402
    if( channelIds[0] != 0 ) // no event-by-event choice of color if channelId == 0 (fix FPE #783)
    {
      const unsigned int iconfigC = mgOnGpu::channelId_to_iconfigC[CID_ACCESS::kernelAccessConst( channelIds )]; // coloramps.h uses a channel ordering not the diagram id
      fptype targetamp[ncolor] = { 0 };
      for( int icolC = 0; icolC < ncolor; icolC++ )
      {
        if( icolC == 0 )
          targetamp[icolC] = 0;
        else
          targetamp[icolC] = targetamp[icolC - 1];
        if( mgOnGpu::icolamp[iconfigC][icolC] ) targetamp[icolC] += jamp2_sv[icolC];
      }
      //printf( "sigmaKin: ievt=%4d rndcol=%f\n", ievt, allrndcol[ievt] );
      for( int icolC = 0; icolC < ncolor; icolC++ )
      {
        if( allrndcol[ievt] < ( targetamp[icolC] / targetamp[ncolor - 1] ) )
        {
          allselcol[ievt] = icolC + 1; // NB Fortran [1,ncolor], cudacpp [0,ncolor-1]
          break;
        }
      }
    }
#endif
    // *** END OF PART 

The first code section correctly uses nullptr as a check, the second section uses individual channelids (the channelid of the first event...)

@roiser
Copy link
Member

roiser commented Jul 8, 2024

Hi, there are two places in CPPProcess where the channelID needs to be checked, one in wavefunctions (the above case) where this is all working fine and once in sigmaKin (the below case) where this was not working because our testing infrastructure was not supporting this case. I can try to dig out the case where this has happened. The way out was to check in sigmaKin the first value of the array, please note that I tried to make sure that the array in all cases contains 0s upon initialisation also for GPUs which I could not find a guarantuee for in the documentation.

@valassi
Copy link
Member Author

valassi commented Jul 8, 2024

The point is to have a uniform API. No multi channel with channelids is nullptr. I will fix this, faster

@valassi
Copy link
Member Author

valassi commented Jul 9, 2024

Hi, there are two places in CPPProcess where the channelID needs to be checked, one in wavefunctions (the above case) where this is all working fine and once in sigmaKin (the below case) where this was not working because our testing infrastructure was not supporting this case.

Thanks, I kind of see what you mean now. Indeed calculate_wavefunction and sigmaKin may handle this differently.

Or even more, I just realised that calculate_wavefunction and sigmaKin MUST hadnle this differently (to make it faster). I am modifying as follows (half way done and pending tests)

  • In calculate_wavefunction, I am going back to the old API. Essentially this function can be as-is what it is in upstream/master, more or less. The point is that calculate_)wavefunction (as specified in the comments above the signature) computes a single event (cuda, and c++ no simd), or a single event page/vector (c++ simd). Because of our assumptions (see discussion above and master_june24: missing sanity checks that all channelids are the same in a "warp" (or at least inside each SIMD vector) #898), a single scalar channelId is enough to be passed to calculate_wavefunctions.
  • In sigmakin instead there is the loop over SIMD vectors. This is where the decoding of the channelid for each specific vector must be done. And this is also where I am putting the sanity check that all events in a SIMD vector have the same channelid.

PS With respect to #894, ie the SIMD implementation, under the scanario above this becomes much simpler and extremely simple (again, calculate_wavefunctions goes back to upstream/master essentially)

valassi added a commit to valassi/madgraph4gpu that referenced this issue Jul 17, 2024
…omputeMatrixElements in madgraph5#892 (now bool useChannelIds, was uint channelId in master)
valassi added a commit to valassi/madgraph4gpu that referenced this issue Jul 17, 2024
…nt in computeMatrixElements in madgraph5#892 - bldall ok but runTest segfaults
valassi added a commit to valassi/madgraph4gpu that referenced this issue Jul 17, 2024
…ol useChannelIds, was uint channelId in master) to MEK::computeMatrixElements in madgraph5#892

In CODEGEN, adapt Bridge*/runTest/check accordingly to add the useChannelIds argument in computeMatrixElements
valassi added a commit to valassi/madgraph4gpu that referenced this issue Jul 17, 2024
…'channelIds != nullptr' (fix segfault madgraph5#892)

The SIMD tests pass without crashing, the CUDA test fails returning ME=0
valassi added a commit to valassi/madgraph4gpu that referenced this issue Jul 17, 2024
…nelIds[0] != 0' by 'channelIds != nullptr' (fix segfault madgraph5#892)

Also add the missing fix for computeMatrixElements (I just added -Wunused-parameter otherwise this is not spotted)

Also replace 0 by nullptr for clarity on the channelIDs array

Now both the SIMD and cuda runTest.exe pass without crashing and with success results
@oliviermattelaer oliviermattelaer added this to the warp milestone Jul 18, 2024
@valassi
Copy link
Member Author

valassi commented Jul 19, 2024

This is completed in #882.

This issue was so pervasive that it is impossible to point a single commit which fixed this. Amongst others these are relevant

git log --oneline | grep \#892 
229efdf44 [june24] in CODEGEN (backport gg_tt.mad) CPPProcess.cc, replace 'channelIds[0] != 0' by 'channelIds != nullptr' (fix segfault #892)
3bdd88d1b [june24] in gg_tt.mad CPPProcess.cc, replace 'channelIds[0] != 0' by 'channelIds != nullptr' (fix segfault #892)
8ca4fb612 [june24] in CODEGEN (backport gg_tt.mad) add back an argument (now bool useChannelIds, was uint channelId in master) to MEK::computeMatrixElements in #892
416276891 [june24] in gg_tt.mad Bridge*/runTest/check, add useChannelIds argument in computeMatrixElements in #892 - bldall ok but runTest segfaults
a11f1d201 [june24] in gg_tt.mad MatrixElementKernels, add back an argument to computeMatrixElements in #892 (now bool useChannelIds, was uint channelId in master)

i.e. 229efdf 3bdd88d 8ca4fb6 4162768 a11f1d2

In addition, the two issues #914 and #916 are related to this, so also the commits for those two issues are relevant.

Fixed in 882. Linking it there and closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants