Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler crash in physics/aerinterp.F90 with Intel oneAPI compilers (ifx) #1003

Closed
climbfuji opened this issue Mar 3, 2023 · 3 comments · Fixed by #1096
Closed

Compiler crash in physics/aerinterp.F90 with Intel oneAPI compilers (ifx) #1003

climbfuji opened this issue Mar 3, 2023 · 3 comments · Fixed by #1096

Comments

@climbfuji
Copy link
Collaborator

Description

@DusanJovic-NOAA reported this in an email to the Hera sysadmins and myself when trying to build the UFS Weather Model using the new Intel oneAPI compilers (icx, icpx, ifx):

I am trying to compile the model on Hera using ifx compiler (a new Intel compiler) and I'm seeing a compiler crash (internal compiler error) while compiling aerintep module (aerinterp.F90 file):

/tmp/ifxO1ROzQ.i90: error #5633: **Internal compiler error: segmentation violation signal raised** Please report this error along with the circumstances in which it occurred in a Software Problem Report.  Note: File and line given may not be explicit cause of this error.
compilation aborted for /scratch2/NCEPDEV/fv3-cam/Dusan.Jovic/ufs/ifx/ufs-weather-model/FV3/ccpp/physics/physics/aerinterp.F90 (code 3)
make[2]: *** [FV3/ccpp/physics/CMakeFiles/ccpp_physics.dir/physics/aerinterp.F90.o] Error 3
make[1]: *** [FV3/ccpp/physics/CMakeFiles/ccpp_physics.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

I traced the problem to the OMP directives before this loop:

#ifndef __GFORTRAN__
!$OMP parallel num_threads(nthrds) default(none)             &
!$OMP          shared(npts,ntrcaer,aerin,aer_pres,prsl)      &
!$OMP          shared(ddx,ddy,jindx1,jindx2,iindx1,iindx2)   &
!$OMP          shared(aerpm,aerpres,aerout,lev,nthrds) &
!$OMP          shared(temij,temiy,temjx,ddxy)                &
!$OMP          private(l,j,k,ii,i1,i2,j1,j2,tem)             &
!$OMP          copyin(tx1,tx2) firstprivate(tx1,tx2)

!$OMP do
#endif
      DO L=1,levsaer
        DO J=1,npts

I see GNU fortran also has trouble compiling this loop, OMP has been #ifdefed out completely. This OMP section contains two do loops. Instead of this I tried to explicitly specify parallel do loops separately for each of the loops, like this:

!$OMP parallel do num_threads(nthrds) default(none)     &
!$OMP          shared(npts,ntrcaer,aerin,aer_pres)      &
!$OMP          shared(jindx1,jindx2,iindx1,iindx2)      &
!$OMP          shared(aerpm,aerpres,lev,nthrds)         &
!$OMP          shared(temij,temiy,temjx,ddxy,tx1,tx2)   &
!$OMP          private(l,j,k,ii,i1,i2,j1,j2)
      DO L=1,levsaer
        DO J=1,npts
          J1    = JINDX1(J)
          J2    = JINDX2(J)
          I1    = IINDX1(J)
          I2    = IINDX2(J)
          DO ii=1,ntrcaer
           aerpm(j,L,ii) =                                                  &
           tx1*(TEMIJ(j)*aerin(I1,J1,L,ii,1)+DDXY(j)*aerin(I2,J2,L,ii,1)  &
               +TEMIY(j)*aerin(I1,J2,L,ii,1)+temjx(j)*aerin(I2,J1,L,ii,1))&
          +tx2*(TEMIJ(j)*aerin(I1,J1,L,ii,2)+DDXY(j)*aerin(I2,J2,L,ii,2)  &
               +TEMIY(j)*aerin(I1,J2,L,ii,2)+temjx(j)*aerin(I2,J1,L,ii,2))
          ENDDO

          aerpres(j,L) =                                                    &
           tx1*(TEMIJ(j)*aer_pres(I1,J1,L,1)+DDXY(j)*aer_pres(I2,J2,L,1)  &
               +TEMIY(j)*aer_pres(I1,J2,L,1)+temjx(j)*aer_pres(I2,J1,L,1))&
          +tx2*(TEMIJ(j)*aer_pres(I1,J1,L,2)+DDXY(j)*aer_pres(I2,J2,L,2)  &
               +TEMIY(j)*aer_pres(I1,J2,L,2)+temjx(j)*aer_pres(I2,J1,L,2))
        ENDDO
      ENDDO
!$OMP end parallel do

! don't flip, input is the same direction as GFS  (bottom-up)

!$OMP parallel do num_threads(nthrds) default(none)     &
!$OMP          shared(npts,ntrcaer,prsl)                &
!$OMP          shared(aerpm,aerpres,aerout,lev,nthrds)  &
!$OMP          private(l,j,k,ii,i1,i2,j1,j2,tx1,tx2,tem)
      DO J=1,npts
        DO L=1,lev
           if(prsl(j,L) >= aerpres(j,1)) then
              DO ii=1, ntrcaer
               aerout(j,L,ii) = aerpm(j,1,ii)        !! sfc level
              ENDDO
           else if(prsl(j,L) <= aerpres(j,levsaer)) then
              DO ii=1, ntrcaer
               aerout(j,L,ii) = aerpm(j,levsaer,ii)  !! toa top
              ENDDO
           else
             DO  k=1, levsaer-1      !! from sfc to toa
              IF(prsl(j,L) < aerpres(j,k) .and. prsl(j,L)>aerpres(j,k+1)) then
                 i1 = k
                 i2 = min(k+1,levsaer)
                 exit
              ENDIF
             ENDDO
             tem  = 1.0 / (aerpres(j,i1) - aerpres(j,i2))
             tx1  = (prsl(j,L) - aerpres(j,i2)) * tem
             tx2  = (aerpres(j,i1) - prsl(j,L)) * tem
             DO ii = 1, ntrcaer
               aerout(j,L,ii) = aerpm(j,i1,ii)*tx1 + aerpm(j,i2,ii)*tx2
             ENDDO
           endif
        ENDDO   !L-loop
      ENDDO     !J-loop
!$OMP end parallel do

After I made this change, code compiled successfully with ifx.

Steps to Reproduce

See above

Additional Context

This issue is not urgent, we want to have it here so that we can work with the Intel compiler team to find out what is going on and fix it.

@climbfuji climbfuji added the bug label Mar 3, 2023
@grantfirl
Copy link
Collaborator

@climbfuji This seems easy enough to implement the fix as you described. I'm going through our issues list and determining which we can address for the upcoming release, and this seems like an easy target. You say that you want to keep the issue here, but should we implement your fix for the release and just leave the issue?

@climbfuji
Copy link
Collaborator Author

We should probably confirm that this is still an issue with the latest oneAPI complers?

@DusanJovic-NOAA
Copy link
Collaborator

As a temporary fix to allow me to build the current version of ccpp-physics with currently available compilers on Hercules (and Hera) I made this change:

DusanJovic-NOAA@a7c0a13#diff-06bd6e693dddc2d74e323ec3908037c1537961e2a987af0489446bf4169dc37b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants