Add Enzyme functionality for both the GPU and CPU #6

jlk9 · 2024-08-21T18:00:32Z

This PR builds on the enhancements for CPU Enzyme, shown here:
#5 (comment)

In addition to the changes in that PR, we have added a few more:

New kernels have been added to enable Enzyme compatibility with GPU code.
Prognostic variables have been changes into Vector{Array} objects, to avoid excessive array slicing.
Kernels are now statically declared with a fixed threadsperblock - this can improve performance, and also helps with GPU AD.
Some refactoring of files has been done, including adding forward/run_loop.jl, which has the main loop for the forward model.

We have also created a new test file, test/enzyme/test_Enzyme_end2end.jl, which tests an AD run of the entire model and compares with FD approximations. A separate test environment has been established with its own Project and Manifest files.

…a loop helper function of ocn_run. Had to temporarily comment out kernels, will re-add them

…quire @allowscalar macros. Still one summation that isn't working with AD

…rward run otherwise

…owscalar or broadcasting in part that it differentiated

andrewdnolan · 2024-08-22T21:17:32Z

show_output.py

Do we need this file? Between the hard coded path and it being python, I'd opt to leave it out if possible.

Sure, we can remove this. I just had it in for convenience.

andrewdnolan · 2024-08-22T21:21:35Z

src/forward/init.jl

+    d_Prog = PrognosticVars(zeros(Float64, size(Prog.ssh)),
+                            zeros(Float64, size(Prog.normalVelocity)),
+                            zeros(Float64, size(Prog.layerThickness)),


Just want to make sure the shadow arrays don't need to be/can't be on GPU? If that's the case, then maybe this function doesn't need the backend kwarg.

Oops, this function isn't actually used in the Enzyme tests. It's a holdout from before, we can remove it.

@test

Add @test, forward, and CPU backend

…r test since forward mdoe on GPU no longer errors but still produces incorrect result

In order to avoid redefinitions of variables/function warnings.

andrewdnolan · 2024-09-16T23:04:17Z

Hey @jlk9,

Those most recent commits fixed the enzyme test fails we talked about a couple weeks ago. Unfortunately, something is now broken in the test_Operators.jl related to integer division when I calculate the norm of the error. The command I used to test the code on perlmutter was:

module load julia/1.10.4
julia -O0 --color=yes --debug-info=2 --project=@. -e 'using Pkg; Pkg.test()'

which results in:

Error Log

anolan@perlmutter:login20:~/MPAS-Ocean.jl> julia -O0 --color=yes --debug-info=2 --project=@. -e 'using Pkg; Pkg.test()'
     Testing MOKA
      Status `/tmp/jl_dlAHXi/Project.toml`
  [79e6a3ab] Adapt v4.0.4
  [052768ef] CUDA v5.4.3
  [7da242da] Enzyme v0.12.36
  [63c18a36] KernelAbstractions v0.9.25
  [4c5738b9] MOKA v0.1.0 `/global/u2/a/anolan/MPAS-Ocean.jl`
  [3a884ed6] UnPack v1.0.2
  [ddb6d928] YAML v0.4.12
  [ade2ca70] Dates
  [f43a241f] Downloads v1.6.0
  [4af54fe1] LazyArtifacts
  [37e2e46d] LinearAlgebra
  [44cfe95a] Pkg v1.10.0
  [8dfed614] Test
      Status `/tmp/jl_dlAHXi/Manifest.toml`
  [21141c5a] AMDGPU v1.0.1
  [621f4979] AbstractFFTs v1.5.0
  [7d9f7c33] Accessors v0.1.38
  [79e6a3ab] Adapt v4.0.4
  [a9b6321e] Atomix v0.1.0
  [ab4f0b2a] BFloat16s v0.5.0
  [6e4b80f9] BenchmarkTools v1.5.0
  [d1d4a3ce] BitFlags v0.1.9
  [fa961155] CEnum v0.5.0
  [179af706] CFTime v0.1.3
  [052768ef] CUDA v5.4.3
  [1af6417a] CUDA_Runtime_Discovery v0.3.5
  [da1fd8a2] CodeTracking v1.3.6
  [944b1d66] CodecZlib v0.7.6
  [35d6a980] ColorSchemes v3.26.0
  [3da002f7] ColorTypes v0.11.5
  [c3611d14] ColorVectorSpace v0.10.0
  [5ae59095] Colors v0.12.11
  [1fbeeb36] CommonDataModel v0.3.6
  [34da2185] Compat v4.16.0
  [a33af91c] CompositionsBase v0.1.2
  [f0e56b4a] ConcurrentUtilities v2.4.2
  [187b0558] ConstructionBase v1.5.8
  [d38c429a] Contour v0.6.3
  [a8cc5b0e] Crayons v4.1.1
  [9a962f9c] DataAPI v1.16.0
  [a93c6f00] DataFrames v1.6.1
  [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [8bb1440f] DelimitedFiles v1.9.1
⌅ [3c3547ce] DiskArrays v0.3.23
  [ffbed154] DocStringExtensions v0.9.3
  [7da242da] Enzyme v0.12.36
  [f151be2c] EnzymeCore v0.7.8
  [460bff9d] ExceptionUnwrapping v0.1.10
  [e2ba6199] ExprTools v0.1.10
  [c87230d0] FFMPEG v0.4.1
  [53c48c17] FixedPointNumbers v0.8.5
  [1fa38f19] Format v1.3.7
  [0c68f7d7] GPUArrays v10.3.1
  [46192b85] GPUArraysCore v0.1.6
⌅ [61eb1bfa] GPUCompiler v0.26.7
  [28b8d3ca] GR v0.73.7
  [42e2da0e] Grisu v1.0.2
  [cd3eb016] HTTP v1.10.8
  [842dd82b] InlineStrings v1.4.2
  [3587e190] InverseFunctions v0.1.16
  [41ab1584] InvertedIndices v1.3.0
  [92d709cd] IrrationalConstants v0.2.2
  [82899510] IteratorInterfaceExtensions v1.0.0
  [1019f520] JLFzf v0.1.8
  [692b3bcd] JLLWrappers v1.6.0
  [682c06a0] JSON v0.21.4
  [aa1ae85d] JuliaInterpreter v0.9.36
  [63c18a36] KernelAbstractions v0.9.25
⌅ [929cbde3] LLVM v8.1.0
  [8b046642] LLVMLoopInfo v1.0.0
  [8ac3fa9e] LRUCache v1.6.1
  [b964fa9f] LaTeXStrings v1.3.1
  [23fbe1c1] Latexify v0.16.5
  [2ab3a3ac] LogExpFunctions v0.3.28
  [e6f89c97] LoggingExtras v1.0.3
  [6f1432cf] LoweredCodeUtils v3.0.2
  [4c5738b9] MOKA v0.1.0 `/global/u2/a/anolan/MPAS-Ocean.jl`
  [da04e1cc] MPI v0.20.21
  [3da0fdf6] MPIPreferences v0.1.11
  [1914dd2f] MacroTools v0.5.13
  [739be429] MbedTLS v1.1.9
  [442fdcdd] Measures v0.3.2
  [e1d29d7a] Missings v1.2.0
  [85f8d34a] NCDatasets v0.14.5
  [5da4648a] NVTX v0.3.4
  [77ba4419] NaNMath v1.0.2
  [d8793406] ObjectFile v0.4.2
  [6fe1bfb0] OffsetArrays v1.14.1
  [4d8831e6] OpenSSL v1.4.3
  [bac558e1] OrderedCollections v1.6.3
  [69de0a69] Parsers v2.8.1
  [b98c9c47] Pipe v1.3.0
  [eebad327] PkgVersion v0.3.3
  [ccf2f8ad] PlotThemes v3.2.0
  [995b91a9] PlotUtils v1.4.1
  [91a5bcdd] Plots v1.40.8
  [2dfb63ee] PooledArrays v1.4.3
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [08abe8d2] PrettyTables v2.3.2
  [74087812] Random123 v1.7.0
  [e6cf234a] RandomNumbers v1.6.0
  [3cdcf5f2] RecipesBase v1.3.4
  [01d81517] RecipesPipeline v0.6.12
  [189a3867] Reexport v1.2.2
  [05181044] RelocatableFolders v1.0.1
  [ae029012] Requires v1.3.0
  [295af30f] Revise v3.5.18
  [6c6a2e73] Scratch v1.2.1
  [91c51154] SentinelArrays v1.4.5
  [992d4aef] Showoff v1.0.3
  [777ac1f9] SimpleBufferStream v1.1.0
  [a2af1166] SortingAlgorithms v1.2.1
  [276daf66] SpecialFunctions v2.4.0
  [90137ffa] StaticArrays v1.9.7
  [1e83bf80] StaticArraysCore v1.4.3
  [82ae8749] StatsAPI v1.7.0
  [2913bbd2] StatsBase v0.34.3
  [69024149] StringEncodings v0.3.7
  [892a3eda] StringManipulation v0.3.4
  [09ab397b] StructArrays v0.6.18
  [53d494c1] StructIO v0.3.1
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [62fd8b95] TensorCore v0.1.1
  [a759f4b9] TimerOutputs v0.5.24
  [3bb67fe8] TranscodingStreams v0.11.2
  [5c2747f8] URIs v1.5.1
  [3a884ed6] UnPack v1.0.2
  [1cfade01] UnicodeFun v0.4.1
  [1986cc42] Unitful v1.21.0
  [45397f5d] UnitfulLatexify v1.6.4
  [013be700] UnsafeAtomics v0.2.1
  [d80eeb9a] UnsafeAtomicsLLVM v0.2.1
  [41fe7b60] Unzip v0.2.0
  [ddb6d928] YAML v0.4.12
  [0b7ba130] Blosc_jll v1.21.5+0
  [6e34b625] Bzip2_jll v1.0.8+1
⌅ [4ee394cb] CUDA_Driver_jll v0.9.2+0
⌅ [76a88914] CUDA_Runtime_jll v0.14.1+0
  [83423d85] Cairo_jll v1.18.0+2
  [ee1fde0b] Dbus_jll v1.14.10+0
  [ab5a07f8] Elfutils_jll v0.189.0+1
  [7cc45869] Enzyme_jll v0.0.148+0
  [2702e6a9] EpollShim_jll v0.0.20230411+0
  [2e619515] Expat_jll v2.6.2+0
⌅ [b22a6f82] FFMPEG_jll v4.4.4+1
  [a3f928ae] Fontconfig_jll v2.13.96+0
  [d7e528f0] FreeType2_jll v2.13.2+0
  [559328eb] FriBidi_jll v1.0.14+0
  [0656b61e] GLFW_jll v3.4.0+1
  [d2c73de3] GR_jll v0.73.7+0
  [78b55507] Gettext_jll v0.21.0+0
  [7746bdde] Glib_jll v2.80.2+0
  [0951126a] GnuTLS_jll v3.8.4+0
  [3b182d85] Graphite2_jll v1.3.14+0
  [0234f1f7] HDF5_jll v1.14.3+3
  [2696aab5] HIP_jll v5.4.4+0
  [2e76f6c2] HarfBuzz_jll v8.3.1+0
  [e33a78d0] Hwloc_jll v2.11.1+0
  [aacddb02] JpegTurbo_jll v3.0.3+0
  [9c1d0b0a] JuliaNVTXCallbacks_jll v0.2.1+0
  [c1c5ebd0] LAME_jll v3.100.2+0
⌅ [88015f11] LERC_jll v3.0.0+1
⌅ [dad2f222] LLVMExtra_jll v0.0.31+0
  [1d63c593] LLVMOpenMP_jll v18.1.7+0
⌅ [86de99a1] LLVM_jll v15.0.7+10
  [dd4b983a] LZO_jll v2.10.2+0
⌅ [e9f186c6] Libffi_jll v3.2.2+1
  [d4300ac3] Libgcrypt_jll v1.8.11+0
  [7e76a0d4] Libglvnd_jll v1.6.0+0
  [7add5ba3] Libgpg_error_jll v1.49.0+0
  [94ce4f54] Libiconv_jll v1.17.0+0
  [4b2f31a3] Libmount_jll v2.40.1+0
⌅ [89763e89] Libtiff_jll v4.5.1+1
  [38a345b3] Libuuid_jll v2.40.1+0
  [5ced341a] Lz4_jll v1.10.0+0
  [7cb0a576] MPICH_jll v4.2.2+0
  [f1f71cc9] MPItrampoline_jll v5.4.0+0
  [9237b28f] MicrosoftMPI_jll v10.1.4+2
  [7f51dc2b] NUMA_jll v2.0.18+0
  [e98f9f5b] NVTX_jll v3.1.0+2
  [7243133f] NetCDF_jll v400.902.211+1
⌅ [4c82536e] Nettle_jll v3.7.2+0
  [e7412a2a] Ogg_jll v1.3.5+1
⌅ [fe0851c0] OpenMPI_jll v4.1.6+0
  [458c3c95] OpenSSL_jll v3.0.15+0
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [91d4177d] Opus_jll v1.3.3+0
  [c2071276] P11Kit_jll v0.24.1+0
  [36c8627f] Pango_jll v1.54.1+0
  [30392449] Pixman_jll v0.43.4+0
  [c0090381] Qt6Base_jll v6.7.1+1
  [629bc702] Qt6Declarative_jll v6.7.1+2
  [ce943373] Qt6ShaderTools_jll v6.7.1+1
  [e99dba38] Qt6Wayland_jll v6.7.1+1
  [8fbdd1d2] ROCmCompilerSupport_jll v5.4.4+0
  [873c0968] ROCmDeviceLibs_jll v5.6.1+1
  [10ae2a08] ROCmOpenCLRuntime_jll v5.4.4+0
  [a44049a8] Vulkan_Loader_jll v1.3.243+0
  [a2964d1f] Wayland_jll v1.21.0+1
  [2381bf8a] Wayland_protocols_jll v1.31.0+0
  [02c8fc9c] XML2_jll v2.13.3+0
  [aed1982a] XSLT_jll v1.1.41+0
  [ffd25f8a] XZ_jll v5.4.6+0
  [f67eecfb] Xorg_libICE_jll v1.1.1+0
  [c834827a] Xorg_libSM_jll v1.2.4+0
  [4f6342f7] Xorg_libX11_jll v1.8.6+0
  [0c0b7dd1] Xorg_libXau_jll v1.0.11+0
  [935fb764] Xorg_libXcursor_jll v1.2.0+4
  [a3789734] Xorg_libXdmcp_jll v1.1.4+0
  [1082639a] Xorg_libXext_jll v1.3.6+0
  [d091e8ba] Xorg_libXfixes_jll v5.0.3+4
  [a51aa0fd] Xorg_libXi_jll v1.7.10+4
  [d1454406] Xorg_libXinerama_jll v1.1.4+4
  [ec84b674] Xorg_libXrandr_jll v1.5.2+4
  [ea2f1a96] Xorg_libXrender_jll v0.9.11+0
  [a65dc6b1] Xorg_libpciaccess_jll v0.16.0+1
  [14d82f49] Xorg_libpthread_stubs_jll v0.1.1+0
  [c7cfdc94] Xorg_libxcb_jll v1.17.0+0
  [cc61e674] Xorg_libxkbfile_jll v1.1.2+0
  [e920d4aa] Xorg_xcb_util_cursor_jll v0.1.4+0
  [12413925] Xorg_xcb_util_image_jll v0.4.0+1
  [2def613f] Xorg_xcb_util_jll v0.4.0+1
  [975044d2] Xorg_xcb_util_keysyms_jll v0.4.0+1
  [0d47668e] Xorg_xcb_util_renderutil_jll v0.3.9+1
  [c22f9ab0] Xorg_xcb_util_wm_jll v0.4.1+1
  [35661453] Xorg_xkbcomp_jll v1.4.6+0
  [33bec58e] Xorg_xkeyboard_config_jll v2.39.0+0
  [c4d99508] Xorg_xorgproto_jll v2019.2.0+2
  [c5fb5394] Xorg_xtrans_jll v1.5.0+0
  [3161d3a3] Zstd_jll v1.5.6+0
  [c53206cc] argp_standalone_jll v1.3.1+0
  [35ca27e7] eudev_jll v3.2.9+0
  [d65627f6] fts_jll v1.2.8+0
  [214eeab7] fzf_jll v0.53.0+0
  [1a1c6b14] gperf_jll v3.1.1+0
  [dd59ff1a] hsa_rocr_jll v5.4.4+0
  [1cecccd7] hsakmt_roct_jll v5.5.1+0
  [477f73a3] libaec_jll v1.1.2+0
  [a4ae2306] libaom_jll v3.9.0+0
  [0ac62f75] libass_jll v0.15.2+0
  [1183f4f0] libdecor_jll v0.2.2+0
  [8e53e030] libdrm_jll v2.4.110+0
  [2db6ffa8] libevdev_jll v1.11.0+0
  [f638f0a6] libfdk_aac_jll v2.0.3+0
  [36db933b] libinput_jll v1.18.0+0
  [b53b4c65] libpng_jll v1.6.43+1
  [f27f6e37] libvorbis_jll v1.3.7+2
  [337d8026] libzip_jll v1.10.1+0
  [009596ad] mtdev_jll v1.1.6+0
  [c88a4935] obstack_jll v1.2.3+0
  [5a766526] rocminfo_jll v5.4.4+0
⌅ [1270edf5] x264_jll v2021.5.5+0
⌅ [dfaa095f] x265_jll v3.5.0+0
  [d8fb68d0] xkbcommon_jll v1.4.1+1
  [0dad84c5] ArgTools v1.1.1
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [ade2ca70] Dates
  [8ba89e20] Distributed
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching
  [9fa8497b] Future
  [b77e0a4c] InteractiveUtils
  [4af54fe1] LazyArtifacts
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [a63ad114] Mmap
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.10.0
  [de0858da] Printf
  [9abbd945] Profile
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [6462fe0b] Sockets
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [8dfed614] Test
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.1.1+0
  [781609d7] GMP_jll v6.2.1+6
  [d55e3150] LLD_jll v15.0.7+10
  [deac9b47] LibCURL_jll v8.4.0+0
  [e37daf67] LibGit2_jll v1.6.4+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.2+1
  [14a3606d] MozillaCACerts_jll v2023.1.10
  [4536629a] OpenBLAS_jll v0.3.23+4
  [05823500] OpenLibm_jll v0.8.1+2
  [efcefdf7] PCRE2_jll v10.42.0+1
  [bea87d4a] SuiteSparse_jll v7.2.1+1
  [83775a58] Zlib_jll v1.2.13+1
  [8f36deef] libLLVM_jll v15.0.7+10
  [8e850b90] libblastrampoline_jll v5.8.0+1
  [8e850ede] nghttp2_jll v1.52.0+1
  [3f19e933] p7zip_jll v17.4.0+2
        Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading.
Precompiling project...
  2 dependencies successfully precompiled in 5 seconds. 297 already precompiled.
     Testing Running tests...
┌ Warning: CUDA runtime library `libcublasLt.so.12` was loaded from a system path, `/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64/libcublasLt.so.12`.
│
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH
│ environment variable, or that it does not contain paths to CUDA libraries.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/Tl08O/src/initialization.jl:219
┌ Warning: CUDA runtime library `libnvJitLink.so.12` was loaded from a system path, `/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/lib64/libnvJitLink.so.12`.
│
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH
│ environment variable, or that it does not contain paths to CUDA libraries.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/Tl08O/src/initialization.jl:219
┌ Warning: CUDA runtime library `libcusparse.so.12` was loaded from a system path, `/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64/libcusparse.so.12`.
│
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH
│ environment variable, or that it does not contain paths to CUDA libraries.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/Tl08O/src/initialization.jl:219
Operator/Kernel Tests: Error During Test at /global/u2/a/anolan/MPAS-Ocean.jl/test/runtests.jl:12
  Got exception outside of a @test
  LoadError: DivideError: integer division error
  Stacktrace:
    [1] macro expansion
      @ ~/.julia/packages/CUDA/Tl08O/lib/utils/call.jl:218 [inlined]
    [2] macro expansion
      @ ~/.julia/packages/CUDA/Tl08O/lib/cublas/libcublas.jl:1536 [inlined]
    [3] #332
      @ ~/.julia/packages/CUDA/Tl08O/lib/utils/call.jl:35 [inlined]
    [4] retry_reclaim
      @ ~/.julia/packages/CUDA/Tl08O/src/memory.jl:434 [inlined]
    [5] check
      @ ~/.julia/packages/CUDA/Tl08O/lib/cublas/libcublas.jl:24 [inlined]
    [6] cublasDnrm2_v2_64
      @ ~/.julia/packages/CUDA/Tl08O/lib/utils/call.jl:34 [inlined]
    [7] nrm2(n::Int64, X::CuArray{Float64, 2, CUDA.DeviceMemory})
      @ CUDA.CUBLAS ~/.julia/packages/CUDA/Tl08O/lib/cublas/wrappers.jl:193
    [8] nrm2
      @ ~/.julia/packages/CUDA/Tl08O/lib/cublas/wrappers.jl:201 [inlined]
    [9] norm
      @ ~/.julia/packages/CUDA/Tl08O/lib/cublas/linalg.jl:135 [inlined]
   [10] ErrorMeasures(Numeric::CuArray{Float64, 2, CUDA.DeviceMemory}, Analytic::CuArray{Float64, 2, CUDA.DeviceMemory}, mesh::MOKA.MPASMesh.HorzMesh{MOKA.MPASMesh.PrimaryCells{Int64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int32, 1, CUDA.DeviceMemory}, CuArray{Int32, 2, CUDA.DeviceMemory}}, MOKA.MPASMesh.DualCells{Int64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int32, 2, CUDA.DeviceMemory}}, MOKA.MPASMesh.Edges{Int64, CuArray{Float64, 1, CUDA.DeviceMemory}, CuArray{Int32, 1, CUDA.DeviceMemory}, CuArray{Float64, 2, CUDA.DeviceMemory}, CuArray{Int32, 2, CUDA.DeviceMemory}}}, node_location::Type)
      @ Main /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:25
   [11] top-level scope
      @ /global/u2/a/anolan/MPAS-Ocean.jl/test/ocn/test_Operators.jl:49
   [12] include(fname::String)
      @ Base.MainInclude ./client.jl:489
   [13] macro expansion
      @ /global/u2/a/anolan/MPAS-Ocean.jl/test/runtests.jl:13 [inlined]
   [14] macro expansion
      @ /global/common/software/nersc/n9/julia/1.10.4/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [15] macro expansion
      @ /global/u2/a/anolan/MPAS-Ocean.jl/test/runtests.jl:13 [inlined]
   [16] macro expansion
      @ /global/common/software/nersc/n9/julia/1.10.4/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
   [17] top-level scope
      @ /global/u2/a/anolan/MPAS-Ocean.jl/test/runtests.jl:7
   [18] include(fname::String)
      @ Base.MainInclude ./client.jl:489
   [19] top-level scope
      @ none:6
   [20] eval
      @ ./boot.jl:385 [inlined]
   [21] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:291
   [22] _start()
      @ Base ./client.jl:552
  in expression starting at /global/u2/a/anolan/MPAS-Ocean.jl/test/ocn/test_Operators.jl:49
backend = KernelAbstractions.CPU(false)
┌ Info:  (gradients)
│                                                                                                                                                                                                           [0/1894]
│ For edge global input 1, output 1
│ Enzyme computed 48.00000000000077
└ Finite differences computed 48.00000032512775
(nEdges, nCells) = (6912, 2304)
┌ Info:  (divergence)
│
│ For cell global input 2, output 1
│ Enzyme computed -32.00000000000052
└ Finite differences computed -31.999999391116123
backend = CUDABackend(false, false)
┌ Info:  (gradients)
│
│ For edge global input 1, output 1
│ Enzyme computed 48.00000000000077
└ Finite differences computed 48.00000032512775
(nEdges, nCells) = (6912, 2304)
┌ Info:  (divergence)
│
│ For cell global input 2, output 1
│ Enzyme computed -32.00000000000052
└ Finite differences computed -31.999999391116123
Test Summary:           |  Pass  Error  Broken  Total     Time
Moka                    | 73873      1       1  73875  2m40.1s
  Infrastructre Test    | 73862                 73862     2.3s
  Operator/Kernel Tests |            1              1    18.7s
  Enzyme Tests          |    11              1     12  2m19.1s
ERROR: LoadError: Some tests did not pass: 73873 passed, 0 failed, 1 errored, 1 broken.
in expression starting at /global/u2/a/anolan/MPAS-Ocean.jl/test/runtests.jl:5
ERROR: Package MOKA errored during testing
Stacktrace:
 [1] pkgerror(msg::String)
   @ Pkg.Types /global/common/software/nersc/n9/julia/1.10.4/share/julia/stdlib/v1.10/Pkg/src/Types.jl:70
 [2] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, julia_args::Cmd, test_args::Cmd, test_fn::Nothing, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool)
   @ Pkg.Operations /global/common/software/nersc/n9/julia/1.10.4/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:2019
 [3] test
   @ /global/common/software/nersc/n9/julia/1.10.4/share/julia/stdlib/v1.10/Pkg/src/Operations.jl:1900 [inlined]
 [4] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, test_fn::Nothing, julia_args::Cmd, test_args::Cmd, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool, kwargs::@Kwargs{io::Base.TTY})
   @ Pkg.API /global/common/software/nersc/n9/julia/1.10.4/share/julia/stdlib/v1.10/Pkg/src/API.jl:444
 [5] test(pkgs::Vector{Pkg.Types.PackageSpec}; io::Base.TTY, kwargs::@Kwargs{})
   @ Pkg.API /global/common/software/nersc/n9/julia/1.10.4/share/julia/stdlib/v1.10/Pkg/src/API.jl:159
 [6] test(pkgs::Vector{Pkg.Types.PackageSpec})
   @ Pkg.API /global/common/software/nersc/n9/julia/1.10.4/share/julia/stdlib/v1.10/Pkg/src/API.jl:148
 [7] test(; name::Nothing, uuid::Nothing, version::Nothing, url::Nothing, rev::Nothing, path::Nothing, mode::Pkg.Types.PackageMode, subdir::Nothing, kwargs::@Kwargs{})
   @ Pkg.API /global/common/software/nersc/n9/julia/1.10.4/share/julia/stdlib/v1.10/Pkg/src/API.jl:174
 [8] test()
   @ Pkg.API /global/common/software/nersc/n9/julia/1.10.4/share/julia/stdlib/v1.10/Pkg/src/API.jl:165
 [9] top-level scope
   @ none:1

which is failry confusing and annoying, because if I run:

module load julia/1.10.4
julia -O0 --color=yes --debug-info=2 --project=@. -e 'include("test/utilities.jl"); include("test/ocn/test_Operators.jl")'

I don't get any errors... The output looks as expected:

┌ Info:  (Operators on GPU)
│
│ Gradient
│ --------
│ L∞ norm of error : 0.0012502607187855211
│ L₂ norm of error : 0.0013435461111726199
│
│ Divergence
│ ----------
│ L∞ norm of error: 0.0012488688659444015
│ L₂ norm of error: 0.0012488688659097421
│
│ Curl
│ ----
│ L∞ norm of error: 0.16136566356967616
└ L₂ norm of error: 0.16134801689713477

Because I can't reproduce the erros outside of Pkg.test() it's a little hard to understand what's going on. I'm guessing these have something to do with the CUDA.jl warning during the dependecy install:

┌ Warning: CUDA runtime library `libcublasLt.so.12` was loaded from a system path, `/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64/libcublasLt.so.12`.
│
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH
│ environment variable, or that it does not contain paths to CUDA libraries.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/Tl08O/src/initialization.jl:219
┌ Warning: CUDA runtime library `libnvJitLink.so.12` was loaded from a system path, `/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/lib64/libnvJitLink.so.12`.
│
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH
│ environment variable, or that it does not contain paths to CUDA libraries.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/Tl08O/src/initialization.jl:219
┌ Warning: CUDA runtime library `libcusparse.so.12` was loaded from a system path, `/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64/libcusparse.so.12`.
│
│ This may cause errors. Ensure that you have not set the LD_LIBRARY_PATH
│ environment variable, or that it does not contain paths to CUDA libraries.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/Tl08O/src/initialization.jl:219

From checking the warnings,

echo $LD_LIBRARY_PATH

results in

/usr/local/cuda-12.2/compat:/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64:/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/extras/CUPTI/lib64:/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/extras/Debugger/lib64:/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/nvvm/lib64:/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/lib64:/opt/cray/pe/papi/7.0.1.2/lib64:/opt/cray/libfabric/1.15.2.0/lib64

so there are CUDA libraries in there, which the warning message says there shouldn't be. Not sure if this is a perlmutter issue, or a Project.toml issue. I'm going to do more looking into this tomorrow.

michel2323 · 2024-09-17T02:27:21Z

@JBlaschke can you maybe help us out here? I usually prefer not to use the system CUDA when using CUDA.jl. At least not at first. @andrewdnolan if possible open a ticket at NERSC as described here. I had such issues on other systems. Can you also look at the CUDA.jl docs and paste your CUDA.versioninfo(). A Manifest.toml would also be great. Or create a debug branch with a Manifest.toml so I can try on another machine with the exact same versions.

Edit: It might also help to add CUDA.versioninfo() to the runtests.jl output. You should see a difference there between ] test and just running the tests from the project itself.

michel2323 · 2024-09-17T02:31:58Z

Also, can you use Julia 1.10.5? There were important bugfixes in that release that impact Enzyme. I recommend your own Julia installation. But maybe @JBlaschke knows more.

JBlaschke · 2024-09-17T04:19:04Z

@michel2323 I can take a look when I come back from vacation. RE the CUDA version: using the artifact can have two problems:

If the artifact CUDA version doesn't use an ABI that's compatible with HPE GTL, then MPI may crash.
In the past the CUDA installed on the system contained patches from HPE to make working with GTL possible.

Problem 1 is easy enough to fix by pinning the artifact version to a CUDA version that's compatible with GTL. Problem 2 can be fixed by disabling CUDA-aware MPI (which entails a hit to performance)

When I get back I can bump the CUDA module versions also.

The Julia modules do two things:

They provide Julia and Juliaup
The point the JULIA_LOAD_PATH to a system-wide LocalPreferences.toml.

You can try your own local install and with only the MPI part of the LocalPreferences.toml (i.e. removing the CUDA part). The MPI preferences are important to make MPI work at all.

Anyway, that was just advice to help you be productive while I'm away. @andrewdnolan and @michel2323 when I get back what exactly would you like me to look at?

Also: why do you prefer the artifact install in lieu of the system version? Our high-performance file systems have limited space so I'm weary of thousands of users installing the same system software over and over.

andrewdnolan · 2024-09-17T16:34:00Z

@michel2323 Unfortunately, I can't use Julia 1.10.5 yet. I have ticket open with NERSC about some permission issue's I'm having with juliaup. Until that's resolved, 1.10.4 is the most recent version I have access to.

I fear the issue may have just been something silly on my end, mainly related to the cudatoolkit module. If I run:

cd $MOKA_DIR

module purge 
source /opt/cray/pe/cpe/23.12/restore_lmod_system_defaults.sh

module load PrgEnv-gnu/8.5.0
moduel load julia/1.10.4
module load cpe-cuda/23.12

julia -O0 --color=yes --debug-info=2 --project=@. -e 'using Pkg; Pkg.test()'

everything works as expected (and all test pass).

With only those modules loaded, echo $LD_LIBRARY_PATH results in:

/opt/cray/libfabric/1.15.2.0/lib64

and module list results in:


Currently Loaded Modules:
  1) libfabric/1.15.2.0   3) julia/1.10.4             5) cray-dsmml/0.2.2             7) cray-mpich/8.1.28 (mpi)   9) gcc-native/12.3
  2) craype-network-ofi   4) PrgEnv-gnu/8.5.0 (cpe)   6) cray-libsci/23.12.5 (math)   8) craype/2.7.30     (c)    10) cpe-cuda/23.12  (cpe)

Whereas with the default modules upon sshing into perlmutter, echo $LD_LIBRARY_PATH results in:

/usr/local/cuda-12.2/compat:/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64:/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/extras/CUPTI/lib64:/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/extras/Debugger/lib64:/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/nvvm/lib64:/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/lib64:/opt/cray/pe/papi/7.0.1.2/lib64:/opt/cray/libfabric/1.15.2.0/lib64

and module list results in:

Currently Loaded Modules:
  1) craype-x86-milan     4) xpmem/2.6.2-2.5_2.38__gd067c3f.shasta   7) cray-libsci/23.12.5  10) gcc-native/12.3         13) cudatoolkit/12.2
  2) libfabric/1.15.2.0   5) PrgEnv-gnu/8.5.0                        8) cray-mpich/8.1.28    11) perftools-base/23.12.0  14) craype-accel-nvidia80
  3) craype-network-ofi   6) cray-dsmml/0.2.2                        9) craype/2.7.30        12) cpe/23.12               15) gpu/1.0

So, long story short I think this may just be an issue with the default modules on perlmutter. Not really sure when this changed, since everything was working ok up until this PR. But, anyway I'll add a config_pm.sh script that does the necessary module loading for perlmutter to the repo.

@JBlaschke Thanks for your guidance! Sorry to bother you on vacation!

JBlaschke · 2024-09-17T17:03:44Z

@andrewdnolan

I have ticket open with NERSC about some permission issue's I'm having with juliaup. Until that's resolved, 1.10.4 is the most recent version I have access to.

What's the INC number? I can take a look (should have some time between now and Thursday. Permission issues are probably a bug.

module purge
source /opt/cray/pe/cpe/23.12/restore_lmod_system_defaults.sh

Oh my! (Clutches pearls) can you tell me why you're doing this. Not judging, this has a story which I would love to hear.

So, long story short I think this may just be an issue with the default modules on perlmutter. Not really sure when this changed, since everything was working ok up until this PR. But, anyway I'll add a config_pm.sh script that does the necessary module loading for perlmutter to the repo.

That makes sense. It's worth the effort though to figure out what aspects of the default modules breaks this (and fix the default modules) rather than doing extreme surgery on the NERSC software environment every time the defaults get in the way.

@JBlaschke Thanks for your guidance! Sorry to bother you on vacation!

No worries -- this is interesting, so I will look at it whenever I get a moment

andrewdnolan · 2024-09-17T17:26:06Z

@JBlaschke

What's the INC number? I can take a look (should have some time between now and Thursday. Permission issues are probably a bug.

The juliaup issue is: INC0221587. I lost track of that and it seems to be closed by the system. Are you able to reopen that on your end? I can open a new issue if that's easier. (P.S. per the last comment on the ticket, I did try deleting the .julia directory and I'm dealing with the original issue).

Oh my! (Clutches pearls) can you tell me why you're doing this. Not judging, this has a story which I would love to hear.

Unfortunately, not a very good explanation here besides with the above warnings re the LD_LIBRARY_PATH variable, I though it might make sense to try module purge and when you run that command, it issues the warning:

Unloading the cpe module is insufficient to restore the system defaults.
Please run 'source /opt/cray/pe/cpe/23.12/restore_lmod_system_defaults.[csh|sh]'.

So I added that to the list of commands. Is the module purge or the source /opt/cray/pe/cpe/23.12/restore_lmod_system_defaults.sh the more problematic line? Or are both bad practice?

That makes sense. It's worth the effort though to figure out what aspects of the default modules breaks this (and fix the default modules) rather than doing extreme surgery on the NERSC software environment every time the defaults get in the way.

That sounds good to me! Thanks for taking the time to look into this!

michel2323 · 2024-09-17T21:54:55Z

I'll apply for an account, and hopefully, I won't create more work for you two but remove some.

Thanks so much @JBlaschke ! Happy vacation!

andrewdnolan · 2024-09-18T20:43:29Z

Testing

With the environment/CUDA work around tested above I'm now able to run the unit tests and everything passes for me!

To run the tests on perlmutter, I did:

module purge 
source /opt/cray/pe/cpe/23.12/restore_lmod_system_defaults.sh

module load PrgEnv-gnu/8.5.0
module load julia/1.10.4
module load cpe-cuda/23.12

julia -O0 --color=yes --debug-info=2 --project=@. -e 'using Pkg; Pkg.test()'

which resulted in:

Unit Test Logs

     Testing MOKA
      Status `/tmp/jl_EpUHmt/Project.toml`
  [79e6a3ab] Adapt v4.0.4
⌃ [052768ef] CUDA v5.4.3
  [7da242da] Enzyme v0.12.36
  [63c18a36] KernelAbstractions v0.9.26
  [4c5738b9] MOKA v0.1.0 `/global/u2/a/anolan/MPAS-Ocean.jl`
  [3a884ed6] UnPack v1.0.2
  [ddb6d928] YAML v0.4.12
  [ade2ca70] Dates
  [f43a241f] Downloads v1.6.0
  [4af54fe1] LazyArtifacts
  [37e2e46d] LinearAlgebra
  [44cfe95a] Pkg v1.10.0
  [8dfed614] Test
      Status `/tmp/jl_EpUHmt/Manifest.toml`
  [21141c5a] AMDGPU v1.0.1
  [621f4979] AbstractFFTs v1.5.0
  [7d9f7c33] Accessors v0.1.38
  [79e6a3ab] Adapt v4.0.4
  [a9b6321e] Atomix v0.1.0
  [ab4f0b2a] BFloat16s v0.5.0
  [6e4b80f9] BenchmarkTools v1.5.0
  [d1d4a3ce] BitFlags v0.1.9
  [fa961155] CEnum v0.5.0
  [179af706] CFTime v0.1.3
⌃ [052768ef] CUDA v5.4.3
  [1af6417a] CUDA_Runtime_Discovery v0.3.5
  [da1fd8a2] CodeTracking v1.3.6
  [944b1d66] CodecZlib v0.7.6
  [35d6a980] ColorSchemes v3.26.0
  [3da002f7] ColorTypes v0.11.5
  [c3611d14] ColorVectorSpace v0.10.0
  [5ae59095] Colors v0.12.11
  [1fbeeb36] CommonDataModel v0.3.6
  [34da2185] Compat v4.16.0
  [a33af91c] CompositionsBase v0.1.2
  [f0e56b4a] ConcurrentUtilities v2.4.2
  [187b0558] ConstructionBase v1.5.8
  [d38c429a] Contour v0.6.3
  [a8cc5b0e] Crayons v4.1.1
  [9a962f9c] DataAPI v1.16.0
  [a93c6f00] DataFrames v1.6.1
  [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [8bb1440f] DelimitedFiles v1.9.1
⌅ [3c3547ce] DiskArrays v0.3.23
  [ffbed154] DocStringExtensions v0.9.3
  [7da242da] Enzyme v0.12.36
  [f151be2c] EnzymeCore v0.7.8
  [460bff9d] ExceptionUnwrapping v0.1.10
  [e2ba6199] ExprTools v0.1.10
  [c87230d0] FFMPEG v0.4.1
  [53c48c17] FixedPointNumbers v0.8.5
  [1fa38f19] Format v1.3.7
  [0c68f7d7] GPUArrays v10.3.1
  [46192b85] GPUArraysCore v0.1.6
⌅ [61eb1bfa] GPUCompiler v0.26.7
  [28b8d3ca] GR v0.73.7
  [42e2da0e] Grisu v1.0.2
  [cd3eb016] HTTP v1.10.8
  [842dd82b] InlineStrings v1.4.2
  [3587e190] InverseFunctions v0.1.16
  [41ab1584] InvertedIndices v1.3.0
  [92d709cd] IrrationalConstants v0.2.2
  [82899510] IteratorInterfaceExtensions v1.0.0
  [1019f520] JLFzf v0.1.8
  [692b3bcd] JLLWrappers v1.6.0
  [682c06a0] JSON v0.21.4
  [aa1ae85d] JuliaInterpreter v0.9.36
  [63c18a36] KernelAbstractions v0.9.26
⌅ [929cbde3] LLVM v8.1.0
  [8b046642] LLVMLoopInfo v1.0.0
  [8ac3fa9e] LRUCache v1.6.1
  [b964fa9f] LaTeXStrings v1.3.1
  [23fbe1c1] Latexify v0.16.5
  [2ab3a3ac] LogExpFunctions v0.3.28
  [e6f89c97] LoggingExtras v1.0.3
  [6f1432cf] LoweredCodeUtils v3.0.2
  [4c5738b9] MOKA v0.1.0 `/global/u2/a/anolan/MPAS-Ocean.jl`
  [da04e1cc] MPI v0.20.21
  [3da0fdf6] MPIPreferences v0.1.11
  [1914dd2f] MacroTools v0.5.13
  [739be429] MbedTLS v1.1.9
  [442fdcdd] Measures v0.3.2
  [e1d29d7a] Missings v1.2.0
  [85f8d34a] NCDatasets v0.14.5
  [5da4648a] NVTX v0.3.4
  [77ba4419] NaNMath v1.0.2
  [d8793406] ObjectFile v0.4.2
  [6fe1bfb0] OffsetArrays v1.14.1
  [4d8831e6] OpenSSL v1.4.3
  [bac558e1] OrderedCollections v1.6.3
  [69de0a69] Parsers v2.8.1
  [b98c9c47] Pipe v1.3.0
  [eebad327] PkgVersion v0.3.3
  [ccf2f8ad] PlotThemes v3.2.0
  [995b91a9] PlotUtils v1.4.1
  [91a5bcdd] Plots v1.40.8
  [2dfb63ee] PooledArrays v1.4.3
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [08abe8d2] PrettyTables v2.3.2
  [74087812] Random123 v1.7.0
  [e6cf234a] RandomNumbers v1.6.0
  [3cdcf5f2] RecipesBase v1.3.4
  [01d81517] RecipesPipeline v0.6.12
  [189a3867] Reexport v1.2.2
  [05181044] RelocatableFolders v1.0.1
  [ae029012] Requires v1.3.0
  [295af30f] Revise v3.5.18
  [6c6a2e73] Scratch v1.2.1
  [91c51154] SentinelArrays v1.4.5
  [992d4aef] Showoff v1.0.3
  [777ac1f9] SimpleBufferStream v1.1.0
  [a2af1166] SortingAlgorithms v1.2.1
  [276daf66] SpecialFunctions v2.4.0
  [90137ffa] StaticArrays v1.9.7
  [1e83bf80] StaticArraysCore v1.4.3
  [82ae8749] StatsAPI v1.7.0
  [2913bbd2] StatsBase v0.34.3
  [69024149] StringEncodings v0.3.7
  [892a3eda] StringManipulation v0.3.4
  [09ab397b] StructArrays v0.6.18
  [53d494c1] StructIO v0.3.1
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [62fd8b95] TensorCore v0.1.1
  [a759f4b9] TimerOutputs v0.5.24
  [3bb67fe8] TranscodingStreams v0.11.2
  [5c2747f8] URIs v1.5.1
  [3a884ed6] UnPack v1.0.2
  [1cfade01] UnicodeFun v0.4.1
  [1986cc42] Unitful v1.21.0
  [45397f5d] UnitfulLatexify v1.6.4
  [013be700] UnsafeAtomics v0.2.1
  [d80eeb9a] UnsafeAtomicsLLVM v0.2.1
  [41fe7b60] Unzip v0.2.0
  [ddb6d928] YAML v0.4.12
  [0b7ba130] Blosc_jll v1.21.5+0
  [6e34b625] Bzip2_jll v1.0.8+1
⌅ [4ee394cb] CUDA_Driver_jll v0.9.2+0
⌅ [76a88914] CUDA_Runtime_jll v0.14.1+0
  [83423d85] Cairo_jll v1.18.0+2
  [ee1fde0b] Dbus_jll v1.14.10+0
  [ab5a07f8] Elfutils_jll v0.189.0+1
⌅ [7cc45869] Enzyme_jll v0.0.148+0
  [2702e6a9] EpollShim_jll v0.0.20230411+0
  [2e619515] Expat_jll v2.6.2+0
⌅ [b22a6f82] FFMPEG_jll v4.4.4+1
  [a3f928ae] Fontconfig_jll v2.13.96+0
  [d7e528f0] FreeType2_jll v2.13.2+0
  [559328eb] FriBidi_jll v1.0.14+0
  [0656b61e] GLFW_jll v3.4.0+1
  [d2c73de3] GR_jll v0.73.7+0
  [78b55507] Gettext_jll v0.21.0+0
  [7746bdde] Glib_jll v2.80.2+0
  [0951126a] GnuTLS_jll v3.8.4+0
  [3b182d85] Graphite2_jll v1.3.14+0
  [0234f1f7] HDF5_jll v1.14.3+3
  [2696aab5] HIP_jll v5.4.4+0
  [2e76f6c2] HarfBuzz_jll v8.3.1+0
  [e33a78d0] Hwloc_jll v2.11.1+0
  [aacddb02] JpegTurbo_jll v3.0.3+0
  [9c1d0b0a] JuliaNVTXCallbacks_jll v0.2.1+0
  [c1c5ebd0] LAME_jll v3.100.2+0
⌅ [88015f11] LERC_jll v3.0.0+1
⌅ [dad2f222] LLVMExtra_jll v0.0.31+0
  [1d63c593] LLVMOpenMP_jll v18.1.7+0
⌅ [86de99a1] LLVM_jll v15.0.7+10
  [dd4b983a] LZO_jll v2.10.2+0
⌅ [e9f186c6] Libffi_jll v3.2.2+1
  [d4300ac3] Libgcrypt_jll v1.8.11+0
  [7e76a0d4] Libglvnd_jll v1.6.0+0
  [7add5ba3] Libgpg_error_jll v1.49.0+0
  [94ce4f54] Libiconv_jll v1.17.0+0
  [4b2f31a3] Libmount_jll v2.40.1+0
⌅ [89763e89] Libtiff_jll v4.5.1+1
  [38a345b3] Libuuid_jll v2.40.1+0
  [5ced341a] Lz4_jll v1.10.0+0
  [7cb0a576] MPICH_jll v4.2.2+0
  [f1f71cc9] MPItrampoline_jll v5.4.0+0
  [9237b28f] MicrosoftMPI_jll v10.1.4+2
  [7f51dc2b] NUMA_jll v2.0.18+0
  [e98f9f5b] NVTX_jll v3.1.0+2
  [7243133f] NetCDF_jll v400.902.211+1
⌅ [4c82536e] Nettle_jll v3.7.2+0
  [e7412a2a] Ogg_jll v1.3.5+1
⌅ [fe0851c0] OpenMPI_jll v4.1.6+0
  [458c3c95] OpenSSL_jll v3.0.15+0
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [91d4177d] Opus_jll v1.3.3+0
  [c2071276] P11Kit_jll v0.24.1+0
  [36c8627f] Pango_jll v1.54.1+0
  [30392449] Pixman_jll v0.43.4+0
  [c0090381] Qt6Base_jll v6.7.1+1
  [629bc702] Qt6Declarative_jll v6.7.1+2
  [ce943373] Qt6ShaderTools_jll v6.7.1+1
  [e99dba38] Qt6Wayland_jll v6.7.1+1
  [8fbdd1d2] ROCmCompilerSupport_jll v5.4.4+0
  [873c0968] ROCmDeviceLibs_jll v5.6.1+1
  [10ae2a08] ROCmOpenCLRuntime_jll v5.4.4+0
  [a44049a8] Vulkan_Loader_jll v1.3.243+0
  [a2964d1f] Wayland_jll v1.21.0+1
  [2381bf8a] Wayland_protocols_jll v1.31.0+0
  [02c8fc9c] XML2_jll v2.13.3+0
  [aed1982a] XSLT_jll v1.1.41+0
  [ffd25f8a] XZ_jll v5.4.6+0
  [f67eecfb] Xorg_libICE_jll v1.1.1+0
  [c834827a] Xorg_libSM_jll v1.2.4+0
  [4f6342f7] Xorg_libX11_jll v1.8.6+0
  [0c0b7dd1] Xorg_libXau_jll v1.0.11+0
  [935fb764] Xorg_libXcursor_jll v1.2.0+4
  [a3789734] Xorg_libXdmcp_jll v1.1.4+0
  [1082639a] Xorg_libXext_jll v1.3.6+0
  [d091e8ba] Xorg_libXfixes_jll v5.0.3+4
  [a51aa0fd] Xorg_libXi_jll v1.7.10+4
  [d1454406] Xorg_libXinerama_jll v1.1.4+4
  [ec84b674] Xorg_libXrandr_jll v1.5.2+4
  [ea2f1a96] Xorg_libXrender_jll v0.9.11+0
  [a65dc6b1] Xorg_libpciaccess_jll v0.16.0+1
  [14d82f49] Xorg_libpthread_stubs_jll v0.1.1+0
  [c7cfdc94] Xorg_libxcb_jll v1.17.0+0
  [cc61e674] Xorg_libxkbfile_jll v1.1.2+0
  [e920d4aa] Xorg_xcb_util_cursor_jll v0.1.4+0
  [12413925] Xorg_xcb_util_image_jll v0.4.0+1
  [2def613f] Xorg_xcb_util_jll v0.4.0+1
  [975044d2] Xorg_xcb_util_keysyms_jll v0.4.0+1
  [0d47668e] Xorg_xcb_util_renderutil_jll v0.3.9+1
  [c22f9ab0] Xorg_xcb_util_wm_jll v0.4.1+1
  [35661453] Xorg_xkbcomp_jll v1.4.6+0
  [33bec58e] Xorg_xkeyboard_config_jll v2.39.0+0
  [c4d99508] Xorg_xorgproto_jll v2019.2.0+2
  [c5fb5394] Xorg_xtrans_jll v1.5.0+0
  [3161d3a3] Zstd_jll v1.5.6+0
  [c53206cc] argp_standalone_jll v1.3.1+0
  [35ca27e7] eudev_jll v3.2.9+0
  [d65627f6] fts_jll v1.2.8+0
  [214eeab7] fzf_jll v0.53.0+0
  [1a1c6b14] gperf_jll v3.1.1+0
  [dd59ff1a] hsa_rocr_jll v5.4.4+0
  [1cecccd7] hsakmt_roct_jll v5.5.1+0
  [477f73a3] libaec_jll v1.1.2+0
  [a4ae2306] libaom_jll v3.9.0+0
  [0ac62f75] libass_jll v0.15.2+0
  [1183f4f0] libdecor_jll v0.2.2+0
  [8e53e030] libdrm_jll v2.4.110+0
  [2db6ffa8] libevdev_jll v1.11.0+0
  [f638f0a6] libfdk_aac_jll v2.0.3+0
  [36db933b] libinput_jll v1.18.0+0
  [b53b4c65] libpng_jll v1.6.43+1
  [f27f6e37] libvorbis_jll v1.3.7+2
  [337d8026] libzip_jll v1.10.1+0
  [009596ad] mtdev_jll v1.1.6+0
  [c88a4935] obstack_jll v1.2.3+0
  [5a766526] rocminfo_jll v5.4.4+0
⌅ [1270edf5] x264_jll v2021.5.5+0
⌅ [dfaa095f] x265_jll v3.5.0+0
  [d8fb68d0] xkbcommon_jll v1.4.1+1
  [0dad84c5] ArgTools v1.1.1
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [ade2ca70] Dates
  [8ba89e20] Distributed
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching
  [9fa8497b] Future
  [b77e0a4c] InteractiveUtils
  [4af54fe1] LazyArtifacts
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [a63ad114] Mmap
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.10.0
  [de0858da] Printf
  [9abbd945] Profile
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [6462fe0b] Sockets
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [8dfed614] Test
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.1.1+0
  [781609d7] GMP_jll v6.2.1+6
  [d55e3150] LLD_jll v15.0.7+10
  [deac9b47] LibCURL_jll v8.4.0+0
  [e37daf67] LibGit2_jll v1.6.4+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.2+1
  [14a3606d] MozillaCACerts_jll v2023.1.10
  [4536629a] OpenBLAS_jll v0.3.23+4
  [05823500] OpenLibm_jll v0.8.1+2
  [efcefdf7] PCRE2_jll v10.42.0+1
  [bea87d4a] SuiteSparse_jll v7.2.1+1
  [83775a58] Zlib_jll v1.2.13+1
  [8f36deef] libLLVM_jll v15.0.7+10
  [8e850b90] libblastrampoline_jll v5.8.0+1
  [8e850ede] nghttp2_jll v1.52.0+1
  [3f19e933] p7zip_jll v17.4.0+2
        Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading.
     Testing Running tests...
WARNING: Method definition (::Type{Main.ErrorMeasures{FT}})(Any, Any) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:14 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition (::Type{Main.ErrorMeasures{FT} where FT})(FT, FT) where {FT} in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:14 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition (::Type{Main.ErrorMeasures{FT} where FT})(Any, Any, Any, Any) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:18 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition compute_area(Any, Type{MOKA.MPASMesh.Cell}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:30 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition compute_area(Any, Type{MOKA.MPASMesh.Vertex}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:31 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition compute_area(Any, Type{MOKA.MPASMesh.Edge}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:32 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition (::Type{Main.TestSetup{FT, IT, AT}})(Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:36 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition (::Type{Main.TestSetup{FT, IT, AT} where AT where IT where FT})(KernelAbstractions.Backend, AT, AT, AT, AT, AT, AT, FT, FT, AT, AT, IT) where {FT, IT, AT} in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:36 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition (::Type{Main.TestSetup{FT, IT, AT} where AT where IT where FT})(MOKA.MPASMesh.Mesh{HM, VM} where VM where HM, Type{Main.PlanarTest}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:56 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition kwcall(NamedTuple{names, T} where T<:Tuple where names, Type{Main.TestSetup{FT, IT, AT} where AT where IT where FT}, MOKA.MPASMesh.Mesh{HM, VM} where VM where HM, Type{Main.PlanarTest}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:56 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition h(Main.TestSetup{FT, IT, AT} where AT where IT where FT, Type{Main.PlanarTest}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:93 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition 𝐅ˣ(Main.TestSetup{FT, IT, AT} where AT where IT where FT, Type{Main.PlanarTest}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:106 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition 𝐅ʸ(Main.TestSetup{FT, IT, AT} where AT where IT where FT, Type{Main.PlanarTest}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:114 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition ∂h∂x(Main.TestSetup{FT, IT, AT} where AT where IT where FT, Type{Main.PlanarTest}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:120 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition ∂h∂y(Main.TestSetup{FT, IT, AT} where AT where IT where FT, Type{Main.PlanarTest}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:126 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition div𝐅(Main.TestSetup{FT, IT, AT} where AT where IT where FT, Type{Main.PlanarTest}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:135 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition curl𝐅(Main.TestSetup{FT, IT, AT} where AT where IT where FT, Type{Main.PlanarTest}) in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:148 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition 𝐅ₑ(Main.TestSetup{FT, IT, AT} where AT where IT where FT, Type{TC}) where {TC<:Main.TestCase} in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:161 overwritten on the same line (check for duplicate calls to `include`).
WARNING: Method definition ∇hₑ(Main.TestSetup{FT, IT, AT} where AT where IT where FT, Type{TC}) where {TC<:Main.TestCase} in module Main at /global/u2/a/anolan/MPAS-Ocean.jl/test/utilities.jl:178 overwritten on the same line (check for duplicate calls to `include`).
┌ Info:  (Operators on GPU) 
│ 
│ Gradient
│ --------
│ L∞ norm of error : 0.0012502607187856627
│ L₂ norm of error : 0.0013435461111726162
│ 
│ Divergence
│ ----------
│ L∞ norm of error: 0.0012488688659444015
│ L₂ norm of error: 0.0012488688659097393
│ 
│ Curl
│ ----
│ L∞ norm of error: 0.16136566356967616
└ L₂ norm of error: 0.16134801689713474
┌ Warning: Assignment to `HorzMesh` in soft scope is ambiguous because a global variable by the same name exists: `HorzMesh` will be treated as a new local. Disambiguate by using `local HorzMesh` to suppress this warning or `global HorzMesh` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:27
┌ Warning: Assignment to `VertMesh` in soft scope is ambiguous because a global variable by the same name exists: `VertMesh` will be treated as a new local. Disambiguate by using `local VertMesh` to suppress this warning or `global VertMesh` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:29
┌ Warning: Assignment to `MPASMesh` in soft scope is ambiguous because a global variable by the same name exists: `MPASMesh` will be treated as a new local. Disambiguate by using `local MPASMesh` to suppress this warning or `global MPASMesh` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:31
┌ Warning: Assignment to `setup` in soft scope is ambiguous because a global variable by the same name exists: `setup` will be treated as a new local. Disambiguate by using `local setup` to suppress this warning or `global setup` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:33
┌ Warning: Assignment to `nEdges` in soft scope is ambiguous because a global variable by the same name exists: `nEdges` will be treated as a new local. Disambiguate by using `local nEdges` to suppress this warning or `global nEdges` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:35
┌ Warning: Assignment to `nCells` in soft scope is ambiguous because a global variable by the same name exists: `nCells` will be treated as a new local. Disambiguate by using `local nCells` to suppress this warning or `global nCells` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:36
┌ Warning: Assignment to `nVertLevels` in soft scope is ambiguous because a global variable by the same name exists: `nVertLevels` will be treated as a new local. Disambiguate by using `local nVertLevels` to suppress this warning or `global nVertLevels` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:37
┌ Warning: Assignment to `gradNum` in soft scope is ambiguous because a global variable by the same name exists: `gradNum` will be treated as a new local. Disambiguate by using `local gradNum` to suppress this warning or `global gradNum` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:49
┌ Warning: Assignment to `Scalar` in soft scope is ambiguous because a global variable by the same name exists: `Scalar` will be treated as a new local. Disambiguate by using `local Scalar` to suppress this warning or `global Scalar` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:50
┌ Warning: Assignment to `divNum` in soft scope is ambiguous because a global variable by the same name exists: `divNum` will be treated as a new local. Disambiguate by using `local divNum` to suppress this warning or `global divNum` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:145
┌ Warning: Assignment to `VecEdge` in soft scope is ambiguous because a global variable by the same name exists: `VecEdge` will be treated as a new local. Disambiguate by using `local VecEdge` to suppress this warning or `global VecEdge` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:146
┌ Warning: Assignment to `temp` in soft scope is ambiguous because a global variable by the same name exists: `temp` will be treated as a new local. Disambiguate by using `local temp` to suppress this warning or `global temp` to assign to the existing global variable.
└ @ /global/u2/a/anolan/MPAS-Ocean.jl/test/enzyme/test_Enzyme_Operators.jl:147
┌ Info:  (gradients)
│ 
│ For edge global input 1, output 1
│ Enzyme computed 48.00000000000077
└ Finite differences computed 48.00000032512775
┌ Info:  (divergence)
│ 
│ For cell global input 2, output 1
│ Enzyme computed -32.00000000000052
└ Finite differences computed -31.999999391116123
┌ Info:  (gradients)
│ 
│ For edge global input 1, output 1
│ Enzyme computed 48.00000000000077
└ Finite differences computed 48.00000032512775
┌ Info:  (divergence)
│ 
│ For cell global input 2, output 1
│ Enzyme computed -32.00000000000052
└ Finite differences computed -31.999999391116123
backend = KernelAbstractions.CPU(false)
(nEdges, nCells) = (6912, 2304)
backend = CUDABackend(false, false)
(nEdges, nCells) = (6912, 2304)
Test Summary: |  Pass  Broken  Total     Time
Moka          | 73879       1  73880  2m31.5s
     Testing MOKA tests passed

Tests all pass okay, but there are lots of warnings related to duplicate include calls and variable softscope warnings.

I just added two commits that deal with those warnings, and a third that gets rid of that unneeded python file. I'll do the testing one more time around and post results here.

andrewdnolan · 2024-09-18T21:43:00Z

Testing (Cont.)

With those last couple commits the unit test log now looks like:

Unit Test Logs

     Testing MOKA
      Status `/tmp/jl_nGw76j/Project.toml`
  [79e6a3ab] Adapt v4.0.4
⌃ [052768ef] CUDA v5.4.3
  [7da242da] Enzyme v0.12.36
  [63c18a36] KernelAbstractions v0.9.26
  [4c5738b9] MOKA v0.1.0 `/global/u2/a/anolan/MPAS-Ocean.jl`
  [3a884ed6] UnPack v1.0.2
  [ddb6d928] YAML v0.4.12
  [ade2ca70] Dates
  [f43a241f] Downloads v1.6.0
  [4af54fe1] LazyArtifacts
  [37e2e46d] LinearAlgebra
  [44cfe95a] Pkg v1.10.0
  [8dfed614] Test
      Status `/tmp/jl_nGw76j/Manifest.toml`
  [21141c5a] AMDGPU v1.0.1
  [621f4979] AbstractFFTs v1.5.0
  [7d9f7c33] Accessors v0.1.38
  [79e6a3ab] Adapt v4.0.4
  [a9b6321e] Atomix v0.1.0
  [ab4f0b2a] BFloat16s v0.5.0
  [6e4b80f9] BenchmarkTools v1.5.0
  [d1d4a3ce] BitFlags v0.1.9
  [fa961155] CEnum v0.5.0
  [179af706] CFTime v0.1.3
⌃ [052768ef] CUDA v5.4.3
  [1af6417a] CUDA_Runtime_Discovery v0.3.5
  [da1fd8a2] CodeTracking v1.3.6
  [944b1d66] CodecZlib v0.7.6
  [35d6a980] ColorSchemes v3.26.0
  [3da002f7] ColorTypes v0.11.5
  [c3611d14] ColorVectorSpace v0.10.0
  [5ae59095] Colors v0.12.11
  [1fbeeb36] CommonDataModel v0.3.6
  [34da2185] Compat v4.16.0
  [a33af91c] CompositionsBase v0.1.2
  [f0e56b4a] ConcurrentUtilities v2.4.2
  [187b0558] ConstructionBase v1.5.8
  [d38c429a] Contour v0.6.3
  [a8cc5b0e] Crayons v4.1.1
  [9a962f9c] DataAPI v1.16.0
  [a93c6f00] DataFrames v1.6.1
  [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [8bb1440f] DelimitedFiles v1.9.1
⌅ [3c3547ce] DiskArrays v0.3.23
  [ffbed154] DocStringExtensions v0.9.3
  [7da242da] Enzyme v0.12.36
  [f151be2c] EnzymeCore v0.7.8
  [460bff9d] ExceptionUnwrapping v0.1.10
  [e2ba6199] ExprTools v0.1.10
  [c87230d0] FFMPEG v0.4.1
  [53c48c17] FixedPointNumbers v0.8.5
  [1fa38f19] Format v1.3.7
  [0c68f7d7] GPUArrays v10.3.1
  [46192b85] GPUArraysCore v0.1.6
⌅ [61eb1bfa] GPUCompiler v0.26.7
  [28b8d3ca] GR v0.73.7
  [42e2da0e] Grisu v1.0.2
  [cd3eb016] HTTP v1.10.8
  [842dd82b] InlineStrings v1.4.2
  [3587e190] InverseFunctions v0.1.16
  [41ab1584] InvertedIndices v1.3.0
  [92d709cd] IrrationalConstants v0.2.2
  [82899510] IteratorInterfaceExtensions v1.0.0
  [1019f520] JLFzf v0.1.8
  [692b3bcd] JLLWrappers v1.6.0
  [682c06a0] JSON v0.21.4
  [aa1ae85d] JuliaInterpreter v0.9.36
  [63c18a36] KernelAbstractions v0.9.26
⌅ [929cbde3] LLVM v8.1.0
  [8b046642] LLVMLoopInfo v1.0.0
  [8ac3fa9e] LRUCache v1.6.1
  [b964fa9f] LaTeXStrings v1.3.1
  [23fbe1c1] Latexify v0.16.5
  [2ab3a3ac] LogExpFunctions v0.3.28
  [e6f89c97] LoggingExtras v1.0.3
  [6f1432cf] LoweredCodeUtils v3.0.2
  [4c5738b9] MOKA v0.1.0 `/global/u2/a/anolan/MPAS-Ocean.jl`
  [da04e1cc] MPI v0.20.21
  [3da0fdf6] MPIPreferences v0.1.11
  [1914dd2f] MacroTools v0.5.13
  [739be429] MbedTLS v1.1.9
  [442fdcdd] Measures v0.3.2
  [e1d29d7a] Missings v1.2.0
  [85f8d34a] NCDatasets v0.14.5
  [5da4648a] NVTX v0.3.4
  [77ba4419] NaNMath v1.0.2
  [d8793406] ObjectFile v0.4.2
  [6fe1bfb0] OffsetArrays v1.14.1
  [4d8831e6] OpenSSL v1.4.3
  [bac558e1] OrderedCollections v1.6.3
  [69de0a69] Parsers v2.8.1
  [b98c9c47] Pipe v1.3.0
  [eebad327] PkgVersion v0.3.3
  [ccf2f8ad] PlotThemes v3.2.0
  [995b91a9] PlotUtils v1.4.1
  [91a5bcdd] Plots v1.40.8
  [2dfb63ee] PooledArrays v1.4.3
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [08abe8d2] PrettyTables v2.3.2
  [74087812] Random123 v1.7.0
  [e6cf234a] RandomNumbers v1.6.0
  [3cdcf5f2] RecipesBase v1.3.4
  [01d81517] RecipesPipeline v0.6.12
  [189a3867] Reexport v1.2.2
  [05181044] RelocatableFolders v1.0.1
  [ae029012] Requires v1.3.0
  [295af30f] Revise v3.5.18
  [6c6a2e73] Scratch v1.2.1
  [91c51154] SentinelArrays v1.4.5
  [992d4aef] Showoff v1.0.3
  [777ac1f9] SimpleBufferStream v1.1.0
  [a2af1166] SortingAlgorithms v1.2.1
  [276daf66] SpecialFunctions v2.4.0
  [90137ffa] StaticArrays v1.9.7
  [1e83bf80] StaticArraysCore v1.4.3
  [82ae8749] StatsAPI v1.7.0
  [2913bbd2] StatsBase v0.34.3
  [69024149] StringEncodings v0.3.7
  [892a3eda] StringManipulation v0.3.4
  [09ab397b] StructArrays v0.6.18
  [53d494c1] StructIO v0.3.1
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [62fd8b95] TensorCore v0.1.1
  [a759f4b9] TimerOutputs v0.5.24
  [3bb67fe8] TranscodingStreams v0.11.2
  [5c2747f8] URIs v1.5.1
  [3a884ed6] UnPack v1.0.2
  [1cfade01] UnicodeFun v0.4.1
  [1986cc42] Unitful v1.21.0
  [45397f5d] UnitfulLatexify v1.6.4
  [013be700] UnsafeAtomics v0.2.1
  [d80eeb9a] UnsafeAtomicsLLVM v0.2.1
  [41fe7b60] Unzip v0.2.0
  [ddb6d928] YAML v0.4.12
  [0b7ba130] Blosc_jll v1.21.5+0
  [6e34b625] Bzip2_jll v1.0.8+1
⌅ [4ee394cb] CUDA_Driver_jll v0.9.2+0
⌅ [76a88914] CUDA_Runtime_jll v0.14.1+0
  [83423d85] Cairo_jll v1.18.0+2
  [ee1fde0b] Dbus_jll v1.14.10+0
  [ab5a07f8] Elfutils_jll v0.189.0+1
⌅ [7cc45869] Enzyme_jll v0.0.148+0
  [2702e6a9] EpollShim_jll v0.0.20230411+0
  [2e619515] Expat_jll v2.6.2+0
⌅ [b22a6f82] FFMPEG_jll v4.4.4+1
  [a3f928ae] Fontconfig_jll v2.13.96+0
  [d7e528f0] FreeType2_jll v2.13.2+0
  [559328eb] FriBidi_jll v1.0.14+0
  [0656b61e] GLFW_jll v3.4.0+1
  [d2c73de3] GR_jll v0.73.7+0
  [78b55507] Gettext_jll v0.21.0+0
  [7746bdde] Glib_jll v2.80.2+0
  [0951126a] GnuTLS_jll v3.8.4+0
  [3b182d85] Graphite2_jll v1.3.14+0
  [0234f1f7] HDF5_jll v1.14.3+3
  [2696aab5] HIP_jll v5.4.4+0
  [2e76f6c2] HarfBuzz_jll v8.3.1+0
  [e33a78d0] Hwloc_jll v2.11.1+0
  [aacddb02] JpegTurbo_jll v3.0.3+0
  [9c1d0b0a] JuliaNVTXCallbacks_jll v0.2.1+0
  [c1c5ebd0] LAME_jll v3.100.2+0
⌅ [88015f11] LERC_jll v3.0.0+1
⌅ [dad2f222] LLVMExtra_jll v0.0.31+0
  [1d63c593] LLVMOpenMP_jll v18.1.7+0
⌅ [86de99a1] LLVM_jll v15.0.7+10
  [dd4b983a] LZO_jll v2.10.2+0
⌅ [e9f186c6] Libffi_jll v3.2.2+1
  [d4300ac3] Libgcrypt_jll v1.8.11+0
  [7e76a0d4] Libglvnd_jll v1.6.0+0
  [7add5ba3] Libgpg_error_jll v1.49.0+0
  [94ce4f54] Libiconv_jll v1.17.0+0
  [4b2f31a3] Libmount_jll v2.40.1+0
⌅ [89763e89] Libtiff_jll v4.5.1+1
  [38a345b3] Libuuid_jll v2.40.1+0
  [5ced341a] Lz4_jll v1.10.0+0
  [7cb0a576] MPICH_jll v4.2.2+0
  [f1f71cc9] MPItrampoline_jll v5.4.0+0
  [9237b28f] MicrosoftMPI_jll v10.1.4+2
  [7f51dc2b] NUMA_jll v2.0.18+0
  [e98f9f5b] NVTX_jll v3.1.0+2
  [7243133f] NetCDF_jll v400.902.211+1
⌅ [4c82536e] Nettle_jll v3.7.2+0
  [e7412a2a] Ogg_jll v1.3.5+1
⌅ [fe0851c0] OpenMPI_jll v4.1.6+0
  [458c3c95] OpenSSL_jll v3.0.15+0
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [91d4177d] Opus_jll v1.3.3+0
  [c2071276] P11Kit_jll v0.24.1+0
  [36c8627f] Pango_jll v1.54.1+0
  [30392449] Pixman_jll v0.43.4+0
  [c0090381] Qt6Base_jll v6.7.1+1
  [629bc702] Qt6Declarative_jll v6.7.1+2
  [ce943373] Qt6ShaderTools_jll v6.7.1+1
  [e99dba38] Qt6Wayland_jll v6.7.1+1
  [8fbdd1d2] ROCmCompilerSupport_jll v5.4.4+0
  [873c0968] ROCmDeviceLibs_jll v5.6.1+1
  [10ae2a08] ROCmOpenCLRuntime_jll v5.4.4+0
  [a44049a8] Vulkan_Loader_jll v1.3.243+0
  [a2964d1f] Wayland_jll v1.21.0+1
  [2381bf8a] Wayland_protocols_jll v1.31.0+0
  [02c8fc9c] XML2_jll v2.13.3+0
  [aed1982a] XSLT_jll v1.1.41+0
  [ffd25f8a] XZ_jll v5.4.6+0
  [f67eecfb] Xorg_libICE_jll v1.1.1+0
  [c834827a] Xorg_libSM_jll v1.2.4+0
  [4f6342f7] Xorg_libX11_jll v1.8.6+0
  [0c0b7dd1] Xorg_libXau_jll v1.0.11+0
  [935fb764] Xorg_libXcursor_jll v1.2.0+4
  [a3789734] Xorg_libXdmcp_jll v1.1.4+0
  [1082639a] Xorg_libXext_jll v1.3.6+0
  [d091e8ba] Xorg_libXfixes_jll v5.0.3+4
  [a51aa0fd] Xorg_libXi_jll v1.7.10+4
  [d1454406] Xorg_libXinerama_jll v1.1.4+4
  [ec84b674] Xorg_libXrandr_jll v1.5.2+4
  [ea2f1a96] Xorg_libXrender_jll v0.9.11+0
  [a65dc6b1] Xorg_libpciaccess_jll v0.16.0+1
  [14d82f49] Xorg_libpthread_stubs_jll v0.1.1+0
  [c7cfdc94] Xorg_libxcb_jll v1.17.0+0
  [cc61e674] Xorg_libxkbfile_jll v1.1.2+0
  [e920d4aa] Xorg_xcb_util_cursor_jll v0.1.4+0
  [12413925] Xorg_xcb_util_image_jll v0.4.0+1
  [2def613f] Xorg_xcb_util_jll v0.4.0+1
  [975044d2] Xorg_xcb_util_keysyms_jll v0.4.0+1
  [0d47668e] Xorg_xcb_util_renderutil_jll v0.3.9+1
  [c22f9ab0] Xorg_xcb_util_wm_jll v0.4.1+1
  [35661453] Xorg_xkbcomp_jll v1.4.6+0
  [33bec58e] Xorg_xkeyboard_config_jll v2.39.0+0
  [c4d99508] Xorg_xorgproto_jll v2019.2.0+2
  [c5fb5394] Xorg_xtrans_jll v1.5.0+0
  [3161d3a3] Zstd_jll v1.5.6+0
  [c53206cc] argp_standalone_jll v1.3.1+0
  [35ca27e7] eudev_jll v3.2.9+0
  [d65627f6] fts_jll v1.2.8+0
  [214eeab7] fzf_jll v0.53.0+0
  [1a1c6b14] gperf_jll v3.1.1+0
  [dd59ff1a] hsa_rocr_jll v5.4.4+0
  [1cecccd7] hsakmt_roct_jll v5.5.1+0
  [477f73a3] libaec_jll v1.1.2+0
  [a4ae2306] libaom_jll v3.9.0+0
  [0ac62f75] libass_jll v0.15.2+0
  [1183f4f0] libdecor_jll v0.2.2+0
  [8e53e030] libdrm_jll v2.4.110+0
  [2db6ffa8] libevdev_jll v1.11.0+0
  [f638f0a6] libfdk_aac_jll v2.0.3+0
  [36db933b] libinput_jll v1.18.0+0
  [b53b4c65] libpng_jll v1.6.43+1
  [f27f6e37] libvorbis_jll v1.3.7+2
  [337d8026] libzip_jll v1.10.1+0
  [009596ad] mtdev_jll v1.1.6+0
  [c88a4935] obstack_jll v1.2.3+0
  [5a766526] rocminfo_jll v5.4.4+0
⌅ [1270edf5] x264_jll v2021.5.5+0
⌅ [dfaa095f] x265_jll v3.5.0+0
  [d8fb68d0] xkbcommon_jll v1.4.1+1
  [0dad84c5] ArgTools v1.1.1
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [ade2ca70] Dates
  [8ba89e20] Distributed
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching
  [9fa8497b] Future
  [b77e0a4c] InteractiveUtils
  [4af54fe1] LazyArtifacts
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [a63ad114] Mmap
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.10.0
  [de0858da] Printf
  [9abbd945] Profile
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [6462fe0b] Sockets
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [8dfed614] Test
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.1.1+0
  [781609d7] GMP_jll v6.2.1+6
  [d55e3150] LLD_jll v15.0.7+10
  [deac9b47] LibCURL_jll v8.4.0+0
  [e37daf67] LibGit2_jll v1.6.4+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.2+1
  [14a3606d] MozillaCACerts_jll v2023.1.10
  [4536629a] OpenBLAS_jll v0.3.23+4
  [05823500] OpenLibm_jll v0.8.1+2
  [efcefdf7] PCRE2_jll v10.42.0+1
  [bea87d4a] SuiteSparse_jll v7.2.1+1
  [83775a58] Zlib_jll v1.2.13+1
  [8f36deef] libLLVM_jll v15.0.7+10
  [8e850b90] libblastrampoline_jll v5.8.0+1
  [8e850ede] nghttp2_jll v1.52.0+1
  [3f19e933] p7zip_jll v17.4.0+2
        Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading.
     Testing Running tests...
┌ Info:  (Operators on GPU) 
│ 
│ Gradient
│ --------
│ L∞ norm of error : 0.0012502607187856627
│ L₂ norm of error : 0.0013435461111726162
│ 
│ Divergence
│ ----------
│ L∞ norm of error: 0.0012488688659444015
│ L₂ norm of error: 0.0012488688659097393
│ 
│ Curl
│ ----
│ L∞ norm of error: 0.16136566356967616
└ L₂ norm of error: 0.16134801689713474
┌ Info:  (gradients)
│ 
│ For edge global input 1, output 1
│ Enzyme computed 48.00000000000077
└ Finite differences computed 48.00000032512775
┌ Info:  (divergence)
│ 
│ For cell global input 2, output 1
│ Enzyme computed -32.00000000000052
└ Finite differences computed -31.999999391116123
┌ Info:  (gradients)
│ 
│ For edge global input 1, output 1
│ Enzyme computed 48.00000000000077
└ Finite differences computed 48.00000032512775
┌ Info:  (divergence)
│ 
│ For cell global input 2, output 1
│ Enzyme computed -32.00000000000052
└ Finite differences computed -31.999999391116123
backend = KernelAbstractions.CPU(false)
(nEdges, nCells) = (6912, 2304)
backend = CUDABackend(false, false)
(nEdges, nCells) = (6912, 2304)
Test Summary: |  Pass  Broken  Total     Time
Moka          | 73879       1  73880  2m36.0s
     Testing MOKA tests passed

which no longer has the warnings from before.

So, using that version of the code I also ran the inertial gravity wave convergence tests on both CPU and GPU on perlmutter. I've attached a plot of the convergence for a GPU test and have the expected results

Tests on CPU produce produce the same results.

So, with that I'll go ahead and finally merge this. @jlk9 sorry about the hold up here! Thanks for all this work!

jlk9 added 30 commits August 22, 2024 14:22

Applied autodiff to end-to-end run, isolated model computations into …

fdd4aa9

…a loop helper function of ocn_run. Had to temporarily comment out kernels, will re-add them

Modified kernels so they cooperate with AD and (I think) no longer re…

45424fc

…quire @allowscalar macros. Still one summation that isn't working with AD

Successful (non-erroring) end-to-end AD of shallow water model

208a0f2

Added measure of ssh

7f94574

minor edits

5d1de5c

Removing uses of UnPack.jl due to incompatibility with Enzyme

9adc0b9

Removed circshift call

905470a

Tests show end-to-end AD aligns with FD results

c75234b

Expanded test across range of step sizes

c5366e7

Restored ocen_timestep to resemble previous function handle

0f179b7

Split regular forward run and AD run into different functions

07b17ea

Added command line option for specifying AD run, preserves default fo…

88400ba

…rward run otherwise

Minor edits

6544fb5

Re-added enz-rev branch of KA, set up GPU gradient test to avoid @all…

45c9154

…owscalar or broadcasting in part that it differentiated

Updates

69d1cda

Issue with e2e GPU run without AD

23ad15e

Making GPU-friendly modifications

684b1ce

Set up current manifest without Enzyme

8c2113c

Fixed compilation issue, but now results are slightly incorrect

93e86d4

Working and accurate GPU code

5bee022

Back to GPU kernels working with AD

fea6761

Progressing through GPU AD

0801259

Progressing through GPU AD

d9a461b

Temporary fix for blocksize, running into bug with diagnostic variables

9f5791e

Made prognostic variables vectors of arrays instead of arrays

f0bfd94

End to end GPU AD for diagnostic variable computations

746dfd9

Working on tendency computations

195cc0f

Facing KA issue

e94e753

Enabled Normal Velocity Tendency on GPU

41e1dd6

Making dt a length-1 array to avoid known limitations of KA AD

6ad9396

jlk9 and others added 8 commits August 22, 2024 14:24

Rename EnzymeExt.jl to MPASEnzymeExt.jl

b686f22

Renamed EnzymeExt to MPASEnzymeExt

5909fe9

Added file connecting to mesh data and config's

61e22c6

Added data input to end2end test using Artifiacts

bcbba73

Removed Downloads dependency

3c43bf1

Minor edits

9a6eede

Adjusted kernels to work for multiple vertical layers again

680984f

Adding CI

d0ecebf

jlk9 force-pushed the gpu-enzyme branch from e8e47e3 to d0ecebf Compare August 22, 2024 19:31

Editing config file

7c53a64

andrewdnolan reviewed Aug 22, 2024

View reviewed changes

michel2323 and others added 7 commits August 23, 2024 12:09

Add @test, forward, and CPU backend

b9fe989

Merge pull request #8 from jlk9/tests

270f6a4

Add @test, forward, and CPU backend

Updated dependency versions and Julia version, modfied Enzyme operato…

36ebbb3

…r test since forward mdoe on GPU no longer errors but still produces incorrect result

Updated dependencies in test project...

bab2332

vec edge forward mode test now produces correct results

c62f21c

Only do include once in runtests.jl file.

6e7ffc5

In order to avoid redefinitions of variables/function warnings.

Add statment to deal with variable scope

f7d43e1

Remove unneeded python file

18f7910

andrewdnolan merged commit ec4681b into andrewdnolan:develop Sep 19, 2024
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Enzyme functionality for both the GPU and CPU #6

Add Enzyme functionality for both the GPU and CPU #6

jlk9 commented Aug 21, 2024

andrewdnolan Aug 22, 2024

jlk9 Aug 30, 2024

andrewdnolan Aug 22, 2024

jlk9 Aug 30, 2024

andrewdnolan commented Sep 16, 2024

michel2323 commented Sep 17, 2024 •

edited

Loading

michel2323 commented Sep 17, 2024 •

edited

Loading

JBlaschke commented Sep 17, 2024

andrewdnolan commented Sep 17, 2024

JBlaschke commented Sep 17, 2024 •

edited

Loading

andrewdnolan commented Sep 17, 2024 •

edited

Loading

michel2323 commented Sep 17, 2024

andrewdnolan commented Sep 18, 2024

andrewdnolan commented Sep 18, 2024

Add Enzyme functionality for both the GPU and CPU #6

Add Enzyme functionality for both the GPU and CPU #6

Conversation

jlk9 commented Aug 21, 2024

andrewdnolan Aug 22, 2024

Choose a reason for hiding this comment

jlk9 Aug 30, 2024

Choose a reason for hiding this comment

andrewdnolan Aug 22, 2024

Choose a reason for hiding this comment

jlk9 Aug 30, 2024

Choose a reason for hiding this comment

andrewdnolan commented Sep 16, 2024

michel2323 commented Sep 17, 2024 • edited Loading

michel2323 commented Sep 17, 2024 • edited Loading

JBlaschke commented Sep 17, 2024

andrewdnolan commented Sep 17, 2024

JBlaschke commented Sep 17, 2024 • edited Loading

andrewdnolan commented Sep 17, 2024 • edited Loading

michel2323 commented Sep 17, 2024

andrewdnolan commented Sep 18, 2024

Testing

andrewdnolan commented Sep 18, 2024

Testing (Cont.)

michel2323 commented Sep 17, 2024 •

edited

Loading

michel2323 commented Sep 17, 2024 •

edited

Loading

JBlaschke commented Sep 17, 2024 •

edited

Loading

andrewdnolan commented Sep 17, 2024 •

edited

Loading