Memory leak in Dense layer with CUDA #1230

JulianTrommer · 2025-01-31T14:00:22Z

Hi,

I've encountered an issue that if you evaluate a model on the GPU with CUDA there seems to be some memory allocations that are not cleared via garbage collection. Here is a MWE that replicates the mentioned behavior:

using CUDA
using Lux, LuxCUDA

const gdev = Lux.gpu_device()

model = Chain(Dense(2, 128), [Dense(128, 128) for _ in 1:10]..., Dense(128, 2))
ps, st = Lux.setup(Lux.Random.default_rng(), model)

ps = ps |> gdev
st = st |> gdev

x = CUDA.rand(Float32, 2)

function eval_model(model, x, ps, st)
    for _ in 1:500000
        model(x, ps, st)
    end
end

eval_model(model, x, ps, st)

After executing the call to eval_model around 3,4GB are allocated that are not freed during garbage collection.
The model in the MWE has a considerable amount of parameters to visualize the memory leak quicker but it should also happen with networks of any size.
This was tested on Windows and Ubuntu with Julia version 1.10.8 and with the following versions of packages:

[052768ef] CUDA v5.6.1
[b2108857] Lux v1.6.0
[d0bbae9a] LuxCUDA v0.3.3

Thanks in advance!

The text was updated successfully, but these errors were encountered:

avik-pal · 2025-01-31T17:47:35Z

Is this RAM or VRAM? Also what if you run GC.gc(true); CUDA.reclaim() at the end of script

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak in Dense layer with CUDA #1230

Memory leak in Dense layer with CUDA #1230

JulianTrommer commented Jan 31, 2025

avik-pal commented Jan 31, 2025

Memory leak in Dense layer with CUDA #1230

Memory leak in Dense layer with CUDA #1230

Comments

JulianTrommer commented Jan 31, 2025

avik-pal commented Jan 31, 2025