Skip to content

Remove DeviceIntervalTopology #2343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft

Remove DeviceIntervalTopology #2343

wants to merge 7 commits into from

Conversation

charleskawczynski
Copy link
Member

This PR removes the DeviceIntervalTopology, and instead adapts IntervalTopology to the GPU. This allows us to compute things like z_min / z_max on the device (which needs the vertical mesh, which the DeviceIntervalTopology does not have).

Consequently, I've updated the columnwise test to include rayleigh_sponge_tendency_uₕ, which includes that computation.

@charleskawczynski
Copy link
Member Author

Trying on ClimaAtmos here

@charleskawczynski charleskawczynski force-pushed the ck/fixes branch 3 times, most recently from fcd312c to 765bb1a Compare June 3, 2025 21:45
@charleskawczynski
Copy link
Member Author

Ok, this is now working in ClimaCore CI, and ClimaAtmos.

As this PR developed, I came to better understand the trade-offs. On the one hand, this will allow users to write things like

function rayleigh_sponge_tendency_uₕ(ᶜuₕ, s)
    s isa Nothing && return NullBroadcasted()
    ᶜz = Fields.coordinate_field(axes(ᶜuₕ)).z
    ᶠz = Fields.coordinate_field(Spaces.face_space(axes(ᶜuₕ))).z
    zmax = Spaces.z_max(axes(ᶠz))
    return @. lazy(-β_rayleigh_uₕ(s, ᶜz, zmax) * ᶜuₕ)
end

On the other hand, more operations on vertical meshes need to be gpu compatible if we make the FiniteDifferenceSpace constructor adapt to the given device, which is a kind of annoying set of requirements to satisfy (it's much easier to write code that operates on CPU data structures compared to GPU ones). In addition, the way I handled this at the moment is to adapt intermediate things to the CPU, which is not ideal, and normally I wouldn't do this, but the context was relevant: the data is already sent to the host with the explicit Array call (look for ArrayType(vertical_interpolation_weights(cpu_space, target_zcoords))). For InputsOutputs, there might be a way to do this somewhat efficiently, but that too showed that there is an explicit use of Array, so an explicit sync doesn't seem too bad there.

The alternative is that we tell users: you must put z_min / z_max in the cache, and instead write this function as:

function rayleigh_sponge_tendency_uₕ(ᶜuₕ, s, p)
    s isa Nothing && return NullBroadcasted()
    (; zmax) = p
    ᶜz = Fields.coordinate_field(axes(ᶜuₕ)).z
    return @. lazy(-β_rayleigh_uₕ(s, ᶜz, zmax) * ᶜuₕ)
end

(the same will go for any variables in the vertical mesh). I'm torn. I almost think that the latter is just easier, and handle this on a case-by-case basis. The most I've seen so far is needing z_min / z_max. @imreddyTeja, what do you think?

@imreddyTeja
Copy link
Member

It seems like from a user POV, making FiniteDifferenceSpace adapt to the device would be simpler (I'm assuming any operations on vertical meshes are defined in ClimaCore). Moving z_min and z_max to the cache means the user must know that the vertical mesh if available on host but not device.

From a developer POV, moving what's needed on device to the cache seems much easier.

I'm not sure how many operations need to be made gpu compatible, but maybe this would be easier with Denis's ClimaComms.@threaded macro PR

@charleskawczynski charleskawczynski marked this pull request as draft June 18, 2025 11:20
@charleskawczynski
Copy link
Member Author

I think storing zmin/zmax in the cache is just much simpler. I’ll leave this PR open in a draft in case someone later finds that this is useful to revive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants