-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spike: Investigate client aggregation methods #225
Comments
I've thought about this quite a a bit and it's a problem that I think is best handled on the model simulation side. Most model grids (especially the ones within the climate domain) do not necessarily lend themselves to a general solution and require knowledge of the specific grid topology, e.g. the tri-polar, cubed-sphere, and finite element-like grids. Even if you knew that a grid fell under one of these general types, you still likely need information that is unique to a given implementation. Lastly, I think in most cases it will likely be more performant to do it on the model side since most models which some kind of parallel decomposition already have optimized routines to do so. One case that might be useful to handle is that of a rectangular topology, which are often used in idealized models and LES. That maps directly onto a normal n-dimensional tensor and we could setup a specific dataset structure to ensure that all the needed metadata exists for a reconstruction. Perhaps more useful than a client aggregation of the raw model data is a way to generally treat the data as a scattered data and interpolate to a given grid (e.g. |
Just to parachute an item of relevance from #228 I agree with @Spartee that it's a good idea to have a base method that implements a good parallel strategy for retrieving the same field(s) that have been distributed.
That question might then be left to the SmartRedis model-specific extension |
Thanks. I 100% agree with the motivation of this issue, but I agree it might be tricky to find a naming scheme "rank_*" that works for all use-cases. Maybe exposing some lower-level redis data structures would leave the door open for folks to choose their own strategy. Is there any fortran redis client supporting the basic commands like GET/SET/LPUSH/LPOP etc? |
That's a great point and definitely something we are considering. Utilizing something like a redis list to store a collection of keys to be aggregated might be easier than relying on a naming convention. We currently don't have direct support for other redis data types (e.g. LPUSH/LPOP/etc), but that's something we can easily build into existing client functionality or expose to the user. |
Description
In some cases, a global domain of a simulation or workload is needed for analysis or training. When using the SmartRedis clients in an MPI application, each rank sends it's own data to the
Orchestrator
(redis/keydb) database. a method,Client.aggregate
could be very useful in combining data back into a global field.Justification
Online analysis is a primary use case here. For example, let's say we are running a weather simulation and would like to view a global snapshot of the domain every 50 timesteps. currently, one must create a helper function to retrieve data from all the ranks that sent data to the database and recombine it into a global field before it can be analyized.
Also, future methods like
to_xarray
become much more useful if a domain can be aggregated first.Implementation Strategy
This ticket is meant to provide a place for feedback on this issue. The result of this ticket is a design document that will be shared with the community to collect feedback.
I've been ideating on this however, and I'm thinking something like the following method
This example would collect all the tensors in the database the start with
hello_world_rank_
. Essentially this is like glob syntax for specifying which tensors to retrieve.Open questions:
DataSet
object to make this easier for users?It's important that when we design this method, we keep in mind that the idea is to be able to quickly convert to something like an xarray dataset.
for example:
Acceptance Criteria
Client.aggregate
Tagging some people who I think would provide good feedback on this issue.
@ashao @rabernat @mellis13 @nbren12
The text was updated successfully, but these errors were encountered: