-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simulation Model-specific extensions #228
Comments
I like the idea of having model by model extensions, but I have a few worries
Like say for instance we made an extension for SmartRedis in MOM6 or NEMO, would we have to find some way to add those models to a CI? Would they live in the MOM6 source code? I would like to keep what we can in a place where we have nightly integration and unit tests that don't require building a simulation code. We have a similar deal with the smartsim-lammps repo where we implemented a "dump" (I/O) style with SmartRedis. We've been wanting to contibute that back upstream for a while now, but we wanted to make sure we had the ability to test it regularly. One thought was to build it into a container for ease of testings. Im already thinking that we could use the
Would there be a need to implement such features in languages other than Python? I was imagining that this feature would live in the Python side as thats where we expect users to do analysis. Would there need to be helpers in the compiled languages as well? I tend to think that the work in #225 should be general enough to be a base to implement some of these more "tightly integrated" solutions. The biggest reason for this being that we can write performant code to pull the data with multiprocessing or possibly even MPI. I don't want to expect users to be able to write performant multiprocessing extensions that use shared memory or locking primitives on shared data structures. It's my thought that the
Another reason I think a base method to build on is important. We can provide documentation and examples for this method where we utilize it to create something simulation/workload specific. |
100% agree that this is a tricky case since the CI strategy might be different for projects which are distributed across a number of centers. If we held the client extensions within CrayLabs vs the model code, we could probably get away with a simple unit test by storing the grid for a given model and making sure we can always reconstruct the grid. Since the definition of the grid is such a fundamental part of most models it doesn't change THAT often, so we'd just have to keep track of when they might do a major refactor.
I agree. I think that base method could just return a list/dictionary of all the requested objects that might have been posted for a given subset of clients (be it decomposition within a single simulation, across the entire ensemble. The model-side dev would be responsible for constructing something useful from that, but I agree that we're best situated to make that gather performant on our side. We already know that doing the naive thing of looping over all the possible key prefixes is VERY slow and I think we can exercise some basic functionality to help parallelize the gather. |
Description
For many of simulation models with a large userbase, we could think about organizing a set of SmartRedis extensions that aggregates the operations that are both necessary and common for the manipulation of data into and out of the model. This is perhaps a slightly more model-by-model approach to the more general #225 solution proposed by @Spartee, i.e. we could leave
aggregate
as an abstract method intended to populated by the extension ofClient
.This might be a more realistic approach than trying to derive a general solution. Instead, model developers could implement these methods from their better-informed point of view.
Justification
Users of the simulation model who will need to perform common tasks that might need model-specific solutions.
Implementation Strategy
This is more of a high-level design question whose solution should answer the following
The text was updated successfully, but these errors were encountered: