-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Controlling the Migration Assistant Control Plane #1060
Controlling the Migration Assistant Control Plane #1060
Conversation
Signed-off-by: Peter Nied <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1060 +/- ##
============================================
+ Coverage 80.16% 87.67% +7.51%
============================================
Files 367 11 -356
Lines 13743 284 -13459
Branches 949 0 -949
============================================
- Hits 11017 249 -10768
+ Misses 2149 35 -2114
+ Partials 577 0 -577
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
} | ||
|
||
|
||
# Discussion needed: Container will have a minimum number of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧵 For Discussion
# Secrets from ARNs so this seems like it aligns well to do this | ||
# at the same time. | ||
|
||
# Discussion needed: Directly cleanup was being done by script based |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧵 For Discussion
.parse(args); | ||
|
||
// Then override with settings | ||
var config = getConfig(args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧵 For Discussion
else: | ||
logger.warning(f"Ignoring extra value {arg}, there was no command name before it") | ||
i += 1 | ||
# Discussion Needed: Verification of the config should be done by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧵 For Discussion
SCHEMA = { | ||
"from_snapshot": FROM_SNAPSHOT_SCHEMA, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might need to reorganize the layout of the existing services.yaml, e.g. the snapshot definition seems like it should be separate from the snapshot tool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't merge this.
I'm worried that this creates a strong dependency between the migration console's global (union'ed) schema and every single one of our projects, forcing people embedding just one program to need to know about all others. For example, a formatting change within the migration console would require a recompile for the application layers. As they are today, the command line parameters are self-documenting. That creates a contract that's easy to understand.
This tax of maintaining those makes the tools much more accessible (think about the governance proxies that have been forked). This project has more interesting tasks if we think about independent components rather than bundling/coupling them together.
Signed-off-by: Peter Nied <[email protected]>
@@ -41,6 +41,7 @@ dependencies { | |||
|
|||
// JCommander | |||
compileOnly group: 'org.jcommander', name: 'jcommander' | |||
implementation group: 'org.yaml', name: 'snakeyaml' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need snakeyaml over jackson? They both have a healthy stream of CVE's associated with them, but AFAIK currently, we don't have this as a dependency and we do have a dependency in pretty much every project on jackson. Can we keep out maintenance costs down & stick to only jackson for yaml needs too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a prototype of what it would take to move in this direction where I quickly jumped to snakeyaml, make sense to use jackson when we implement the change.
@@ -0,0 +1,7 @@ | |||
package org.opensearch.migrations.config; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I did the instrumentation work, I wanted to make sure that a change needed for one project wouldn't impact in any way any other project that didn't need to consume the change. At the every-day level, this includes builds and build graphs that are as tight as possible. It also self-documents the extent that datastructures and patterns are consumed. If I see Backfill and Cluster, I'm going to presume that every single project that uses parameters will need these. In fact though, some significant services don't - like the proxy or cluster have no use for many of these and never will.
We have a single repo for this project, but I would pretend that a different person were in the sole maintainer for each project. If that was the case, would this aggregation be worth the overhead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that any change in coreUtilities requires a rebuild. And as I understand it its purpose, we put shared functionality between all components in our migration ecosystem. If the services.yaml is used by all the projects in the solution including the java projects, the place that seems logical to put this shared parsing code would be in the coreUtilities project.
I believe the relationship is worthwhile to be sure that all components can read the same services.yaml definition in the project. How else would you have a shared definition, where else should it go?
Signed-off-by: Peter Nied <[email protected]>
I'm continuing to try to wrap my head around this. From the diagrams, it looks like there are two concerns. The first is the separation of deployment vs runtime parameters. The second is how applications are configured at runtime. For the first case, as part of the K8s work, we're looking to redesign the deployment/runtime experience and split deployment parameters (vpcIds, service names) from runtime parameters (scale values, transforms to use, etc). We can also make it so that there are two tiers of deployment parameters so that a change in target endpoint, etc doesn't cause the rest of the resources (console, replayer, etc) to need to be deployed if not applicable. We can also determine what's best for the user and what should be deferred until runtime and is mutable vs what is set at deployment time and immutable and maybe what's set at deployment time but mutable. I see the migration console (orchestrator from our architecture docs) managing all of this. Each application takes what it needs from that orchestrator that understands every interface and how to stitch them together. The second issue here is how to manage the datamodel for the migration console and the consistency between the orchestrator and its worker nodes. The proposal here is unappealing to me because it doesn't scale immediately since every application needs to be aware of every other applications' interfaces. Today, we have a yaml database (services.yaml) keep the union of all the configurations and the orchestrator uses that. If we'd want to make it so that that file can be passed and unknown arguments were ignored, that's probably fine. But I wouldn't want to see that functionality within the application depend upon specific infrastructure, like requiring a shared fs or access to secrets manager, etc. That should go into the container definition for the application or outside of it. If we want to make sure that the models are synchronized between the layers, we should build up model classes (like many web-services do) for each application and have the migration console take dependencies on all of those & concatenate them into one services.yaml spec. That eliminates the need for everybody knowing about everything and still gets the migration console a full-view. I'd be happy to see us continue to use some shared libraries for things like transformation configuration like what we're doing here. We can continue to extend the functionality of our parameter passing logic be easier to share & to have more features - like being able to read from a file or command line parameters (there's a @file annotation w/ JCommander). |
I think there's a good intention & I think there's a path forward, but I also doubt if this is a top priority over cleaning up the overall deployment story (K8s), scaling our services, or putting extensible transformations in place so that users can do more. I don't see how we can derive enough short term benefit to warrant an investment here right now for the first point. I haven't noticed a huge cost yet for the second point and it feels like it's mostly where it needs to be right now. |
This prototype served its purpose, had discussion with the team, notes are captured on the JIRA https://opensearch.atlassian.net/browse/MIGRATIONS-2075 Will follow up with this when it is critical path to future efforts. |
Description
I've taken a stab at writing a design, making changes to get metadata migration pulling from the services.yaml, updating how MA console handles tool arguments, and looking for some edge cases such as RFS where we do considerable work with parameters before the java process.
This was done quickly in the spirit of creating a discussions around interesting areas.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.