-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup docker-compose like k8s deployment #1054
Setup docker-compose like k8s deployment #1054
Conversation
Signed-off-by: Greg Schohn <[email protected]>
There's a testObservability chart that will deploy jaeger, prometheus, and grafana - testing values are within the helmValues/localTesting/testObservability.yaml file. Signed-off-by: Greg Schohn <[email protected]>
… to be deployed hierarchically via helm packages Signed-off-by: Greg Schohn <[email protected]>
…to work as expected. Jaeger & Prometheus have been deployed, but I don't have a collector and haven't confirmed that they're behaving as expected. I still need to 1) setup kafka, flip the proxy to write to kafka; 2) setup the replayer 3) setup otel-collectors; 4) setup RFS. Signed-off-by: Greg Schohn <[email protected]>
… package. Signed-off-by: Greg Schohn <[email protected]>
Signed-off-by: Greg Schohn <[email protected]>
Signed-off-by: Greg Schohn <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1054 +/- ##
============================================
- Coverage 80.49% 80.48% -0.02%
+ Complexity 3080 3077 -3
============================================
Files 421 421
Lines 15682 15683 +1
Branches 1062 1062
============================================
- Hits 12624 12623 -1
+ Misses 2411 2410 -1
- Partials 647 650 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
…ared library WIP Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Greg Schohn <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
…nsearch-migrations into KubernetesExperimentation
…refactoring Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Greg Schohn <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
763de17
to
ad8be18
Compare
…agement for resources. Signed-off-by: Greg Schohn <[email protected]>
…ith the k8s config infrastructure Signed-off-by: Greg Schohn <[email protected]>
…onAssistant and see the replayer and console come up. I'm no longer setting the namespaces manually, but relying upon helm to do that w/ `helm install -n ma ...`. The buildArgumentsBuilderScript can now handle positional arguments. It takes in a list of keys that should be used positionally instead of via flag. Those values can still be lists or scalar values. It's expected that the positional mappings are specified by the chart templates rather than the values. Environment variables are still setup and bound to snake-case variants of the parameter keys pulled in as values. The MA chart now includes all of the charts that could be required, but installs them conditionally. Kafka charts have been moved into shared (for the base kafka chart) and components (for the traffic capture one) - I'll have more to do on integrating the kafka chart and operator, but this is just enough to hold progress on kafka work before really taking it on. The kafka topic chart is probably gone permanently - or for quite a while - since the proxy will auto-create the topic anyway. Install the bulk loader w/ 0 replicas so that it won't run before a snapshot has been created. This puts it roughly where we are w/ a CDK deployment. The capture proxy also starts in a state w/ 0 replicas. The replayer DOES have replicas, but that will likely change down to 0 eventually too (and kafka will be loaded on-demand too). Some of those lazy features will require some more support in the migration console though, so we'll continue along w/ a simpler, if costlier, deployment until then. I'm setting the replica count to 0 for a number of services Signed-off-by: Greg Schohn <[email protected]>
Signed-off-by: Greg Schohn <[email protected]>
…nsearch-migrations into KubernetesExperimentation Signed-off-by: Greg Schohn <[email protected]> # Conflicts: # deployment/k8s/charts/aggregates/migrationAssistant/Chart.yaml # deployment/k8s/charts/aggregates/migrationAssistant/values.yaml
…c kafka broker, and console install cleanly. Run `helm install -n ma ma charts/aggregates/migrationAssistant` and wait for the kafka cluster to come up. Once it does, the replayer logs show that the replayer has joined the group. I still need to verify that the proxy can be instantiated and that e2e traffic can flow from a source to a target. Signed-off-by: Greg Schohn <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this testObservability
directory will be deleted and its functionality merged into MA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks. I've moved the one dashboard for grafana that was in it into the MA chart (and made it conditional)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we could remove this entire localTesting
directory since we are using default values from values.yaml
for the given charts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good point. I've pulled the remaining ones into the MA chart. I've been less and less concerned with deploying just one pod as I've been developing & conditional installs on the top-level chart to deploy specific things might be easier for users and us anyway.
@@ -105,19 +105,19 @@ public static class Parameters { | |||
|
|||
@Parameter( | |||
required = false, | |||
names = {REMOVE_AUTH_HEADER_VALUE_ARG }, | |||
names = {REMOVE_AUTH_HEADER_VALUE_ARG, "--removeAuthHeader" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: A bit weird having static field variables as well as hardcoded values for the same pattern in these files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - the static field is used to print out some error messages. It seemed like it would be even more complicated though to get the language agreements right if both aliases were included
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder from chat: Would be nice to have a README explaining the purpose/usage of each of these helper templates
spec: | ||
initContainers: | ||
{{- include "generic.setupEnvLoadInitContainer" (merge . (dict | ||
"MountName" $mountName | ||
"include" .Template.Include)) | nindent 8 }} | ||
- name: wait-for-kafka | ||
image: bitnami/kubectl:latest # or any image with curl/kubectl | ||
command: [ 'sh', '-c', | ||
'until kubectl get configmap kafka-config; do echo waiting for kafka config; sleep 1; done' ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder to think through what happens if the loaded environment changes while Kafka is still spinning up and our wait condition hasn't been met
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we will remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - this was just to support my development (contents are also in the console_link/README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are your intentions with the RBAC setup here? I was gonna hold on this for a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was required because by default, which is "deny", those containers couldn't get access to the config maps.
deployment/k8/.gitignore
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this k8
directory will get dropped from this PR when its ready. The README.md
is likely worth copying over with some minor modification. The minikube stuff I have is maybe a more useful reference as we figure out how we want to test with minikube
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember that I need to change something in the readme or build scripts but I don't remember what. Please help to remind me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I had at least one section in the README.md that was referencing the minikubeLocal.sh
script that I had. Since we have kept that script for now, I think we are fine
…nd in some cases more consistent. Signed-off-by: Greg Schohn <[email protected]>
Choose helm builtins for snake/kebab casing and put a condition on the grafana dashboard for the aggregate package. Signed-off-by: Greg Schohn <[email protected]>
Fixed a couple other issues along the way and found an issue that helm's merge function mutates the left dictionary, which was surprising and will require re-evaluation of every application. See helm/helm#13308 Signed-off-by: Greg Schohn <[email protected]>
Signed-off-by: Greg Schohn <[email protected]> # Conflicts: # deployment/k8/migration-assistant/Chart.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need a Pipfile.lock in this directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:-). Yeah, the tests just told me that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably revert changes to this file and have our own services.yaml we feed in for now as I'm not sure what impact this will cause
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WOW - thank you. I clearly didn't look at a merge that I took in carefully enough. I didn't even realize that this had happened.
…yet. Signed-off-by: Greg Schohn <[email protected]>
1653442
to
98d4fa3
Compare
Signed-off-by: Greg Schohn <[email protected]>
- name: wait-for-kafka | ||
image: bitnami/kubectl:latest # or any image with curl/kubectl | ||
command: [ 'sh', '-c', | ||
'until kubectl wait --for=condition=Ready kafka/captured-traffic -n {{.Release.Namespace }} --timeout=10s; do echo waiting for kafka cluster is ready; sleep 1; done' ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If kafka is deployed in a different context won't this wait check always fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define different context? Generally, yes - Kafka will need to be installed in the same namespace and it will need to have that specific name. We'll want to improve this to support a bring-your-own-kafka at some point, but I don't know how big of a concern that is. I'd like to have somebody ask for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main point here was that we probably need some way to provide the namespace that kafka is in otherwise it seems like it will fail if it continues to check its own namespace and kafka is in a different namespace
"MountName" "merged-config" | ||
"include" .Template.Include) .) | nindent 8 }} | ||
containers: | ||
- name: console |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename to test-console
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for getting this out for review @gregschohn great work, still digesting the k8 directory
DocumentsFromSnapshotMigration/src/main/java/org/opensearch/migrations/RfsMigrateDocuments.java
Show resolved
Hide resolved
TrafficCapture/dockerSolution/src/main/docker/migrationConsole/lib/console_link/services.yaml
Outdated
Show resolved
Hide resolved
TrafficCapture/dockerSolution/src/main/docker/k8sConfigMapUtilScripts/Dockerfile
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I've got outstanding comments, this is a great step forward thanks @gregschohn and @lewijacn
TrafficCapture/dockerSolution/src/main/docker/k8sConfigMapUtilScripts/README.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add tests for this tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config_watcher will at the very least change significantly once we start to extend the migration console to work directly with k8s. At this point, config_watcher isn't really being tested (because we don't update config values at runtime). Once we do - if we're still in need of services yaml, we'll want to put some integ tests together - and unit tests on the specific contract that we're converging to will also make sense. At this point, the whole k8sConfigMapUtilScripts image should be considered an experimental bridge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK if we don't have a good test story so long as we have a JIRA task that includes this testing as an AC. Can you point a link to one here?
...pture/dockerSolution/src/main/docker/k8sConfigMapUtilScripts/configmap2yaml/template.yaml.j2
Outdated
Show resolved
Hide resolved
deployment/k8s/README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The project structure is throwing me a little bit here, I'm not sure why we need .../k8s/charts/...
, it seems like we could get away with structuring, is there something I'm missing about what we put where?
deployment/
-> charts/
-> README.md
-> {chart deployment scripts}
-> aggregates/...
-> components/...
-> sharedResources/...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Charts are a helm construct. We'll also want to deploy EKS and AWS specific things for an EKS related deployment. We also might want tests, etc that are specific to k8s or helm. Everything w/in charts today is specific to helm. I can see it having siblings that are for k8s, but not specific to helm charts. Does that help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I certainly agree we might have AWS / non-AWS specific functionality. I'm not sure I see the difference between k8s and helm, or to be more specific, I don't see k8s without helm.
When talking about support for other platforms my preference would be a single declaration for the canonical Migration Assistant. Similarly there would be one component that represents the Migration Console. Does this align with what you are thinking?
…/sprig Signed-off-by: Greg Schohn <[email protected]>
Description
Longer term, Kubernetes (K8s) will allow us
For this first cut...
Installing
The k8s directory contains helm charts that can be installed with
helm install charts/aggregate/migrationAssistant
. Optionally,-n ma
can be added to add the helm installation and MA resources to the "ma" namespace (or whatever is specified). aggregate/mockCustomerClusters is a helm chart for testing purposes to bring up the resources external to the migration assistant that are required to facilitate testing all of the migration assistant functionality.Those aggregate charts pull other helm charts, either local to this repo (charts/components/(captureProxy|migrationConsole...) or 3rd party charts (opensearch, elasticsearch, prometheus, etc) into them. Each chart defines default values in the values.yaml contained within the charts directory, alongside the Chart.yaml manifest file. Those default values can be used to install a working version of the MA solution. Today's functionality is limited and buggy, but doing a reasonable demo install w/out providing any value overrides will be an ongoing requirement.
Before running helm install, you'll likely need to run
helm dependency build CHART_DIR
, which downloads or copies charts into the CHART_DIR/charts directory. There's a script (linkSubChartsToDependencies.sh), which still has some bugs, that attempts do do some of this manually w/ symlinks to local directories when possible. Symlinks are preferable to tarballs because you don't need to keep rebuilding the dependencies, which can be time consuming.Configurations
Command line parameters are configured via helm through the values (such as values.yaml). Here's sample contents for the bulkLoad deployment.
All of Migration Assistant's custom processes use the same paradigm to translate those yaml parameters into the command line parameters. Shared helm charts (helmCommon) provide helper macros that allow consistent handling across our fleet of helm packages. For command line parsing, helm charts create deployments, mostly via the common code that does the following. 1) Create config map stores for each of the parameters specified in the values.yaml file; 2) load the values from those config maps as environment variables into an init container; and then 3) that init container runs shell script commands to construct the arguments that will be passed into the main program. For the last part, those arguments are passed via a file (vars.sh) within a shared mount and 4) the main container loads those variables before running the program.
The migration console has an extra init container that constructs the migration_services.yaml - which, like vars.sh is written into a shared container. However, since pods can't refresh environment variables when config maps update and because we're bundling those into a single services yaml file, we have a separate init container to maintain that. That container uses a custom shell script that uses the k8s client to watch config maps and formats the configured values themselves as yaml.
Issues Resolved
https://opensearch.atlassian.net/browse/MIGRATIONS-2287
Testing
Manual testing only w/ minikube on my mac. See
localTesting.sh
for some rough examples.Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.