Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argo CD support needs some thought:- ec_service, enabled and removed flags are confusing and don't quite work #11

Open
gilesknap opened this issue Sep 6, 2024 · 11 comments

Comments

@gilesknap
Copy link
Member

gilesknap commented Sep 6, 2024

We have a number of issues with the current state of our deployment repo.

  • ec_service is a flag that tells us if ec will show this service - this means not all services are managed by ec but it is currently required to flag services that support reading global.enabled to scale down to 0.
  • as pointed out above enabled only works in services that support it.
  • enabled overrides are scattered around in the individual apps so that there is not one place to look for them
  • enabled overrides in sub apps have been demonstrated to disappear under some root app refresh scenarios
  • removed is a little confusing, appears to overlap with enabled and gives us no way to monitor what stopped services there are (without looking in the repo)

I propose the following for review:-

  • we drop ec_service and removed and only support enabled
  • we support enabled as a flag to each service in the apps.yaml file which defaults to true
  • we contrive to make enabled show up in the list of parameters to the root app so you may if you wish override them with the GUI
  • when enabled=false the app is not rendered in the apps helm template so the whole app disappears from the cluster
  • we make the ec tool aware of the git repo p47-deployment and have it parse the values.yaml
  • this means ec can show ALL services and show them as STOPPED when enable=false
    • it means that the argo cd gui does not show stopped services but I think that is OK
    • in exchange, ec is restored to being useful for managing all of the services in the services repo

We would still use the argo cd cli for the rest of the commands - just add the git repo for the list of stopped services.

BUT: now ec has access to that repo I see no reason not to add write capability to that repo so that we can restore the deploy and delete functions.

@coretl @marcelldls @DiamondJoseph @ZohebShaikh Please have a look at this and let me know your thoughts. Thanks.

@gilesknap
Copy link
Member Author

gilesknap commented Sep 6, 2024

@marcelldls regarding the appearance and override of enabled flag in the root app. I believe I had it working at some point as described in the 1st 3 'proposed' bullets above so happy to try and reproduce that.

@DiamondJoseph
Copy link
Contributor

I think DAQ's use case is to only make changes to ixx-deployment when adding a new service. Our services should always point to ixx-services main, and that repository alone can represent the state of the services. Each of our charts will have an enabled in their values.yaml and we manage it there. Just trying to have as little bespokeness as possible.

@marcelldls
Copy link

@gilesknap Your proposal is that any services that use our repos must implement a global.enabled?

@gilesknap
Copy link
Member Author

@gilesknap Your proposal is that any services that use our repos must implement a global.enabled?

No. The proposal is the enabled is understood by apps/template to mean don't render that app into the sub-apps manifest, so argocd will remove it. However because ec looks at the repo it can see and show the stopped service.

@gilesknap
Copy link
Member Author

So this means any service can be stopped (removed from the cluster) by setting enabled=false and control of stopping / starting services is entirely controlled centrally.
@DiamondJoseph if you want the deployment repo to represent the state of the cluster why not let it do the starting and stopping - your approach is only needed if you really want to have the other resources deployed but scale to zero, I believe.

@marcelldls
Copy link

@gilesknap Your proposal is that any services that use our repos must implement a global.enabled?

No. The proposal is the enabled is understood by apps/template to mean don't render that app into the sub-apps manifest, so argocd will remove it. However because ec looks at the repo it can see and show the stopped service.

I see. This is already what removed does? Your proposal is to get rid of enabled, ec-service? In short, we drop scaling?

@gilesknap
Copy link
Member Author

Yes we drop scaling as its a chart specific feature and I'd like to support any chart.

@marcelldls
Copy link

Yes we drop scaling as its a chart specific feature and I'd like to support any chart.

Im happy with the proposed change - the iteration before what we have now the replica count would be set by Override root app > Override app > app values > app chart default > service values > root service values > service chart default which was even more "scattered" then now.

The impact is pvcs must be outside of your service chart? And start up performance?

Can we revert the global in ec-services helm chart for a fixed replica count?

@coretl
Copy link

coretl commented Sep 6, 2024

  • we make the ec tool aware of the git repo p47-deployment and have it parse the values.yaml

Does it need to know about git? Couldn't it get this info from the argo cli? If it knew from argo cli what the enabled value is, and if it is overridden, then this would tell it the difference between:

  • Enabled
  • Temporarily Enabled
  • Disabled
  • Temporarily Disabled

@gilesknap
Copy link
Member Author

gilesknap commented Sep 6, 2024

This seems like a fine idea. But.

  • I can't find any way to get the value of a parameter from the CLI only to set an override for one
  • There is still the question of enabling 'deploy' from ec

@marcelldls you know the CLI better than I, can you get a parameter value?

re the second point. I'm struggling with the sense that ec is crippled and the workflow is more laborious when you switch to argocd. Particularly noticeable when I'm intending to write a helm based tutorial first and then layer an argo cd one on top. It feels like you have hit the sweet spot with helm and then gone too far with argo cd.

If we give ec / argo cd parity of function with ec / helm, then:

  • we do require access to the git repo (including write access for people with permissions)
  • we have a quick and less error prone way to update the deployment repo
  • we still retain the features of argo cd brings to the mix
    • complete record of the state of the beamline
    • version control managed explicitly by a FOSS tool instead of inside our ec helm backend.
    • nice UI which can be used for overrides as well as ec
    • completely in control using just the deployment repo if you prefer not to use ec

@coretl
Copy link

coretl commented Sep 9, 2024

re the second point. I'm struggling with the sense that ec is crippled and the workflow is more laborious when you switch to argocd. Particularly noticeable when I'm intending to write a helm based tutorial first and then layer an argo cd one on top. It feels like you have hit the sweet spot with helm and then gone too far with argo cd.

Good point, we like the synergy with configure-ioc, which would mean needing ec to be able to write to the p47-deployment git repo to upgrade/downgrade versions from the commanline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants