You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently in the compose setup, if galaxy fails to be lifted by supervisor, the latter keeps trying to lift it and the container never fails (as you would expect in a container orchestrator setup). So the only way for the administrator to notice this from the outside is to go and look at the logs (after noticing that, after minutes, Galaxy is not showing up on the expected URL). What you expect in an orchestration environment is that when the main purpose/process of a container fails, the container goes down and is restarted once and again by the orchestrator until the offending conditions (ie. a disk not being available, a networking problem, etc) are resolved. But you see the failure from the orchestration environment. This is probably partly due to the the fact that we have supervisor in the middle; in a purely orchestrated solution, the orchestrator (swarm, k8s, mesos, etc) becomes "your process manager". I presume that moving into 18.01 we will have the chance of detaching web processes and handlers from the same container, providing independent scaling ability for both components, and that should reduce the need for supervisor in the middle.
The text was updated successfully, but these errors were encountered:
I presume that moving into 18.01 we will have the chance of detaching web processes and handlers from the same container, providing independent scaling ability for both components, and that should reduce the need for supervisor in the middle.
Oh yes! 18.01 has currently other problems but this is really planned for 18.05 if I get to it!
Currently in the compose setup, if galaxy fails to be lifted by supervisor, the latter keeps trying to lift it and the container never fails (as you would expect in a container orchestrator setup). So the only way for the administrator to notice this from the outside is to go and look at the logs (after noticing that, after minutes, Galaxy is not showing up on the expected URL). What you expect in an orchestration environment is that when the main purpose/process of a container fails, the container goes down and is restarted once and again by the orchestrator until the offending conditions (ie. a disk not being available, a networking problem, etc) are resolved. But you see the failure from the orchestration environment. This is probably partly due to the the fact that we have supervisor in the middle; in a purely orchestrated solution, the orchestrator (swarm, k8s, mesos, etc) becomes "your process manager". I presume that moving into 18.01 we will have the chance of detaching web processes and handlers from the same container, providing independent scaling ability for both components, and that should reduce the need for supervisor in the middle.
The text was updated successfully, but these errors were encountered: