You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update pipeline keeping it paused and wait for that to be reconciled
Run pipeline
For Step 2, I'd decided to be paranoid in keeping it paused because I didn't want the Numaflow Controller to scale up prior to updating the Vertices; however, I can see in Numaflow that it follows this sequence:
Update, create, and delete all Vertex specs (if an error occurs, stop and return error)
Patch Vertex spec to scale up (setting replicas > 0)
Motivation
When we perform a topology change on the Pipeline, the Pipeline Controller creates a Job which creates new buffers, and buckets. Then the daemon and the vertex Pods restart with an init container that waits for those buffers and buckets to be created.
When we issue desiredPhase=Paused (even in the case of Pipeline already being paused), Numaflow tries to connect to the daemon to determine # of pending messages. If this is happening at the same time that the Job is being created and the daemon is waiting for the buffers to exist, then Numaflow will be waiting for the daemon. Theoretically, everything should work in due time, but there seem to be some issues.
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
The text was updated successfully, but these errors were encountered:
Just talked to Derek and Sidhant about this issue - instead of doing that and since there is a real use case regardless that user may actually want desiredPhase=Paused while updating topology, Sidhant will update Numaflow Controller to not try to contact the daemon while Pipeline is paused to see if that alleviates some issues.
Summary
Currently, the PPND logic performs this sequence:
For Step 2, I'd decided to be paranoid in keeping it paused because I didn't want the Numaflow Controller to scale up prior to updating the Vertices; however, I can see in Numaflow that it follows this sequence:
Motivation
When we perform a topology change on the Pipeline, the Pipeline Controller creates a Job which creates new buffers, and buckets. Then the daemon and the vertex Pods restart with an init container that waits for those buffers and buckets to be created.
When we issue
desiredPhase=Paused
(even in the case of Pipeline already being paused), Numaflow tries to connect to the daemon to determine # of pending messages. If this is happening at the same time that the Job is being created and the daemon is waiting for the buffers to exist, then Numaflow will be waiting for the daemon. Theoretically, everything should work in due time, but there seem to be some issues.Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
The text was updated successfully, but these errors were encountered: