Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement rollback for script run as pre defined stage #4743

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 15 additions & 26 deletions docs/rfcs/0011-script-run-stage.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,34 +90,23 @@ spec:
- "curl -X POST -H 'Content-type: application/json' --data '{"text":"failed to deploy: rollback"}' $SLACK_WEBHOOK_URL"
```

**SCRIPT_SYNC stage also rollbacks** when the deployment status is `DeploymentStatus_DEPLOYMENT_CANCELLED` or `DeploymentStatus_DEPLOYMENT_FAILURE` even though other rollback stage is also executed.
**SCRIPT_RUN stage also rollbacks**. Execute the command to rollback SCRIPT_RUN to the point where the deployment was canceled or failed.
When there are multiple SCRIPT_RUN stages to be rolled back, they are executed in the same order as SCRIPT_RUN on the pipeline.

For example, here is a deploy pipeline combined with other k8s stages.
The result status of the pipeline is FAIL or CANCELED, piped rollbacks the stages `K8S_CANARY_ROLLOUT`, `K8S_PRIMARY_ROLLOUT`, and `SCRIPT_RUN`.
For example, consider when deployment proceeds in the following order from 1 to 7.

1. K8S_CANARY_ROLLOUT
2. WAIT
3. SCRIPT_RUN
4. K8S_PRIMARY_ROLLOUT
5. SCRIPT_RUN
6. K8S_CANARY_CLEAN
7. SCRIPT_RUN

Then
- If 4 is canceled or fails while running, only SCRIPT_RUN of 3 will be rolled back.
- If 6 is canceled or fails while running, only SCRIPT_RUNs 3 and 5 will be rolled back.

```yaml
apiVersion: pipecd.dev/v1beta1
kind: KubernetesApp
spec:
pipeline:
stages:
- name: K8S_CANARY_ROLLOUT
with:
replicas: 10%
- name: WAIT_APPROVAL
with:
timeout: 30m
- name: K8S_PRIMARY_ROLLOUT
- name: K8S_CANARY_CLEAN
- name: SCRIPT_RUN
with:
env:
SLACK_WEBHOOK_URL: ""
runs:
- "curl -X POST -H 'Content-type: application/json' --data '{"text":"successfully deployed!!"}' $SLACK_WEBHOOK_URL"
onRollback:
- "curl -X POST -H 'Content-type: application/json' --data '{"text":"failed to deploy: rollback"}' $SLACK_WEBHOOK_URL"
```

## prepare environment for execution

Expand Down
67 changes: 44 additions & 23 deletions pkg/app/piped/controller/scheduler.go
Original file line number Diff line number Diff line change
Expand Up @@ -369,34 +369,37 @@ func (s *scheduler) Run(ctx context.Context) error {
// we start rollback stage if the auto-rollback option is true.
if deploymentStatus == model.DeploymentStatus_DEPLOYMENT_CANCELLED ||
deploymentStatus == model.DeploymentStatus_DEPLOYMENT_FAILURE {
if stage, ok := s.deployment.FindRollbackStage(); ok {

if rollbackStages, ok := s.deployment.FindRollbackStages(); ok {
// Update to change deployment status to ROLLING_BACK.
if err := s.reportDeploymentStatusChanged(ctx, model.DeploymentStatus_DEPLOYMENT_ROLLING_BACK, statusReason); err != nil {
return err
}

// Start running rollback stage.
var (
sig, handler = executor.NewStopSignal()
doneCh = make(chan struct{})
)
go func() {
rbs := *stage
rbs.Requires = []string{lastStage.Id}
s.executeStage(sig, rbs, func(in executor.Input) (executor.Executor, bool) {
return s.executorRegistry.RollbackExecutor(s.deployment.Kind, in)
})
close(doneCh)
}()

select {
case <-ctx.Done():
handler.Terminate()
<-doneCh
return nil

case <-doneCh:
break
for _, stage := range rollbackStages {
// Start running rollback stage.
var (
sig, handler = executor.NewStopSignal()
doneCh = make(chan struct{})
)
go func() {
rbs := *stage
rbs.Requires = []string{lastStage.Id}
s.executeStage(sig, rbs, func(in executor.Input) (executor.Executor, bool) {
return s.executorRegistry.RollbackExecutor(s.deployment.Kind, in)
})
close(doneCh)
}()

select {
case <-ctx.Done():
handler.Terminate()
<-doneCh
return nil

case <-doneCh:
break
}
Comment on lines +372 to +402
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Changed to execute multiple kinds of rollback stages.
The changes are to do logic just like the same as before with each rollback stage.

}
}
}
Expand Down Expand Up @@ -433,6 +436,24 @@ func (s *scheduler) executeStage(sig executor.StopSignal, ps model.PipelineStage
lp.Complete(time.Minute)
}()

// Check whether to execute the script rollback stage or not.
// If the base stage is executed, the script rollback stage will be executed.
if ps.Name == model.StageScriptRunRollback.String() {
baseStageID := ps.Metadata["baseStageID"]
if baseStageID == "" {
return
}

baseStageStatus, ok := s.stageStatuses[baseStageID]
if !ok {
return
}

if baseStageStatus == model.StageStatus_STAGE_NOT_STARTED_YET || baseStageStatus == model.StageStatus_STAGE_SKIPPED {
return
}
}
Comment on lines +441 to +455
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Check the status of the script run stage to decide whether to rollback or not.
It is because the target stages should be already executed.


// Update stage status to RUNNING if needed.
if model.CanUpdateStageStatus(ps.Status, model.StageStatus_STAGE_RUNNING) {
if err := s.reportStageStatus(ctx, ps.Id, model.StageStatus_STAGE_RUNNING, ps.Requires); err != nil {
Expand Down
52 changes: 51 additions & 1 deletion pkg/app/piped/executor/kubernetes/rollback.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@

import (
"context"
"encoding/json"
"os"
"os/exec"
"strings"

"go.uber.org/zap"
Expand All @@ -27,6 +30,8 @@

type rollbackExecutor struct {
executor.Input

appDir string
}

func (e *rollbackExecutor) Execute(sig executor.StopSignal) model.StageStatus {
Expand All @@ -39,7 +44,8 @@
switch model.Stage(e.Stage.Name) {
case model.StageRollback:
status = e.ensureRollback(ctx)

case model.StageScriptRunRollback:
status = e.ensureScriptRunRollback(ctx)

Check warning on line 48 in pkg/app/piped/executor/kubernetes/rollback.go

View check run for this annotation

Codecov / codecov/patch

pkg/app/piped/executor/kubernetes/rollback.go#L47-L48

Added lines #L47 - L48 were not covered by tests
Comment on lines +47 to +48
Copy link
Member Author

@ffjlabo ffjlabo Dec 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Added the logic to rollback executor for each application (firstly, I added on the k8s's one) because rollback stages are executed per the kind of application not per stage.

I don't have any idea to separate the rollback for application and the rollback for the stage.

default:
e.LogPersister.Errorf("Unsupported stage %s for kubernetes application", e.Stage.Name)
return model.StageStatus_STAGE_FAILURE
Expand Down Expand Up @@ -74,6 +80,8 @@
}
}

e.appDir = ds.AppDir

Check warning on line 84 in pkg/app/piped/executor/kubernetes/rollback.go

View check run for this annotation

Codecov / codecov/patch

pkg/app/piped/executor/kubernetes/rollback.go#L83-L84

Added lines #L83 - L84 were not covered by tests
loader := provider.NewLoader(e.Deployment.ApplicationName, ds.AppDir, ds.RepoDir, e.Deployment.GitPath.ConfigFilename, appCfg.Input, e.GitClient, e.Logger)
e.Logger.Info("start executing kubernetes stage",
zap.String("stage-name", e.Stage.Name),
Expand Down Expand Up @@ -171,3 +179,45 @@
}
return model.StageStatus_STAGE_SUCCESS
}

func (e *rollbackExecutor) ensureScriptRunRollback(ctx context.Context) model.StageStatus {
e.LogPersister.Info("Runnnig commands for rollback...")

onRollback, ok := e.Stage.Metadata["onRollback"]
if !ok {
e.LogPersister.Error("onRollback metadata is missing")
return model.StageStatus_STAGE_FAILURE
}

Check warning on line 190 in pkg/app/piped/executor/kubernetes/rollback.go

View check run for this annotation

Codecov / codecov/patch

pkg/app/piped/executor/kubernetes/rollback.go#L183-L190

Added lines #L183 - L190 were not covered by tests

if onRollback == "" {
e.LogPersister.Info("No commands to run")
return model.StageStatus_STAGE_SUCCESS
}

Check warning on line 195 in pkg/app/piped/executor/kubernetes/rollback.go

View check run for this annotation

Codecov / codecov/patch

pkg/app/piped/executor/kubernetes/rollback.go#L192-L195

Added lines #L192 - L195 were not covered by tests
Comment on lines +192 to +195
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to ensure I understand this right: This means that if there is no user-defined onRollback script, this script run rollback will not do anything, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ffjlabo What I mean here is that we may make this:

  1. Add rollbackable of type bool to specify whether the current script run can be rollbacked or not, default set to false 🤔
  2. In case of rollbackable stage, if onRollback is not set, re-run script instead or return error rollback script is missing
    wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the second thought, I got your idea.

  • If users did not define onRollback script, then there is nothing to run
  • If users defined the onRollback script, then run that script
    Let's keep this idea, thanks 👍


envStr, ok := e.Stage.Metadata["env"]
env := make(map[string]string, 0)
if ok {
_ = json.Unmarshal([]byte(envStr), &env)
}

Check warning on line 201 in pkg/app/piped/executor/kubernetes/rollback.go

View check run for this annotation

Codecov / codecov/patch

pkg/app/piped/executor/kubernetes/rollback.go#L197-L201

Added lines #L197 - L201 were not covered by tests

for _, v := range strings.Split(onRollback, "\n") {
if v != "" {
e.LogPersister.Infof(" %s", v)
}

Check warning on line 206 in pkg/app/piped/executor/kubernetes/rollback.go

View check run for this annotation

Codecov / codecov/patch

pkg/app/piped/executor/kubernetes/rollback.go#L203-L206

Added lines #L203 - L206 were not covered by tests
}

envs := make([]string, 0, len(env))
for key, value := range env {
envs = append(envs, key+"="+value)
}

Check warning on line 212 in pkg/app/piped/executor/kubernetes/rollback.go

View check run for this annotation

Codecov / codecov/patch

pkg/app/piped/executor/kubernetes/rollback.go#L209-L212

Added lines #L209 - L212 were not covered by tests

cmd := exec.Command("/bin/sh", "-l", "-c", onRollback)
cmd.Dir = e.appDir
cmd.Env = append(os.Environ(), envs...)
cmd.Stdout = e.LogPersister
cmd.Stderr = e.LogPersister
if err := cmd.Run(); err != nil {
return model.StageStatus_STAGE_FAILURE
}
return model.StageStatus_STAGE_SUCCESS

Check warning on line 222 in pkg/app/piped/executor/kubernetes/rollback.go

View check run for this annotation

Codecov / codecov/patch

pkg/app/piped/executor/kubernetes/rollback.go#L214-L222

Added lines #L214 - L222 were not covered by tests
}
26 changes: 26 additions & 0 deletions pkg/app/piped/planner/kubernetes/pipeline.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
package kubernetes

import (
"encoding/json"
"fmt"
"time"

Expand Down Expand Up @@ -114,6 +115,31 @@
CreatedAt: now.Unix(),
UpdatedAt: now.Unix(),
})

// Add a stage for rolling back script run stages.
for i, s := range pp.Stages {
if s.Name == model.StageScriptRun {
// Use metadata as a way to pass parameters to the stage.
envStr, _ := json.Marshal(s.ScriptRunStageOptions.Env)
metadata := map[string]string{
"baseStageID": out[i].Id,
"onRollback": s.ScriptRunStageOptions.OnRollback,
"env": string(envStr),
}
Comment on lines +121 to +128
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 I use metadata for the stage to recognize below on executing it.

  • The commands which are executed on the rollback script run stage. (onRollback, env)
  • The stage IDs of script run stages, which are used to check whether to rollback it. (baseStageID)
    • It is because I want to execute the command to rollback SCRIPT_RUN to the point where the deployment was canceled or failed.

ss, _ := planner.GetPredefinedStage(planner.PredefinedStageScriptRunRollback)
out = append(out, &model.PipelineStage{
Id: ss.ID,
Name: ss.Name.String(),
Desc: ss.Desc,
Predefined: true,
Visible: false,
Status: model.StageStatus_STAGE_NOT_STARTED_YET,
Metadata: metadata,
CreatedAt: now.Unix(),
UpdatedAt: now.Unix(),
})
}

Check warning on line 141 in pkg/app/piped/planner/kubernetes/pipeline.go

View check run for this annotation

Codecov / codecov/patch

pkg/app/piped/planner/kubernetes/pipeline.go#L121-L141

Added lines #L121 - L141 were not covered by tests
}
}

return out
Expand Down
6 changes: 6 additions & 0 deletions pkg/app/piped/planner/predefined_stages.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ const (
PredefinedStageECSSync = "ECSSync"
PredefinedStageRollback = "Rollback"
PredefinedStageCustomSyncRollback = "CustomSyncRollback"
PredefinedStageScriptRunRollback = "ScriptRunRollback"
)

var predefinedStages = map[string]config.PipelineStage{
Expand Down Expand Up @@ -65,6 +66,11 @@ var predefinedStages = map[string]config.PipelineStage{
Name: model.StageCustomSyncRollback,
Desc: "Rollback the custom stages",
},
PredefinedStageScriptRunRollback: {
ID: PredefinedStageScriptRunRollback,
Name: model.StageScriptRunRollback,
Desc: "Rollback the script run stage",
},
}

// GetPredefinedStage finds and returns the predefined stage for the given id.
Expand Down
10 changes: 10 additions & 0 deletions pkg/model/deployment.go
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,16 @@ func (d *Deployment) FindRollbackStage() (*PipelineStage, bool) {
return nil, false
}

func (d *Deployment) FindRollbackStages() ([]*PipelineStage, bool) {
rollbackStages := make([]*PipelineStage, 0, len(d.Stages))
for i, stage := range d.Stages {
if d.Stages[i].Name == StageRollback.String() || d.Stages[i].Name == StageScriptRunRollback.String() {
rollbackStages = append(rollbackStages, stage)
}
}
return rollbackStages, len(rollbackStages) > 0
}

// DeploymentStatusesFromStrings converts a list of strings to list of DeploymentStatus.
func DeploymentStatusesFromStrings(statuses []string) ([]DeploymentStatus, error) {
out := make([]DeploymentStatus, 0, len(statuses))
Expand Down
44 changes: 44 additions & 0 deletions pkg/model/deployment_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -587,6 +587,7 @@ func TestFindRollbackStage(t *testing.T) {
}

for _, tt := range tests {
tt := tt
t.Run(tt.name, func(t *testing.T) {
d := &Deployment{
Stages: tt.stages,
Expand All @@ -597,3 +598,46 @@ func TestFindRollbackStage(t *testing.T) {
})
}
}

func TestFindRollbackStags(t *testing.T) {
tests := []struct {
name string
stages []*PipelineStage
wantStages []*PipelineStage
wantStageFound bool
}{
{
name: "found",
stages: []*PipelineStage{
{Name: StageK8sSync.String()},
{Name: StageRollback.String()},
{Name: StageScriptRunRollback.String()},
},
wantStages: []*PipelineStage{
{Name: StageRollback.String()},
{Name: StageScriptRunRollback.String()},
},
wantStageFound: true,
},
{
name: "not found",
stages: []*PipelineStage{
{Name: StageK8sSync.String()},
},
wantStages: []*PipelineStage{},
wantStageFound: false,
},
}

for _, tt := range tests {
tt := tt
t.Run(tt.name, func(t *testing.T) {
d := &Deployment{
Stages: tt.stages,
}
stages, found := d.FindRollbackStages()
assert.Equal(t, tt.wantStages, stages)
assert.Equal(t, tt.wantStageFound, found)
})
}
}
4 changes: 4 additions & 0 deletions pkg/model/stage.go
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,10 @@ const (
// all changes made by the CUSTOM_SYNC stage will be reverted to
// bring back the pre-deploy stage.
StageCustomSyncRollback Stage = "CUSTOM_SYNC_ROLLBACK"
// StageScriptRunRollback represents a state where
// all changes made by the SCRIPT_RUN_ROLLBACK stage will be reverted to
// bring back the pre-deploy stage.
StageScriptRunRollback Stage = "SCRIPT_RUN_ROLLBACK"
)

func (s Stage) String() string {
Expand Down
Loading