-
-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Show failed command after atmos workflow failure #767
base: main
Are you sure you want to change the base?
Conversation
📝 Walkthrough📝 Walkthrough📝 WalkthroughWalkthroughThe changes in this pull request enhance error handling in the workflow execution process. The Changes
Assessment against linked issues
Possibly related PRs
Suggested labels
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (5)
internal/exec/workflow_utils.go (1)
109-124
: Consider extracting error message formatting to a dedicated function.While the implementation is solid, the error message construction logic could be moved to a separate function for better maintainability and reusability.
Consider this approach:
+func formatWorkflowError(step schema.WorkflowStep, command, workflowPath, workflow string) string { + workflowFileName := filepath.Base(workflowPath) + workflowFileName = strings.TrimSuffix(workflowFileName, filepath.Ext(workflowFileName)) + + failedMsg := color.New(color.FgRed).Sprintf("\nStep '%s' failed!", step.Name) + cmdMsg := color.New(color.FgYellow).Sprintf("Failed command: %s", command) + resumeMsg := color.New(color.FgGreen).Sprintf( + "\nTo resume the workflow from this step, run:\ncd %s && atmos workflow %s -f %s --from-step %s", + filepath.Dir(workflowPath), + workflow, + workflowFileName, + step.Name, + ) + + return fmt.Sprintf("%s\n%s\n%s", failedMsg, cmdMsg, resumeMsg) +} if err != nil { - workflowFileName := filepath.Base(workflowPath) - workflowFileName = strings.TrimSuffix(workflowFileName, filepath.Ext(workflowFileName)) - - failedMsg := color.New(color.FgRed).Sprintf("\nStep '%s' failed!", step.Name) - cmdMsg := color.New(color.FgYellow).Sprintf("Failed command: %s", command) - resumeMsg := color.New(color.FgGreen).Sprintf( - "\nTo resume the workflow from this step, run:\ncd %s && atmos workflow %s -f %s --from-step %s", - filepath.Dir(workflowPath), - workflow, - workflowFileName, - step.Name, - ) - - return fmt.Errorf("%s\n%s\n%s\nError: %v", failedMsg, cmdMsg, resumeMsg, err) + errMsg := formatWorkflowError(step, command, workflowPath, workflow) + return fmt.Errorf("%s\nError: %v", errMsg, err) }internal/exec/help.go (1)
107-118
: Enhance the help message clarity and specificity.The help message effectively communicates the workflow failure handling, but could be more helpful with these improvements:
- Use a more descriptive workflow name in the example (e.g., "production-deploy" instead of "workflow1")
- Provide a more realistic error message example that includes specific details
- Consider adding a note about any timeout or retry behavior
Here's a suggested improvement:
u.PrintMessage("\nAtmos workflow commands support failure handling and resume functionality:\n\n" + "When a workflow step fails:\n" + " - The failed step name and command will be displayed\n" + " - A resume command will be provided to restart from the failed step\n\n" + "Example:\n" + "Step 'deploy-vpc' failed!\n" + "Error: Error applying plan:\n" + - "1 error occurred: AWS API call failed\n\n" + + "1 error occurred: AWS API call failed - InvalidVPCID.NotFound: The vpc-id 'vpc-12345' does not exist\n\n" + "Failed command: terraform apply vpc -auto-approve\n\n" + "To resume the workflow from this step, run:\n" + - "atmos workflow deploy-infra -f workflow1 --from-step deploy-vpc\n\n" + + "atmos workflow deploy-infra -f production-deploy --from-step deploy-vpc\n\n" + + "Note: Failed steps will timeout after 30 minutes. No automatic retries are performed.\n\n" + "For more details refer to https://atmos.tools/cli/commands/workflow/resume\n")website/docs/core-concepts/workflows/workflows.mdx (3)
374-380
: Consider adding context about the--from-step
flag in the introduction.The section effectively explains what happens during a failure, but it would be helpful to mention upfront that this feature works in conjunction with the
--from-step
flag to enable workflow resumption. This addition would better align with the PR objectives of highlighting the new resumption capability.## Workflow Failure Handling and Resume When a workflow step fails, Atmos will: 1. Display which step failed 2. Show the exact command that failed 3. Provide a ready-to-use command to resume the workflow from the failed step + +This functionality works in conjunction with the `--from-step` flag, allowing users to resume workflows from any specific step after addressing the failure.
381-404
: Enhance the example with a successful resume scenario.The example effectively demonstrates the failure case, but it would be even more helpful to show what happens when a user successfully resumes the workflow. This would complete the user's mental model of the feature.
To resume the workflow from this step, run: atmos workflow test-1 -f workflow1 --from-step step-2 + +After fixing the issue and running the resume command, you'll see: +```console +Resuming workflow 'test-1' from step 'step-2'... + +Step 'step-2' completed successfully +Step 'step-3' completed successfully + +Workflow completed successfully +```
374-405
: Add a note about error handling behavior.The section would benefit from clarifying whether Atmos preserves error information between sessions. This would help users understand if they need to copy the resume command immediately or if they can retrieve it later.
## Workflow Failure Handling and Resume When a workflow step fails, Atmos will: 1. Display which step failed 2. Show the exact command that failed 3. Provide a ready-to-use command to resume the workflow from the failed step + +:::note +The error information and resume command are displayed only when the failure occurs. Make sure to note down the resume command if you plan to fix the issue and continue later, as this information is not persisted between Atmos sessions. +:::
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (3)
internal/exec/help.go
(1 hunks)internal/exec/workflow_utils.go
(3 hunks)website/docs/core-concepts/workflows/workflows.mdx
(1 hunks)
🔇 Additional comments (3)
internal/exec/workflow_utils.go (2)
7-7
: LGTM! Clean import addition.
The filepath package is appropriately added and utilized for path manipulations in the error message formatting.
109-124
: Excellent error handling improvements! 💪
The enhanced error messages now provide:
- Clear indication of the failed step
- The exact command that failed
- Color-coded output for better readability
- Precise instructions for resuming the workflow
This aligns perfectly with the PR objectives and significantly improves the user experience.
internal/exec/help.go (1)
106-106
: LGTM! Clean conditional structure.
The new condition for workflow help follows the established pattern used for other component types, maintaining code consistency.
Can you show a screenshot so I can see what the coloring looks like? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cerebrovinny thanks, the PR looks good, please address the comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
internal/exec/help.go (1)
106-118
: STRONG WORK, WARRIOR! Let's refine this battle plan with a few adjustments.
- The URL should point to the specific resume documentation page:
-For more details refer to https://atmos.tools/cli/commands/workflow/ +For more details refer to https://atmos.tools/cli/commands/workflow/resume
- The
-f
flag in the example command needs explanation. Consider adding:To resume the workflow from this step, run: -atmos workflow deploy-infra -f workflow1 --from-step deploy-vpc +atmos workflow deploy-infra -f workflow1 --from-step deploy-vpc + +Where: + -f, --file Workflow file to execute (e.g., workflow1) + --from-step Resume workflow execution from the specified step
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (3)
internal/exec/help.go
(1 hunks)internal/exec/workflow_utils.go
(3 hunks)website/docs/core-concepts/workflows/workflows.mdx
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- internal/exec/workflow_utils.go
- website/docs/core-concepts/workflows/workflows.mdx
@Cerebrovinny please make the following changes: "Failed command: ..." to should sent to Debug (and renamed to "Command failed:") Also, "Error: exit status 1" should appear before the tip, as it realtes to the "Failed command" This is to reduce the amount of technical information on the screen. |
@Cerebrovinny please resolve conversations that are addressed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
cmd/workflow.go (1)
Line range hint
52-52
: Enhance the --from-step flag description.Consider providing a more detailed description that explains the flag's purpose in workflow recovery.
-workflowCmd.PersistentFlags().String("from-step", "", "atmos workflow <name> -f <file> --from-step <step-name>") +workflowCmd.PersistentFlags().String("from-step", "", "Resume workflow execution from a specific step (e.g., after a failure)")
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (1)
cmd/workflow.go
(1 hunks)
🔇 Additional comments (2)
cmd/workflow.go (2)
17-22
: Strong documentation of failure handling capabilities!
The new documentation clearly explains what users can expect when a workflow fails. This directly addresses the feature request and improves user experience.
32-33
: Documentation link is well-placed and formatted correctly!
The reference to detailed documentation provides users with a clear path to more information.
Please share updated screenshots |
@Cerebrovinny I really don't understand this screenshot. Why are we printing all this information? At first, I thought I was looking at the output We need to get clear that output of a command should:
Output of any command other than Usage != Help. |
|
What
--from-step
flag to resume workflows from specific stepsWhy
Example workflow demonstrating the feature:
Tests
Last test after requested changes @osterman @aknysh
When step-2 fails, users will see:
references
DEV-2717
Summary by CodeRabbit
New Features
Documentation