-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to make determine/process reboot-cause services restartable #86
Add support to make determine/process reboot-cause services restartable #86
Conversation
Signed-off-by: anamehra <[email protected]>
@anamehra , the PR test is failing:
Please fix. |
Signed-off-by: anamehra <[email protected]>
@anamehra , my understanding is this PR needs to go in first before the PR from buildimage correct? or it is safe tohave both PRs merged independently? Please clarify as we don't want to merge with wrong order to cause any regressions. Thanks! |
This PR can go independently. The PR in sonic-buildimage will need this PR. |
Thaks for the clarification. |
@@ -218,6 +220,10 @@ def main(): | |||
sonic_logger.log_error("User {} does not have permission to execute".format(pwd.getpwuid(os.getuid()).pw_name)) | |||
sys.exit("This utility must be run as root") | |||
|
|||
if os.path.exists(REBOOT_PROCESSED_FILE): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anamehra can you add RemainAfterExit=true
in the determine-reboot-cause.service
, that should ensure systemd starts this service only once (unless someone manually starts the service, not the point of this PR)
@StormLiangMS , @yxieca , MSFT ADO: 25892864. Please help review/approve for 202305 and 202205 branches. Thanks! |
202205 will require manual PR as the sonic-host-service is not a submodule but part of sonic-buildimage repo. I will raise a PR for 202205 once sonic-net/sonic-buildimage#17220 is merged. |
@prgeor , process-reboot-cause has a dependency on this service. If we add a conditional check, that dependency won't meet, and process-reboot -cause won't run. Let me check modifications in the precess-reboot-cause unit file to handle this. |
…estartable (sonic-net#86)" This reverts commit 5dcd1e5.
Signed-off-by: anamehra [email protected]
Why I did it
Fixes sonic-net/sonic-buildimage#16990
This PR can be merged independently. The PR (sonic-net/sonic-buildimage#17220) will need this host-services PR to be merged and released.
MSFT ADO: 25892864
determine-reboot-cause and process-reboot-cause service does not start If the database service fails to restart in the first attempt. Even if the Database service succeeds in next attempt, these reboot-cause services do not start.
The process-reboot-service does not restart if the docker or database service restarts, which leads to an empty reboot-cause history
deploy-mg from sonic-mgmt also triggers the docker service restart. The restart of the docker service caused the issue stated in 2 above. The docker restart also triggers determine-reboot-cause to restart which creates an additional reboot-cause file in history and modifies the last reboot-cause.
This PR along with sonic-buildimage PR (17220) fixes these issues by making both processes to start again when dependency meets after dependency failure, making both processes restart when the database service restarts, and preventing duplicate processing of the last reboot reason.
How I did it
How to verify it
On single asic pizza box:
On Chassis:
Let database service on LC fail the first time. determine-reboot-cause and process-reboot-cause would fail to start due to dependency failure
start database-chassis on Supervisor. Database service on LC should now start successfully.
Verify determine-reboot-cause and process-reboot-cause also starts
Verify show reboot-cause history output