Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rgw]fix TFA issue by adding sleep of 20 seconds after rgw restart to avoid sync status failures #4187

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions utility/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -594,6 +594,8 @@ def set_config_param(node):
# restart osd service
node.exec_command(cmd=f"ceph orch restart {osd_process_name}")
node.exec_command(cmd="ceph config dump")
log.info("sleeping for 20 seconds")
time.sleep(20)
Comment on lines +597 to +598
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blind sleep may work today but it could fail later or in the next build.

IMHO, we could check if the rgw service is up and running using

ceph orch ls --service_type rgw --service-name <rgw_service_name> --format json |
jq '.[0].status | select(.size !=0) | .size == .running`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong with the need for checking rgw based on the heading... however the code is restarting osd_process_name

It is also possible to check for that particular daemon running status.



def kernel_mount(mounting_dir, mon_node_ip, kernel_clients):
Expand Down
Loading