-
Notifications
You must be signed in to change notification settings - Fork 2
Add cold migration to osism manage compute migrate #1287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Automatically cold migrate instances in SHUTOFF state when issueing `osism manage compute migrate` command. Cold migration will be automatically confirmed if the command is waiting for the migration to finish. A warning has been added to the description of the `--no-wait` parameter. A `--no-cold-migration` parameter is added to allow to skip cold migration and thus allow to fallback to the previous behaviour. Part of osism/issues#1248 Signed-off-by: Jan Horstmann <[email protected]>
if not no_wait: | ||
inner_wait = True | ||
while inner_wait: | ||
while True: | ||
time.sleep(2) | ||
s = conn.compute.get_server(server[0]) | ||
if s.status in ["MIGRATING"]: | ||
logger.info( | ||
f"Live migration of {server[0]} ({server[1]}) is still in progress" | ||
f"{migration_type.capitalize()} migration of {server[0]} ({server[1]}) is still in progress" | ||
) | ||
inner_wait = True | ||
else: | ||
inner_wait = False | ||
if migration_type == "cold" and s.status in [ | ||
"VERIFY_RESIZE" | ||
]: | ||
conn.compute.confirm_server_resize(s) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop used for checking the migration status of servers (lines 431-444) can potentially lead to performance issues due to the continuous polling with a fixed sleep interval. This approach can cause unnecessary delays and resource usage, especially if the migration operation is lengthy.
Recommendation:
Consider implementing a more efficient waiting mechanism, possibly using event-driven or callback mechanisms provided by the openstack
SDK, if available. This would reduce the CPU cycles wasted on polling and could provide a more responsive and resource-efficient solution.
for server in result: | ||
if server[2] not in ["ACTIVE", "PAUSED"]: | ||
if server[2] in ["ACTIVE", "PAUSED"]: | ||
migration_type = "live" | ||
elif server[2] in ["SHUTOFF"] and not no_cold_migration: | ||
migration_type = "cold" | ||
else: | ||
logger.info( | ||
f"{server[0]} ({server[1]}) in status {server[2]} cannot be live migrated" | ||
f"{server[0]} ({server[1]}) in status {server[2]} cannot be migrated" | ||
) | ||
continue | ||
|
||
if yes: | ||
answer = "yes" | ||
else: | ||
answer = prompt( | ||
f"Live migrate server {server[0]} ({server[1]}) [yes/no]: " | ||
f"{migration_type.capitalize()} migrate server {server[0]} ({server[1]}) [yes/no]: " | ||
) | ||
|
||
if answer in ["yes", "y"]: | ||
logger.info(f"Live migrating server {server[0]}") | ||
conn.compute.live_migrate_server( | ||
server[0], host=target, block_migration="auto", force=force | ||
logger.info( | ||
f"{migration_type.capitalize()} migrating server {server[0]}" | ||
) | ||
if migration_type == "live": | ||
conn.compute.live_migrate_server( | ||
server[0], host=target, block_migration="auto", force=force | ||
) | ||
elif migration_type == "cold": | ||
conn.compute.migrate_server(server[0], host=target) | ||
|
||
if not no_wait: | ||
inner_wait = True | ||
while inner_wait: | ||
while True: | ||
time.sleep(2) | ||
s = conn.compute.get_server(server[0]) | ||
if s.status in ["MIGRATING"]: | ||
logger.info( | ||
f"Live migration of {server[0]} ({server[1]}) is still in progress" | ||
f"{migration_type.capitalize()} migration of {server[0]} ({server[1]}) is still in progress" | ||
) | ||
inner_wait = True | ||
else: | ||
inner_wait = False | ||
if migration_type == "cold" and s.status in [ | ||
"VERIFY_RESIZE" | ||
]: | ||
conn.compute.confirm_server_resize(s) | ||
break | ||
|
||
|
||
class ComputeStart(Command): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error handling in the migration logic (lines 402-447) appears minimal, which could lead to scenarios where exceptions or unexpected states are not properly managed. For instance, network issues or API failures during the live_migrate_server
or migrate_server
calls are not explicitly handled, which could result in partial migrations or unclear application states.
Recommendation:
Enhance the robustness of the migration operations by implementing comprehensive exception handling around the API calls. Use try-except blocks to catch potential exceptions from the openstack
SDK methods, and provide clear error messages or recovery options. This would improve the reliability and user experience of the migration commands.
Automatically cold migrate instances in SHUTOFF state when issueing
osism manage compute migrate
command.Cold migration will be automatically confirmed if the command is waiting for the migration to finish. A warning has been added to the description of the
--no-wait
parameter.A
--no-cold-migration
parameter is added to allow to skip cold migration and thus allow to fallback to the previous behaviour.Part of osism/issues#1248