Skip to content

Commit

Permalink
Merge pull request xapi-project#5192 from robhoes/add-docs
Browse files Browse the repository at this point in the history
Add some more documents from xapi-project.github.io
  • Loading branch information
robhoes committed Oct 12, 2023
2 parents 56d31a6 + 34db67e commit 41773b4
Show file tree
Hide file tree
Showing 50 changed files with 3,151 additions and 0 deletions.
Binary file added doc/content/toolstack/features/DR/dr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
29 changes: 29 additions & 0 deletions doc/content/toolstack/features/DR/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
+++
title = "Disaster Recovery"
+++

The [HA](../HA/HA.html) feature will restart VMs after hosts have failed, but what
happens if a whole site (e.g. datacenter) is lost? A disaster recovery
configuration is shown in the following diagram:

![Disaster recovery maintaining a secondary site](dr.png)

We rely on the storage array's built-in mirroring to replicate (synchronously
or asynchronously: the admin's choice) between the primary and the secondary
site. When DR is enabled the VM disk data and VM metadata are written to the
storage server and mirrored. The secondary site contains the other side
of the data mirror and a set of hosts, which may be powered off.

In normal operation, the DR feature allows a "dry-run" recovery where a host
on the secondary site checks that it can indeed see all the VM disk data
and metadata. This should be done regularly, so that admins are familiar with
the process.

After a disaster, the admin breaks the mirror on the secondary site and triggers
a remote power-on of the offline hosts (either using an out-of-band tool or
the built-in host power-on feature of xapi). The pool master on the secondary
site can connect to the storage and extract all the VM metadata. Finally the
VMs can all be restarted.

When the primary site is fully recovered, the mirror can be re-synchronised
and the VMs can be moved back.
11 changes: 11 additions & 0 deletions doc/content/toolstack/features/HA/HA.configure.msc
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
participant slave1
Note over master: Host.enable_ha\nchoose an SR\nfind or create VDIs\nattach VDIs\nwrite xhad.conf\nha_set_pool_state init
master->slave1: Host.preconfigure_ha
Note over slave1: attach VDIs\nwrite xhad.conf\n
master->slave2: Host.preconfigure_ha
Note over slave2: attach VDIs\nwrite xhad.conf\n
master->slave1: Host.ha_join_liveset
master->slave2: Host.ha_join_liveset
Note over master: ha_propose_master
slave1-->master: wait for master
slave2-->master: wait for master
11 changes: 11 additions & 0 deletions doc/content/toolstack/features/HA/HA.configure.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 6 additions & 0 deletions doc/content/toolstack/features/HA/HA.disable.clean.msc
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Master Xapi->Master Xhad: ha_set_pool_state Invalid
Master Xhad->Master Xapi: OK
Note over Slave Xhad: heartbeat thread notices\ninvalid state and disarms
Slave Xapi-->Slave Xhad: ha_query_liveset
Slave Xhad-->Slave Xapi: Invalid
Note over Slave Xapi: disable HA, cleanup
7 changes: 7 additions & 0 deletions doc/content/toolstack/features/HA/HA.disable.clean.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions doc/content/toolstack/features/HA/HA.disable.unclean.msc
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Note over Xapi: disable HA recovery\nlogic; user has manual\ncontrol
Xapi->Xhad: ha_disarm_fencing
Xhad->Xapi: OK
Xapi->Xhad: ha_stop_daemon
Xhad->Xapi: OK
6 changes: 6 additions & 0 deletions doc/content/toolstack/features/HA/HA.disable.unclean.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 8 additions & 0 deletions doc/content/toolstack/features/HA/HA.shutdown.msc
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Note over Xapi: all VMs shutdown\nall VDIs unlocked
Xapi->Xhad: ha_disarm_fencing
Xhad->Xapi: OK
Xapi->Xhad: ha_stop_daemon
Xhad->Xapi: OK
Note over Xhad: daemon exits
Xapi->Statefile: ha_set_excluded
Note over Statefile: host will not be included\nin liveset calculations until\nafter reboot
8 changes: 8 additions & 0 deletions doc/content/toolstack/features/HA/HA.shutdown.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions doc/content/toolstack/features/HA/HA.start.msc
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Xapi->Xhad: ha_start_daemon
Note over Xhad: Starts talking to other hosts\nto form or join the liveset
Xapi-->Xhad: ha_query_liveset
Xhad-->Xapi: Starting
Note over Xhad: liveset joined and\nexcluded flag cleared
Xhad->Xapi: OK
Xapi-->Xhad: ha_query_liveset
Xhad-->Xapi: Online
Note over Xapi: If starting HA and am a master\n already or if responding to a failure\nwhere the master may have failed.
Xapi->Xhad: ha_propose_master
Note over Xhad: at most one host can be a master
Xhad->Xapi: TRUE/FALSE
13 changes: 13 additions & 0 deletions doc/content/toolstack/features/HA/HA.start.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/content/toolstack/features/HA/ha.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 41773b4

Please sign in to comment.