Node failures

Failure scenario	Active-active cluster processing methods and impacts
Failure of a single storage node (for example, node 3) in IDC A	Virtual machines on the failed node: If other nodes in IDC A have sufficient compute resources (CPU and memory), the virtual machines on the failed node 3 trigger HA and migrate to other nodes within IDC A. If the nodes in IDC A do not have sufficient idle compute resources, and IDC B has enough resources, the virtual machines will migrate to IDC B when HA is triggered. If all nodes lack sufficient compute resources, the virtual machine will not be able to perform HA properly. Virtual machines on other nodes: If a virtual machine accesses Extent data on the failed node 3, there will be a brief increase in I/O latency (usually at the millisecond level, with a maximum of no more than 7 seconds) before it returns to normal. If the virtual machine does not access the data on the failed node 3, there will be no observable impact. Data recovery: If the data corresponding to the replica stored on the failed node 3 was written during the failure, data recovery will be triggered within 1 minute after the update. If the data corresponding to the replica stored on the failed node 3 has not been updated, data recovery will be triggered 10 minutes later. If healthy nodes in IDC A have sufficient space, data will be recovered to other healthy nodes in IDC A from the healthy replicas within IDC A. If healthy nodes in IDC A have sufficient space, data will be recovered to other healthy nodes in IDC B from the healthy replicas within IDC B.
Simultaneous failure of multiple storage nodes (for example, nodes 3 and 4) in IDC A	Virtual machines on the failed node: If other nodes in IDC A have sufficient compute resources (CPU and memory), the virtual machines on failed nodes 3 and 4 will trigger HA and be migrated to other nodes in IDC A. If two replicas of some virtual machine data are located on nodes 3 and 4 in IDC A, after HA, services can continue operating normally by accessing the replicas in IDC B. However, storage performance will degrade noticeably because all requests must cross IDCs. The degree of degradation depends on inter-DC latency and the virtual machine's I/O model. If available compute resources in IDC A are insufficient while IDC B has sufficient compute resources, the virtual machines will trigger HA and be migrated to IDC B. If none of the nodes have sufficient available compute resources, the virtual machine will not be able to perform HA properly. Virtual machines on other nodes: If a virtual machine accesses Extent data on the failed nodes 3 and 4, there will be a brief increase in I/O latency (usually at the millisecond level, with a maximum of no more than 7 seconds) before it returns to normal. If the virtual machine does not access the data on the failed nodes 3 and 4, there will be no observable impact. Data recovery: If the data corresponding to the replica stored on the failed nodes 3 and 4 was written during the failure, data recovery will be triggered within 1 minute after the update. If the data corresponding to the replicas stored on the failed nodes 3 and 4 was not updated during the failure, data recovery will be triggered after 10 minutes. If healthy nodes in IDC A have sufficient storage space, data will be recovered to other healthy nodes in IDC A from healthy replicas within IDC A. If both replicas of the data in IDC A were respectively located on the failed nodes 3 and 4, the failure results in only one healthy replica remaining in IDC B. In this case: If other nodes in IDC A have sufficient storage space, data will first be recovered from the healthy replica in IDC B to other healthy nodes in IDC A, restoring a two-replica state (one replica in IDC A and one in IDC B). Then, if IDC A still has nodes that do not hold replicas and have sufficient space, data will be recovered again from the healthy replica in IDC A to other healthy nodes in IDC A, eventually restoring a three-replica state (two replicas in IDC A and one in IDC B). If IDC A has no nodes that both have sufficient storage space and do not already hold replicas, but IDC B has nodes with sufficient space that do not hold replicas, data will be recovered from the healthy replica in IDC B to other healthy nodes in IDC B, eventually restoring a three-replica state (one replica in IDC A and two replicas in IDC B). If nodes in IDC A do not have sufficient storage space, data will be recovered from the healthy replica in IDC B to other healthy nodes in IDC B, restoring a two-replica state (both replicas in IDC B). If it is not possible to restore the data to a three-replica state due to insufficient storage space, the cluster will continuously report high space usage alerts, and persistent data recovery failures will be observed.
Failure of nodes in IDC A hosting Meta Leader or Zookeeper Leader roles	If the failed master nodes include the ZooKeeper Leader or the Meta Leader, all virtual machines performing I/O will observe a single period of high storage latency (≤ 9 s), but no I/O errors will occur. Virtual machine HA behavior and data recovery determination are the same as those observed in storage node failure scenarios. If all master nodes in IDC A fail, operations such as creating or editing cluster virtual machines and virtual volumes will experience slight delays (typically at the millisecond level).
Failure of all nodes in IDC A	If sufficient available resources exist on nodes in IDC B, all virtual machines will be HA to IDC B. If resources in IDC B are insufficient, the virtual machines to be recovered first are determined by their configured high-availability priority. The recovery order among virtual machines with the same priority is not fixed. All data will be restored to a two-replica state within IDC B whenever possible. If the reserved space in IDC B is insufficient to host all data, the capacity layer space on each node may be restored up to 100%. The performance tier space, however, is restored only up to approximately 90% to ensure normal business I/O write operations. When the capacity tier space within a single availability zone cannot accommodate all data, data in the performance tier cannot be demoted to the capacity tier. If the availability-zone-wide failure persists for an extended period and no manual intervention is taken to delete excess data, virtual machines that do not use thick provisioning may encounter exceptions due to insufficient writable space. Typically, it is recommended to plan the capacity for a single IDC to support all data in the cluster with 2 replications.
Node failure in IDC B	Virtual machines continue to run stably on nodes in IDC A and will not trigger HA. If a virtual machine is currently writing data and one replica resides on a failed node, a single instance of I/O latency may be observed (typically at the millisecond level, with a maximum of no more than 7 seconds). Since IDC B usually holds only one data replica, for data identified as abnormal (data that was updated within 1 minute after the failure or not updated within 10 minutes), the system will read from the healthy replicas in IDC A and synchronize recovery to other healthy nodes in IDC B, restoring the system to a two-replica state in IDC A and one replica in IDC B.
Failure of witness node C	If the cluster was in a stable, healthy state before the witness node failure (the cluster ensures that the witness node does not serve as the ZooKeeper Leader), business operations will not be affected. If the cluster had experienced other failures and just recovered before the current fault, the witness node may have a small chance of assuming the Zookeeper Leader role. In this case, a brief I/O latency may be observed, typically at the millisecond level and not exceeding 6 seconds.
Multiple node failures across availability zones; critical nodes (master + witness) ≤ 50%	The HA behavior of the virtual machine is determined by its configured primary availability zone and placement group rules, and is handled in the same manner as failures on ordinary nodes. There is a small chance that all replicas of certain data reside on failed nodes. In this case, the associated virtual machines cannot start normally due to inaccessible data, and data recovery cannot be triggered because no healthy replicas exist. Normal startup and the data recovery process can only proceed once the failure is resolved and at least one data replica is restored to a healthy state.