Since key services such as ZooKeeper and Meta run on the master node, a failure of the master node will affect the availability of the ACOS cluster.
A failure of a single master node generally does not affect the availability of the ACOS cluster. Details are as follows:
ZooKeeper Leader and Meta Leader failures
Failures of a ZooKeeper Leader or Meta Leader will trigger a re-election of the ZooKeeper Leader or Meta Leader. During this process, virtual machines may experience minor I/O anomalies.
ZooKeeper Follower and Meta Follower failures
ZooKeeper Follower and Meta Follower nodes do not actively provide metadata services. If they fail, no re-election is triggered for ZooKeeper or Meta, so virtual machines will not experience I/O interruptions.
Additionally, since the master node also runs the ABS Chunk service, a failure of the master node will impact ACOS cluster storage. The effects of this type of failure can be referred to under Storage node failures.
Failures of multiple master nodes will affect the availability of the ACOS cluster. If more than half of the master nodes in the ACOS cluster become unavailable, metadata services will be disrupted, the cluster will not function normally, and virtual machines will stop I/O operations.
For example, in 3-node and 5-node ACOS clusters, the impact of failed master nodes on the cluster is shown in the table below:
| Master node count | Failed master node count | Failure impact |
| 3 | 1 | Cluster can operate normally |
| ≥ 2 | Cluster is completely unavailable | |
| 5 | ≤ 2 | Cluster can operate normally |
| ≥ 3 | Cluster is completely unavailable |