Meta is the distributed metadata service of ABS, responsible for providing and managing all metadata information required by the cluster. The primary metadata includes state information of cluster members, description information for all volumes, the mapping relationship between volumes and data blocks, and the specific storage location of data blocks within the cluster. Meta's core functionality is to ensure the accuracy and consistency of metadata through an efficient distributed architecture, providing robust support for normal cluster operations and data management.
Multiple Metas form a Meta cluster. Each Meta stores a complete replica of the cluster metadata in the local LevelDB database, ensuring the reliability of the metadata.
The Meta nodes use ZooKeeper to perform leader election and select a single Meta leader in the cluster, while the others act as Meta followers. As the cluster manager, the Meta leader communicates with all access and LSM modules, collecting relevant status information to make decisions on data recovery, data migration, and other tasks. The follower serves as a backup node for the leader and does not perform direct data management operations. If the Meta leader node fails, the follower nodes will elect a new Meta leader.
Meta builds a distributed MetaDB based on LevelDB and ZooKeeper to manage all metadata. When MetaDB receives a data update request, all metadata update operations generate an operation log in the ZooKeeper cluster. Meta follower retrieves the latest log from ZooKeeper and replays it locally in LevelDB. In this way, the data differences between the Meta follower and the Meta leader are kept within a smaller range. Before providing services externally, the new Meta leader ensures its data state is up-to-date through data comparison and synchronization mechanisms within MetaDB.
ZooKeeper is a widely used open-source component in distributed systems, designed to ensure consistency and avoid split-brain issues caused by network faults.
A split-brain condition refers to a network fault causing a cluster, which normally operates as a coherent unit, to split into two independent entities. Although each node continues to operate normally, the loss of communication between nodes leads them to mistakenly believe there's a fault in the other nodes. This results in multiple nodes attempting to read and write data simultaneously, eventually causing data corruption.
ZooKeeper prevents split-brain issues through a strict leader election mechanism. This mechanism requires that a node in the cluster must be elected as the leader, and at any given time, there can be only one leader in the entire cluster. In a cluster with N nodes, when a network fault causes the cluster to split into two sub-clusters, the sub-cluster with more than N/2 nodes will reinitiate an election to choose a new leader if the original leader is not in this sub-cluster; however, if the original leader is still in the sub-cluster, no re-election is necessary.
Nodes running the ZooKeeper service are called master nodes and each master node runs a ZooKeeper instance. For actual deployments, depending on the scale of the cluster, you can use a deployment approach with three or five master nodes. For example, when the cluster size does not exceed four nodes, you can use a deployment method with three master nodes, which allows any one of the master nodes to fail without affecting the cluster. When the cluster size is five nodes or more, a deployment method with five master nodes can be used, allowing any two of the master nodes to fail.
LevelDB is an open-source local Key-Value (KV) service used for storing persistent metadata for Meta.