During the long-term operation of a cluster, data recovery needs caused by hardware failures (such as physical disk and memory issues) or maintenance operations (like service node restarts) are necessary. The ABS data recovery mechanism aims to ensure data safety while fully utilizing hardware capabilities to accelerate recovery speed. It reduces the time during which data is degraded and minimizes the impact on business I/O, avoiding critical business disruptions caused by cluster data recovery storms.
When an exception occurs, ABS prioritizes recovering the following data to enhance security:
Data with a higher degree of loss, such as data that expects 2 replicas losing one, will have a higher recovery priority compared to data expecting 3 replicas losing one.
High-priority data specified by the user, such as data designated for volume pinning.
For temporary recoverable failures (such as interruptions caused by network fluctuations or data shards becoming inaccessible due to node/service restarts), ABS records the data state at the last moment the shard is removed during the exclusion of abnormal shards. After the node or Chunk service resumes normal access, the system prioritizes replaying the merged data increments generated during the exclusion period onto the removed shard. This increment is usually smaller than the full data size, significantly reducing the time required for data recovery.
Each node in ABS monitors the following two statuses in real-time:
Hardware capabilities: This includes the capacities of physical disks and the network, which determine the upper and lower limits for handling recovery traffic.
User business I/O traffic.
Based on the above statuses, ABS provides two flow control strategies to meet the needs of different scenarios:
AUTO (Intelligent adjustment): Default mode. To protect business I/O, the system adaptively adjusts data recovery speed based on the current business load. When the workload is heavy, the system automatically reduces the recovery speed to maintain I/O stability.
STATIC (Static adjustment): Allows users to manually set the upper limit of the data recovery speed. When users want to protect normal I/O operations, they can set a smaller value (e.g., the default recovery and speed limit values). For faster recovery, a larger value (such as 500 MB/s) can be set. This setting applies to all Chunks under Meta; individual Chunk settings cannot be set.
In the routine operation of a cluster, it may be necessary to temporarily stop a node or a storage service when performing tasks like upgrading a service, restarting the kernel, or replacing hardware. ABS provides a node maintenance mode to handle these operations. After entering maintenance mode, cold data on the node that is not reported as normal for a relatively long period will not trigger data recovery either. If the hot data remains in a relatively safe mode after excluding data shards (for example, in a 3-replica policy, excluding one replica leaves 2 healthy replicas), data recovery will not be immediately triggered. Instead, it will wait for maintenance operations to complete, the node to return to normal, and exit maintenance mode. Data will then be efficiently restored through incremental recovery to minimize unnecessary recovery operations.