Search Docs...
⌘ K
OverviewDeploymentManagementOperationReferenceGlossary
    ACOS 6.2.0
  • Acrfra Cloud Operation System cluster>
  • ACOS operations and maintenance>
  • For physical disks

Arcfra system disk failures

The Arcfra system disk refers to the hardware RAID 1 composed of two physical disks selected during ACOS image installation when you choose to install ACOS on an independent hardware RAID 1. These two physical disks serve as member physical disks for installing the operating system.

This section provides guidance only on identifying and handling failures of an Arcfra system disk. If you choose to install ACOS on a software RAID 1 when installing the ACOS image file, refer to Physical disk failures to handle any failure of cache disks with metadata partitions used for installing the operating system.

Failure types

Main type Subtype and description
Hardware RAID failure

The member physical disk in the hardware RAID group experiences read or write errors or high latency, and has been marked as a failed component.

S.M.A.R.T. test failed

The member physical disk in the hardware RAID group failed the S.M.A.R.T. test.

Short lifespan

The member physical disk in the hardware RAID group shows no signs of read or write timeout, high I/O latency, or damage, but is determined by the system to have an insufficient lifespan and may soon pose a risk.

When any of the following alerts appear on the main AOC Alert page, it indicates a hardware RAID failure on the cluster node. Here, the placeholder {XXXX} represents the actual information displayed by the system. Follow the alert message for further actions.

Alert message Default alert level Recommended operation

The hardware RAID virtual disk { virtual disk name } on the host { host name } is abnormal: { abnormal status }.

Critical Correct the RAID group configuration under the controller, or replace the problematic disk that caused the issue with a new one.

The physical disk { disk serial number } in hardware RAID on the host { host name } failed the S.M.A.R.T. test.

Critical Replace the corresponding physical disk.

The hardware RAID virtual disk { virtual disk name } on the host { host name } has insufficient redundancy.

Notice Install a new disk in an empty slot, or replace the unrecognized disk with a new one.

The physical disk { disk serial number } in hardware RAID on the host { host name } has a remaining lifespan below { alert threshold }.

Notice Replace the corresponding physical disk.

When a related failure occurs:

  • In most cases, you only need to replace the problematic physical disk, and the RAID controller automatically handles related configurations.

  • In certain special cases, you may need to reconfigure the RAID settings via the BMC or BIOS interface of the server.

You can check the status of the Arcfra system disk and its member physical disks via AOC or the BMC system to determine whether a hardware RAID failure has occurred.

  • AOC

    You can check the status of the Arcfra system disk and its member physical disks on the host overview page or the system disk details page of AOC.

  • BMC system

    You can use the BMC interface of the server to check the model of the storage controller, as well as the status and details of the RAID group and physical disks under the storage controller.

    The Arcfra Hardware Compatibility Checker helps you check which storage controllers are supported by ACOS. The following table lists some common storage controllers used for the Arcfra system disk and their corresponding BMC systems. Refer to the vendor documentation for detailed instructions since the operation interfaces vary with different BMC systems.

    Vendor BMC system RAID controller
    Dell PowerEdge iDRAC Marvell SATA M.2 (Dell BOSS-S1/S2)
    Marvell NVMe M.2 (Dell BOSS-N1)
    Lenovo ThinkSystem XClarity Marvell SATA M.2 (ThinkSystem M.2 SATA 2-Bay RAID)
    Marvell NVMe M.2 (ThinkSystem M.2 NVMe 2-Bay RAID)
    xFusion iBMC Broadcom MegaRAID (LSI SAS3XXX)