Search Docs...
⌘ K
OverviewDeploymentManagementOperationReferenceGlossary

Replacing an SSD (VMware ESXi)

Replacing an SAS or SATA SSD

Refer to Replacing an SSD (AVE).

Replacing an NVMe SSD

Applicable scenario

The steps described in this section apply only to replacing an NVMe SSD on ACOS (VMware ESXi) cluster hosts that meet all the following conditions:

Procedure

Preparing the vmpctl file

  1. Download the compatible vmpctl file based on the CPU architecture and operating system of your local host.

  2. Rename the vmpctl file and grant executable permissions.

    • If the operating system of your local host is Linux or macOS, rename it to vmpctl, then go to the directory where the file is located and run the command chmod a+x vmpctl to grant executable permissions.
    • If the operating system of your local host is Windows, rename it to vmpctl.exe, and then disable antivirus software or add the software to the antivirus allowlist to prevent the file from being mistakenly deleted by the antivirus software.

Hot removing an NVMe SSD

  1. Use a terminal tool on your local host to open the vmpctl file and run the following command to hot remove the NVMe SSD from the SCVM via the vSphere API.

    The ESXi PCI ID of the NVMe SSD can be obtained based on the SSD name, as described in Obtaining the ESXi PCI ID corresponding to the NVMe passthrough disk name.

    ./vmpctl remove --address=<ESXI_IP> --username=<ESXI_USER> --password=<ESXI_PASSWORD> --vm=<SCVM_NAME> --device=<ESXI_PCI_ID>

    ParameterDescription
    ESXI_IPIP address of the ESXi host
    ESXI_USERUsername of the ESXi host
    ESXI_PASSWORDPassword of the ESXi host
    SCVM_NAMEName of the SCVM
    ESXI_PCI_IDESXi PCI ID of the NVMe SSD

    Note:

    • When the local host is running macOS, a pop-up message may appear indicating that the vmpctl file cannot be opened. This is likely due to the file being automatically blocked by the system. In this case, click the ? icon in the upper-right corner of the pop-up window and follow the instructions to open the file.
    • If the username or password of the ESXi host contains special characters and the local host is running Linux or macOS, you need to enclose the username or password in single quotes (') when running the command.
    • If the command output displays remove device success, it indicates that the device has been successfully removed.
    • If the output displays remove device failed, it indicates that the removal failed. You can run the command dmesg -T | tail on the ESXi host and check the kernel logs to determine whether hardware incompatibility exists. Resolve this issue and try again.
  2. Optional: Run the lsblk, lspci, or mdadm command within the SCVM to verify whether the NVMe SSD has been successfully removed.

    • Run lsblk -bld to view all disk information. If the output does not contain the removed disk, it indicates that the disk has been successfully removed.

    • Run lspci to view all PCI devices. If the output does not contain the removed disk, it indicates that the disk has been successfully removed.

    • For example, if the NVMe SSD belongs to a RAID array named md127, run mdadm -D /dev/md127 to view the RAID array information, where md127 represents the name of this RAID array. If the output displays the state as removed, the device has been successfully removed.

  3. Physically remove the NVMe SSD from the server.

Hot adding an NVMe SSD

  1. Physically install the NVMe SSD into the server.

  2. Use SSH to log in to the ESXi host where the NVMe SSD is located, and run the following command to obtain all NVMe SSD information. Identify the corresponding disk based on Physical Slot and Slot Description, and obtain its ESXi PCI ID from the ID shown in the Address field.

    esxcli hardware pci list --class=0x0108

  3. Use a terminal tool on your local host to open the vmpctl file and run the following command to hot add the NVMe SSD to the SCVM.

    ./vmpctl add --address=<ESXI_IP> --username=<ESXI_USER> --password=<ESXI_PASSWORD> --vm=<SCVM_NAME> --device=<ESXI_PCI_ID>

    ParameterDescription
    ESXI_IPIP address of the ESXi host
    ESXI_USERUsername of the ESXi host
    ESXI_PASSWORDPassword of the ESXi host
    SCVM_NAMEName of the SCVM
    ESXI_PCI_IDESXi PCI ID of the NVMe SSD

    Note:

    • When the local host is running macOS, a pop-up message may appear indicating that the vmpctl file cannot be opened. This is likely due to the file being automatically blocked by the system. In this case, click the ? icon in the upper-right corner of the pop-up window and follow the instructions to open the file.
    • If the username or password of the ESXi host contains special characters and the local host is running Linux or macOS, you need to enclose the username or password in single quotes (') when running the command.

    • If the command output displays add device success, it indicates that the device has been successfully added.
    • If the output displays add device failed, it indicates that the addition failed. You can run the command dmesg -T | tail on the ESXi host and check the kernel logs to determine whether hardware incompatibility exists. Resolve this issue and try again.
  4. Follow the steps below to mount the disk.

    • If the replaced NVMe SSD is a cache disk containing a metadata partition (tiered storage mode), or a data disk containing a metadata partition (non-tiered storage mode), run the following command in the SCVM, where nvme1n1 refers to the name of the NVMe SSD to be mounted.

      zbs-deploy-manage mount-disk /dev/nvme1n1 smtx_system

      If the output displays Mount disk: /dev/nvme1n1 success!, it indicates that the disk is successfully mounted.

    • If the replaced NVMe SSD is a cache disk containing no metadata partitions (tiered storage mode), run the following command in the SCVM, where nvme2n1 refers to the name of the NVMe SSD to be mounted.

      zbs-deploy-manage mount-disk /dev/nvme2n1 cache

      If the output shows Mount disk: /dev/nvme2n1 success!, it indicates that the disk is successfully mounted.

  5. Run the lsblk command in the SCVM to verify that the newly installed NVMe SSD is successfully partitioned.