Search Docs...
⌘ K
OverviewDeploymentManagementOperationReferenceGlossary
    ACOS 6.2.0
  • Acrfra Cloud Operation System cluster>
  • ACOS operations and maintenance>
  • For physical disks

Upgrading physical disk firmware

When upgrading the firmware of a physical disk, perform the procedures below accordingly, based on whether a node restart is required by the upgrade solution provided by the hardware vendor, and whether the disk to be upgraded is a system disk (including the Arcfra system disk):

Upgrading physical disk firmware offline

  1. Log in to AOC and set the node whose disk firmware is to be upgraded to maintenance mode.

  2. Shut down the node in AOC.

  3. Upgrade the firmware. After confirming that the firmware version has been updated to the target version, start the server.

  4. Log in to AOC and exit maintenance mode for this node.

  5. Log in to this node using SSH and run the following command to check whether all services are running properly.

    sudo /usr/share/tuna/script/control_all_services.sh --action=status --group=role

  6. On any node in the cluster, run the following command to check whether data recovery is complete:

    sudo zbs-meta pextent find need_recover

    If the message No PExtents found. is returned, the data recovery is complete. Otherwise, wait for a while and check again.

Upgrading system disk firmware online

  1. Log in to AOC and set the node whose disk firmware is to be upgraded to maintenance mode.

  2. On the command-line terminal of the node, run the following command to stop the Chunk service:

    sudo systemctl stop zbs-chunkd

  3. Then run the following command to verify that the Chunk service has been stopped:

    sudo systemctl status zbs-chunkd

    The output below indicates that the Chunk service is stopped. Ensure that the value of Active is inactive (dead):

     zbs-chunkd.service - Chunk service
         Loaded: loaded (/usr/lib/systemd/system/zbs-chunkd.service; enabled; vendor preset: disabled)
        Drop-In: /etc/systemd/system/zbs-chunkd.service.d
                 └─cgroup.conf, delegate.conf
         Active: inactive (dead) since Fri 2024-12-20 12:08:24 CST; 42s ago
        Process: 180462 ExecStart=/usr/share/zbs/bin/zbs_run_service.sh zbs/others /usr/sbin/zbs-chunkd --foreground (code=exited, status=0/SUCCESS)
       Main PID: 180462 (code=exited, status=0/SUCCESS)
         Status: "Starting event loop..."

    Information:

    At this point, AOC will display an alert indicating that the storage service health status is abnormal, which is an expected behavior.

  4. Upgrade the firmware. After the upgrade is complete, confirm that the firmware version has been updated to the target version.

  5. On the command-line terminal of the node, run the following command to start the Chunk service:

    sudo systemctl start zbs-chunkd

  6. Then run the following command to verify that the Chunk service has been started as expected:

    sudo systemctl status zbs-chunkd

    The output below indicates that the Chunk service has started as expected. Ensure that the value of Active is active (running):

     zbs-chunkd.service - Chunk service
           Loaded: loaded (/usr/lib/systemd/system/zbs-chunkd.service; enabled; vendor preset: disabled)
          Drop-In: /etc/systemd/system/zbs-chunkd.service.d
                   └─cgroup.conf, delegate.conf
           Active: active (running) since Fri 2024-12-20 12:45:12 CST; 3s ago
          Process: 132708 ExecStartPre=/usr/share/zbs/bin/zbs_config_rdma_qos.sh dscp $CHUNK_SERVER_ACCESS_QOS_MODE (code=exited, status=0/SUCCESS)
         Main PID: 132712 (zbs-chunkd)
           Status: "Starting event loop..."
            Tasks: 21
           Memory: 112.0M
           CGroup: /smtx.slice/smtx-zbs.slice/smtx-zbs-chunkd.slice/zbs-chunkd.service
                   └─132712 /usr/sbin/zbs-chunkd --foreground
  7. Wait until the alert indicating that the storage service health status is abnormal in AOC clears.

  8. Exit maintenance mode for this node in AOC.

  9. On any node in the cluster, run the following command to check whether data recovery is complete:

    sudo zbs-meta pextent find need_recover

    If the message No PExtents found. is returned, the data recovery is complete. Otherwise, wait for a while and check again.

Upgrading non-system disk firmware online

  1. On the command-line terminal of the node where the physical disk firmware needs to be upgraded, run the following command to put the node into storage maintenance mode:

    zbs-meta chunk set_maintenance <cid> true [--expire_duration_s <EXPIRE_DURATION_S>]
    • <cid>: Replace it with the actual Chunk ID.
    • [--expire_duration_s <EXPIRE_DURATION_S>]: An optional parameter, which refers to the expiration time of the maintenance mode. If not specified, it defaults to a maximum of 43200 seconds (12 hours).

    If the output displays that both the ID and IP match the host whose disk firmware is to be upgraded, and the value of Maintenance Mode is True, the host has successfully entered storage maintenance mode. Wait for 10 seconds after the storage maintenance mode is enabled.

  2. On the command-line terminal of the node, run the following command to stop the Chunk service:

    sudo systemctl stop zbs-chunkd

  3. Then run the following command to verify that the Chunk service has been stopped:

    sudo systemctl status zbs-chunkd

    The output below indicates that the Chunk service is stopped. Ensure that the value of Active is inactive (dead):

     zbs-chunkd.service - Chunk service
         Loaded: loaded (/usr/lib/systemd/system/zbs-chunkd.service; enabled; vendor preset: disabled)
        Drop-In: /etc/systemd/system/zbs-chunkd.service.d
                 └─cgroup.conf, delegate.conf
         Active: inactive (dead) since Fri 2024-12-20 12:08:24 CST; 42s ago
        Process: 180462 ExecStart=/usr/share/zbs/bin/zbs_run_service.sh zbs/others /usr/sbin/zbs-chunkd --foreground (code=exited, status=0/SUCCESS)
       Main PID: 180462 (code=exited, status=0/SUCCESS)
         Status: "Starting event loop..."

    Information:

    At this point, AOC will display an alert indicating that the storage service health status is abnormal, which is an expected behavior.

  4. Upgrade the firmware. After the upgrade is complete, confirm that the firmware version has been updated to the target version.

  5. On the command-line terminal of the node, run the following command to start the Chunk service:

    sudo systemctl start zbs-chunkd

  6. Then run the following command to verify that the Chunk service has been started as expected:

    sudo systemctl status zbs-chunkd

    The output below indicates that the Chunk service has started as expected. Ensure that the value of Active is active (running):

     zbs-chunkd.service - Chunk service
           Loaded: loaded (/usr/lib/systemd/system/zbs-chunkd.service; enabled; vendor preset: disabled)
          Drop-In: /etc/systemd/system/zbs-chunkd.service.d
                   └─cgroup.conf, delegate.conf
           Active: active (running) since Fri 2024-12-20 12:45:12 CST; 3s ago
          Process: 132708 ExecStartPre=/usr/share/zbs/bin/zbs_config_rdma_qos.sh dscp $CHUNK_SERVER_ACCESS_QOS_MODE (code=exited, status=0/SUCCESS)
         Main PID: 132712 (zbs-chunkd)
           Status: "Starting event loop..."
            Tasks: 21
           Memory: 112.0M
           CGroup: /smtx.slice/smtx-zbs.slice/smtx-zbs-chunkd.slice/zbs-chunkd.service
                   └─132712 /usr/sbin/zbs-chunkd --foreground
  7. Wait until the alert indicating that the storage service health status is abnormal in AOC clears.

  8. Run the following command on the command-line terminal of the node to exit maintenance mode for the node:

    zbs-meta chunk set_maintenance <cid> false

    Replace <cid> with the actual Chunk ID.

  9. On any node in the cluster, run the following command to check whether data recovery is complete:

    sudo zbs-meta pextent find need_recover

    If the message No PExtents found. is returned, the data recovery is complete. Otherwise, wait for a while and check again.