Select the cluster to be upgraded from the cluster list in Upgrade Center to open the cluster details panel. In the upgrade file row, click Upgrade next to the target version upgrade file.
Select the upgrade scope. Refer to Upgrade policies and select the upgrade scope based on your actual needs.
Upgrade cluster: If the cluster has never been upgraded before, this option is selected by default and cannot be unchecked. If the cluster has already been upgraded to the latest version, it cannot be selected.
Upgrade kernel: Confirm whether to choose to upgrade the kernel according to the prompt. If you choose to upgrade the kernel, the kernel upgrade will be performed automatically after the cluster upgrade is completed.
Upgrade the kernel for all hosts: If all nodes in the cluster meet the following requirements, this option is selected by default:
The current kernel version is earlier than the target kernel version;
The management IP of the node is not used to associate with AOC;
If the cluster is an Arcfra Cloud Operating System (ACOS) cluster, there must not be any GPU devices used for vGPU (if there are devices used for vGPU on the node, refer to the corresponding version of Arcfra Cloud Operating System Upgrade Guide to upgrade the kernel version and update the vGPU driver);
Upgrade the kernel of the selected hosts: When there are nodes in the cluster that do not meet the above requirements, this option is selected by default, and you can choose which nodes to upgrade the kernel.
Choose whether to specify the data recovery rate during the upgrade.
If this option is not checked, the default is the cluster's own data recovery mode setting, and the system will automatically adjust the data recovery rate during the upgrade based on the current node's hardware configuration and I/O load.
If this option is checked, you can specify the data recovery rate during the upgrade, which defaults to 400 MiB/s and can be adjusted between 100 and 500 MiB/s.
(Optional) When upgrading the kernel is selected, you also need to choose the execution method for migrating virtual machines back.
During the kernel upgrade process, the host will enter maintenance mode, and the virtual machines on the node will be migrated or shut down. Therefore, after exiting maintenance mode, you need to choose the method for starting or migrating back the virtual machines. ACOS (AVE) clusters support two methods: automatically and upon your confirmation.
automatically: After the nodes in the cluster exit host maintenance mode, the system will automatically power on or migrate back the virtual machines that were shut down or migrated during the maintenance mode entry process.
upon your confirmation: After the nodes in the cluster exit host maintenance mode, manual confirmation is required to determine whether to start or migrate back the virtual machines that were shut down or migrated during the maintenance mode entry process. If you need to manually migrate back some virtual machines to verify if workloads can run normally after exiting maintenance mode, you can choose this method. After confirming that there are no issues, you can manually click Execute to automatically migrate back the remaining virtual machines.
Click Upgrade to enter the automatic upgrade process. You can view the specific upgrade progress on the upgrade page.
If the cluster upgrade or kernel upgrade process fails, you can click View log to see the reason for the failure, or view the failure reason by clicking Details next to the upgrade record in the Upgrade record tab of the cluster details panel. After resolving the issue, click Retry to continue the upgrade from the point of failure. When viewing the upgrade log of the most recent failed upgrade record, you can click Download log to download the log file to your local machine for problem analysis.
Note:
- During the kernel upgrade process, if you need to stop the upgrade, you can click Stop upgrade, and the system will immediately stop the upgrade. If you need to continue the upgrade later, you can click Retry, and it will continue from the last stopped node.
- When kernel upgrade is selected, some virtual machines may have been migrated or shut down when the node enters maintenance mode. After the upgrade is complete, the system will automatically check if these virtual machines have been restored to their pre-upgrade state and host. If there are virtual machines that have not been powered on or migrated back, you can view the prompt and click Details to compare the virtual machine's status and host information.