Prerequisite
You have downloaded the ACOS upgrade ISO image and the corresponding metadata JSON file.
You have acquired the root privileges. For privilege details, see Local accounts on ACOS nodes.
Precaution
During the cluster upgrade, the elf-vm-monitor and elf-vm-watchdog services will be temporarily suspended, which triggers corresponding alerts. However, these alerts can be ignored because after the upgrade is complete, the operation of elf-vm-monitor and elf-vm-watchdog services will be automatically resumed.
If the upgrade process is interrupted, the elf-vm-monitor and elf-vm-watchdog services will be in a stopped state. In this case, do not start the elf-vm-monitor service on any node until the cause of the upgrade interruption is identified. After re-executing the upgrade command and successfully completing the upgrade, the system will automatically resume the operation of elf-vm-monitor and elf-vm-watchdog services.
Procedure
You only need to perform the command-line operation on one node of the cluster to complete the upgrade of the entire cluster.
Log in to the node system of the corresponding virtualization platform using the root account, and use the tool to upload the upgrade ISO file for the target version and the corresponding metadata JSON file to a node in the cluster.
For example, the management IP address of a cluster node is 192.168.75.101. You can use Xshell or other similar tools to upload the target version's upgrade ISO file and the corresponding metadata JSON file to the root directory of the 192.168.75.101 node before upgrading.
Enter the following commands to mount the image file and configure the yum repository as upgrade.repo.
# Create the /mnt/iso directory
mkdir /mnt/iso
# Mount the image to the /mnt/iso directory
mount -o loop new_iso_file_name.iso /mnt/iso
# Create the /etc/yum.repos.d/bk directory
mkdir /etc/yum.repos.d/bk
# Move all repository files under the /etc/yum.repos.d/ directory to the /etc/yum.repos.d/bk/ directory
mv /etc/yum.repos.d/*.repo /etc/yum.repos.d/bk/
# Create the upgrade.repo file under the /etc/yum.repos.d/ directory
touch /etc/yum.repos.d/upgrade.repo
# Use the vi /etc/yum.repos.d/upgrade.repo command to open the file and write the following content:
[upgrade-local-iso]
name=acos
baseurl=file:///mnt/iso
gpgcheck=0
enabled=1
After execution, check /etc/yum.repos.d to ensure that only the upgrade.repo file exists in this directory and that the content of the upgrade.repo file matches the content described above.
To install or update the cluster-upgrade rpm package, enter the following command to update.
yum clean all
yum install -y cluster-upgrade
Enter the /usr/share/upgrade/runner/ directory and execute the following script, where <iso path> specifies the absolute path of the ISO image file for the target upgrade version, and <metadata path> specifies the absolute path of the ISO metadata file for the target upgrade version.
nohup python cluster_upgrader.py --iso_path <iso path> --metadata_path <metadata path> &
While the script is being executed, run the command tail -f nohup.out to view the real-time upgrade logs.
If the cluster upgrade is successful, the log will display Cluster upgrade successful.
If the cluster upgrade fails, you need to first identify and resolve the failure causes based on relevant logs. Then, proceed with another upgrade attempt, which will continue from where the last failed upgrade left off.
Information:
- During the upgrade of the ACOS (VMware ESXi) cluster, the I/O routing does not switch automatically.
- A prompt like "There is another tool upgrade process." or "There are other cluster upgrade processes in progress." might show in the upgrade logs.
- If this is caused by another simultaneous upgrade attempt, the current upgrade should be aborted.
- If not, log in to the node system corresponding to your virtualization platform and then execute the
cluster-upgrade clear_upgrade_eventcommand to clear the upgrade failure information, and then try upgrading the cluster again.- If the upgrade fails, you may continue upgrading after resolving the related issues. The system will automatically skip the executed steps and resume from the last failed step.