The VM-based workload cluster contains a control plane node group and at least one worker node group, with nodes in the same group having the same configurations.
If the VM-based workload cluster being created requires a worker node group with the Autoscaling node count type, you should enable Node group autoscaling to set the auto scaling range for the worker node group in subsequent steps.
Note:
You can also enable or disable Node group autoscaling after the VM-based workload cluster is created.
Set the node group name.
Specify the number of nodes for the node group. You can choose 1, 3, or 5. To ensure control plane high availability for the VM-based workload cluster, it is recommended to select 3 or 5.
Configure the resources for each node within the node group.
| Parameter | Description |
|---|---|
| CPU | The number of vCPUs allocated per node in the group. The default is 4 vCPUs, with a minimum allocation of 2 vCPUs. |
| Memory | The amount of memory allocated per node in the group. The default is 8 GiB, with a minimum allocation of 6 GiB. |
| Storage | The amount of storage allocated per node in the group, which is the disk capacity for the corresponding virtual machine. The default is 200 GiB and cannot be modified. |
(Optional) If the number of nodes in the node group is greater than 1, you can choose to enable Faulty node auto replacement. Once enabled, if the proportion of faulty nodes in the node group does not exceed the maximum threshold for faulty nodes, the system will automatically remove the faulty nodes and create new ones. Enabling this feature requires configuring the following parameters.
| Parameter | Description |
|---|---|
| Faulty node detection | Criteria for determining faulty nodes. You should tick the required conditions and set the duration threshold.
Notes:
|
| Maximum percentage of faulty nodes | Setting it can trigger the system to perform automatic replacements of faulty nodes if the percentage of faulty nodes is within the maximum percentage. The value cannot be greater than 40%. Assuming the total number of nodes in the group is 5, if this setting is 40%, then when the number of faulty nodes in the group is less than or equal to 2, they will be automatically replaced; if it is greater than 2, they will not be replaced. |
You need to create at least one worker node group.
Set the node group name.
Specify the number of nodes for the node group.
| Mechanism | Description |
|---|---|
| Automatic addition |
|
| Automatic reduction | The triggering of this behavior depends on whether GPU devices are configured within the node group.
|
Configure the resources for each node within the node group.
| Parameter | Description |
|---|---|
| CPU | The number of vCPUs allocated per node in the group. The default is 4 vCPUs, with a minimum of 4 vCPUs. |
| Memory | The amount of memory allocated per node in the group. The default is 8 GiB, with a minimum of 8 GiB. |
| GPU (Only shows when the hosts in the ACOS cluster where workload clusters resides have available GPU devices) | GPU configuration per node in the group, default is not configured. If you need to configure GPU devices, choose Passthrough or vGPU based on the information planned when confirming the use of GPU devices in Requirements for using GPU devices and then set the model and quantity of passthrough GPU devices or vGPU for each node. |
| Storage | The amount of storage allocated per node in the group, which is the disk allocation capacity for the corresponding virtual machine. Default is 200 GiB and cannot be modified. |
(Optional) Enable Faulty node auto replacement. Once enabled, when the number of faulty nodes in the node group meets the limit conditions for the number of faulty nodes, the system will automatically delete the faulty nodes and create new ones. Enabling this feature requires setting the following parameters:
| Parameter | Description |
|---|---|
| Faulty node detection | Conditions for determining node failure. Select the required conditions and set the duration threshold.
Notes:
|
| Faulty node count limit | Limit conditions for the number of faulty nodes that can trigger the system to perform node replacement. You need to choose one of the following dimensions for limitation:
|
You need to configure the default account password or SSH public key for accessing nodes in the VM-based workload cluster.
admin, leave blank if not configuring. You need to enter the password twice to confirm it.