The VM-based workload cluster contains a control plane node group and at least one worker node group, with nodes in the same group having the same configurations.
If the VM-based workload cluster being created requires a worker node group with the Autoscaling node count type, you should enable Node group autoscaling to set the auto scaling range for the worker node group in subsequent steps.
Note:
You can also enable or disable Node group autoscaling after the VM-based workload cluster is created.
Set the node group name.
Specify the number of nodes for the node group. You can choose 1, 3, or 5. To ensure control plane high availability for the VM-based workload cluster, it is recommended to select 3 or 5.
Configure the resources for each node within the node group.
| Parameter | Description |
|---|---|
| CPU | The number of vCPUs allocated per node in the group. The default is 4 vCPUs, with a minimum allocation of 2 vCPUs. |
| Memory | The amount of memory allocated per node in the group. The default is 8 GiB, with a minimum allocation of 6 GiB. |
| Storage | The amount of storage allocated per node in the group, which is the disk capacity for the corresponding virtual machine. The default is 200 GiB and cannot be modified. |
(Optional) If the number of nodes in the node group is greater than 1, you can choose to enable Faulty node auto replacement. Once enabled, if the proportion of faulty nodes in the node group does not exceed the maximum threshold for faulty nodes, the system will automatically remove the faulty nodes and create new ones. Enabling this feature requires configuring the following parameters:
| Parameter | Description |
|---|---|
| Faulty node detection | Conditions for determining faulty nodes. You should tick the required conditions and set the duration threshold.
Note:
|
| Maximum percentage of faulty nodes | The maximum percentage of faulty nodes in the node group that can be automatically replaced. If the number of faulty nodes exceeds this percentage, automatic replacement is not supported. The value cannot be greater than 40%. For example, if the total number of nodes in the group is 5, and this setting is 40%, then when the number of faulty nodes in the group is less than or equal to 2, they will be automatically replaced; if it is greater than 2, they will not be replaced. |
You need to create at least one worker node group.
Set the node group name.
Specify the number of nodes for the node group.
Fixed count: Enter the number of nodes, with a minimum of 1.
Autoscaling (Only supported when Node group autoscaling is enabled): Enter the minimum and maximum numbers of nodes to specify the node count range. The minimum number of nodes cannot be less than 3. After the cluster is created, the initial number of nodes in the current node group will be equal to the minimum number of nodes, and the number of nodes can be automatically adjusted within the set range. The specific adjustment mechanism is as follows:
| Mechanism | Description |
|---|---|
| Automatic addition |
|
| Automatic reduction | The triggering of this behavior depends on whether GPU devices are configured within the node group.
|
Configure the resources for each node within the node group.
| Parameter | Description |
|---|---|
| CPU | The number of vCPUs allocated per node in the group. The default is 4 vCPUs, with a minimum of 4 vCPUs. |
| Memory | The amount of memory allocated per node in the group. The default is 8 GiB, with a minimum of 8 GiB. |
| GPU (displayed only when the ACOS cluster where the VM-based workload cluster resides has hosts with mounted GPU devices) |
GPU configuration per node in the group. Not configured by default.If you need to configure GPU devices, choose "Passthrough" or "vGPU" based on the information planned when confirming the GPU device requirements for the workload cluster and then set the model and quantity of passthrough GPU devices or vGPU for each node. Note: In the GPU model drop-down list, hover over the information icon "i" next to the available quantity to view the number of available GPUs on each host. |
| Storage | The amount of storage allocated per node in the group, which is the disk allocation capacity for the corresponding virtual machine. The default is 200 GiB, with a minimum of 200 GiB. |
(Optional) Enable Faulty node auto replacement. Once enabled, when the number of faulty nodes in the node group meets the limit conditions for the number of faulty nodes, the system will automatically delete the faulty nodes and create new ones. Enabling this feature requires configuring the following parameters:
| Parameter | Description |
|---|---|
| Faulty node detection | Conditions for determining faulty nodes. You should tick the required conditions and set the duration threshold.
Note:
|
| Faulty node quantity limit | Limit conditions for the number of faulty nodes that can trigger the system to perform node replacement. You need to choose one of the following dimensions for limitation:
|
You need to configure the default account password or SSH public key for accessing nodes in the VM-based workload cluster.
admin. Leave blank if not configuring. Note that you need to enter your password twice to confirm it.