AKE 1.3.2

Deploying Arcfra Kubernetes Engine

Basic concepts

Before deployment and maintenance, you need to understand some Arcfra Kubernetes Engine (AKE) related concepts to help you better understand AKE deployment processes and functions.

Kubernetes

Kubernetes is a portable, extensible open source platform for managing workloads and services for containerization that can facilitate declarative configuration and automation. Kubernetes has a large and rapidly growing ecology, with a wide range of services, support, and tools.

Workloads

Workloads are running on Kubernetes clusters. Kubernetes provides several types of built-in workload resources, including various types such as Deployment, StatefulSet, DaemonSet, Job, CronJob, and others.

In AKE, user workloads should be deployed in the workload cluster.

Node

Kubernetes runs workloads by placing containers in Pods that run on nodes, which can be a virtual machine or a physical machine.

A Kubernetes cluster contains two types of node roles: control plane nodes and worker nodes.

Control plane node

Control plane nodes run the Control Plane Components of the Kubernetes cluster and a small number of user workloads. Generally, a Kubernetes cluster has 1, 3, or 5 control plane nodes.
Worker node

Worker nodes run the node components of the Kubernetes cluster and containerized workloads.

In AKE, each node of the management cluster and each control plane node of the workload cluster is a virtual machine, while the worker node of the workload cluster can be either a virtual machine or a physical machine.

Container image

A container image carries binary data that encapsulates an application along with all its software dependencies. It is a lightweight and executable software package that can be run independently, with well-defined assumptions about the runtime environment in which it operates. Developers usually create an application's container image, push it to a container registry, and quote it in the pod.

Container registry

Container registry, a central storage system for storing and managing container images, provides a centralized location where developers and operations personnel can access, share, or store container images.

In AKE, the container registry can be divided into the following types depending on the purpose of use:

AKE-system-used container registry

It is the unique AKE container registry created during AKE deployment on Arcfra Operation Center (AOC), which is used to store and manage container images used by the system.

When deploying AKE and creating the workload cluster, the system will pull the corresponding system images from that registry to build the management cluster and workload clusters.
User-built container registry for Kubernetes workloads

This type of container registry is used to store and manage container images used by workloads. AKE supports the configuration of multiple trusted container registries for the workload cluster, enabling the workload cluster to pull the required container image from the corresponding container registry.

Depending on the container registry source, it can be divided into the following types:
- Internal container registry
  
  Container registries created in the Container registry interface of AOC can be directly configured for use by workload clusters in AKE to pull corresponding container images.
  
  In addition, this type of container registry can also be used by other standard Kubernetes clusters. The terms container registry and internal container registry in the following text refer to this type of registry.
- External container registry
  
  The container registry deployed by users. It can be configured for use by workload clusters via the domain or the IP address to pull the corresponding container image.

Control plane virtual IP

The control plane VIP (virtual IP) is an IP address that is configured for the control plane node of the management cluster or workload clusters. It serves as the entry point for external access to the cluster and is responsible for automatically forwarding external access requests to a control plane node in order to achieve high availability of the control plane.

IP pool

There are numerous scenarios of dynamic node creation in the lifecycle of management clusters and workload clusters. In AKE, it is recommended to configure static IPs for nodes rather than DHCP. To solve the problem of configuring static IPs for each node, AKE provides the IP pool feature. An IP pool can automatically manage a series of IPs, primarily providing functions such as IP allocation and reclamation.

Management cluster IP pool: Allocates static IPs for management cluster nodes, as well as the control plane VIP.
Workload cluster IP pool: Allocates static IPs for workload cluster nodes, as well as the control plane VIP.

Namespace

Namespaces provide a mechanism to divide resources within the same cluster into isolated groups, allowing for the creation and separate management of groups within a cluster as needed.

CSI

CSI (Container Storage Interface) defines standard interfaces for storage systems to expose to containers.

Storage class

StorageClass provides a way for administrators to describe the class of storage, including fields such as provisioner, parameters, and reclaimPolicy. These fields are used by StorageClass when dynamically provisioning PersistentVolumes, and StorageClass can be understood as a template for dynamic PV provisioning. This concept of class is sometimes referred to as a profile in other storage systems.

PersistentVolume

A PersistentVolume (PV) is a piece of storage within a cluster that functions as a resource at the cluster level, similar to a node. PersistentVolumes can be pre-provisioned by administrators or dynamically provisioned using StorageClass. In AKE, only dynamic provisioning is supported.

PersistentVolumeClaim

A PersistentVolumeClaim (PVC) is a request for storage by a user. Conceptually similar to a Pod. Pods consume node resources, while PVCs consume PV resources. Just as a pod can request a specific amount of resources (CPU and memory), a PVC can also request PVs of a specific size and access mode.

Volume mode

The volume mode specifies the specific mode of a volume, with two modes available:

Filesystem

Volumes with volumeMode set to filesystem are mounted to a directory in the Pod.
Block

Setting the volumeMode attribute to block allows the volume to be used as a raw block device.

Access mode

The access mode refers to the specific access method supported by persistent volumes. Persistent volumes can be mounted to the host system in any way supported by the resource provider. The access modes supported by persistent volumes differ depending on the capabilities of the resource provider. There are several access modes available:

ReadWriteOnce

The volume can be mounted by a single node in read-write mode. The ReadWriteOnce access mode also allows multiple Pods running on the same node to access the volume.
ReadOnlyMany

The volume can be mounted by multiple nodes in read-only mode.
ReadWriteMany

The volume can be mounted by multiple nodes in read-write mode.
ReadWriteOncePod

The volume can be mounted by a single Pod in read-write mode. If you need to ensure that only one Pod in the entire cluster can read or write to that PVC, use the ReadWriteOncePod access mode. This mode only supports CSI volumes and requires Kubernetes version 1.22 or later.

AKE supports three access modes: ReadWriteOnce, ReadOnlyMany, and ReadWriteMany.

CNI

CNI (Container Network Interface) defines the standard interface for configuring networks for containers.

External load balancer

The external load balancer provides an externally accessible IP address to Kubernetes Service objects that can send traffic to the correct port of the cluster node.

Ingress

Ingress is an API object that manages external access to services in a cluster, typically HTTP. Ingress can provide load balancing, SSL termination, and name-based virtual hosting.

Kubeconfig file

A kubeconfig file is a configuration file used by Kubernetes that stores information to access Kubernetes clusters. It includes details such as clusters, users, namespaces, and authentication mechanisms. For example, the kubectl command-line tool can use the kubeconfig file to find and select the required information for the cluster and communicate with the cluster's API server.

Content library

The content library is a feature in AOC that allows the user to store and manage VM templates and ISO images. The user can manage VM templates and ISO images all through the content library function. If the user has more than one Arcfra Cloud Operating System (ACOS) cluster to manage, VM templates and ISO images can be shared across clusters using the content library, provided that the virtualization platform is Arcfra Virtualization Engine (AVE). In AKE, the user will need the content library to manage and distribute Kubernetes node templates.

GPU driver

The vGPU software package officially released by NVIDIA contains NVIDIA Virtual GPU Manager and NVIDIA Graphics Driver. The NVIDIA Graphics Driver is deployed in virtual machines using GPUs to enable them to use GPU devices, which will be referred to as "GPU driver" in the following text.

vGPU driver

The vGPU software package officially released by NVIDIA contains NVIDIA Virtual GPU Manager and NVIDIA Graphics Driver. The NVIDIA Virtual GPU Manager is a driver deployed on the virtualization platform to provide vGPU functionality for virtual machines. Arcfra has modified and adapted the NVIDIA Virtual GPU Manager, which is released with ACOS and will be referred to as "vGPU driver" in the following text.

Rolling update cluster

A cluster rolling update refers to the process of updating or upgrading an AKE cluster by gradually creating new cluster nodes to replace the old ones and migrating workloads from the old nodes to the new ones, ensuring that workloads can continue running uninterrupted. If an error occurs during the update or upgrade of the cluster, it can be easily rolled back to the previous state.