ACOS 6.1.1

Deploying an Arcfra Cloud Operating System cluster

Basic concepts

Before installing and deploying Arcfra Cloud Operating System (ACOS), you should first familiarize yourself with the relevant concepts. This will help you better understand the installation and deployment process, as well as the functionalities of ACOS.

SCVM

Storage Controller Virtual Machine (SCVM), is a virtual machine used to install ACOS when ACOS is deployed with VMware ESXi in a hyperconverged form. It manages physical disks on the hosts, including SSDs and HDDs, which are then centrally managed by Arcfra Block Storage (ABS). SCVMs can be organized into a distributed storage cluster over the network, providing storage services to virtual machines on VMware ESXi hosts.

Hyperconvergence

Hyperconvergence is an IT infrastructure approach that streamlines the deployment, management, and maintenance of resources by integrating compute virtualization, distributed storage, and networking.

VMware ESXi

VMware ESXi is a bare metal hypervisor and a core component of VMware vSphere. It can be installed and run on physical servers. VMware ESXi specifically refers to the virtualization platform that runs the ESXi program.

Active-active cluster

An active-active cluster in ACOS uses a stretched architecture with two availability zones and a witness node. These availability zones are typically located in separate sites within the same city: one as the primary zone and the other as the secondary. Each zone has at least three nodes that independently handle compute and storage services. Under normal conditions, workloads run mainly in the primary zone.

To manage leader elections and enable automatic failover between availability zones, an additional witness node is required. This witness node, which does not store data, can be a physical or virtual machine and must be in a different location from the availability zones.

The availability zones and the witness node communicate through networks. If one zone fails, the other will take place to provide services, ensuring disaster recovery at the zone level.

Master node

In an ACOS cluster, a master node is a node that runs metadata services.

Storage node

In an ACOS cluster, a storage node is a node that does not run metadata services.

Replication factor

The replication factor defines how many replicas of your data are stored. This redundancy policy helps improve data reliability and security by distributing replicas across multiple hosts.

Erasure coding

Erasure coding is a method that creates M parity blocks from K original data blocks using specific algorithms. This technique allows data to be rebuilt using the K data blocks and parity blocks if a node or disk fails, offering fault tolerance. In ACOS, it is available only when tiered storage is chosen in the cluster deployment.

Capacity specification

ACOS offers two capacity specifications based on physical disk limits. The normal capacity specification supports hosts with disks up to 128 TB, while the large capacity specification supports disks up to 256 TB.

Tiered storage

Tiered storage separates storage into two layers: cache and capacity. The capacity layer holds all data across nodes, storing cold data using replicas or erasure codes based on redundancy settings.

In hybrid-flash or all-flash configurations with different SSD types, cache and data disks are separate. The cache layer is split into a write cache for new data and a read cache for frequently accessed data, with an 8:2 ratio between them. Data initially goes into the write cache and is moved to the capacity layer as it cools.

In all-flash configurations with a single SSD type, cache and data share physical disks. The cache layer is solely for new data with a 1:9 ratio of cache to data partition. Replicated volumes use only the capacity layer.

Non-tiered storage

In non-tiered storage, there is no cache layer. All disks, except those for the system partition, are used solely for data storage. This setup is available only in all-flash configurations.

Boost mode

Designed for higher performance, the Boost mode in ACOS enables memory sharing between the Guest OS, QEMU, and ABS through the vhost protocol, which can improve the I/O performance of virtual machines.

RDMA

Remote Direct Memory Access (RDMA) is a technology that enables direct memory access between nodes, bypassing the operating system kernels of both nodes. This allows data to be transferred directly over the network, saving CPU resources, boosting system throughput, and reducing network latency. ACOS supports RDMA to minimize write latency between remote nodes, enhancing both latency and throughput for optimal cluster performance.

SR-IOV

Single Root I/O Virtualization (SR-IOV) is a hardware-based virtualization solution that enhances performance and scalability.

SR-IOV allows efficient sharing of PCIe (Peripheral Component Interconnect Express) devices among virtual machines. Since it is implemented in hardware, it can achieve I/O performance comparable to that of hosts.
A PCIe device with SR-IOV enabled, along with appropriate hardware and OS support, can appear as multiple separate physical devices, allowing multiple virtual machines to share a single I/O resource.

In ACOS, the SR-IOV passthrough feature virtualizes a single SR-IOV-capable port into multiple virtual NICs, which can be assigned to virtual machines, thereby significantly reducing network latency.

Volume pinning

ACOS supports volume pinning, a feature that keeps data from virtual volumes (with AVE) or Network File System (NFS) files (with VMware ESXi) in the cache layer. This results in more stable and higher performance.

LACP dynamic link aggregation

Link Aggregation Control Protocol (LACP) is a protocol based on the IEEE 802.3ad standard that implements dynamic link aggregation and disaggregation. It is one of the common protocols for link aggregation. In a link aggregation group, LACP-enabled member ports interact by exchanging LACP PDUs (Protocol Data Units) to explicitly agree on which ports can send and receive packets, ultimately determining the links that will carry the application traffic. In addition, when the aggregation conditions change, such as when a link fails, the LACP mode will automatically adjust the links in the aggregation group, allowing other available member links in the group to take over the failed link, thereby maintaining load balancing. Therefore, LACP can also increase the logical bandwidth between devices and improve network reliability without the need for hardware upgrades.

VAAI-NAS plugin

VAAI-NAS is a plugin from Arcfra that allows for thick provisioning and fast cloning of files when ACOS is integrated with VMware ESXi in a hyper-converged architecture.

Deployment control node

When deploying an active-active cluster, you need to access the management IP address of a node in either the primary availability zone or the secondary availability zone through a browser to enter the cluster deployment page. This node serves as the control node during cluster deployment, and is therefore referred to as the deployment control node.