API Doc
Search Docs...
⌘ K
OverviewDeploymentManagementOperationReferenceGlossary

Viewing volume performance

Viewing performance information for all volumes

Procedure

Run the following command on a cluster node to view performance information for all volumes:

zbs-perf-tools volume list [--chunk-addr <ip>] [--sort_by <sort_by>] [-A]

ParameterDescription
--chunk-addr <ip>The zbs-chunkd RPC server address. Default: 127.0.0.1:10200, the chunk on the node where the command is run.
--sort_by <sort_by>Sorts all volume performance information in descending or ascending order based on this field. Only fields related to iops, bw, or latency can be specified. Default: total_iops.
-A, --ascendingSorts all volume performance information in ascending order by the sort_by field. If not specified, results will be sorted in descending order by default.

Output example

$ zbs-perf-tools volume list
---------------------------------------------------------------------
  volume_id                    dc656bde-8095-4a58-938b-00018a951190
  read_iops                    601.00
  read_avgrq                   262.14 KB(256.00 KiB)
  read_bw                      157.55 MB/s(150.25 MiB/s)
  read_latency                 242.12 US
  splited_read_iops            601.00
  splited_read_latency         208.46 US
  splited_local_read_ratio     1.00 (601.00 / 601.00)
  splited_local_read_bw        157.55 MB/s(150.25 MiB/s)
  splited_local_read_latency   208.46 US
  write_iops                   0.00
  write_avgrq                  0.00 B(0.00 B)
  write_bw                     0.00 B/s(0.00 B/s)
  write_latency                0.00 NS
  splited_write_iops           0.00
  splited_write_latency        0.00 NS
  splited_local_write_ratio    0.00
  splited_local_write_bw       0.00 B/s(0.00 B/s)
  splited_local_write_latency  0.00 NS
  total_iops                   601.00
  total_avgrq                  262.14 KB(256.00 KiB)
  total_bw                     157.55 MB/s(150.25 MiB/s)
  total_latency                242.12 US
  total_iop30s                 18011.00
  unmap_iops                   0.00
  unmap_total                  0
  unmap_unaligned_iops         0.00
  unmap_unaligned_total        0
---------------------------------------------------------------------
---------------------------------------------------------------------
  volume_id                    ae12b673-8bca-4118-a9b3-45db8f60f945
  read_iops                    0.00
  read_avgrq                   0.00 B(0.00 B)
  read_bw                      0.00 B/s(0.00 B/s)
  read_latency                 0.00 NS
  splited_read_iops            0.00
  splited_read_latency         0.00 NS
  splited_local_read_ratio     0.00
  splited_local_read_bw        0.00 B/s(0.00 B/s)
  splited_local_read_latency   0.00 NS
  write_iops                   601.00
  write_avgrq                  262.14 KB(256.00 KiB)
  write_bw                     157.55 MB/s(150.25 MiB/s)
  write_latency                108.41 US
  splited_write_iops           601.00
  splited_write_latency        76.36 US
  splited_local_write_ratio    1.00 (601.00 / 601.00)
  splited_local_write_bw       157.55 MB/s(150.25 MiB/s)
  splited_local_write_latency  76.36 US
  total_iops                   601.00
  total_avgrq                  262.14 KB(256.00 KiB)
  total_bw                     157.55 MB/s(150.25 MiB/s)
  total_latency                108.41 US
  total_iop30s                 18008.00
  unmap_iops                   0.00
  unmap_total                  0
  unmap_unaligned_iops         0.00
  unmap_unaligned_total        0
---------------------------------------------------------------------

Output note

ParameterDescription
read_iopsThe read IOPS in the last 1 second.
read_avgrqThe average read request size in the last 1 second.
read_bwThe read bandwidth in the last 1 second.
read_latencyThe average read request latency in the last 1 second.
splited_read_iopsThe split read IOPS in the last 1 second. When the stripe size is 256 KiB, a 512 KiB read request will be split into two 256 KiB requests sent to access.
splited_read_latencyThe average latency of split read requests in the last 1 second.
splited_local_read_ratioThe IOPS ratio of split read requests served by local access in the last 1 second.
splited_local_read_bwThe read bandwidth of split requests served by local access in the last 1 second.
splited_local_read_latencyThe average latency of split read requests served by local access in the last 1 second.
write_iopsThe write IOPS in the last 1 second.
write_avgrqThe average write request size in the last 1 second.
write_bwThe write bandwidth in the last 1 second.
write_latencyThe average write request latency in the last 1 second.
splited_write_iopsThe split write IOPS in the last 1 second. When the stripe size is 256 KiB, a 512 KiB write request will be split into two 256 KiB requests sent to access.
splited_write_latencyThe average latency of split write requests in the last 1 second.
splited_local_write_ratioThe IOPS ratio of split write requests served by local access in the last 1 second.
splited_local_write_bwThe write bandwidth of split requests served by local access in the last 1 second.
splited_local_write_latencyThe average latency of split write requests served by local access in the last 1 second.
total_iopsThe total IOPS in the last 1 second.
total_avgrqThe average request size in the last 1 second.
total_bwThe total bandwidth in the last 1 second.
total_latencyThe average latency in the last 1 second.
total_iop30sThe total I/Os in the last 30 seconds.
unmap_iopsThe UNMAP I/Os in the last 1 second.
unmap_totalThe total UNMAP I/Os.
unmap_unaligned_iopsThe unaligned UNMAP I/Os in the last 1 second.
unmap_unaligned_totalThe total unaligned UNMAP I/Os.

Viewing performance information for a volume with the specified ID

Procedure

Run the following command on a cluster node to view performance information for the volume with the specified ID:

zbs-perf-tools volume show <volume id> [--chunk-addr <ip>] [-L] [-A]

ParameterDescription
volume_idThe volume ID.
--chunk-addr <ip>The zbs-chunkd RPC server address. Default: 127.0.0.1:10200, the chunk on the node where the command is run.
-LDisplays data only from the local chunk server.
-ADisplays all properties of the chart.

Output example

$zbs-perf-tools volume show e5a1d376-7d14-4c44-82b4-f2bc2a2334ee
Aggregated Data:
---------------------------------------------------------------------
  volume_id                    f618b4b1-c0c9-4b93-8dc1-9ffdcd086679
  read_iops                    0.00
  read_avgrq                   0.00 B(0.00 B)
  read_bw                      0.00 B/s(0.00 B/s)
  read_latency                 0.00 NS
  splited_read_iops            0.00
  splited_read_latency         0.00 NS
  splited_local_read_ratio     0.00
  splited_local_read_bw        0.00 B/s(0.00 B/s)
  splited_local_read_latency   0.00 NS
  write_iops                   0.00
  write_avgrq                  0.00 B(0.00 B)
  write_bw                     0.00 B/s(0.00 B/s)
  write_latency                0.00 NS
  splited_write_iops           0.00
  splited_write_latency        0.00 NS
  splited_local_write_ratio    0.00
  splited_local_write_bw       0.00 B/s(0.00 B/s)
  splited_local_write_latency  0.00 NS
  total_iops                   0.00
  total_avgrq                  0.00 B(0.00 B)
  total_bw                     0.00 B/s(0.00 B/s)
  total_latency                0.00 NS
  total_iop30s                 0.00
  unmap_iops                   0.00
  unmap_total                  0
  unmap_unaligned_iops         0.00
  unmap_unaligned_total        0
---------------------------------------------------------------------
chunk-Specific Data:
--------------------------------------------------------------------------------
  CHUNK IP       TOTAL IOPS  TOTAL AVGRQ     TOTAL BW            TOTAL LATENCY
--------------------------------------------------------------------------------
  10.213.141.86  0.00        0.00 B(0.00 B)  0.00 B/s(0.00 B/s)  0.00 NS
  10.213.141.88  0.00        0.00 B(0.00 B)  0.00 B/s(0.00 B/s)  0.00 NS
  10.213.141.87  0.00        0.00 B(0.00 B)  0.00 B/s(0.00 B/s)  0.00 NS
  10.213.141.89  0.00        0.00 B(0.00 B)  0.00 B/s(0.00 B/s)  0.00 NS
--------------------------------------------------------------------------------

Output note

chunk-Specific Data is the data at the chunk level, collected from each chunk service in the cluster.

Aggregated Data is the aggregated performance data collected from all chunk services in the cluster, representing overall metrics. Latency and average values are weighted averages; counts are summed.

ParameterDescription
read_iopsThe read IOPS in the last 1 second.
read_avgrqThe average read request size in the last 1 second.
read_bwThe read bandwidth in the last 1 second.
read_latencyThe average read request latency in the last 1 second.
splited_read_iopsThe split read IOPS in the last 1 second. When the stripe size is 256 KiB, a 512 KiB read request will be split into two 256 KiB requests sent to access.
splited_read_latencyThe average latency of split read requests in the last 1 second.
splited_local_read_ratioThe IOPS ratio of split read requests served by local access in the last 1 second.
splited_local_read_bwThe read bandwidth of split requests served by local access in the last 1 second.
splited_local_read_latencyThe average latency of split read requests served by local access in the last 1 second.
write_iopsThe write IOPS in the last 1 second.
write_avgrqThe average write request size in the last 1 second.
write_bw The write bandwidth in the last 1 second.
write_latencyThe average write request latency in the last 1 second.
splited_write_iopsThe split write IOPS in the last 1 second. When the stripe size is 256 KiB, a 512 KiB write request will be split into two 256 KiB requests sent to access.
splited_write_latencyThe average latency of split write requests in the last 1 second.
splited_local_write_ratioThe IOPS ratio of split write requests served by local access in the last 1 second.
splited_local_write_bwThe write bandwidth of split requests served by local access in the last 1 second.
splited_local_write_latencyThe average latency of split write requests served by local access in the last 1 second.
total_iopsThe total IOPS in the last 1 second.
total_avgrqThe average request size in the last 1 second.
total_bwThe total bandwidth in the last 1 second.
total_latencyThe average latency in the last 1 second.
total_iop30sThe total I/Os in the last 30 seconds.
unmap_iopsThe UNMAP I/Os in the last 1 second.
unmap_totalThe total UNMAP I/Os.
unmap_unaligned_iopsThe unaligned UNMAP I/Os in the last 1 second.
unmap_unaligned_totalThe total unaligned UNMAP I/Os.

Probing I/O latency distribution, I/O request size distribution, and I/O access heatmap for a volume

By enabling the probe mode for a volume, you can periodically collect the volume's I/O latency distribution, I/O request size distribution, and I/O access heatmap. The results are displayed in histogram form.

Procedure

Run the following command on the cluster node to enable the probe mode. Press Ctrl + C to exit. Upon exit, zbs-chunkd automatically clears metrics for that probe session.

zbs-perf-tools volume probe <volume id> [--chunk-addr <ip>] [--meta-addr <ip>] [--distribution {lat | rqsz | logical_offset}] [--interval <int>] [--readwrite {read | write | readwrite}]

ParameterDescription
volume idThe ABS volume ID.
--chunk-addr <ip>The zbs-chunkd RPC server address. Default: 127.0.0.1:10200, the chunk on the node where the command is run.
--meta-addr <ip>The zbs-meta RPC server address. Optional. Default: 127.0.0.1:10206.
--distribution The I/O distribution type. Default: lat. Options: lat (I/O latency distribution), rqsz (I/O request size distribution), logical_offset (logical region access heatmap).
--interval <int>The probe interval (time window for collecting distribution data). The unit: seconds. Default: 1 second.
--readwrite The I/O type to probe. Valid values: read, write, readwrite. Default: readwrite.

Output example

If a distribution bin contains no values, it will be omitted from the output. For example, if all latency values fall within the [0, 64.00 us) bin and all other bins are empty, only the [0, 64.00 us) bin will be shown.

  • Probe the latency distribution of read or write requests on a volume

    The output histogram has three columns: the first column is the latency bin (unit: us), the second column is the count for the bin, and the third column shows the distribution.

    $zbs-perf-tools volume probe e5a1d376-7d144c44-82b4-f2bc2a2334ee --distribution lat
        readwrite lat(us)          : count     distribution
                [0,64.00)          : 57       |*                               |
            [64.00,128.00)         : 6172     |**********                      |
            [128.00,256.00)        : 15473    |*************************       |
            [256.00,512.00)        : 20443    |********************************|
            [512.00,1024.00)       : 10467    |*****************               |
            [1024.00,2048.00)      : 1361     |***                             |
            [2048.00,4096.00)      : 337      |*                               |
    
  • Probe the I/O request size distribution of read or write requests on a volume

    The output histogram has three columns: the first column is the I/O request size bin (unit: KB), the second column is the count for the bin, and the third column shows the distribution.

    $zbs-perf-tools volume probe e5a1d376-7d14-4c44-82b4-f2bc2a2334ee --distribution rqsz
          readwrite size(KB)        : count     distribution
              [4.00,8.00)           : 71       |********************************|
              [16.00,32.00)         : 58       |***************************     |
              [256.00,512.00)       : 3        |**                              |
  • Probe the hot region distribution of read or write operations on a volume

    The output histogram has three columns: the first column is the I/O read or write request bin (1 GB per bin), the second column is the count for the bin, and the third column shows the distribution.

    $zbs-perf-tools volume probe e5a1d376-7d14-4c44-82b4-f2bc2a2334ee --distribution logical_offset
       readwrite logical offset(GB)    : count     distribution
               [0,1.00)                : 29248    |********************************|
              [1.00,2.00)              : 16579    |*******************             |
              [2.00,3.00)              : 16763    |*******************             |
              [3.00,4.00)              : 16320    |******************              |

Collecting and analyzing I/O performance data for a specified volume

Procedure

Run the following command on any node in the cluster to create a session named after the volume ID and start collecting data:

zbs-perf-tools trace volume <volume_id> --trace_time <time>

ParameterDescription
volume_idThe target volume ID.
--trace_time <time>The duration of data collection. Default: 10s. Maximum: 600s.

Press Ctrl + C to stop data collection. The tool will then automatically perform data analysis.

The analysis results include a statistics table and statistics charts. The statistics table is displayed directly in the console. The statistics charts are saved as static HTML files in the trace data directory on each node. You can download the files and view them in a Web browser.

Output example

$ zbs-perf-tools trace volume 5be99682-cdc9-4516-b262-3bfe44379207
session started...
Trace time is over, sending interrupt signal...
stopping tracing...
/root/zbs-trace/5be99682-cdc9-4516-b262-3bfe44379207/trace-data/cid-1-10-0-18-31 parse succeed. wrote to /root/zbs-trace/5be99682-cdc9-4516-b262-3bfe44379207/trace-data/cid-1-10-0-18-31/2023-07-28-165212+0800/parsed_data
/root/zbs-trace/5be99682-cdc9-4516-b262-3bfe44379207/trace-data/cid-2-10-0-18-34 parse succeed. wrote to /root/zbs-trace/5be99682-cdc9-4516-b262-3bfe44379207/trace-data/cid-2-10-0-18-34/2023-07-28-165211+0800/parsed_data
/root/zbs-trace/5be99682-cdc9-4516-b262-3bfe44379207/trace-data/cid-4-10-0-18-32 parse succeed. wrote to /root/zbs-trace/5be99682-cdc9-4516-b262-3bfe44379207/trace-data/cid-4-10-0-18-32/2023-07-28-165212+0800/parsed_data
The report of directory: /root/zbs-trace/5be99682-cdc9-4516-b262-3bfe44379207/trace-data/cid-1-10-0-18-31/2023-07-28-165212+0800/parsed_data
ACCESS
------------------------------------------------------------------------
                        AVG      P50      P95      P99      MAX      N
------------------------------------------------------------------------
  read                  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0
  write                 0.00 NS  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0
  readwrite             0.00 NS  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0
  sync_gen              0.00 NS  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0
  wait_recover          0.00 NS  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0
  caw                   0.00 NS  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0
  replica_io_read       0.00 NS  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0
  replica_io_write      0.00 NS  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0
  replica_io_readwrite  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0.00 NS  0
------------------------------------------------------------------------
……