基于Cinder的云硬盘动态限速调研

最近忙存储网关的开发,没有投放精力在Ceph和Openstack上了,这里翻出来一些之前做的项目,分享给大家。

一、QOS概况

OpenStack H版中块存储引入了QoS特性,主要在CinderNova项目中实施。

OpenStackQoS特性上,主要依赖于后段存储和前端的Hypervisor来实现,而在Cinder中,提供了一个QoS Spec框架,用户可以创建一个QoS Spec,这个Spec说明了针对目标(后端或者前端),限制键值对(total_iops_sec=1000)等,每个QoS SpecVolume Type相联系,用户在创建一个卷时可以将该卷与一个Volume Type联系,这样就间接使得该卷与特定QoS Spec联系。

换句话说,该卷获得了一系列QoS键值对,当该QoS是面向后端时,创建卷命令会将QoS Spec的键值对传给Cinder的后端存储解释。当该QoS面向前端时,在这个卷被附加到一个虚拟机上时才会被实现,比如通过QEMU来实现。

二、当前实现

硬盘:CephRBD提供

当前限速实现:通过Openstack Cinder创建type,指定QOS来实现。具体步骤如下:

1)创建cinder volume type

1
2
usage: cinder type-create <name>
例如:cinder type-create sata

2)创建cinder的一个qos标准

1
2
usage: cinder qos-create <name> <key=value> [<key=value> ...]
例如: cinder qos-create sata-qos consumer="front-end" read_iops_sec=300 write_iops_sec=300 read_bytes_sec=104857600 write_bytes_sec=62914560

[注释]
consumer:通常分为两类

  • front-end:表示限速在前端hypervisor(例如Qemu)实现
  • back-end:表示限速在后端存储系统实现

Libvirt/Qemu可配置qos keys:

  • total_bytes_sec: the total allowed bandwidth for the guest per second
  • read_bytes_sec: sequential read limitation
  • write_bytes_sec: sequential write limitation
  • total_iops_sec: the total allowed IOPS for the guest per second
  • read_iops_sec: random read limitation
  • write_iops_sec: random write limitation

后端可配置qos keys:

与具体的后端存储支持相关,当前Ceph不支持块设备的QOS配置,在Ceph社区已经提上日程。

3)管理cinder volume type和qos

1
2
3
4
5
6
7
8
9
usage: cinder qos-associate <qos_specs> <volume_type_id>
例如:cinder qos-associate 2b3b35b8-e19c-442d-9218-8b72b369f0e4 c224099f-2a8e-4356-ac2a-78fe56ec9523

$ cinder qos-get-association 2b3b35b8-e19c-442d-9218-8b72b369f0e4
+------------------+------+--------------------------------------+
| Association_Type | Name | ID |
+------------------+------+--------------------------------------+
| volume_type | sata | c224099f-2a8e-4356-ac2a-78fe56ec9523 |
+------------------+------+--------------------------------------+

4)cinder volume attach到nova instance

attach后可以查看到这些iotune信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$ virsh dumpxml cde0ab13-ff3d-46d0-a7c6-b797f18d0465
...
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<auth username='volumes'>
<secret type='ceph' uuid='c22ced15-b23b-4b31-8329-e14a59273b81'/>
</auth>
<source protocol='rbd' name='volumes_2/volume-f9daf542-4ca7-4228-b6b4-d450c7c7dbb4'>
<host name='172.16.0.6' port='6789'/>
<host name='172.16.0.7' port='6789'/>
<host name='172.16.0.17' port='6789'/>
</source>
<backingStore/>
<target dev='vdb' bus='virtio'/>
<iotune>
<read_bytes_sec>104857600</read_bytes_sec>
<write_bytes_sec>62914560</write_bytes_sec>
<read_iops_sec>300</read_iops_sec>
<write_iops_sec>300</write_iops_sec>
</iotune>
<serial>f9daf542-4ca7-4228-b6b4-d450c7c7dbb4</serial>
<alias name='virtio-disk1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</disk>
...

三、如何动态限速磁盘?

1)根据磁盘容量动态调整

难度一般

从上诉分析知道cinder volume的QOS设置在hypervisor Qemu中实现,从cinder中读取这些参数应该在Nova的逻辑中有,我们只需要修改这部分逻辑,添加代码实现随着磁盘容量递增的QOS即可。

2)根据系统负载动态调整

难度较大

动态调整虚拟机iotune的方法:

1. novachange-disk-io-tune命令

nova的命令中有一个change-disk-io-tune的子命令,格式如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ nova help change-disk-io-tune
usage: nova change-disk-io-tune [--read_bytes_sec <read_bytes_sec>]
[--write_bytes_sec <write_bytes_sec>]
[--read_iops_sec <read_iops_sec>]
[--write_iops_sec <write_iops_sec>]
<server>
change the root user password for a server
Positional arguments:
<server> Name or ID of server
Optional arguments:
--read_bytes_sec <read_bytes_sec>
read_bytes_sec bytes/s.
--write_bytes_sec <write_bytes_sec>
write_bytes_sec bytes/s.
--read_iops_sec <read_iops_sec>
read_iops_sec.
--write_iops_sec <write_iops_sec>
write_iops_sec.

通过该命令可以动态修改nova instance中的disk iotune限制,但它不能区分nova instance中的具体硬盘,会把所有盘的iotune都修改为指定值,例如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
$ nova change-disk-io-tune --read_iops_sec 200 --write_iops_sec 200 --read_bytes_sec 102400 --write_bytes_sec 102400 5ceaacab-b021-4f23-ba45-579345b843a9
$ virsh dumpxml 5ceaacab-b021-4f23-ba45-579345b843a9 > withdisk.xml
$ vim withdisk.xml
...
<target dev='vda' bus='virtio'/>
<iotune>
<read_bytes_sec>102400</read_bytes_sec>
<write_bytes_sec>102400</write_bytes_sec>
<read_iops_sec>200</read_iops_sec>
<write_iops_sec>200</write_iops_sec>
</iotune>
...
<target dev='vdb' bus='virtio'/>
<iotune>
<read_bytes_sec>102400</read_bytes_sec>
<write_bytes_sec>102400</write_bytes_sec>
<read_iops_sec>200</read_iops_sec>
<write_iops_sec>200</write_iops_sec>
</iotune>
...
<target dev='vdc' bus='virtio'/>
<iotune>
<read_bytes_sec>102400</read_bytes_sec>
<write_bytes_sec>102400</write_bytes_sec>
<read_iops_sec>200</read_iops_sec>
<write_iops_sec>200</write_iops_sec>
</iotune>

通过nova change-disk-io-tune 可以实时调整虚拟机上所有blk deviceiotune,并立即生效。但它不能指定具体的blk device,不符合我们的使用需求。

2. virshblkdeviotune命令

virsh命令有disk iotune相关的子命令,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ virsh blkdeviotune --help
NAME
blkdeviotune - Set or query a block device I/O tuning parameters.
SYNOPSIS
blkdeviotune <domain> <device> [--total-bytes-sec <number>] [--read-bytes-sec <number>] [--write-bytes-sec <number>] [--total-iops-sec <number>] [--read-iops-sec <number>] [--write-iops-sec <number>] [--total-bytes-sec-max <number>] [--read-bytes-sec-max <number>] [--write-bytes-sec-max <number>] [--total-iops-sec-max <number>] [--read-iops-sec-max <number>] [--write-iops-sec-max <number>] [--size-iops-sec <number>] [--config] [--live] [--current]
DESCRIPTION
Set or query disk I/O parameters such as block throttling.
OPTIONS
[--domain] <string> domain name, id or uuid
[--device] <string> block device
--total-bytes-sec <number> total throughput limit in bytes per second
--read-bytes-sec <number> read throughput limit in bytes per second
--write-bytes-sec <number> write throughput limit in bytes per second
--total-iops-sec <number> total I/O operations limit per second
--read-iops-sec <number> read I/O operations limit per second
--write-iops-sec <number> write I/O operations limit per second
--total-bytes-sec-max <number> total max in bytes
--read-bytes-sec-max <number> read max in bytes
--write-bytes-sec-max <number> write max in bytes
--total-iops-sec-max <number> total I/O operations max
--read-iops-sec-max <number> read I/O operations max
--write-iops-sec-max <number> write I/O operations max
--size-iops-sec <number> I/O size in bytes
--config affect next boot
--live affect running domain
--current affect current domain

通过该命令可以实时调整对应block deviceiotune,但必须在虚拟机对应的物理机上执行该命令。

  • 查看指定设备的iotune,命令如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ virsh blkdeviotune 5ceaacab-b021-4f23-ba45-579345b843a9 vdc
total_bytes_sec: 0
read_bytes_sec : 104857600
write_bytes_sec: 62914560
total_iops_sec : 0
read_iops_sec : 1500
write_iops_sec : 1000
total_bytes_sec_max: 0
read_bytes_sec_max: 10485760
write_bytes_sec_max: 6291456
total_iops_sec_max: 0
read_iops_sec_max: 150
write_iops_sec_max: 100
size_iops_sec : 0
  • 修改指定设备的iotune值,命令如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ virsh blkdeviotune 5ceaacab-b021-4f23-ba45-579345b843a9 vdc --read_iops_sec 5000 --write_iops_sec 3000
$ virsh blkdeviotune 5ceaacab-b021-4f23-ba45-579345b843a9 vdc
total_bytes_sec: 0
read_bytes_sec : 104857600
write_bytes_sec: 62914560
total_iops_sec : 0
read_iops_sec : 5000
write_iops_sec : 3000
total_bytes_sec_max: 0
read_bytes_sec_max: 10485760
write_bytes_sec_max: 6291456
total_iops_sec_max: 0
read_iops_sec_max: 500
write_iops_sec_max: 300
size_iops_sec : 0

通过virsh可以实时调整虚拟机指定blk deviceiotune,并立即生效。但如何结合需求动态调整虚拟机上磁盘的iotune,还需深入考察。

支持原创