24 KiB
Aggregate affinity
This series of patches adds ability for creating aggregation of ironic nodes in Nova. This work is based on work of Jay Pipes series back ported to Ocata, with some additional fixes.
After applying those patches on Ocata tree nova and novaclient, it will be possible to create aggregates which contain ironic nodes and a group with one of two new policies:
aggregate-affinity
aggregate-anti-affinity
Note, that if openstackclient is used, it is needed to overwrite OS_COMPUTE_API_VERSION environment variable to value 2.43.
Given, that we are working on devstack, and have available four Ironic nodes (it need to be changed in devstacks' local.conf by setting variable IRONIC_VM_COUNT to 4), basic flow to test it is as follows:
$ export OS_COMPUTE_API_VERSION=2.43
$ openstack aggregate create rack1
$ openstack aggregate create rack2
$ openstack aggregate add host rack1 $(openstack baremetal node list|grep node-0|awk '{print $2}')
$ openstack aggregate add host rack1 $(openstack baremetal node list|grep node-1|awk '{print $2}')
$ openstack aggregate add host rack2 $(openstack baremetal node list|grep node-2|awk '{print $2}')
$ openstack aggregate add host rack2 $(openstack baremetal node list|grep node-3|awk '{print $2}')
$ openstack server group create --policy aggregate-anti-affinity group1
$ openstack server create \
--image=$(openstack image list|grep x86_64-disk| awk '{print $2}') \
--flavor=1 \
--nic net-id=$(openstack network list |grep private | awk '{print $2}') \
--hint group=$(openstack server group list | grep group1 | awk '{print $2}') \
instance1
$ openstack server create \
--image=$(openstack image list|grep x86_64-disk| awk '{print $2}') \
--flavor=1 \
--nic net-id=$(openstack network list |grep private | awk '{print $2}') \
--hint group=$(openstack server group list | grep group1 | awk '{print $2}') \
instance2
this should place two ironic instances on two different rack aggregates. In similar fashion it might be group created with policy aggregate-affinity.
Soft aggregate affinity
This is similar feature to soft (anti) affinity* feature which was done for compute hosts. There are two new weight introduced:
aggregate-soft-affinity
aggregate-soft-anti-affinity
and can be used for scattering instances between two aggregates within an instance group with two policies - to keep instances within an aggregate (affinity), or to spread them around on different aggregates. If there would be not possible to put an instance together on an aggregate (in case of affinity) or on different one (in case of anti-affinity), it will be placed in specified group anyway.
Simple usage is as follows, using environment described above in aggregate-affinity feature:
$ export OS_COMPUTE_API_VERSION=2.43
$ openstack aggregate create rack1
$ openstack aggregate create rack2
$ openstack aggregate add host rack1 $(openstack baremetal node list|grep node-0|awk '{print $2}')
$ openstack aggregate add host rack1 $(openstack baremetal node list|grep node-1|awk '{print $2}')
$ openstack aggregate add host rack2 $(openstack baremetal node list|grep node-2|awk '{print $2}')
$ openstack aggregate add host rack2 $(openstack baremetal node list|grep node-3|awk '{print $2}')
$ openstack server group create --policy aggregate-soft-anti-affinity group1
$ openstack server create \
--image=$(openstack image list|grep x86_64-disk| awk '{print $2}') \
--flavor=1 \
--nic net-id=$(openstack network list |grep private | awk '{print $2}') \
--hint group=$(openstack server group list | grep group1 | awk '{print $2}') \
instance1
$ openstack server create \
--image=$(openstack image list|grep x86_64-disk| awk '{print $2}') \
--flavor=1 \
--nic net-id=$(openstack network list |grep private | awk '{print $2}') \
--hint group=$(openstack server group list | grep group1 | awk '{print $2}') \
instance2
$ openstack server create \
--image=$(openstack image list|grep x86_64-disk| awk '{print $2}') \
--flavor=1 \
--nic net-id=$(openstack network list |grep private | awk '{print $2}') \
--hint group=$(openstack server group list | grep group1 | awk '{print $2}') \
instance3
Unlike in aggregate-anti-affinity policy, creating instance3 will pass, since regardless of not available aggregate with no group members, it will be placed in the group anyway on one of the available host within the group.
Configuration
As for soft aggregate (anti) affinity there is another limitation, which comes with how weights works right now in Nova. Because of this commit change of behaviour was introduced on how scheduler selects hosts. It's concerns all of affinity/anti-affinity weights, not only this particular newly added for aggregation.
That change introduce a blind selection of the host form a group of the weighed hosts, which are originally sorted from best fitting. For affinity weight it will always return full list of the hosts (since they are not a filters), which is ordered from best to worst hosts. There is a high chance, that nova.conf will need to have a scheduler filter option host_subset_size set to 1, like:
[filter_scheduler]
host_subset_size = 1
Creation of instances in a bulk
Unfortunately, creating instance in bulk isn't possible. Here is a full explanation.
Currently, if we schedule a bulk creation for ironic instances, (or any bulk creation of instances) filtered_scheduler will perform a filtering on each available hosts on each requested instance.
Let's take an example, that we have 4 available ironic hosts, divided in two groups with aggregate-affinity policy:
ubuntu@ubuntu ~/devstack ◆ (stable/ocata) $ openstack baremetal node list
+--------------------------------------+--------+---------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------+---------------+-------------+--------------------+-------------+
| 959734ed-8dda-4878-9d5c-ddd9a95b65ec | node-0 | None | power off | available | False |
| c105d862-2eca-4845-901e-cd8194a39248 | node-1 | None | power off | available | False |
| a204e33f-6803-4d92-ad47-5b6928e3cede | node-2 | None | power off | available | False |
| 6ee27372-884d-4db4-af27-f697fffcb7c0 | node-3 | None | power off | available | False |
+--------------------------------------+--------+---------------+-------------+--------------------+-------------+
ubuntu@ubuntu ~/devstack ◆ (stable/ocata) $ openstack server group list
+--------------------------------------+--------+--------------------+
| ID | Name | Policies |
+--------------------------------------+--------+--------------------+
| 0b96ffc0-8e96-4613-b9a8-ea4e6c7ff0e8 | group1 | aggregate-affinity |
+--------------------------------------+--------+--------------------+
ubuntu@ubuntu ~/devstack ◆ (stable/ocata) $ openstack aggregate list
+----+-------+-------------------+
| ID | Name | Availability Zone |
+----+-------+-------------------+
| 1 | rack1 | None |
| 2 | rack2 | None |
+----+-------+-------------------+
ubuntu@ubuntu ~/devstack ◆ (stable/ocata) $ openstack aggregate show rack1
+-------------------+------------------------------------------------------------------------------------+
| Field | Value |
+-------------------+------------------------------------------------------------------------------------+
| availability_zone | None |
| created_at | 2018-02-21T08:10:35.000000 |
| deleted | False |
| deleted_at | None |
| hosts | [u'959734ed-8dda-4878-9d5c-ddd9a95b65ec', u'c105d862-2eca-4845-901e-cd8194a39248'] |
| id | 1 |
| name | rack1 |
| properties | |
| updated_at | None |
| uuid | bf7a251a-edff-4688-81d7-d6cf8b201847 |
+-------------------+------------------------------------------------------------------------------------+
ubuntu@ubuntu ~/devstack ◆ (stable/ocata) $ openstack aggregate show rack2
+-------------------+------------------------------------------------------------------------------------+
| Field | Value |
+-------------------+------------------------------------------------------------------------------------+
| availability_zone | None |
| created_at | 2018-02-21T08:10:37.000000 |
| deleted | False |
| deleted_at | None |
| hosts | [u'a204e33f-6803-4d92-ad47-5b6928e3cede', u'6ee27372-884d-4db4-af27-f697fffcb7c0'] |
| id | 2 |
| name | rack2 |
| properties | |
| updated_at | None |
| uuid | 7ca81b0e-2a87-4d41-af1b-b688aedc7b25 |
+-------------------+------------------------------------------------------------------------------------+
Next, given that we are able to have only two nodes in each aggregare, lets create two instances in a bulk:
ubuntu@ubuntu ~/devstack ◆ (stable/ocata) $ openstack server create \
--image=$(openstack image list|grep x86_64-disk|awk '{print $2}') \
--flavor=1 \
--nic net-id=$(openstack network list|grep private|awk '{print $2}') \
--hint group=$(openstack server group list|grep group1|awk '{print $2}') \
--min 2 --max 2 instance
which will results running a filters, like those from scheduler logs:
12018-02-21 09:16:53.303 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter RetryFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10422018-02-21 09:16:53.304 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter AvailabilityZoneFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10432018-02-21 09:16:53.304 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter RamFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10442018-02-21 09:16:53.304 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter DiskFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10452018-02-21 09:16:53.305 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter ComputeFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10462018-02-21 09:16:53.305 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter ComputeCapabilitiesFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10472018-02-21 09:16:53.305 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter ImagePropertiesFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10482018-02-21 09:16:53.305 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter ServerGroupAntiAffinityFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10492018-02-21 09:16:53.306 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter ServerGroupAffinityFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:104102018-02-21 09:16:53.306 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter SameHostFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:104112018-02-21 09:16:53.306 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter DifferentHostFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:104122018-02-21 09:16:53.306 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter ServerGroupAggregateAffinityFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:104132018-02-21 09:16:53.307 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter ServerGroupAggregateAntiAffinityFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:104142018-02-21 09:16:53.307 DEBUG nova.scheduler.filter_scheduler [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filtered [(ubuntu, c105d862-2eca-4845-901e-cd8194a39248) ram: 1280MB disk: 10240MB io_ops: 0 instances: 0, (ubuntu, a204e33f-6803-4d92-ad47-5b6928e3cede) ram: 1280MB disk: 10240MB io_ops: 0 instances: 0, (ubuntu, 6ee27372-884d-4db4-af27-f697fffcb7c0) ram: 1280MB disk: 10240MB io_ops: 0 instances: 0, (ubuntu, 959734ed-8dda-4878-9d5c-ddd9a95b65ec) ram: 1280MB disk: 10240MB io_ops: 0 instances: 0] from (pid=11395) _schedule /opt/stack/nova/nova/scheduler/filter_scheduler.py:115152018-02-21 09:16:53.307 DEBUG nova.scheduler.filter_scheduler [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Weighed [WeighedHost [host: (ubuntu, c105d862-2eca-4845-901e-cd8194a39248) ram: 1280MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0], WeighedHost [host: (ubuntu, a204e33f-6803-4d92-ad47-5b6928e3cede) ram: 1280MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0], WeighedHost [host: (ubuntu, 6ee27372-884d-4db4-af27-f697fffcb7c0) ram: 1280MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0], WeighedHost [host: (ubuntu, 959734ed-8dda-4878-9d5c-ddd9a95b65ec) ram: 1280MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0]] from (pid=11395) _schedule /opt/stack/nova/nova/scheduler/filter_scheduler.py:120162018-02-21 09:16:53.308 DEBUG nova.scheduler.filter_scheduler [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Selected host: WeighedHost [host: (ubuntu, a204e33f-6803-4d92-ad47-5b6928e3cede) ram: 1280MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0] from (pid=11395) _schedule /opt/stack/nova/nova/scheduler/filter_scheduler.py:127172018-02-21 09:16:53.308 DEBUG oslo_concurrency.lockutils [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Lock "(u'ubuntu', u'a204e33f-6803-4d92-ad47-5b6928e3cede')" acquired by "nova.scheduler.host_manager._locked" :: waited 0.000s from (pid=11395) inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270182018-02-21 09:16:53.308 DEBUG oslo_concurrency.lockutils [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Lock "(u'ubuntu', u'a204e33f-6803-4d92-ad47-5b6928e3cede')" released by "nova.scheduler.host_manager._locked" :: held 0.000s from (pid=11395) inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:282192018-02-21 09:16:53.308 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Starting with 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:70
so, for the first iteration, filters return all four nodes (new aggregate filters are on lines 12 and 13), which can be used to fulfill the request. Next second iteration is done:
12018-02-21 09:16:53.310 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter RetryFilter returned 4 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10422018-02-21 09:16:53.310 DEBUG nova.scheduler.filters.ram_filter [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] (ubuntu, a204e33f-6803-4d92-ad47-5b6928e3cede) ram: 0MB disk: 0MB io_ops: 0 instances: 0 does not have 512 MB usable ram, it only has 0.0 MB usable ram. from (pid=11395) host_passes /opt/stack/nova/nova/scheduler/filters/ram_filter.py:6132018-02-21 09:16:53.310 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter RamFilter returned 3 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10442018-02-21 09:16:53.310 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter DiskFilter returned 3 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10452018-02-21 09:16:53.310 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter ServerGroupAntiAffinityFilter returned 3 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10462018-02-21 09:16:53.311 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter ServerGroupAffinityFilter returned 3 host(s) from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:10472018-02-21 09:16:53.311 DEBUG nova.scheduler.filters.affinity_filter [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] aggregate-affinity: check if set([1]) is a subset of set([]),host nodes: set([u'ubuntu']) from (pid=11395) host_passes /opt/stack/nova/nova/scheduler/filters/affinity_filter.py:21382018-02-21 09:16:53.311 DEBUG nova.scheduler.filters.affinity_filter [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] aggregate-affinity: check if set([2]) is a subset of set([]),host nodes: set([u'ubuntu']) from (pid=11395) host_passes /opt/stack/nova/nova/scheduler/filters/affinity_filter.py:21392018-02-21 09:16:53.311 DEBUG nova.scheduler.filters.affinity_filter [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] aggregate-affinity: check if set([1]) is a subset of set([]),host nodes: set([u'ubuntu']) from (pid=11395) host_passes /opt/stack/nova/nova/scheduler/filters/affinity_filter.py:213102018-02-21 09:16:53.312 INFO nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filter ServerGroupAggregateAffinityFilter returned 0 hosts112018-02-21 09:16:53.312 DEBUG nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filtering removed all hosts for the request with instance ID '9a7f787c-5074-4af3-80a2-38eaecf882a2'. Filter results: [('RetryFilter', [(u'ubuntu', u'c105d862-2eca-4845-901e-cd8194a39248'), (u'ubuntu', u'a204e33f-6803-4d92-ad47-5b6928e3cede'), (u'ubuntu', u'6ee27372-884d-4db4-af27-f697fffcb7c0'), (u'ubuntu', u'959734ed-8dda-4878-9d5c-ddd9a95b65ec')]), ('RamFilter', [(u'ubuntu', u'c105d862-2eca-4845-901e-cd8194a39248'), (u'ubuntu', u'6ee27372-884d-4db4-af27-f697fffcb7c0'), (u'ubuntu', u'959734ed-8dda-4878-9d5c-ddd9a95b65ec')]), ('DiskFilter', [(u'ubuntu', u'c105d862-2eca-4845-901e-cd8194a39248'), (u'ubuntu', u'6ee27372-884d-4db4-af27-f697fffcb7c0'), (u'ubuntu', u'959734ed-8dda-4878-9d5c-ddd9a95b65ec')]), ('ServerGroupAntiAffinityFilter', [(u'ubuntu', u'c105d862-2eca-4845-901e-cd8194a39248'), (u'ubuntu', u'6ee27372-884d-4db4-af27-f697fffcb7c0'), (u'ubuntu', u'959734ed-8dda-4878-9d5c-ddd9a95b65ec')]), ('ServerGroupAffinityFilter', [(u'ubuntu', u'c105d862-2eca-4845-901e-cd8194a39248'), (u'ubuntu', u'6ee27372-884d-4db4-af27-f697fffcb7c0'), (u'ubuntu', u'959734ed-8dda-4878-9d5c-ddd9a95b65ec')]), ('ServerGroupAggregateAffinityFilter', None)] from (pid=11395) get_filtered_objects /opt/stack/nova/nova/filters.py:129122018-02-21 09:16:53.312 INFO nova.filters [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Filtering removed all hosts for the request with instance ID '9a7f787c-5074-4af3-80a2-38eaecf882a2'. Filter results: ['RetryFilter: (start: 4, end: 4)', 'RamFilter: (start: 4, end: 3)', 'DiskFilter: (start: 3, end: 3)', 'ServerGroupAntiAffinityFilter: (start: 3, end: 3)', 'ServerGroupAffinityFilter: (start: 3, end: 3)', 'ServerGroupAggregateAffinityFilter: (start: 3, end: 0)']132018-02-21 09:16:53.312 DEBUG nova.scheduler.filter_scheduler [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] There are 1 hosts available but 2 instances requested to build. from (pid=11395) select_destinations /opt/stack/nova/nova/scheduler/filter_scheduler.py:76142018-02-21 09:16:53.312 DEBUG oslo_messaging.rpc.server [req-6b671371-ea58-4b1d-8657-a6376d2d1d88 admin admin] Expected exception during message handling () from (pid=11395) _process_incoming /usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py:158
This time, as we can see in line 10, ServerGroupAffinityFilter returns 0 hosts. A log lines 7-9 gives us a hint, that none of the candidates fulfill requirement, which looks like this (I've removed some comments and non interesting parts for readability):
1def host_passes(self, host_state, spec_obj):2# ...3host_aggs = set(agg.id for agg in host_state.aggregates)45if not host_aggs:6return self.REVERSE_CHECK78# Take all hypervisors nodenames and hostnames9host_nodes = set(spec_obj.instance_group.nodes +10spec_obj.instance_group.hosts)1112if not host_nodes:13# There are no members of the server group yet14return True1516# Grab all aggregates for all hosts in the server group and ensure we17# have an intersection with this host's aggregates18group_aggs = set()19for node in host_nodes:20group_aggs |= self.host_manager.host_aggregates_map[node]2122LOG.debug(...)2324if self.REVERSE_CHECK:25return host_aggs.isdisjoint(group_aggs)26return host_aggs.issubset(group_aggs
In this filter first we check if host belongs to any aggregate and store it as a set. If there is an empty set, it means that node either cannot satisfy aggregate affinity constraint in case of aggregate-affinity policy or it's does satisfy the constraint in case of aggregate-anti-affinity.
Next, there is a check for instance_group hosts and nodes (nodes field is added for Ironic case, otherwise we don't have Ironic nodes hostnames other than… hostname which origin from compute service). In case there is no instance yet created, that means we can pass current host, since there is no hosts in the group yet.
If we have some nodenames/hostnames in the set, we trying to match host aggregates with the each nodenames/hostnames (line 20). And here is the issue. instance_group provided by request spec object (spec_obj) have hosts field filled out during scheduling, but nodes field not, until there is an instance created, so this is the reason why we can create instances one by one, but not in the bulk.