OpenStack: Glance and Keystone HA
The purpose of this article is to achieve high-availability for some OpenStack components.
I. Prerequisites
The purpose of this article is neither to setup glance nor keystone. Thus, I leave to you the configuration of Keystone and Glance. I assume that you have a proper installation of Glance and Keystone on each node. Perform some test via the command line and make sure that you’r able to reach the content of your database etc… After this you need to prevent those services from running at boot. If you run Ubuntu Server it’s really easy, you can use the .override
file facility for this. For other Linux distros see tools like: update-rc.d
,insserv
or chkconfig
.
$ sudo echo "manual" > /etc/init/glance-api.override |
If you want to re-enable them, simply delete every .override
files. Since that pacemaker will be the only one who manage them. Those resource agents are not part of the official Cluster Lab repo so you need to download them:
$ sudo mkdir /usr/lib/ocf/resource.d/openstack |
You will notice that the keystone RA point to my personnal github. Indeed, there is an indentation error in the Keytone ra script, I tried multiple time with the default raw ans with proper indentation. You can try the one provided by Hastexo and if it doesn’t work try mine :). I have a couple of pending pull requests:
I’m waiting the approval from the hastexo guys :).
Pull 1. The if ! check_binary
test seems badly indented. It’s weird because the indentation is similar as the one proposed by the OCF Resource Agent Developer’s Guide. Actually this test is never performed and the function keystone_monitor
returns $OCF_SUCCESS
. Thus, Pacemaker believes that the resource is running. At least, keystone daemon is running but the database in not reachable from the keystone client binary. After this change, the RA works like a charm :)
Here the changes I made:
Pull 2. The function keystone_monitor
is called too quickly. The server is not up. Thus this generate false errors in the logs because the server is about to come up. Putting the sleep at the beginning of the loop makes sure that the server is up (most of the time).
II. Setup the resources
I will skip the pacemaker installation steps, I assume that you have 2 nodes configured, you can easly check the membership:
$ sudo corosync-objctl | grep member |
And you must see this output from the crm_mon
command:
$ sudo crm_mon -1 |
With a 2 nodes cluster comes some pre-requires like:
- Disable STONITH
- Ignore the quorum policy
- Set the resource stickness to prevent resource failbacks. It’s not mandatory but recommended. Don’t forget that the stickiness is additive in groups. Every active member of the group will contribute its stickiness value to the group’s total. Here we have 4 resource agents, each resource has a weight of 100, then the group as a whole will prefer its current location with a score of 400.
I truly advise you to use the crm shell by typing sudo crm
because it offers the auto-completion of the commands which is really great. Or you can use the binary from the shell like this:
$ sudo crm configure property stonith-enabled=false |
We will start by creating a floating IP address for the cluster:
$ sudo crm configure primitive p_vip ocf:heartbeat:IPaddr \ |
Adapt the different values with your setup. And check the status of your cluster:
$ sudo crm_mon -1 |
Now we are about to add the glance and keystone primitives:
$ sudo crm configure primitive p_glance_api ocf:openstack:glance-api \ |
You must modify this value according to your setup, the os_auth_url
must be the IP address of the p_vpi
primitive:
- os_auth_url
- os_password
- os_username
- user
Eventually create a group to host every resource:
$ sudo crm configure group g_ha_glance_keystone p_vip p_keystone p_glance_api p_glance_registry |
And check the result:
$ sudo crm_mon -1 |
III. Special note about the IPaddr RA
When the ra IPaddr add a new IP address, the process automatically add a route according to this IP addresss. In my current setup, this installation is hosted inside OpenStack instances and I don’t need an extra route. For example the netwrk of my VMs is 192.168.22.32/27
, when I added the floating IP address I chose 172.17.1.80/24
which is part of my physical network (floating IPs). The NIC of each VM is bridged to the physical NIC of my compute node and nova-network takes care of the rest. Long story short, I simply modify the resource agent in order to remove the route every time the ra is called.
This custom ra is available on my Github. Here the changes I made:
You will maybe need to setup finer constrains to your cluster but this basic configuration will be enough most of the time. Achieving High-Availability in Glance and Keystone wouldn’t have been possible without the tremendous work of the hastexo’guys. Many thanks to them :)
Comments