OpenStack High Availability 1/??
First article of a long serie to build an highly available OpenStack platform. This one is more a state of art about the OpenStack HA.
The main idea is to build a clustered cloud. A clustered cloud? This can be really useful particulary if you don’t have a lot of servers at your disposal. It’s really important to keep the KISS principle. A clustered cloud is just an infrastructure based on a cloud operating system (like OpenStack) which uses a clustered management software such as Pacemaker and a replication layer like DRBD. Pacemaker and corosync are 2 amazing solution for building cluster. Pacemaker is a cluster management and Corosync manages the communication layer.
#I. Nova components
Bringing high-availability to the nova componentis is not an easy task. Specially because some of them are really critical.
- nova-api
- nova-scheduler
- nova-consoleauth
- nova-cert
Since there is currently no resource agent available I thought about started with LSB agent and maybe writing a resource agent later.
There is three remaining components:
- nova-compute: at the moment the main idea is to setup at least 2 compute node and use the live migration. It’s not high-availability. Since the cloud is design for failure, simply trust this mechanism.
- nova-network: idealy hosted on the same node as the nova-compute service.
- nova-volume: see the table below
nova-volume
Object storage | Block storage | POSIX filesystem | HA | Scale-in | Scale-out | OpenStack driver | Production ready | |
Local LVM | ||||||||
Nexenta | ||||||||
NFS | ||||||||
SAN | ||||||||
Sheepdog | ||||||||
Swift | ||||||||
Ceph | ||||||||
GlusterFS |
#II. Identity service: Keystone
The company hastexo provides a resource agent compatible pacemaker for Keystone.
#III. Dashboard
The Hoziron dashboard is based on the Django framework and natively hosted on Apache. Pacemaker provides a resource agent for apache.
#II. Glance
First recommandation here is to setup a 2 nodes pacemaker cluster active/passive with the resource agent available. Thoses ra are provided by hastexo, many thanks.
I didn’t try them yet, but soon enough!
#III. Queues
RabbitMQ offers a native active/active built-in clustering system which is really easy to setup. For more information take a look to the rabbitmq article. I will realease an article about the rabbitmq HA soon. I already test it on bare-metal.
#IV. Databases
I’m a pretty big fan of the Galera replicator. It’s also supported by Percona, I’m using it for most of my setups. I think Galera is currently the best master-master replication solution. Check my previous article about it
I think it’s a good way to start!
Comments