How I barely got my first Ceph monitor running in Docker
Docker is definitely the new trend. Thus I quickly wanted to try to put a Ceph monitor inside a Docker container. Story of a tough journey…
First let’s start with the DockerFile, this makes the setup easy and repeatable by anybody:
FROM ubuntu:latest
MAINTAINER Sebastien Han <[email protected]>
# Hack for initctl not being available in Ubuntu
RUN dpkg-divert --local --rename --add /sbin/initctl
RUN ln -s /bin/true /sbin/initctl
# Repo and packages
RUN echo deb http://archive.ubuntu.com/ubuntu precise main | tee /etc/apt/sources.list
RUN echo deb http://archive.ubuntu.com/ubuntu precise-updates main | tee -a /etc/apt/sources.list
RUN echo deb http://archive.ubuntu.com/ubuntu precise universe | tee -a /etc/apt/sources.list
RUN echo deb http://archive.ubuntu.com/ubuntu precise-updates universe | tee -a /etc/apt/sources.list
RUN apt-get update
RUN apt-get install -y --force-yes wget lsb-release sudo
# Fake a fuse install otherwise ceph won't get installed
RUN apt-get install libfuse2
RUN cd /tmp ; apt-get download fuse
RUN cd /tmp ; dpkg-deb -x fuse_* .
RUN cd /tmp ; dpkg-deb -e fuse_*
RUN cd /tmp ; rm fuse_*.deb
RUN cd /tmp ; echo -en '#!/bin/bash\nexit 0\n' > DEBIAN/postinst
RUN cd /tmp ; dpkg-deb -b . /fuse.deb
RUN cd /tmp ; dpkg -i /fuse.deb
# Install Ceph
CMD wget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | apt-key add -
RUN echo deb http://ceph.com/debian-dumpling/ $(lsb_release -sc) main | tee /etc/apt/sources.list.d/ceph-dumpling.list
RUN apt-get update
RUN apt-get install -y --force-yes ceph ceph-deploy
# Avoid host resolution error from ceph-deploy
RUN echo ::1 ceph-mon | tee /etc/hosts
# Deploy the monitor
RUN ceph-deploy new ceph-mon
EXPOSE 6789
Then build the image:
$ sudo docker build -t leseb/ceph-mon . |
Now we almost have th full image, we just need to instruct Docker to install the monitor. For this, we simply run the image that we just created and we pass the command that creates the monitor:
$ docker run -d -h="ceph-mon" leseb/ceph-mon ceph-deploy --overwrite-conf mon create ceph-mon |
Check if it works properly:
$ docker logs e2f48f3cca26 |
Then commit the last version of your image to save the latest change:
$ docker commit e2f48f3cca26 leseb/ceph-mon |
Finally run the monitor in a new container:
$ docker run -d -p 6789 -h="ceph-mon" leseb/ceph ceph-mon --conf /ceph.conf --cluster=ceph -i ceph-mon -f |
Now the tough part, because of the use of ceph-deploy
the monitor listens to the IPv6 local address.
Which in normal circonstances is not a problem since we can access from either its local IP (lo) or its private address (eth0 or something else).
However with Docker, things are a little bit different, the monitor is only accessible from its namespace, so even if you expose a port this won’t work.
Basically exposing a port creates an Iptables DNAT rule, that says: everything that goes from anywhere to the host IP address on a specific port is redirected to the IP address within the container namespace.
In the end, if you try to access the monintor using the IP address of the host plus the exposed port you will get something like this:
.connect claims to be [::1]:6804/1031425 not [::1]:6804/31537 - wrong node!
Although there is a way to access the monitor! We need to access it from host directly through the namespace.
First grab your container’s ID:
$ docker ps |
Use this script, stolen and adapt from Jérôme Petazzoni here. This script creates the entry point on the host to access the namespace of the container.
|
Execute it:
$ ./pipework.sh 9cfa541f6be9 |
Now, get the monitor’s key:
$ cp /var/lib/docker/containers/9cfa541f6be97821131355b4005bc24b509baf3028759f0f871bf43840399f96/rootfs/ceph.mon.keyring ceph.mon.docker.keyring |
Ouahh YEAH!
$ sudo ip netns exec 10660 ceph -k ceph.mon.docker.keyring -n mon. -m 172.17.0.8 -s |
III. Issues and caveats
I’m not really convinced by this first shot. The biggest issue here is that the monitor needs to be known.
Wow that was a hell of a job to get this working. At the end, the effort is quite useless since nothing can reach the monitor except the host itself. Thus, other Ceph components will only work if they share the same network namespace as the monitor. Sharing all the containers namespace into one could quite difficult as well. But what’s the point to have a Ceph cluster stuck within some namespaces, without any clients accessing it?
I have to admit that this was pretty fun to hack. Although, in practice, that’s not usable at all. Thus you can consider this as an experiment and a way to get into Docker ;-).
Comments