Ceph: maintenance mode, use case and common operations
Quick tips about how to manage a production environement. A simple use case here put your ceph journal on a SSD on a production cluster while clients are writting.
This is the current state of the cluster:
$ ceph osd tree |
Let’s say that you just bought a SSD, and you want to put your ceph journal in it. Your cluster is in production, clients are writting data. So you can’t simply stop your OSD and replace everything. Otherwise the cluster will detect a failure and start to recover. Assuming that you want to perform a maintenance action on the OSD 2. You should marked it as out, which means assigning a weight of 0 to the OSD.
$ ceph osd set noout |
You immediatly notice that the status changed. After this you can easily bring down your OSD. PGs will get a degraded state because the noout
option prevents the OSD to be marked out of the cluster. Because of this, the PG replica count can’t be properly honored anymore. See the example bellow:
health HEALTH_WARN 54 pgs degraded; 54 pgs stuck unclean; 1/3 in osds are down; noout flag(s) set
$ sudo service ceph stop osd.2 |
Check the CRUSH tree:
$ ceph osd tree |
The cluster will show you a degraded state which is normal since one OSD is now marked as out and down. Anyway the cluster won’t attempt any recovery. After that flush the content of your journal, to commit pending transactions to the backend filesystem:
$ ceph-osd -i 2 --flush-journal |
Do whatever you want with the previous journal… Mount your SSD:
$ sudo mount /dev/sdc /journal |
Finally create a new journal, if you don’t specify any path ceph will use the path inside your configuration file:
$ ceph-osd -i 2 --mkjournal |
Restart your OSD and change his status:
$ sudo service ceph start osd.2 |
Finally unset the noout
value.
$ ceph osd unset noout |
Everything should be normal:
$ ceph osd tree |
Enjoy the velocity of your journal within your SSD! Note that those manipulations are also valid for update/upgrade and common maintenance tasks.
Comments