Issue type: pacemaker nfs/drbd cluster
6 posts all around the web
2 deviant from my configuration
the rest.. no answer .
Seems legit.
My pacemaker setup Architecture:
2 nodes
pacemaker active/passive
# crm configure show node ha-node01 \ attributes standby="off" node ha-node02 \ attributes standby="on" primitive drbd_nfs ocf:linbit:drbd \ params drbd_resource="r0" \ op monitor interval="15s" primitive fs_nfs ocf:heartbeat:Filesystem \ params device="/dev/drbd0" directory="/mnt/data/magento" fstype="ext3" \ op start interval="0" timeout="60" \ op stop interval="0" timeout="120" primitive nfs lsb:nfs-kernel-server \ op monitor interval="5s" \ meta target-role="Started" primitive vip1 ocf:heartbeat:IPaddr2 \ params ip="172.18.34.63" nic="bond1.1034" \ op monitor interval="5s" group HAServices vip1 fs_nfs nfs \ meta target-role="Started" ms ms_drbd_nfs drbd_nfs \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" colocation ms-drbd-nfs-with-haservices inf: ms_drbd_nfs:Master HAServices order fs-nfs-before-nfs inf: fs_nfs:start nfs:start order ip-before-ms-drbd-nfs inf: vip1:start ms_drbd_nfs:promote order ms-drbd-nfs-before-fs-nfs inf: ms_drbd_nfs:promote fs_nfs:start property $id="cib-bootstrap-options" \ dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ rsc_defaults $id="rsc-options" \ resource-stickiness="100"
General cluster status Cluster status, the resource is running on the second node
============ Last updated: Wed Apr 25 11 :40 :11 2012 Stack: openais Current DC: ha -node02 - partition with quorum Version: 1.0 .9 -74392 a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, 2 expected votes2 Resources configured.============ Node ha -node02: standby Online: [ ha -node01 ] Resource Group: HAServices vip1 (ocf::heartbeat:IPaddr2 ): Started ha -node02 fs_nfs (ocf::heartbeat:Filesystem ): Started ha -node02 nfs (ls b:nfs -kernel-server): Started ha -node02 Master/Slave Set: ms_drbd_nfs Masters: [ ha -node02 ] Slaves: [ ha -node01 ]
When I tried to put the node which hold the resource in standby mode, things goes wrong
# crm node standby ha-node02 ============ Last updated: Wed Apr 25 11:40:12 2012 Stack: openais Current DC: ha-node02 - partition with quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Node ha-node02: standby Online: [ ha-node01 ] Resource Group: HAServices vip1 (ocf::heartbeat:IPaddr2): Started ha-node01 fs_nfs (ocf::heartbeat:Filesystem): Stopped nfs (lsb:nfs-kernel-server): Stopped Master/Slave Set: ms_drbd_nfs Slaves: [ ha-node01 ] Stopped: [ drbd_nfs:0 ] Failed actions: fs_nfs_start_0 (node=ha-node01, call=14, rc=1, status=complete): unknown error
##What the logs said?
Corosync log:
ha-node02 lrmd: [23026 ]: WARN: For LSB init script, no additional parameters are ha-node02 ha-node02
##Solution: clean your resource!
Simply run this command and your cluster will go back to life. In my setup the incrimited resource is fs_nfs
, so I ran:
crm resource cleanup fs_nfs
Extract from the cluster lab documentation:
Cleanup resource status. Typically done after the resource has temporarily failed. If a node is omitted, cleanup on all nodes. If there are many nodes, the command may take a while.
Never give up!
Comments