Debian Heartbeat: Commands, Tips and Tricks

root's picture

How to move the services manually, to the secondary node (candy):

This will also prevent the resource to automatically failover if the primary node is back online so keep that in mind when you plan several reboots on primary node and you don't want to be bothered.
crm resource migrate nfs-group candy

How to stop/start a resource group:

crm resource stop nfs-group
crm resource start nfs-group

One way to modify the cluster crm configuration:

crm
configure
edit
At this point, you will be introduced in "vi mode" and you can edit anything you want. Do not forget to save exit afterwards. Once done, continue with:
commit
exit

How to add another DRBD volume in the existing cluster configuration:

1. Add new resource:
primitive drbd_squid ocf:heartbeat:Filesystem \
        params fstype="reiserfs" directory="/nfs/squid" device="/dev/drbd1" \
        meta target-role="Started"
2. Teach the cluster to mount the filesystems in the right order (drbd_fs is mounting the /nfs partition so it must be started first):
order drbd_fs-before-drbd_squid mandatory: drbd_fs:promote drbd_squid:start
colocation drbd_squid-on-drbd_fs inf: drbd_squid drbd_fs:Master
3. Edit drbd_main and add nfsquid to drbd_resource:
primitive drbd_main ocf:heartbeat:drbd \
        params drbd_resource="nfs [color=red]nfsquid[/color]" \
        op monitor interval="59s" role="Master" timeout="30s" \
        op monitor interval="60s" role="Slave" timeout="30s"
4. Add the new ressource to the existing nfs group:

group nfs-group drbd_fs [color=red]drbd_squid[/color] nfs_server nfs_common nfs_ip

How to restart service within the cluster:

Once you add something in cluster configuration, it is advisable to stop/start it only within the cluster.
So, if you want to (let's say) restart nginx server, you will have to do it with heartbeat commands:
# crm resource restart <service>

If you want to just reload, you can do it the classic way but make sure the configuration is OK because otherwise the cluster will detect the failure and will fail-over the service to the other node (on which will fail too since you have the same "wrong" configuration).

Example:
root@eave:~# crm resource restart nginx_srv
INFO: ordering nginx_srv to stop
INFO: ordering nginx_srv to start
root@eave:~# crm_mon -1 | grep nginx_srv
     nginx_srv	(lsb:nginx):	Started eave
root@eave:~# /etc/init.d/nginx configtest
Testing nginx configuration: nginx.
root@eave:~# /etc/init.d/nginx reload
Reloading nginx configuration: nginx.

How to solve error heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]:

In my situation, I have migrated the VMs from VirtualBox to ESXi by converting the disks associated with the VMs. When I started back the 2nd node, I got this error continuously on the primary node:

20:43:43 root@candy:~# grep should_drop_message /var/log/heartbeat.log | tail
Jan 23 20:39:08 candy heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]
Jan 23 20:39:08 candy heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]
Jan 23 20:39:09 candy heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]
Jan 23 20:39:09 candy heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]
Jan 23 20:39:09 candy heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]
Jan 23 20:39:09 candy heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]
Jan 23 20:39:10 candy heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]
Jan 23 20:39:10 candy heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]
Jan 23 20:39:11 candy heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]
Jan 23 20:39:11 candy heartbeat: [5686]: ERROR: should_drop_message: attempted replay attack [eave]? [gen = 1417088478, curgen = 1417088491]

You can solve this by deleting the file /var/lib/heartbeat/hb_generation from the 2nd node:
1. Stop the cluster services on 2nd node. If the stopping script hangs, just CTRL+C and kill the heartbeat's main process
2. Delete the mentioned file:

20:38:24 root@eave:heartbeat# cd /var/lib/heartbeat/
20:38:32 root@eave:heartbeat# ls -la hb_generation
-rw-r--r--  1 root      root        16 Jan 23 20:27 hb_generation

3. Start back up the services on 2nd node.

You should be OK now.

Thou shalt not steal!

If you want to use this information on your own website, please remember: by doing copy/paste entirely it is always stealing and you should be ashamed of yourself! Have at least the decency to create your own text and comments and run the commands on your own servers and provide your output, not what I did!

Or at least link back to this website.

Recent content

root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root
root