Heartbeat-2 on Dom0

From brainsik
Jump to navigation Jump to search

These instructions will help you setup heartbeat-2 to run on a sarge system. Although this guide uses the CRM to configure heartbeat-2, since no OCF scripts are used, many of the advanced heartbeat-2 features, like monitoring, are unavailable. Documentation for writing OCF resource scripts is mostly non-existent and right now, I don't have the time to experiment. In the future, I'd like to come back to this problem because monitoring would allow for the health of each domain to be known and appropriate action to be taken.

Install

Grab the latest Ultra Monkey heartbeat-2 backport for Debian Sarge.

wget [1]

Install it. If it complains about unsatisfied dependencies, that's okay.

dpkg -i heartbeat-2_2.0.2-4bpo1_i386.deb

Run aptitude and install heartbeat-2's unsatisfied dependencies.

Configure

There are three files to create in /etc/ha.d:

  • authkeys -- network authentication
  • ha.cf -- heartbeat configuration
  • haresources -- resources being served

I'm not going into a full explanation of everything. Heartbeat is sophisticated software and you should take some time to read the docs at Linux-HA.

# /etc/ha.d/authkeys
#
auth 1
1 sha1 SecretPassphraseStuff
# /etc/ha.d/ha.cf
#
logfacility daemon         # Log to syslog as facility "daemon"
crm yes                    # use the new Cluster Resource Manager
keepalive 1                # Send one heartbeat each second
warntime 3                 # How long before issuing "late heartbeat" warning?
deadtime 10                # Declare nodes dead after 10 seconds
initdead 90                # Dead time for when things first come up
bcast eth0 eth1            # Broadcast heartbeats on eth0 and eth1 interfaces
auto_failback no           # Don't fail back to paul automatically
node xena xeno             # List our cluster members
#ping 1.2.3.254             # Ping our router to monitor ethernet connectivity
#respawn hacluster /usr/lib/heartbeat/ipfail  # Failover on network failures

We are running heartbeat-2 because we want to use some of the new resource monitoring features. This means /etc/ha.d/haresources isn't used by heartbeat to configure resources. Instead the new cluster resource manager (CRM) uses some difficult to read/write XML and our resource scripts have to conform to the OCF specifications (which are poorly documented). Heartbeat-2 ships with a script to convert the haresources file for use by the CRM. However, the configuration file it generates uses the old style heartbeat resource scripts and so don't support the monitor operation. It's my hope going through this process brings me at least a little closer to the heartbeat-2 system i envision.

# /etc/ha.d/haresources
# 
#  !! NOTE: This file is not used directly by heartbeat-2. It must be
#           converted to XML for the CRM. Use the command below to
#           install the required XML file.
#
#  # python /usr/lib/heartbeat/cts/haresources2cib.py > /var/lib/heartbeat/crm/cib.xml
#
node1 drbddisk::test xendomains::drbdtest

Create a resource.d script for heartbeat to control your Xen domains. Below is the script I'm using. It is a modified version of the one found in Setup guide: Active/Passive Redundancy using Xen, DRBD and Heartbeat posted to the Xen-users list. Since this is not an OCF script, it does not support the monitor function. I've created it as a starting point in getting the heartbeat system running.

#! /bin/bash
#
#   /etc/ha.d/resource.d/xendomains
#
#   heartbeat resource script to control Xen domains
#
PATH='/usr/local/sbin:/bin:/usr/sbin:/usr/bin'

RES="$1"
CMD="$2"

case "$CMD" in
    start)
        xm create $RES
        ;;
    stop)
        exec xm destroy $RES
        ;;
    status)
        xm list | awk '{print $1}'  | grep $RES > /dev/null
        if [ $? -eq 0 ] ; then
            echo 'running'
        else
            echo 'stopped'
        fi
        ;;
    *)
        echo "Usage: xendomain [domain] {start|stop|status}"
        exit 1
        ;;
esac

exit 0

Test

Make sure you've place all the configuration files and the xendomains script onto all the heartbeat nodes.

To test the resource scripts you can run the following commands checking the status of cat /proc/drbd and xm list.

 /etc/ha.d/resource.d/drbddisk testdisk start
 cat /proc/drbd
 /etc/ha.d/resource.d/xendomains testdom start
 xm list
 /etc/ha.d/resource.d/xendomains testdom stop
 xm list
 /etc/ha.d/resource.d/drbddisk r0 stop
 cat /proc/drbd

If everything looks good, start heartbeat on both machines. See if the primary node has started up the VM and then restart heartbeat on that machine so the resources transition over to the other machine. See if the other machine has the VM running.