Introduction

This page describes several procedures which AlexisHuxley uses to configure and test cluster services on his network. The actual installation of cluster software, etc is covered by MDI.

Procedure: configuring a VM to access multiple bridges

In VM servers, pdi (see MDI) can configure 3 bridges, each connected to a different VLAN and make them available to VMs. But the VM configuration still needs to be updated to make use of them.

  1. Run:

    virsh shutdown <this-vm>
    virsh dumpxml <this-vm> > <this-vm>.xml 
  2. Edit the XML file, clone the NIC stanza twice, incrementing the MAC address, bridge name and PCI slot in the clones, making sure that the original NIC stanza is not changed and that the PCI slot does not clash with any already present! E.g. If the original stanza was this:

    <interface type='bridge'>
      <mac address='00:16:3e:dd:54:cf'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface> 

    and PCI slot numbers 0x03 and 0x04 were used by other stanzas then you would add this:

    <interface type='bridge'>
      <mac address='00:16:3e:dd:54:d0'/>
      <source bridge='br1'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </interface>
    <interface type='bridge'>
      <mac address='00:16:3e:dd:54:d1'/>
      <source bridge='br2'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </interface> 
  3. Run:

    virsh undefine <this-vm>
    virsh define <this-vm>.xml 
  4. Run:

    virsh start <this-vm> 
  5. libvirt or libvirt-tools has a bug whereby XML configuration data for multiple NICs overwrites the XML configuration data for the first NIC, leading on the first edit to the impression that there is only one NIC and then on the second edit to there really being only one NIC. For this reason it is a good idea to preserve the XML files used above.

Procedure: tweaking basic cluster settings

This section lists various steps which may be needed; review them carefully to decide whether they are appropriate.

  1. Set the Unix password for the 'hacluster' account (this will be needed when using hb_gui).
  2. Disable STONITH (taken from http://www.clusterlabs.org/wiki/Debian_Lenny_HowTo), fix two-node quorum issues (taken from http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf) and make sure that resources do not migrate back by running:

    noodle# crm
    crm(live)# cib new configtmp
    INFO: building help index
    INFO: configtmp shadow CIB created
    crm(configtmp)# configure
    crm(configtmp)configure# property stonith-enabled=false
    crm(configtmp)configure# property no-quorum-policy=ignore
    crm(configtmp)configure# rsc_defaults resource-stickiness=100
    crm(configtmp)configure# verify
    crm(configtmp)configure# end
    There are changes pending. Do you want to commit them? y
    crm(configtmp)# cib use live
    crm(live)# cib commit configtmp
    INFO: commited 'configtmp' shadow CIB to the cluster
    crm(live)# cib delete configtmp
    crm(live)# quit 

Procedure: testing using a dummy resource

  1. Set up a dummy resource by running:

    noodle# crm
    crm(live)# cib new configtmp
    INFO: building help index
    INFO: configtmp shadow CIB created
    crm(configtmp)# configure
    crm(configtmp)configure# primitive dummy ocf:pacemaker:Dummy op monitor interval=10s
    WARNING: dummy: default timeout 20s for start is smaller than the advised 90
    WARNING: dummy: default timeout 20s for stop is smaller than the advised 100
    crm(configtmp)configure# verify
    WARNING: dummy: default timeout 20s for start is smaller than the advised 90
    WARNING: dummy: default timeout 20s for stop is smaller than the advised 100
    crm(configtmp)configure# end
    There are changes pending. Do you want to commit them? y
    crm(configtmp)# cib use live
    crm(live)# cib commit configtmp
    INFO: commited 'configtmp' shadow CIB to the cluster
    crm(live)# cib delete configtmp
    INFO: configtmp shadow CIB deleted
    crm(live)# quit
    bye
    noodle# 
  2. Test by running the following commands (based on http://www.clusterlabs.org/wiki/Debian_Lenny_HowTo):

    root# crm
    crm(live)# configure show
    node doodle \
            attributes standby="off"
    node noodle \
            attributes standby="off"
    primitive dummy ocf:pacemaker:Dummy \
            op monitor interval="10s"
    property $id="cib-bootstrap-options" \
            dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
            cluster-infrastructure="openais" \
            expected-quorum-votes="2" \
            stonith-enabled="false" \
            no-quorum-policy="ignore" \
            maintenance-mode="false" \
            last-lrm-refresh="1291797358"
    rsc_defaults $id="rsc-options" \
            resource-stickiness="100"
    op_defaults $id="op_defaults-options" \
            record-pending="false"
    crm(live)# node show
    doodle: normal
            standby: off
    noodle: normal
            standby: off
    crm(live)# resource show
     dummy  (ocf::pacemaker:Dummy) Started 
    crm(live)# node standby <node-name>            #  verify resource is migrated to other node with "crm_mon -1"
    crm(live)# node online  <node-name>            #  verify resource is note migrated back to other node with "crm_mon -1"
    crm(live)# resource migrate dummy <node-name>  #  verify resource is migrated with "crm_mon -1"
    crm(live)# resource stop dummy                 #  verify resource is stopped with "crm_mon -1"
    crm(live)# resource start dummy                #  verify resource is started with "crm_mon -1" 
    crm(live)# quit
    bye
    noodle# 
  3. Remove the dummy resource by running:

    noodle# crm
    crm(live)# cib new configtmp
    INFO: building help index
    INFO: configtmp shadow CIB created
    crm(configtmp)# configure
    crm(configtmp)configure# delete dummy
    INFO: hanging location:cli-prefer-dummy deleted
    crm(configtmp)configure# verify
    crm(configtmp)configure# end
    There are changes pending. Do you want to commit them? y
    crm(configtmp)# cib use live
    crm(live)# cib commit configtmp
    INFO: commited 'configtmp' shadow CIB to the cluster
    crm(live)# cib delete configtmp
    INFO: configtmp shadow CIB deleted
    crm(live)# quit
    bye
    noodle# 

Procedure: clustering Apache

  1. On all nodes install apache2
  2. On all nodes prevent automatic startup:

    service apache2 stop
    update-rc.d apache2 remove 
  3. On all nodes configure apache to listen on an as-yet-unconfigured virtual interface:

    perl -pi -e 's/^Listen.*/Listen 192.168.1.13:80/' /etc/apache2/ports.conf 
  4. On NFS shared storage (e.g. NAS) allocate storage to be accessible to both nodes
  5. On one node manually start resources to test understanding of what is required and in what order. E.g.:

    mount storage.pasta.net:/vol/webpages /var/www
    ifconfig eth0:1 192.168.1.13 up
    service apache2 start 
    and check web access on the virtual interface.
  6. Manually stop resources.
  7. Add a resource group containing 3 resources for this service (vNIC, mount, apache). The resulting resources looked like this:

    noodle# cibadmin -Q -o resources > resources.xml
    noodle# cat resources.xml
    <resources>
      <group id="webservices">
        <meta_attributes id="webservices-meta_attributes">
          <nvpair id="webservices-meta_attributes-target-role" name="target-role" value="started"/>
        </meta_attributes>
        <primitive class="ocf" id="vnic" provider="heartbeat" type="IPaddr2">
          <operations id="vnic-operations">
            <op id="vnic-op-monitor-10s" interval="10s" name="monitor" timeout="20s"/>
          </operations>
          <instance_attributes id="vnic-instance_attributes">
            <nvpair id="vnic-instance_attributes-ip" name="ip" value="192.168.1.13"/>
            <nvpair id="vnic-instance_attributes-nic" name="nic" value="eth0:1"/>
          </instance_attributes>
          <meta_attributes id="vnic-meta_attributes">
            <nvpair id="vnic-meta_attributes-target-role" name="target-role" value="started"/>
          </meta_attributes>
        </primitive>
        <primitive class="ocf" id="mount" provider="heartbeat" type="Filesystem">
          <operations id="mount-operations">
            <op id="mount-op-monitor-20" interval="20" name="monitor" timeout="40"/>
          </operations>
          <instance_attributes id="mount-instance_attributes">
            <nvpair id="mount-instance_attributes-device" name="device" value="storage.pasta.net:/vol/www"/>
            <nvpair id="mount-instance_attributes-directory" name="directory" value="/var/www"/>
          </instance_attributes>
          <meta_attributes id="mount-meta_attributes">
            <nvpair id="mount-meta_attributes-target-role" name="target-role" value="started"/>
          </meta_attributes>
        </primitive>
        <primitive class="lsb" id="apache2" type="apache2">
          <operations id="apache2-operations">
            <op id="apache2-op-monitor-15" interval="15" name="monitor" start-delay="15" timeout="15"/>
          </operations>
        </primitive>
      </group>
    </resources>
    noodle# 

    This could be reloaded with:

    cibadmin --replace --scope resources --xml-file resources.xml 

Procedure: clustering Icinga

  1. On all nodes install icinga
  2. Work around BTS#599555 by creating XXXX containing the following (with hostname adjusted):

    <VirtualHost *:80>
    
        ServerName icinga.pasta.net
        ServerAlias www.icinga.pasta.net
    
        DocumentRoot /usr/share/icinga/htdocs
    
        ScriptAlias /cgi-bin/icinga /usr/lib/cgi-bin/icinga
    
        # Where the stylesheets (config files) reside
        Alias /stylesheets /etc/icinga/stylesheets
    
        <Directory /usr/share/icinga/htdocs>
            Options     FollowSymLinks
            Order       allow,deny
            Allow       from all
        </Directory>
    
        ErrorLog ${APACHE_LOG_DIR}/icinga.error.log
        CustomLog ${APACHE_LOG_DIR}/icinga.access.log combined
    
    </VirtualHost> 
  3. Run:

    /etc/init.d/apache2 reload 
  4. In /etc/apache2/conf.d/icinga, locate the specification of the htpasswd.users file.
  5. Use htpasswd to add an entry to that file.

  6. At this point, I could access the Icinga tactical interface, but found it similar enough to Nagios that I did not want to continue.

See also


CategoryProcedure

ConfiguringClusterServices (last edited 2011-10-03 14:31:47 by AlexisHuxley)