We have the pretty standard server configuration of a load balancer (apache) in front of two machines which serve the actual content. Through the process of trying to upgrade a machine I realized they were not configured to easily bring up or down on their own. In commercial production (Websphere and WebLogic for instance) you just go to the admin console and take a node out of the pool. Not so with our rig. Here is the solution I came up with. I think it is pretty elegant (though I am sure it is just reinventing the wheel).

  1. Directory structure – I created 2 directories in the apache config directory (/etc/apache2 in this case): nodes-available and nodes-enabled
  2. Creating the nodes – We had all our load balanced ports in a single file. Those got broken in to separate files depending on which machine they were on. (Lets call them zfp1 and zfp2 for this example.) These files get placed in the nodes-available directory.
     $ cat nodes-available/zfp1 
     BalancerMember http://x.y.z.a:8000
     ...
     BalancerMember http://x.y.z.a:8019
    
  3. Enabling the nodes – The nodes are at this point both disabled which is not really an ideal situation. To enable them we create a symlink from the nodes-enabled directory into the nodes-available one for each node. For example, nodes-enabled/zfp1 links to nodes-available/zfp1.
     $ ls -l nodes-enabled/
     total 0
     lrwxrwxrwx 1 root root 33 Sep 16 14:12 zfp1 -> /etc/apache2/nodes-available/zfp1
     lrwxrwxrwx 1 root root 33 Sep 16 14:36 zfp2 -> /etc/apache2/nodes-available/zfp2
    
  4. Use the new system – Until this point we were just playing with files in a way that was completely transparent to apache. To start using this you need to make use of the Include functionality of apache’s configuration. I changed the reference to the single file (that we split in step 2) to be Include nodes-enabled/zfp*. By using a wildcard we can add more nodes without having to do anything to the actual server configuration. It also means we can use the load balancer for multiple clusters by just having differently named files.
     <Proxy balancer://mongrel_cluster>
       Include nodes-enabled/zfp*
     </Proxy>
    
  5. Reload the config – Apache will happily run forever without re-reading its config file, so once you are comfortable with your new configuration, so you have to remember to tell it to reload.

All the above is still really only half the solution though. Actually, it is what makes controlling individual nodes possible, but it is a pretty manual process still. For this particular property we are using Vlad the Deployer to manage things, though I suspect you could use Capistrano or Puppet just as easily.

desc 'add a node to the server'
remote_task :add_node, :roles => :load_balancer do
  if ENV['node']
    run %{[ -f #{ apache_root }/nodes-available/#{ ENV['node'] } ] && [ ! -f #{ apache_root }/nodes-enabled/#{ ENV['node'] } ] && sudo ln -s #{ apache_root }/nodes-available/#{ ENV['node'] } #{ apache_root }/nodes-enabled/#{ ENV['node'] } && sudo /etc/init.d/apache2 reload || echo "Node does not exist"}
  else
    p 'You need to specify a node to be able to add it. node=foo'
  end
end

desc 'remove a node from the server'
remote_task :remove_node, :roles => :load_balancer do
  if ENV['node']
    run %{[ -f #{ apache_root }/nodes-enabled/#{ ENV['node'] } ] && sudo rm #{ apache_root }/nodes-enabled/#{ ENV['node'] } && sudo /etc/init.d/apache2 reload || echo "Node does not exist"}
  else
    p 'You need to specify a node to be able to remove it. node=foo'
  end
end

(Yes, I know that the error messages are not very nice, or even accurate in a lot of situations, but it does the job which is all I require of it right now.)

To enable a node you simply do a ‘rake lb:add_node node=zfpN’ where lb is the namespace you created for your load balancer machine and N in this case is the node number. In this example there is only zfp1 and zfp2 but it should scale linearly.