[clug] Distributed Administration

Andrew Janke a.janke at gmail.com
Thu Sep 24 20:17:21 MDT 2009


On Fri, Sep 25, 2009 at 11:18, Daniel Pittman <daniel at rimspace.net> wrote:
>> What do people who run large clusters/GRID computing do??
>
> At least a few hang about here and can probably answer better than I can, but
> my experience is that they usually use a "bare metal reinstall" model, and
> have a good automatic-installation system.

A flexible bare metal install is really the only way to go.

In my case I use both FAI and cfengine2, and there are good reasons for both.

   http://www.informatik.uni-koeln.de/fai/

FAI is primarily debian based but does work in Ubuntu too (I use it
for both desktop and cluster installs). You can get FAI to make your
own CD/DVD including a stripped down mirror archive to install from
but I use PXE. The process sort of goes like this:

1. Setup various config files  (bit of a  learning curve here)

2. Use debootstrap to make an "nfsroot" and PXE pointer things for tftpboot

3. Setup DHCP + tftpboot

4. Turn on new machine, watch and grin.  (takes about 5 minutes no to go).

The install via PXE then goes like this: PXE -> Boot from nfsroot mini
system on master via NFS -> mount remote config files + mirror ->
install system from NFS mirror -> install extra packages -> config
system -> reboot   Depending on how you do things there is often a bit
more fiddling to do on the first boot via boot-once scripts for nvidia
modules and the likes.

For me the config consists of install packages + install local root
SSH key.  from here cfengine2 takes over and is called via cfagent on
the first boot (and from then on via cron). cfengine does all the user
management (cfpasswd), day to day package management, config, etc.

There are mechanisms within FAI in which you can do a "softupdate" so
that you just update a system as compared to a reinstall. Still for
what I often need to do this is not enough. I used to have nearly all
the config file fiddling in FAI but have found it is more useful to
have it all in cfengine2.

Both of these tools have a bit of a learning curve (especially the
rather complex self-healing setup of cfengine2), but I can provide
example configs to those who might like it.   If only an example
cfagent.conf file.

If I could get to CLUG talks I would be happy to give a demo on this
but time time... :)


a


More information about the linux mailing list