Tutorial for cluster building?

induktio
induktio
Joined: 1 Oct 10
Posts: 15
Credit: 10144774
RAC: 0
Topic 197195

For quite a some time now, I've been maintaining a small "cluster" of Linux machines, each booting with PXE and mounting their (shared) operating system disk with NFS from the host computer, and then load their own instance of BOINC for crunching (from NFS also).

Sure, it's not a true cluster in the sense they don't have any shared memory space, but the real benefit there is the simplicity of maintenance. A new computer can be added to the cluster just by changing a couple of configuration files, no installation of a new OS is required. Now I'd like to ask how many there is who'd like to read a tutorial on how to build this? The tutorial is bound to be quite a long because of all the configuration required.

Stranger7777
Stranger7777
Joined: 17 Mar 05
Posts: 436
Credit: 418292883
RAC: 35668

Tutorial for cluster building?

Ten years ago I built some similar "clusters" but on Novell Netware 3.12. Nowadays I am far from this just because I'm not familiar with Linux. But this tutorial will be an excelent guide for such things and may help to make a cluster of my own under the guidance. So I say, Yes, I would like to read it.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5845
Credit: 109965745833
RAC: 30670544

I would also be very

I would also be very interested in reading about your experiences. I use linux exclusively these days on all my own hosts - about 70-80 or so. There are about another 15 or so belonging to my daughter's business which share my account. Most of them are iMacs or Mac minis running OS X, but there are a couple of Windows machines in there as well. All these machines are on two LANs at two separate locations and they are all self standing from the point of view of OS and BOINC installation.

On each LAN there is a host which has the additional duty of maintaining a few file shares using samba and rsync and some fairly simple shell scripts to automate things. One of those shares is for OS installation and maintenance. The distro I use has a 'rolling release' paradigm so I keep a local repository from which I can upgrade machines whenever I want without wasting bandwidth. Keeping the repository fully updated only consumes perhaps 2-3GB per month. I also keep an external USB drive with a bootable image of the latest release. So, for a new build, it's less than two hours from a collection of parts to crunching, since the initial OS install and update from local repo are both lightning fast, each taking little more than a few minutes.

I never actually 'install' BOINC these days. On a second file share, I maintain templates that contain the complete file structure of an installation, fully populated with everything needed. I don't 'attach' to projects and I don't allow new host IDs to be allocated by a project. I always allocate a previous ID to a new machine. At one point a few years ago when I acquired a very large number of mainly Tualatin PIII working machines, I had over 200 machines on my account. They were mostly running Windows and after decommissioning, I've been reusing their IDs on replacement machines. I have suitable state file templates into which I can insert the previous ID in order to reuse it. So, to 'install' BOINC, I simply copy the appropriate template from the file share and edit the state file template to allocate the desired ID(s) for the projects the host will join. The account files are part of the template so a new host is immediately recognised as being attached.

On a third file share, I keep a managed collection of any reusable data files and executables that would otherwise need to be downloaded time and again. Obviously, such files will be downloaded once at some point by some host but they get synced to the file share and then deployed to all other hosts that might need the file. After deploying a BOINC template to a new host, the final step is to populate the project directories with the latest data collections appropriate for the sub-projects that will run. This is useful for GW and FGRP sub-projects but not for BRP4/5 where each task needs unique data. BRP5 isn't much of a problem since the data payload per task is quite small but this isn't true for BRP4G. Recently, I allowed a subset of hosts to do BRP4G instead of BRP5 and found my full monthly bandwidth limit consumed in about 5 days. I have no such problems with a mix of BRP5 and FGRP2 tasks over the entire fleet.

By caching and deploying FGRP data files, both to new hosts and existing hosts, a lot of bandwidth is saved. I pay for one of the internet connection points and piggyback on the other so I need to be a bit careful about how I abuse the second one. I suppose I could just buy bigger pipes but I quite enjoy the challenge of being minimalistic :-). Being an Australian, I want to crunch the Parkes data anyway so I'm quite happy to stay with the much smaller BRP5 data footprint.

So the upshot of all this is that I don't really regard the initial installation of the OS, or the 'installation' of BOINC, or attaching to projects, or the initial large set of downloads, as much of a hassle. Consequently, I haven't been tempted into trying to develop something along the lines of your 'cluster' setup. However, I certainly would like to read of your experiences as I'm certain there would be better ways of doing things than the way I've stumbled into.

One final point I'd like to add. Even though all my BOINC 'installations' are preloaded with every possible data file they might need, the disk requirements are quite modest. When I acquired the bulk of my PIII hosts in 2006/2007 they came with a particular model of seagate 20GB IDE drive, very few of which have failed. The bulk of my fleet still use this same drive and a full standard installation (OS + KDE GUI + BOINC + all data) still uses less than half of it. It's just about impossible to buy new mobos with IDE interfaces any more so I'll have a problem with future upgrades. For the moment I've bought mobos with PCI slots (also dying out) and using PCI IDE interface cards I've acquired over the years. I've also found a small supply of SATA IDE converters that plug directly onto an IDE drive connector and they seem to work OK. I got desperate recently and started using some HP Megaraid cards out of old HP servers. I had 18.2GB and 36.4GB SCSI drives lying around and was pleasantly surprised to find that I could easily configure a new RAID 0 array on a drive using the on-card utility and then find that the linux installation would recognise the array and very happily install. So my very latest hosts are quite happily using these old cards/disks.

Cheers,
Gary.

induktio
induktio
Joined: 1 Oct 10
Posts: 15
Credit: 10144774
RAC: 0

Hehe, assuming you have some

Hehe, assuming you have some 15+ machines, a cluster setup is tailor-made for them maintenance-wise. I for one would not bother watching that many computers. My setup is very scalable, and BOINC's storage requirements being very modest, just one NFS server could easily serve dozens of 'cluster nodes'. I'm not even sure what's the limit there, since Einstein@Home among others uses very little NFS bandwidth while crunching because the clients just cache the files in their local RAM.

Some time ago there was this Unofficial Boinc Wiki, homewer it seems it's been down for a while? They had this article on Creating a diskless cluster. I might reuse some ideas from there. Stay tuned..

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 539750847
RAC: 150739

Wow Gary, that sounds

Wow Gary, that sounds impressive! Some thoughts on storage.. assuming you can't switch straight to a system without local disks over night:

- There are still combined IDE/SATA to USB adapters around. If your Linux can boot from these you could use them to prolong the life of your IDEs
- The Raid 0 with old SCSIs, does it contain several disks? If so there's unnecessary additional power consumption, since you don't gain any benefit from the Raid 0 anyway.
- SCSI drives tend to consume more energy than IDEs, which themselves use more than modern drives. Takes a looong time to pay for new ones, though.
- 16 GB USB thumb drives might also be a good idea for local storage for pure crunchers (basically no power ocnsumption)
- old laptop drives could also be good - barely usable otherwise and lower power consumption than desktop drives

Since you're running this many machines it doesn't sound like you're too worried about consumption. Still, the less the better, right? In germany 16 GB NoName USB 2 sticks start from ~7.5€. And old SCSI HDD typically consumes about 10 W, which would cost me 20€/year. An old IDE drive draws 7 W (-> 14€/a), a modern one 5 W (->10€/a) and an old laptop drive about 2 W (-> 4€/a). Assuming the drives run 24/7 those USB sticks would pay for themselves quickly! But this depends on your electricity cost, of course.

Edit: or to view it the other way around.. running 80 old IDE drives at 7 W each would cost me 1120€/year. That would be enough motivation for me to go diskless.. but then I'm not even running a 2nd discrete GPU because my main home system is expensive enough as it is ;)

MrS

Scanning for our furry friends since Jan 2002

robl
robl
Joined: 2 Jan 13
Posts: 1709
Credit: 1454553658
RAC: 3649

Here is a link to an

Here is a link to an interesting "cluster" for the Raspberry Pi:

pi cluster

edit
and this link: lego pi cluster and this one
I think this link may have been one of the originals.

induktio
induktio
Joined: 1 Oct 10
Posts: 15
Credit: 10144774
RAC: 0

Ok, now the draft is

Ok, now the draft is ready!

http://induktio.github.io/2013/09/21/creating-boinc-clusters/

It is quite long, and if I forgot anything essential, please post here!

Matt Giwer
Matt Giwer
Joined: 12 Dec 05
Posts: 144
Credit: 6891649
RAC: 0

RE: Ok, now the draft is

Quote:

Ok, now the draft is ready!

http://induktio.github.io/2013/09/21/creating-boinc-clusters/

It is quite long, and if I forgot anything essential, please post here!

Just whipped that out did you? ;)

I'm going to read it but ... I got into multiple computers simply by not putting the old ones in the closet when I upgraded. They take care of themselves quite nicely. Other than bragging rights I can't see much point in it. Not that I mind bragging of course. Maintenance is no more than shelling in occasionally to see if everything is going alright and it almost always is.

induktio
induktio
Joined: 1 Oct 10
Posts: 15
Credit: 10144774
RAC: 0

Well at first I kept thinking

Well at first I kept thinking about writing it, but for some time didn't actually get around to write anything.

Sure, the usefulness of diskless booting depends on the use case scenario. If you just install BOINC once and *never* change anything, just going for separate installs might be easier. But if you need to perform some maintenance on all computers, the diskless setup is very useful.. Still, I'd be very interested if you have any ideas on improving the tutorial.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.