We like variety : how about converting E@H to multiple points of failure ?

kami4ligo
kami4ligo
Joined: 15 Mar 05
Posts: 48
Credit: 16105651
RAC: 0
Topic 192275

Hi, crunchers !

I do not know how E@H will assess the impact of the recent outages for the project - there are so many things that can go rrrong, and did, in your environment. The IT shop is only one of the many action places in E@H...

In a commercial setting, however, people would go over the design and try to spread the data across several servers, to minimize the impact of such failures and speed up the backup / reload processes as well ...

I suppose E@H had good reasons to design things that way - financial aspects aside, what were they ?

Kind regards

-rg-

Back into crunching mode - my shallow queue is exhausted pretty rapidly, and my RAC plumbs down every time Murphy hits you - I have to think about making my own backups, that's for sure !

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 241

We like variety : how about converting E@H to multiple points of

Until the stability problems are resolved I'd adviseeither a multiday queue, or a backup project.

clownius
clownius
Joined: 16 Jun 06
Posts: 42
Credit: 2164665
RAC: 0

When all else fails try the

When all else fails try the backup server...and when that fails. PANIC!!!!!!!

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

RE: When all else fails try

Message 58961 in response to message 58960

Quote:
When all else fails try the backup server...and when that fails. PANIC!!!!!!!


it was "double panic" for old UNIX users like me.
Tullio

Annika
Annika
Joined: 8 Aug 06
Posts: 720
Credit: 494410
RAC: 0

This sounds bad... I'm only

This sounds bad... I'm only just starting to learn about Unix but I think I can figure out what "double panic" signifies. Glad I always have at least one backup project up my sleeve, though this longer outage wreaked havoc with my RAC and debts.

Lt. Cmdr. Daze
Lt. Cmdr. Daze
Joined: 19 Apr 06
Posts: 756
Credit: 82361
RAC: 0

RE: This sounds bad... I'm

Message 58963 in response to message 58962

Quote:
This sounds bad... I'm only just starting to learn about Unix but I think I can figure out what "double panic" signifies. Glad I always have at least one backup project up my sleeve, though this longer outage wreaked havoc with my RAC and debts.


It is indeed. That's why I joined another project as well (very unwillingly actually). But it didnt solve the complete problem. The previous outage resulted in prolonged crunching of my backup-project. In order to repay the debts, the computer had to run a lot of Einstein WUs. But during that period, another outage occured. Somehow, BOINC did not switch to the back-up. So, my computer went idle anyway... Only after I reset that project, it restarted duties.

So, a backup project is effective for a second outage only if the debts are already repaid?

Regards,
Bert

Somnio ergo sum

Annika
Annika
Joined: 8 Aug 06
Posts: 720
Credit: 494410
RAC: 0

It shouldn't be, really... in

It shouldn't be, really... in theory I think BOINC should build up debts rather than leaving the box idle. But I've never tried it with really different resource shares, the resource shares on my two boxes are 50/50 and 50/25/25 atm so I can't build up THAT much debt over a couple of days...

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

I have 3 projects running on

I have 3 projects running on my Pentium II, Einstein, SETI and QMC. So,even if 2 fail, at least one is running.
Tullio

clownius
clownius
Joined: 16 Jun 06
Posts: 42
Credit: 2164665
RAC: 0

Im running 4 projects on one

Im running 4 projects on one box and 3 on my others. If worst all else fails i always have CPDN on my main machine.

history
history
Joined: 22 Jan 05
Posts: 127
Credit: 7573923
RAC: 0

During the "December to

During the "December to Remember", and the current "January of Failure", I discovered a little distributed computing botique called "Global Community Grid". I am not a tout for this project but they do give you a choice of running Boinc, or their own organic app. The organic app lets you avoid the tedious allocation of project shares. Subscribe with their app and run until "Thunderthud" the E@H server comes back up. Simply exit their app and start E@H. Works for me, and I keep my house warm regardless of "Thunderthud's" health. Of course some of the WCG WU's will expire as long as "Tthud" is up, but the way things are going they won't lose many.

Tegards-tweakster

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.