Hi, crunchers !
I do not know how E@H will assess the impact of the recent outages for the project - there are so many things that can go rrrong, and did, in your environment. The IT shop is only one of the many action places in E@H...
In a commercial setting, however, people would go over the design and try to spread the data across several servers, to minimize the impact of such failures and speed up the backup / reload processes as well ...
I suppose E@H had good reasons to design things that way - financial aspects aside, what were they ?
Kind regards
-rg-
Back into crunching mode - my shallow queue is exhausted pretty rapidly, and my RAC plumbs down every time Murphy hits you - I have to think about making my own backups, that's for sure !
Copyright © 2024 Einstein@Home. All rights reserved.
We like variety : how about converting E@H to multiple points of
)
Until the stability problems are resolved I'd adviseeither a multiday queue, or a backup project.
When all else fails try the
)
When all else fails try the backup server...and when that fails. PANIC!!!!!!!
RE: When all else fails try
)
it was "double panic" for old UNIX users like me.
Tullio
This sounds bad... I'm only
)
This sounds bad... I'm only just starting to learn about Unix but I think I can figure out what "double panic" signifies. Glad I always have at least one backup project up my sleeve, though this longer outage wreaked havoc with my RAC and debts.
RE: This sounds bad... I'm
)
It is indeed. That's why I joined another project as well (very unwillingly actually). But it didnt solve the complete problem. The previous outage resulted in prolonged crunching of my backup-project. In order to repay the debts, the computer had to run a lot of Einstein WUs. But during that period, another outage occured. Somehow, BOINC did not switch to the back-up. So, my computer went idle anyway... Only after I reset that project, it restarted duties.
So, a backup project is effective for a second outage only if the debts are already repaid?
Regards,
Bert
Somnio ergo sum
It shouldn't be, really... in
)
It shouldn't be, really... in theory I think BOINC should build up debts rather than leaving the box idle. But I've never tried it with really different resource shares, the resource shares on my two boxes are 50/50 and 50/25/25 atm so I can't build up THAT much debt over a couple of days...
I have 3 projects running on
)
I have 3 projects running on my Pentium II, Einstein, SETI and QMC. So,even if 2 fail, at least one is running.
Tullio
Im running 4 projects on one
)
Im running 4 projects on one box and 3 on my others. If worst all else fails i always have CPDN on my main machine.
During the "December to
)
During the "December to Remember", and the current "January of Failure", I discovered a little distributed computing botique called "Global Community Grid". I am not a tout for this project but they do give you a choice of running Boinc, or their own organic app. The organic app lets you avoid the tedious allocation of project shares. Subscribe with their app and run until "Thunderthud" the E@H server comes back up. Simply exit their app and start E@H. Works for me, and I keep my house warm regardless of "Thunderthud's" health. Of course some of the WCG WU's will expire as long as "Tthud" is up, but the way things are going they won't lose many.
Tegards-tweakster