Kudos to Bruce!

ralic

Joined: 8 Nov 04

Posts: 128

Credit: 695810

RAC: 0

Thanks Bruce. Your E@H

1 Jun 2006 7:29:52 UTC

Message 35966

(moderation:

)

Thanks Bruce.

Your E@H team would be better advised to screen all these S@H refugees more closely in future. I suspect an unnoticed foreign body was brought onboard and got into the air conditioning system, causing the failure.

Maybe we should quarantine all the S@H refugees until further notice... ;-)

Pooh Bear 27

Joined: 20 Mar 05

Posts: 1376

Credit: 20312671

RAC: 0

I also wish to state that I

1 Jun 2006 11:41:53 UTC

Message 35967

(moderation:

)

I also wish to state that I am highly impressed at the speed of recovery after that long of an outage. I saw very little lag time to seeing uploads happen, reports going through, and new work being downloaded. I know that this wasn't the case for everyone, because of location, and understand getting all the servers across the world in synch takes time, but still, WOW!

This project really is on it's toes, and is really well set up. Of course they have had help by watching other projects, and then doing it several times better.

Kudos! Keep up the impressive work.

Stefan

Joined: 15 Nov 05

Posts: 52

Credit: 761198

RAC: 0

Thanks to all that got

2 Jun 2006 23:47:51 UTC

Message 35968

(moderation:

)

Thanks to all that got Einstein back up again!

And a special thanks to Bruce, brilliant guy... ;)

Human Stupidity Is Infinite...

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117913721316

RAC: 34597483

RE: I also wish to state

3 Jun 2006 5:10:45 UTC

Message 35969 in response to message 35967

(moderation:

)

Quote:

I also wish to state that I am highly impressed at the speed of recovery after that long of an outage......

This is actually the most impressive part of the whole saga. Based on the experiences with the Seti servers after a long outage, one would expect to see some difficulties in getting results uploaded and reported, and new work downloaded. In my personal experience, I had 80+ machines with several thousand results to upload and report and all hungry for new work.

Virtually all of these boxes needed to be "kick started" because they were all out of work and had communications deferred for intervals of up to 300 hours!!! There was no way I was going to let "nature take its course" :). So, one by one in rapid succession, I made sure each machine's stuck results were uploaded and then updated. It took several hours to do them all. I was probably helped by the timezone as the servers had been up for an hour or two before I started. However, I can't say I ever saw an operation that needed to be retried. Every server contact was handled with little if any abnormal delay. The servers seemed to be able to cope with whatever was being thrown at them!!!

The servers are obviously well designed for the job with plenty of spare capacity for situations like this. Congratulations to all involved!!

Cheers,

Cheers,
Gary.

Bruce Allen

Moderator

Joined: 15 Oct 04

Posts: 1119

Credit: 172127663

RAC: 0

Thank you very much. I

3 Jun 2006 21:15:44 UTC

Message 35970

(moderation:

)

Thank you very much.

I spent most of a day babysitting our servers after restarting the project. At one point we had about 300 machines simultaneously uploading results and downloading new work. The only real bottleneck was validation, and I was able to fix that by running five copies of the validator at the same time.

We really try hard to keep the project up and running 100% of the time. Unfortunately we have still not received any project funding, although I am quite hopefull that the US National Science Foundation will provide funding for us in the future. If this happens we can hire a couple of professionals to help take care of our hardware and software, which should greatly improve our reliability and capability to deal with unexpected problems.

Cheers,
Bruce

Director, Einstein@Home

John Hunt

Joined: 4 Mar 05

Posts: 1227

Credit: 501906

RAC: 0

RE: We really try hard to

3 Jun 2006 21:21:01 UTC

Message 35971

(moderation:

)

Quote:

We really try hard to keep the project up and running 100% of the time. Unfortunately we have still not received any project funding, although I am quite hopefull that the US National Science Foundation will provide funding for us in the future. If this happens we can hire a couple of professionals to help take care of our hardware and software, which should greatly improve our reliability and capability to deal with unexpected problems.

Wow! Bruce - you have performed above and beyond the call of duty!

We salute you!

Stan Pleban

Joined: 2 Dec 05

Posts: 73

Credit: 4635380

RAC: 0

John, I agree with your

9 Jun 2006 2:20:20 UTC

Message 35972

(moderation:

)

John, I agree with your comments regarding Kudos for Bruce...

I was unaware of the funding situation. Cash flow is always important!!

Kudos to Bruce!

Forums › Cafe Einstein

Thanks Bruce. Your E@H

I also wish to state that I

Thanks to all that got

RE: I also wish to state

Thank you very much. I

RE: We really try hard to

John, I agree with your

Comment viewing options

Forums › Cafe Einstein