What to do about the current Seti Server Overload

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,570
Credit: 81,819,758,496
RAC: 67,189,369
Topic 190314

If you crunch for both EAH and Seti on your computer(s), you cannot help but notice the problems at Seti. Due to the impending closure of Classic Seti, all BOINC projects are experiencing an upsurge in new user activity. Take a look at the EAH situation here and the Seti situation here. In both cases scroll down and look at the graphs for the last 60 days for things like total users, new users per day and hosts per day. Pretty scary stuff and in the case of Seti, little wonder that their servers are unable to cope.

I really think that there is potential for things to get worse for Seti in the short term. EAH seems to be coping very well although it should be noted that the percentage increase here is a lot less than at Seti.

What do ordinary users need to do about this? Well, I guess it depends ...

If you support both projects, here is a list of actions you could consider, both for your own peace of mind and to do something proactive to help Seti cope with the crisis. These are just personal suggestions and should not be construed as having any "official" sanction. They are made in particular to give direction to any participant concerned about their inability to get/return work to Seti and the impact this might be having (including all the red messages) on their own crunching activities for other projects.

  • * Select the Seti project and hit "No new work". This will allow your client to finish current work and then stop it hammering the Seti servers for more.

* When your Seti work is completed and uploaded, you can "suspend" the Seti project to stop all communication with the Seti servers. If you still have work stuck in upload, "suspending" seems to still allow those upload attempts to continue, and hopefully your finished work will gradually clear. At this point you will simply need to monitor the Seti project and "unsuspend" when things return to something more akin to normal. You will also have to be mindful of deadlines so that you "report" any uploaded work before the deadline.

* The above two steps will have other benefits for your peace of mind. At the moment, if you support both EAH and Seti, the lack of work from Seti is probably building up a big debt in favour of Seti. The scheduler, mindful of this, will be reluctant to get EAH work because Seti needs to run and not EAH. The scheduler can't know the real strife at Seti. All it knows is that Seti may have work next second so it had better not overstock on EAH in case that happens. I've even heard of cases where the machine is totally out of work before more EAH work was allowed.

New EAH work will be downloaded the minute the cpu becomes idle so it isn't really a problem but people love to worry about it. However, as noted in the edit below, this may not be true if you have a "stuck" download from Seti still in progress. If you have suspended the Seti project, then the scheduler will know it has to get work from any other available project (even ones with large negative LTDs) and EAH work will more regularly flow to your work tab.

* If you like the thought of having a backup project and if Seti was your backup for EAH (or vice versa :)), you might now be concerned that you don't have a backup. Well now would be a golden opportunity to start investigating other BOINC projects. If you're really into physics, then LHC might be appropriate. They have a fresh batch of work just today, but their work is intermittent. You can check out a number of boinc projects here.

If you have any thoughts to add or questions to ask, please contribute them to this thread.

EDIT: 08 December 05

In the notes above, a comment was made that if the CPU became idle, work would be immediately downloaded from another project. This is indeed the way things are supposed to happen according to John McLeod VII who wrote the code. The problem is that because most people will have work stuck "in transit" to Seti under the current abnormal conditions, BOINC thinks there is still Seti work available and even although your CPU has become idle, these uploads and downloads to Seti which are stuck and going nowhere, seem to be able to prevent another project from grabbing more work. The solution is very simple - just "suspend" Seti and then BOINC knows there is going to be no work from Seti. BOINC will then allow other projects, even those with large negative LTDs to get immediate work for that idle CPU.

Cheers,
Gary.

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 71,180,862
RAC: 29

What to do about the current Seti Server Overload

As one who started out in SETI, like most of us, then added Einstein as a "backup" during one of the _other_ big SETI outages, I agree with all that Gary has said. Realistically, the few hundred of us who read this and "lighten up" our load on SETI won't make much difference except in our own piece of mind. While everyone else is hammering the SETI servers and worrying about getting another WU, we'll be busy doing other things. :-)

I personally have decided that my CPU time should go to projects that have a high possibility of "accomplishing something". While the effect of a _successful_ SETI search can't be underestimated, and SETI will remain in my resource share, the realistic probability that SETI will "find something" in the near future, whether I contribute or not, is near zero. On the other hand, the probability that Einstein will make new discoveries, that Protein Predictor or Rosetta will develop tools to fight disease, or that Climate Prediction will improve long-term meteorological forecasting, is much higher. (I'd mention other projects, but I'm not familiar with very many, I just don't have enough computers. Send donations of Quad G5's and I'll change that.)

This good news/bad news situation for the SETI folks is looking to be purely good news for other Distributed Computing projects. To anyone who has just joined - welcome!

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,570
Credit: 81,819,758,496
RAC: 67,189,369

To anyone who thinks they

To anyone who thinks they have an EAH problem "similar" to the Seti problem

Please realise that the current Seti problem, which may very well get worse, is being caused by drastic server overload that is pretty much unique to Seti. It's not a BOINC thing really, it's a Seti thing. If you are concerned about Seti problems and want some suggestions then please read the ideas in the opening message of this thread.

As far as I'm aware, EAH is NOT having any similar issues at the moment. Personally, I have quite a number of boxes that are showing zero problems in uploading/downloading results. EAH seems to be coping rather well and it makes a good safe alternative for those having problems with Seti.

If the Seti problem does degenerate, we could expect to see many more users joining here and at some point overloading might occur. There certainly doesn't seem to be a problem at the moment and there is no reason to think that one might be imminent.

If you believe that you are seeing a problem, the smartest thing you can do is DON'T PANIC. Please DO NOT start changing preferences blindly, or start clicking random functions like manual update or reset or detach as these sort of actions could easily make things worse for you.

The smartest action you can take is to do NOTHING. Just copy the messages (plus a decent amount of context) that are concerning you, from your BOINC Manager messages window and post them in the "Problems" forum with any other supporting information as to why you think there is a problem. Someone will look at the messages and be able to give you advice about any further action you need to take.

Cheers,
Gary.

Stick
Stick
Joined: 24 Feb 05
Posts: 790
Credit: 15,273,629
RAC: 40,910

RE: * Select the Seti

Quote:

* Select the Seti project and hit "No new work". This will allow your client to finish current work and then stop it hammering the Seti servers for more.

* When your Seti work is completed, to stop hammering the Seti servers while trying to upload it, "suspend" the Seti project. At this point you will simply need to monitor the Seti project and "unsuspend" when things return to something more akin to normal. You will also have to be mindful of deadlines so that you allow the work to be uploaded before the deadline.

Gary,

I have done as you suggested but I have noted that BOINC is still trying to upload finished WU's even though I have suspended SETI (under the Project tab). BTW: I am using 5.2.13.

Stick

Jord
Joined: 26 Jan 05
Posts: 2,952
Credit: 5,779,100
RAC: 0

That's because the NNW

Message 20747 in response to message 20746

That's because the NNW setting only halts incoming work. It has no say over outgoing work.

If you want to stop BOINC from trying to upload to Seti, you can only do so by suspending the network access. But that disconnects the whole of BOINC from the internet.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,570
Credit: 81,819,758,496
RAC: 67,189,369

Stick, I've just looked

Stick,

I've just looked back through the messages records for one of my boxes that had both EAH and Seti running. I had set the "No new work" on December 01 and then "Suspended" Seti after all work was returned on December 02. There is not a single further Seti contact message in the log after that time and I assumed that "suspending" did stop all further contact. Looks like I'm wrong and that transfers that are "in the pipeline" will be allowed to complete and that new ones wont be started. So thanks for the "heads up". I'll change the message accordingly.

Actually in a way, this is a good thing for users as the backlog of "in the pipeline" uploads might actually get cleared progressively if some of the repeat attempts actually make it through. The important thing is to stop adding to that pipeline by continuing to get new work that doesn't seem to have much chance of making it back. I've got a couple of very slow old clunkers that had been doing Seti only. They are not able to upload but they do get some downloads although this is becoming impossible too. I'm just switching them to EAH even although an EAH result will take nearly 3 days :).

Cheers,
Gary.

Cruncher
Cruncher
Joined: 6 Dec 05
Posts: 3
Credit: 692
RAC: 0

RE: New EAH work will be

Quote:
New EAH work will be downloaded the minute the cpu becomes idle


Unfortunately, this is not correct. After finishing my Einstein package, no new WU was downloaded (having one still unsuccessful SETI WU download in the queue). Only after actually suspending SETI, Einstein started downloading a new WU.
This is a non-optimal behavior and could be considered a bug, since the CPU is idle if both SETI and Einstein are active :-(

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 71,180,862
RAC: 29

"Downloading" work does count

"Downloading" work does count as being present. If you have no SETI work on hand, best to suspend SETI until the problems are solved. Which requires keeping an eye on it if you have it at No New Work...

Paul D. Buck
Paul D. Buck
Joined: 17 Jan 05
Posts: 754
Credit: 5,385,205
RAC: 0

Just one note. Some work is

Just one note. Some work is getting through. I got to "report" a number of SETI@Home results this morning. Granted I have many more queued up ... :)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,570
Credit: 81,819,758,496
RAC: 67,189,369

RE: Unfortunately, this is

Message 20752 in response to message 20749

Quote:
Unfortunately, this is not correct. After finishing my Einstein package, no new WU was downloaded (having one still unsuccessful SETI WU download in the queue). Only after actually suspending SETI, Einstein started downloading a new WU.

You are perfectly correct and I've added a comment about it to the original post. JM7, who wrote the code, made this comment about a similar situation at the time of Seti's last big outage:-

Quote:
There is a bug (fixed in 4.72) that caused the work download scheduler to count completed work as in progress until it is reported. If a CPU is out of work, the work download scheduler is supposed to try to contact any project for work (even if it has a negative LT debt).

That quote comes from this thread.

I took that comment to mean that the bug was fixed and it probably was for stuck uploads that can't complete the upload and reporting operations. It's probably the stuck downloads that are now causing the same behaviour. I'd assumed that all stuck traffic in both directions would have been taken care of in the fix that John refers to. Obviously not ...

Anyway, many thanks for your comment about this. As it happens, I've just noticed this behaviour on one of my own boxes. It had a stuck Seti download and I just watched it finish its last EAH result. I was interested to see exactly what would happen. I expected to see a momentary CPU idle state, followed by the fairly immediate download of new EAH work (EAH had large negative LTD). Ten minutes later when nothing was happening, I updated EAH and was told that work was not required :). So, I suspended Seti and got immediate work.

Cheers,
Gary.

Skyflash
Skyflash
Joined: 7 Dec 05
Posts: 2
Credit: 413,805
RAC: 0

I have 1 WU queued for

I have 1 WU queued for downloading, I stopped requesting new work, suspended Seti, and wanted to cancel the to be downloaded WU. But don't I have to connect to Seti for that request? In messages it says:
request reschedule cpus: result op
and
request reschedule cpus: project op

But in Work the WU is still trying to be downloaded, is this normal?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.