Forcing BOINC action?

ADDMP
ADDMP
Joined: 25 Feb 05
Posts: 104
Credit: 7332049
RAC: 0
Topic 188879

I have a computer that was running disconnected from the network for a number of hours. When I reconnected it, there seemed to be one finished WU under the "transfer" tab, & two more finished WUs under the "work" tab. The computer was in a state of "communication suspended for 8 hours". I was able to "retry" the transfer operation & the WU under the transfer tab was then immediately uploaded. But the state of suspension is still in effect for the two finished units under the work tab. I cannot find a way to force it to try to transfer these two WUs.

Is there any way to force BOINC 4.25 to leave its state of suspension now? Simply stopping & restarting it has no effect because it remembers its suspension time.

Thanks,

ADDMP

Blank Reg
Blank Reg
Joined: 18 Jan 05
Posts: 228
Credit: 40599
RAC: 0
ADDMP
ADDMP
Joined: 25 Feb 05
Posts: 104
Credit: 7332049
RAC: 0

> Update project > Thanks,

Message 10127 in response to message 10126

> Update project
>
Thanks, that did it.

ADDMP

gravywavy
gravywavy
Joined: 22 Jan 05
Posts: 392
Credit: 68962
RAC: 0

> I have a computer that was

> I have a computer that was running disconnected from the network for a number
> of hours. When I reconnected it, there seemed to be one finished WU under the
> "transfer" tab, & two more finished WUs under the "work" tab. The computer
> was in a state of "communication suspended for 8 hours". I was able to "retry"
> the transfer operation & the WU under the transfer tab was then
> immediately uploaded. But the state of suspension is still in effect for the
> two finished units under the work tab. I cannot find a way to force it to try
> to transfer these two WUs.

hi,

You've got the proactical solution now, but you (or other readers) may wonder what was going on. In fact there were two different things going on, which is why one WU behaved differently to the other(s).

When a WU completes there is a two stage process of communication with the project. Firstly the results file is uploaded to the project file server, this usually starts within a few minutes of the WU being completed. This is the process known as uploading. Then some time later (usually not until your client decides it needs more work) the completion is reported to the project database server.

So the normal cycle of the status in the work tab is

Downloading
Ready to run
Running
Ready to upload
Uploading [at this point the WU is also seen on the transfer tab]
Ready to report

My best guess is that when you forced the upload of the one WU the other one was already at "Ready to report". After you forced the upload they both were.

At that point you were not waitng for the rest of the 8 hrs to transfer, but for the next time when your client would ask for more work. One of two things could have been going on:

either the client had enough work to keep it happy and you would have waited until the appropriate amount of crunching had been done on the current WU,

or, it had already run out of work and there was another set of backing off messages in the massages tab to reflect the way it was backing off on that request.

Whichever was the case, everything went back on track when you did the update.

~~gravywavy

gravywavy
gravywavy
Joined: 22 Jan 05
Posts: 392
Credit: 68962
RAC: 0

> Update project > yes

Message 10129 in response to message 10126

> Update project
>

yes update project is appropriate here

However, I feel it is important to mention the one time *not* to use it.

**Please** do **not** use update project, **nor** force a WU upload, when the central servers have been down, or might have been.

Reason:

When the central servers come back after downtime, there will be many many clients all wanting to check in. If all of them tried at once it would pull down the network connection close to the project server (perhaps crash their connection to their ISP, or perhaps part of thier LAN). The backoff process is designed to spread that load out over a more manageable time. That is why the backoff starts small and gets progressively bigger: the longer the central downtime the more clients will be wanting to send data and therefore the more time each client is asked to wait between tries.

If too many people try to speed up their client's backoff process they would crash the network and from the outside that would look just like a second central failure. Then the same might happen again when the network recovered, ...

If you know in advance you are going to have a local net outage then it can be handy to force any uploads and reporting that are outstanding. LIkewise if the connection is frequently down -- like mine :-( -- you might want to tell the client to seize the chance while the connection is live.

If you have a net outage and know it really was a local problem (your machine, your LAN, etc) feel free to get your client back to normal status as soon as you like, though you don't actually have to: the backoff will still work OK. Being human, though, you might like the reassurance and there is no harm in that after a local problem is resolved.

But whenever you do not know for sure that the interruptiuon is/was local please trust the backoff process to come right in the end and don't try to accelerate things you might make them worse for everyone else!

E@H has not had much central downtime but in the long term it is likely that there will be a few -- all computer systems are fallible -- and it would be a pity if the users unwittingly prolonged the agony.

~~gravywavy

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.