Daily Quota

MAGIC Quantum M...

Joined: 18 Jan 05

Posts: 1945

Credit: 1483606102

RAC: 1288172

17 Jan 2015 7:53:11 UTC

Topic 197932

(moderation:

)

http://einsteinathome.org/host/11673771/tasks

For some reason that happened and when I happened to check that host it had 112 sitting there *Error while computing*

So m I manually sent them in and of course I got that *Daliy Quota* on the Boinc Event Log and it is telling me right now I have to wait 17 hours.

Rather not do that and I know the 660Ti works and I hate to EVER have it sitting there with no Einstein tasks (just 2 vLHC tasks)

It has been years since I changed that file so I forgot how to edit it or where it is.......tried looking around and did see it on a page but couldn't edit that.

Midnight so I will probably be thinking about this until I fall asleep.......and when I get up it will still want me to wait

Claggy

Joined: 29 Dec 06

Posts: 560

Credit: 2798290

RAC: 2801

Daily Quota

17 Jan 2015 12:02:05 UTC

Message 129509

(moderation:

)

They errored because:

Quote:

7.4.36

app_version download error: couldn't get input files:

cufft_xp32_32_16.dll
-120 (RSA key check failed for file)
signature verification failed

Claggy

MAGIC Quantum M...

Joined: 18 Jan 05

Posts: 1945

Credit: 1483606102

RAC: 1288172

Yeah I saw that all 112

17 Jan 2015 12:12:48 UTC

Message 129510 in response to message 129509

(moderation:

)

Yeah I saw that all 112 times.

Makes no sense because it is no different than it was before.

I rather not wait another 12hrs and 45mins to try getting tasks again and getting this one back to work......and yeah it is 4am here and I am still awake.

Claggy

Joined: 29 Dec 06

Posts: 560

Credit: 2798290

RAC: 2801

RE: Yeah I saw that all 112

17 Jan 2015 12:46:03 UTC

Message 129511 in response to message 129510

(moderation:

)

Quote:

Yeah I saw that all 112 times.

Makes no sense because it is no different than it was before.

I rather not wait another 12hrs and 45mins to try getting tasks again and getting this one back to work......and yeah it is 4am here and I am still awake.

Try downloading the file manually, and over write the original:

http://einstein2.aei.uni-hannover.de/download/cufft_xp32_32_16.dll

Claggy

MAGIC Quantum M...

Joined: 18 Jan 05

Posts: 1945

Credit: 1483606102

RAC: 1288172

Thanks Claggy

17 Jan 2015 12:48:50 UTC

Message 129512 in response to message 129511

(moderation:

)

Thanks Claggy

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5883

Credit: 119040088184

RAC: 24791185

RE: http://einstein.phys.uw

18 Jan 2015 4:20:04 UTC

Message 129513

(moderation:

)

Quote:

http://einsteinathome.org/host/11673771/tasks

For some reason that happened and when I happened to check that host it had 112 sitting there *Error while computing*

So m I manually sent them in ...

Big mistake right there ... :-).

Here's a tip for you. If you ever again find a host with a large slab (or all) of its work cache trashed by something silly like a failed checksum or a missing file, do NOT 'manually' send them in by clicking 'update'. They will be counting down the remaining part of some timeout that was assigned at the time the 'error' was discovered. This is your 'window of opportunity' so just say a little 'thank you' to BOINC for not having returned them all at this point. Then quickly shut down BOINC - after noting the names of just the trashed tasks if there are 'good' ones left as well. Also note the names of any completed tasks so you can be sure to leave them untouched.

Essentially what you are going to do is remove (or 'correct' if you're a masochist) the blocks for all trashed tasks from the state file (client_state.xml). You would also do as Claggy suggested and download a fresh copy of the file that caused the problem in the first place. When your client is restarted and talks to the server, the server will notice all the missing results, declare them to be 'lost', and will very obligingly send you brand new copies to replace them, thus 'fixing' your state file for you. If you really are a masochist, you will do the 'fixing' yourself, perhaps to save yourself the ignominy of having the server know that you've been fiddling with your state file, naughty boy :-). The 'fixing' is relatively simple but rather tedious if you have to visit 112 individual blocks to remove the error details that have been inserted there and to fix other things like status flags, etc. It's quite doable, just tedious. Much quicker to completely remove the entire blocks and have the server resend them to you.

That's the simple overview - the devil is in the details :-). After you fix or remove all the blocks for trashed tasks, you need to search through the rest of the state file for other 'damage'. This is quite easy to find as it always seems to be setting a status flag with a negative number denoting an error condition. I always just search for the string '>-'. There are not many flag values that start with a minus so you will easily find things like '-119' or -161, which are two of the values I remember seeing in past episodes of state file fixing. The status needs to be changed back to '1' for files (like your example) that have been sent out to you. For finished data to be uploaded, the status is '0' - if I remember this all correctly :-). I probably do a couple of these 'fixes' per year and I always work it out from the context, rather than writing it down in intricate detail :-). After all, who bothers reading the manual if you can just wing it :-).

The other thing I do is preserve all downloaded data files associated with onboard tasks. For example, for BRP5, these will be the files starting with 'PB...' and ending in '.bin4' or '.zap' For 100+ tasks, this represents a lot of data. You need to take a backup copy of it all just after shutting down BOINC. When BOINC restarts, it is likely to delete the data for s you have removed from the state file. So, anything deleted by BOINC because of this, can easily be restored just before you initiate the contact with the server that will discover the lost results. Then when the server sends batches of lost results, you will not need the downloads. You will see messages saying "... file exists, skipping download."

I've used this procedure successfully many times over the years. For a case like yours, it would have taken about 15 mins or so and would have avoided the daily quota issues.

Another point, what happened to the host you linked to? It no longer has any tasks at all and the hostID isn't listed as one of yours any more? Did you create a new hostID to get around the backoff and then merge in the old one? Ah yes, it seems you did!!

Cheers,
Gary.

Daily Quota

Forums › Cruncher's Corner

Daily Quota

Yeah I saw that all 112

RE: Yeah I saw that all 112

Thanks Claggy

RE: http://einstein.phys.uw

Comment viewing options

Forums › Cruncher's Corner