Information about the new S5 workunits

ROG
ROG
Joined: 20 Feb 05
Posts: 19
Credit: 217692
RAC: 0

RE: RE: Got my first S5

Message 37537 in response to message 37536

Quote:
Quote:
Got my first S5 unit today. Should I replace akosf's (thanks dude!)albert_4.37_windows_intelx86.exe with the original before the new S5 unit starts processing?

not necessary - the S5R1 App will be a new file.

BM

Cool. Thanks BM!

Dusty33
Dusty33
Joined: 20 Feb 05
Posts: 2
Credit: 182522
RAC: 0

Just to help you all keep

Just to help you all keep track of things I had this happen today on one box.

15/06/2006 22:48|Einstein@Home|MD5 check failed for h1_1281.0_S5R1
15/06/2006 22:48|Einstein@Home|expected baf19a5f22981870f40e152338bf5f20, got 9593892335534bd1caf100e2e3069bc7
15/06/2006 22:48|Einstein@Home|Checksum or signature error for h1_1281.0_S5R1
15/06/2006 22:48|Einstein@Home|h1_1281.0_S5R1

WU download error: couldn't get input files:

h1_1281.0_S5R1
-119
MD5 check failed

Shortly after that I had this happen as well on the same box.

15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_0386.0_S5R1
15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_1233.5_S5R1
15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_0196.5_S5R1
15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_0244.5_S5R1
15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_1281.0_S5R1
15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_0389.0_S5R1

However these WUs are still in the queue to be processed. Don't know what's going to happen when that time comes. Should get to them in the next day or two.

Dusty

Mahbubur
Mahbubur
Joined: 31 Mar 06
Posts: 46
Credit: 258468
RAC: 0

RE: Just to help you all

Message 37539 in response to message 37538

Quote:

Just to help you all keep track of things I had this happen today on one box.

15/06/2006 22:48|Einstein@Home|MD5 check failed for h1_1281.0_S5R1
15/06/2006 22:48|Einstein@Home|expected baf19a5f22981870f40e152338bf5f20, got 9593892335534bd1caf100e2e3069bc7
15/06/2006 22:48|Einstein@Home|Checksum or signature error for h1_1281.0_S5R1
15/06/2006 22:48|Einstein@Home|h1_1281.0_S5R1

WU download error: couldn't get input files:

h1_1281.0_S5R1
-119
MD5 check failed

Shortly after that I had this happen as well on the same box.

15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_0386.0_S5R1
15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_1233.5_S5R1
15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_0196.5_S5R1
15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_0244.5_S5R1
15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_1281.0_S5R1
15/06/2006 22:50|Einstein@Home|Got server request to delete file h1_0389.0_S5R1

However these WUs are still in the queue to be processed. Don't know what's going to happen when that time comes. Should get to them in the next day or two.

Dusty

Im also getting all my downloads failing. Im running truxoft-ccc with u41.05 app. It also failed to doenload the new application. All S4 work units are downloading fine so not a conncection problem.

I have 8 of these:

16/06/2006 00:02:16|Einstein@Home|Started download of grid_0260_h_T21_S5R1.dat
16/06/2006 00:02:18|Einstein@Home|Finished download of grid_0260_h_T21_S5R1.dat
16/06/2006 00:02:18|Einstein@Home|Throughput 69485 bytes/sec
16/06/2006 00:02:18|Einstein@Home|Started download of grid_0540_h_T21_S5R1.dat
16/06/2006 00:02:18|Einstein@Home|MD5 check failed for grid_0260_h_T21_S5R1.dat
16/06/2006 00:02:18|Einstein@Home|expected 3ceb4719f736f42138d7293b87654a14, got 42fc402431d153528bd140409241fcbe
16/06/2006 00:02:18|Einstein@Home|Checksum or signature error for grid_0260_h_T21_S5R1.dat
16/06/2006 00:02:20|Einstein@Home|Unrecoverable error for result h1_0254.0_S5R1__11862_S5R1a_0 (WU download error: couldn't get input files: grid_0260_h_T21_S5R1.dat -119 MD5 check failed)
16/06/2006 00:02:20|Einstein@Home|Temporarily failed download of grid_0540_h_T21_S5R1.dat: error 404

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4273
Credit: 245184601
RAC: 13895

Some hours ago we had some

Some hours ago we had some problems with at least one download mirror, which should, however, be solved by now. The problem may have been on our end.

BM

BM

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 241

I restarted boincmanager

I restarted boincmanager using file-exit and got the following errors on a pair of new s5 WUs when I restarted. I'm using the boinc studio versions of boinc/boincmanager so it might be from that, not the science app.

6/15/2006 7:17:25 PM||Starting BOINC client version 5.4.9 for windows_intelx86
6/15/2006 7:17:25 PM||BoincStudio mod 0.5c
6/15/2006 7:17:25 PM||libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3
6/15/2006 7:17:25 PM||Data directory: D:\\Program Files\\BOINC
6/15/2006 7:17:25 PM|Einstein@Home|BoincStudio: Setting ThierryH credit correction to true
6/15/2006 7:17:25 PM|Einstein@Home|BoincStudio: Faking number of cpus for work units claims: 4
6/15/2006 7:17:25 PM||Processor: 2 AuthenticAMD AMD Athlon(tm) 64 X2 Dual Core Processor 3800+
6/15/2006 7:17:25 PM||Memory: 3.00 GB physical, 4.34 GB virtual
6/15/2006 7:17:25 PM||Disk: 100.00 GB total, 84.88 GB free
6/15/2006 7:17:25 PM|boincsimap|URL: http://boinc.bio.wzw.tum.de/boincsimap/; Computer ID: 28821; location: home; project prefs: default
6/15/2006 7:17:25 PM|Einstein@Home|URL: http://einstein.phys.uwm.edu/; Computer ID: 667908; location: home; project prefs: default
6/15/2006 7:17:25 PM||General prefs: from Einstein@Home (last modified 2006-06-14 19:15:47)
6/15/2006 7:17:25 PM||General prefs: no separate prefs for home; using your defaults
6/15/2006 7:17:25 PM||Local control only allowed
6/15/2006 7:17:25 PM||Listening on port 31416
6/15/2006 7:17:25 PM|Einstein@Home|Resuming task z1_1468.0__736_S4R2a_3 using albert version 437
6/15/2006 7:17:25 PM|Einstein@Home|Resuming task z1_1468.0__735_S4R2a_3 using albert version 437
6/15/2006 7:17:25 PM|Einstein@Home|File h1_0609.5_S5R1 exists already, skipping download
6/15/2006 7:17:25 PM|Einstein@Home|File h1_1041.0_S5R1 exists already, skipping download
6/15/2006 7:17:25 PM|Einstein@Home|File grid_1050_h_T21_S5R1.dat exists already, skipping download
6/15/2006 7:17:25 PM|Einstein@Home|File h1_0852.0_S5R1 exists already, skipping download
6/15/2006 7:17:25 PM|Einstein@Home|File grid_0860_h_T21_S5R1.dat exists already, skipping download
6/15/2006 7:17:26 PM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
6/15/2006 7:17:26 PM|Einstein@Home|Reason: To fetch work
6/15/2006 7:17:26 PM|Einstein@Home|Requesting 1666852 seconds of new work
6/15/2006 7:17:27 PM|Einstein@Home|Unrecoverable error for result z1_1468.0__736_S4R2a_3 (The semaphore cannot be set again. (0x67) - exit code 103 (0x67))
6/15/2006 7:17:27 PM|Einstein@Home|Deferring scheduler requests for 1 minutes and 0 seconds
6/15/2006 7:17:27 PM||Rescheduling CPU: application exited
6/15/2006 7:17:27 PM|Einstein@Home|Unrecoverable error for result z1_1468.0__735_S4R2a_3 (The semaphore cannot be set again. (0x67) - exit code 103 (0x67))
6/15/2006 7:17:27 PM||Rescheduling CPU: application exited

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4273
Credit: 245184601
RAC: 13895

I think you better reset the

I think you better reset the project - the error messages from your S5R1 results look like the files have been corrupted and your client apparently didn't try to get them again.

BM

BM

DarkStar
DarkStar
Joined: 2 Jan 06
Posts: 13
Credit: 73738
RAC: 0

One think I'm curious about

One think I'm curious about is checkpoint frequency. With the general increase in completion times, will results in progress be checkpointed often enough to avoid large losses when the application (and BOINC client) is stopped and restarted?

Oh, and personal opinion only - standardized credit = good; longer completion times = not so good. Completing 8 "work units" in 16 hours just "feels better" than completing 2 work units in 16 hours, credit issues aside. Of course, that's purely subjective and mostly irrelevant.

.

Odysseus
Odysseus
Joined: 17 Dec 05
Posts: 372
Credit: 19610020
RAC: 3464

RE: RE: The intention is

Message 37544 in response to message 37518

Quote:
Quote:

The intention is that the average Einstein@Home participant will get the same credit per hour "work" than what he gets on other BOINC projects.

BM


I respect that this is what the project should do.

It does, however, mean that those of us who have been enjoying a credit feast with the akosf S4 aps will see our credit production per hour nosedive with S5 (since we have been getting far higher than BOINC-standard credit per hour).

Personally, I think this is just fine. Akosf's aps provided a wonderful speedup. Those of us who adopted them early constituted an extremely large-scale "beta test" which was arguably useful to him and to the project in validating that the science results were useful and did not come hitched to high data-dependent application fault rates of one kind or another. We got a temporary adventurer's reward of credit, but as the more efficient techniques are folded into the project it is back to parity. Should yet further independent improvements arise, it seems likely your current credit scheme would again allow them a decent interval of extra credit/hour reflecting their higher science/hour.

@Bernd, you were asking about the controversy about SETI@home Enhanced: AFAICT it was mostly generated by people expressing an attitude diametrically opposed to archae86’s here, mutus mutandis.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4273
Credit: 245184601
RAC: 13895

RE: One think I'm curious

Message 37545 in response to message 37543

Quote:
One think I'm curious about is checkpoint frequency. With the general increase in completion times, will results in progress be checkpointed often enough to avoid large losses when the application (and BOINC client) is stopped and restarted?


The checkpointing frequency is determined by the "write to disk at most every" setting in your general preferences. There is a limit in the App, of course, but I doubt that this is above the 60s default even on slow machines. The times between checkpoints is the maximum time that gets lost when the App is interrupted, plus a few seconds it takes to read and process the checkpointed state when resuming.

BM

BM

Odysseus
Odysseus
Joined: 17 Dec 05
Posts: 372
Credit: 19610020
RAC: 3464

I’m already getting a “no

I’m already getting a “no work from project� message on a system running the beta Mac/PPC v4.56. Should I delete the app_info file immediately, or will S4 work (replacing lost or erroneous results) still be available for a while?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.