S5R5 plans

samuel7
samuel7
Joined: 16 Feb 05
Posts: 34
Credit: 1,579,363
RAC: 0

RE: Yes, the S5R4 workunits

Message 87076 in response to message 87074

Quote:

Yes, the S5R4 workunits will remain the majority for the next few days. We won't cancel them so people will get credit for them; this will also help our servers to cope with the transition.

BM

Indeed it helps spread out the new application downloads and is of course the best way to handle the transition. I'm just eager to start the new run on my machines but will do my part to complete the S5R4 run. Looks like about 40 results to be done for my laptop with no partners in sight. Now have to check the quad...

Thanks Bernd and here's hoping for a smooth changeover!

Lee Venters
Lee Venters
Joined: 21 Oct 08
Posts: 2
Credit: 105,865
RAC: 0

I have 2 E@H S5R4 tasks left

I have 2 E@H S5R4 tasks left to do... one of them is almost done (another 29h+)and the second one will take about 52h+
Maybe one of these or both will match up with Samuel7's.
I do crunching for Einstein, Rosetta and SETI. It's interesting (and intriguing), but I'm not doing it for the credits to be rewarded.
Maybe the next ones I get from E@H will be the S5R5's, since they are ready.
The 14 day or 18 day deadline is fine with me. I do prefer longer deadlines, because SETI does send out some long WU's with a shorter deadline, so BOINC picks that one to run at high priority. (Or Rosetta, depending on the expected deadline... but Rosetta's WU's are usually between 2h and 4h.)
I'm just happy that I can be of service to the scientific community.
Thank you for all of the updates.

Lee

archae86
archae86
Joined: 6 Dec 05
Posts: 3,103
Credit: 6,108,216,787
RAC: 1,219,762

I just transitioned three

I just transitioned three hosts from ap_info running 6.05 to stock (thus accepting S5R4 on 6.10 plus S5R5 on 3.01)

Here are a couple of observations from my experience, in case they may reduce surprise to others

work availability
Of the three hosts, the Duo got two S5R4's and the first Quad 4 S5R4's on first fetch. But the second Quad got two S5R4's and two S5R5's. I then opened up my requested queue size a little, and both Quads got several more sequential S5R5's. The S5R5s came to the same frequency range as the S5R4.

Task Duration
Even though my work queue is small and has sequential work, the predicted completion times for unstarted work vary appreciably.

On the Q6600
730.05 1102 predicted 7:50:43
730.05 1098 predicted 7:40:36

On the Q9550
748.6 1133 predicted 4:23:12
748.6 1123 predicted 4:09:38

I can't yet give any observation on prediction vs. reality, save that the one S5R5 currently executing is indeed clearly running faster than S5R4, though nowhere near twice as fast (perhaps I happen to be near a peak, however)

what to delete and what is downloaded
On my first host to change over, I thought to be clever and delete the 6.05 ap as well as the ap_info. The result was that an entry in one of the config files triggered an attempt to re-download it (even though not needed), and of course it had no place to get it--one minute loop. A project reset fixed that.

For my other two I followed directions and only deleted the ap_info. Both started right up. Of course the first thing to do was considerable downloading (_0, _1, and _2 ap, and graphics files for S5R4, a skygrid or two and several of the 4 Mbyte frequency-specific files, plus more files for S5R5). So the total download in the first few minutes for my Q6600 was just over 80 Mbytes. The servers supplied at splendid rates, however, and no retries were required.

Deadlines

As noted, the S5R5 work is coming with 14 day deadlines, while newly downloaded S5R4 remains at 18 day deadline, so if you are looking at the web page representation of "Tasks for Computer" at the moment newly downloaded January 27 deadline work is R5, while January 31 is R4 (in less than two hours to be January 28 and February 1).

John Clark
John Clark
Joined: 4 May 07
Posts: 1,087
Credit: 3,143,193
RAC: 0

I have swapped part of one of

I have swapped part of one of my quads to the new WU.

Deleted the app_info file and D/Led the new 6.01 client and a mixture of S5R4s and S5R5s (6.10s and 3.01s). I have 6 of each, but they will not start crunching until a couple of MW WUs are completed in about 15 minutes.

ATM I will ignore the predicted completion times of 3hrs 25m for the 3.01s and 8hrs 20mins for the 6.10s. These predicted times for the 6.10s are longer than when I ran the 6.05 client.

Shih-Tzu are clever, cuddly, playful and rule!! Jack Russell are feisty!

John Clark
John Clark
Joined: 4 May 07
Posts: 1,087
Credit: 3,143,193
RAC: 0

Now crunching 3 3.01 WUs and

Now crunching 3 3.01 WUs and projections suggest these will complete (for my older quad) in about 6 hours. The 6.10 WUs using the 6.05 client, with app_info file, was 7hours 22 minuted.

A good reduction, but not by 50%.

I know I am not comparing like with like. This is just to a very rough first approximation.

Shih-Tzu are clever, cuddly, playful and rule!! Jack Russell are feisty!

Svenie25
Svenie25
Joined: 21 Mar 05
Posts: 139
Credit: 2,436,862
RAC: 0

My Desktop got his first R5

My Desktop got his first R5 WUs. Looks fine so far. Waiting for the first validations.
Now I hope to stay a looong time at one frequenzfile, to have a lokk on the new curve in runtime.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 492,071,074
RAC: 117,064

BTW, as the command line

BTW, as the command line arguments to the app are now printed into teh debugging output of the results, it's much easier to check after a WU has finished whether it's runtime is near the expected minimum or maximum.

Youst look at the output in the result, and find the argument :

--numSkyPartitions=xxx

e.g. "--numSkyPartitions=339"

Now lookup the sequence number in the name of the result following the double underscore, e.g. for WU h1_0709.40_S5R4__677_S5R5a that number would be 677

now divide the second by the first number, so here:

677 / 339 = 1.99

If the fractional part of that quotient is close to 0 or 1 , you are near a runtime maximum. If it's close to 0.5, you are near a runtime minimum.

This will help to put the first runtime results into perspective a bit.

CU

Bikeman

Jord
Joined: 26 Jan 05
Posts: 2,952
Credit: 5,779,100
RAC: 0

RE: RE: The S5R4 workunit

Message 87083 in response to message 87071

Quote:
Quote:
The S5R4 workunit generator (WUG) has been stopped and the S5R5 one been started instead. S5R5 has officially been launched.

And so I allowed for some extra minutes between its launch and me re-allowing work. Got an S5R4 of course. ;-)


Hmm, something strange happened here. My internet just dropped off.
Because of that I had some network problems (router doing strange things). The next thing I know is that the old S5R4 task was gone. It exited with an error as the graphics application was not found...

Here's the log at the exact time my internet went off:

15-Jan-2009 18:57:33 [---] file projects/einstein.phys.uwm.edu/einstein_S5R4_6.09_graphics_windows_intelx86.exe not found
15-Jan-2009 18:57:33 [---] Suspending network activity - user request
15-Jan-2009 18:57:33 [Einstein@Home] [error] Application file einstein_S5R4_6.09_windows_intelx86.exe missing signature
15-Jan-2009 18:57:33 [Einstein@Home] [error] BOINC cannot accept this file
15-Jan-2009 18:57:33 [Einstein@Home] [sched_op_debug] Deferring communication for 1 min 0 sec
15-Jan-2009 18:57:33 [Einstein@Home] [sched_op_debug] Reason: Unrecoverable error for result h1_1103.40_S5R4__791_S5R4a_1 (Input file einstein_S5R4_6.09_windows_intelx86.exe missing or invalid: -123)
15-Jan-2009 18:57:33 [Einstein@Home] [task_debug] result state=COMPUTE_ERROR for h1_1103.40_S5R4__791_S5R4a_1 from CS::report_result_error
15-Jan-2009 18:57:33 [Einstein@Home] [task_debug] task_state=COULDNT_START for h1_1103.40_S5R4__791_S5R4a_1 from start
15-Jan-2009 18:57:33 [Einstein@Home] [task_debug] task_state=COULDNT_START for h1_1103.40_S5R4__791_S5R4a_1 from resume_or_start1
15-Jan-2009 18:57:35 [Einstein@Home] Computation for task h1_1103.40_S5R4__791_S5R4a_1 finished
15-Jan-2009 18:57:35 [Einstein@Home] Output file h1_1103.40_S5R4__791_S5R4a_1_0 for task h1_1103.40_S5R4__791_S5R4a_1 absent
15-Jan-2009 18:57:35 [Einstein@Home] [task_debug] result state=COMPUTE_ERROR for h1_1103.40_S5R4__791_S5R4a_1 from CS::app_finished

It has now downloaded a new S5R5 task, but is still trying to download the 6.09 graphical application every minute.

15-Jan-09 20:15:21|Einstein@Home|Backing off 1 min 0 sec on download of einstein_S5R4_6.09_graphics_windows_intelx86.exe

{scratch, scratch} was 6.09 a power app then?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,593
Credit: 85,625,117,243
RAC: 67,070,268

RE: ... The next thing I

Message 87084 in response to message 87083

Quote:
... The next thing I know is that the old S5R4 task was gone. It exited with an error as the graphics application was not found...

Actually, the task exited because it was looking for the 6.09 version of the science app and that there was no file sig for that version. The fact that the 6.09 graphics app also couldn't be found was just collateral damage :-).

I usually keep all the beta/power app versions on my server and I've just checked. I have versions (for Windows) 6.04, 6.05, 6.06, 6.07 and 6.10. AFAIK there was a Windows 6.09 version but it was to fix checkpointing issues under Win98 and ME if I recall correctly.

Since the current beta and official version is 6.10 (and I'm guessing this would have been the version you were using) the reason for your problem is that for some unknown reason the version number associated with your task suddenly got changed from 610 to 609 in your state file and then BOINC suddenly realised that you didn't have the 6.09 app package with which to continue crunching it. The fact that BOINC tries to get the 609 app shows that you weren't using the AP mechanism and somehow BOINC thinks that 609 is official. I don't remember if 609 was ever official at any point.

There are probably other variations on this but it seems that something in your state file that was 6.10 somehow got changed to 6.09 in some way. It's hard to see how this might be due to a loss of network connectivity.

Another funny point is that BOINC complains about a missing signature for a 6.09 file. This seems to imply that you had such a file in your project folder and had run it under AP at some point so that there was a block for it (with no file sig) in your state file. Surely BOINC wouldn't say that it can't accept the file if the file didn't actually exist??

So what version of the science app were you actually running??

Cheers,
Gary.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2,112
Credit: 1,538,367,287
RAC: 4,613,624

RE: RE: ... The next

Message 87085 in response to message 87084

Quote:
Quote:
... The next thing I know is that the old S5R4 task was gone. It exited with an error as the graphics application was not found...

Actually, the task exited because it was looking for the 6.09 version of the science app and that there was no file sig for that version. The fact that the 6.09 graphics app also couldn't be found was just collateral damage :-).

I usually keep all the beta/power app versions on my server and I've just checked. I have versions (for Windows) 6.04, 6.05, 6.06, 6.07 and 6.10. AFAIK there was a Windows 6.09 version but it was to fix checkpointing issues under Win98 and ME if I recall correctly.

Since the current beta and official version is 6.10 (and I'm guessing this would have been the version you were using) the reason for your problem is that for some unknown reason the version number associated with your task suddenly got changed from 610 to 609 in your state file and then BOINC suddenly realised that you didn't have the 6.09 app package with which to continue crunching it. The fact that BOINC tries to get the 609 app shows that you weren't using the AP mechanism and somehow BOINC thinks that 609 is official. I don't remember if 609 was ever official at any point.

There are probably other variations on this but it seems that something in your state file that was 6.10 somehow got changed to 6.09 in some way. It's hard to see how this might be due to a loss of network connectivity.

Another funny point is that BOINC complains about a missing signature for a 6.09 file. This seems to imply that you had such a file in your project folder and had run it under AP at some point so that there was a block for it (with no file sig) in your state file. Surely BOINC wouldn't say that it can't accept the file if the file didn't actually exist??

So what version of the science app were you actually running??


Gary,

For the first (and probably only) time I'm going to disagree with you - you're probably the most technically astute (and courteous) moderator I've come across in my limited range of BOINC projects - and yet.....

There was a Windows v6.09 package, and Bernd made his usual announcement thread for it. As a Beta, it would have come with an app_info.xml specifying all the filenames.

And that's exactly the point. The anonymous platform mechanism requires that every file is named, explicitly. BOINC doesn't make up filenames by combining version numbers with filename root components. [It does make up 'friendly names' that way for display in BOINC Manager]. That does suggest that at some point Jord downloaded and tested Beta v6.09 - it must have been in a relatively short interval between 4 Dec 2008 (v6.08) and 1 January 2009 (v6.10). I was an active participant in the Windows 98 phase of that test, and those are the download datestamps of my preserved archives.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.