S5R1b behind

Klimax
Klimax
Joined: 27 Apr 07
Posts: 87
Credit: 1370205
RAC: 0

RE: * Post-processing (or

Message 80016 in response to message 80015

Quote:

* Post-processing (or final analysis) of the E@H S4 run is finished, but the results are still under internal review.

* Unfortunately the lost S5R1 results span the whole frequency range, no current post-processing of the S5R1 results would work even in a limited frequency band until we have those results (again).

* Yep, there is a result template for S5R3c, but unless we find something going wrong in the current S5R3b I don't think it is of any other use than for internal tests. The next public run will most likely be named S5R4.

BM


BTW:Must say that it is interesting to follow changes to code.(If one has time and does not have to learn for exam from mathematical analysis :-| )

jowr
jowr
Joined: 19 Feb 05
Posts: 55
Credit: 1947636
RAC: 0

RE: * Post-processing (or

Message 80017 in response to message 80015

Quote:
* Post-processing (or final analysis) of the E@H S4 run is finished, but the results are still under internal review.

Do I have anything to look forward to, maybe? [hint,hint]

Would it be possible to get a time frame for release/publishing?

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4267
Credit: 244933393
RAC: 16304

We found an error in the

We found an error in the calculation of the computing power needed to re-do the lost tasks. So we'll now do this on our own (LSC) clusters and won't bother Einstein@home with it.

Sorry for the confusion.

(actually another Application with tasks with two more run-times different from S5R3 might have been even more confusing)

BM

BM

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 511
Credit: 402987492
RAC: 1030439

RE: In the near future

Quote:

In the near future (1-2 weeks) we'll begin to re-issue the workunits (~150,000) of the lost results with the label S5R3b. They should run with the Apps that we used at the end of S5R1.


A more administrative question:
If I do understand correctly these 150,000 files are in addition to the 7,369,434 files from the original S5R3 set of WU. So the relative file count has to be modified or halted during the time they are in the pipeline. Difficulty.
At the current speed of crunching 150,000 WU will become done within about 6 days.
MW

Sabroe_SMC
Sabroe_SMC
Joined: 9 Oct 06
Posts: 27
Credit: 357301992
RAC: 142652

RE: RE: In the near

Message 80020 in response to message 80019

Quote:
Quote:

In the near future (1-2 weeks) we'll begin to re-issue the workunits (~150,000) of the lost results with the label S5R3b. They should run with the Apps that we used at the end of S5R1.

Hi to all
is it possible that this is one of the lost Results?
It is not validated yet!
CU
Sabroe

Astro
Astro
Joined: 18 Jan 05
Posts: 257
Credit: 1000560
RAC: 0

RE: Hi to all is it

Message 80021 in response to message 80020

Quote:

Hi to all
is it possible that this is one of the lost Results?
It is not validated yet!
CU
Sabroe


If you look at the "wu id" wuid=37754779, you'll see that both your wingman got credited and you're was judged (by the validation process) to be "invalid", and no credit was given. Basically, the answer returned by the other two hosts matched each other, but your's didn't match theirs.

I doubt you'll ever get credit for that one. Validation errors can occur when your host has a problem like overheating, excess overclocking, memory glitches, computer crashes, video problems, etc. If you see this happening frequently, then I'd worry about it. It happens to everyone once in a great while. I define "frequently" to be in the area of one time out of every 30 or so results or more.

Another definition of how I would look at it:

If it happens 1 out of every 100 - ignore it
If it happens 1 out of every 30 - keep an eye on it.
If it happens 1 out of every 15-20 then start looking for the cause as something is degrading and needs attention.

hope this helps
tony

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109410027856
RAC: 35089717

RE: RE: In the near

Message 80022 in response to message 80019

Quote:
Quote:

In the near future (1-2 weeks) we'll begin to re-issue the workunits (~150,000) of the lost results with the label S5R3b. They should run with the Apps that we used at the end of S5R1.

A more administrative question:
If I do understand correctly these 150,000 files are in addition to the 7,369,434 files from the original S5R3 set of WU. So the relative file count has to be modified or halted during the time they are in the pipeline. Difficulty.
At the current speed of crunching 150,000 WU will become done within about 6 days.
MW

Please read Bernd's original message (and the followup one that appeared immediately before your message) more carefully.

The missing 150,000 results belonged to the S5R1 run which has nothing to do with the current S5R3 run which is using a quite different set of science apps. There is absolutely no issue anymore since the project is going to redo the missing results "in-house".

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109410027856
RAC: 35089717

RE: RE: Hi to all is it

Message 80023 in response to message 80021

Quote:
Quote:

Hi to all
is it possible that this is one of the lost Results?
It is not validated yet!
CU
Sabroe

If you look at the "wu id" wuid=37754779, you'll see that both your wingman got credited and you're was judged (by the validation process) to be "invalid", and no credit was given. Basically, the answer returned by the other two hosts matched each other, but your's didn't match theirs.

I doubt you'll ever get credit for that one. Validation errors can occur when your host has a problem like overheating, excess overclocking, memory glitches, computer crashes, video problems, etc. ...

@Sabroe,
No it is not possible that your validation errors have anything to do with the 150,000 missing results that Bernd is talking about. Look at the date and look at the two totally different science runs. I've actually just given my assessment of the cause of your validation errors, as a response in your other thread.

@Tony,
Please don't confuse invalid results with validation errors. What you are describing are some of the reasons why a result may be declared invalid. However in this case the results are still listed as "no consensus yet", ie they haven't actually been declared invalid. The state "validation error" is typically used for results that cannot be found when the validator tries to check their validity. The common cause is a server screwup.

Cheers,
Gary.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 686169189
RAC: 549961

What a coincidence, one of my

What a coincidence, one of my hosts was one of the wingmen in the non validating results.

All wingmen were using apps from the same code basis, tho on different OSes, but I guess it's most likely that this is a one time thing.

CU
Bikeman

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4267
Credit: 244933393
RAC: 16304

RE: We found an error in

Message 80025 in response to message 80018

Quote:
We found an error in the calculation of the computing power needed to re-do the lost tasks. So we'll now do this on our own (LSC) clusters and won't bother Einstein@home with it.


Update for the curious: That's done by now. Took ATLAS a bit more than two days of actual crunching.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.