Why did I get zero credit on this WU?

Eric W
Eric W
Joined: 12 May 07
Posts: 3
Credit: 252,690
RAC: 0
Topic 192876

http://einsteinathome.org/workunit/33970193

Any ideas? I have not seen this before.

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9,352,143
RAC: 0

Why did I get zero credit on this WU?

Because you had hosts running Linux as wingmen on the WU.

The is a known issue with the S5R2 app at this point, and the team has been working hard to nail down the exact cause and fix it for about a month.

Keep in mind that S5R2 is essentially a Beta run for the next full scale science runs to follow.

Alinator

Eric W
Eric W
Joined: 12 May 07
Posts: 3
Credit: 252,690
RAC: 0

RE: Because you had hosts

Message 68406 in response to message 68405

Quote:

Because you had hosts running Linux as wingmen on the WU.

The is a known issue with the S5R2 app at this point, and the team has been working hard to nail down the exact cause and fix it for about a month.

Keep in mind that S5R2 is essentially a Beta run for the next full scale science runs to follow.

Alinator

Thanks for the quick reply. Does this mean that I am not going to get any credit for that WU. I hope not as it takes about a day to process these WUs. Anything I can do?

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9,352,143
RAC: 0

Unfortunately, that's what it

Unfortunately, that's what it means at this point, and most of us have gotten burned on the cross platform validation issue at least once. OTOH, since you are on Windows, the odds are in your favor your wingman will be another Winbox.

As I said, the team is working on it but this app is an all new approach to the way we analyze the data and getting all the 'pieces/parts' to work together smoothly seems to be proving a little tough. Also there really isn't anything you can do at your end except keep an eye on the NC forum for any new news from Bernd or the rest of the team regarding the status.

I know it's little discouraging to get 'zipped' with the runtimes such as they are right at the moment, but the payback will be a much better app when we get to the next full length science run later on.

Alinator

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,035
Credit: 34,659,383,350
RAC: 34,738,925

RE: http://einstein.phys.uw

Quote:

http://einsteinathome.org/workunit/33970193

Any ideas? I have not seen this before.

As Alinator explained, Windows and Linux sometimes produce slightly different answers, just different enough for the result to be sent to a third computer to get another opinion. The law of averages says that the third opinion is likely to come from another Windows box so it is usually the Linux box that gets shafted.

If you look in detail at other "pendings" in your results list you will see that this is going to happen again but you are unlikely to miss out again. This time the third box is another Windows box so you should be safe and the Linux guy will be shafted - as usual :).

EDIT: To help you find it quickly, here is a link to the WU where the Linux guy is likely to miss out. If you actually look at his full results list, you will find he has 16 completed results showing. There is one where he has already received zero credit and there are three further pendings which are "checked but no consensus yet". It is quite likely that he will miss out on 25% of the credit in his current list because of problems with the validator. As a significant Linux user myself, I'm very frustrated with this ongoing waste of resources.

Cheers,
Gary.

tullio
tullio
Joined: 22 Jan 05
Posts: 1,994
Credit: 32,282,599
RAC: 521

RE: EDIT: To help you

Message 68409 in response to message 68408

Quote:


EDIT: To help you find it quickly, here is a link to the WU where the Linux guy is likely to miss out. If you actually look at his full results list, you will find he has 16 completed results showing. There is one where he has already received zero credit and there are three further pendings which are "checked but no consensus yet". It is quite likely that he will miss out on 25% of the credit in his current list because of problems with the validator. As a significant Linux user myself, I'm very frustrated with this ongoing waste of resources.


As a Linux user myself, I would suggest this user to upgrade his 2.14.xx kernel to a 2.16.xx kernel. I never have any validation problem with Windows users on my PII Deschutes running SuSE Linux 10.1.
Tullio

Alphali
Alphali
Joined: 8 Mar 06
Posts: 6
Credit: 708,538
RAC: 0

Besides, it's not really a

Besides, it's not really a platform validation problem.

http://einsteinathome.org/workunit/33732455

In this workunit, the three computers are Windows Intel (first two were Pentium 4 windows XP, third one was a Celeron W2k).
Even the version (4.17) of the app was the same between the boxes.

Maybe they are focusing in the wrong issue...

Akos Fekete
Akos Fekete
Joined: 13 Nov 05
Posts: 560
Credit: 4,510,638
RAC: 0

Nice

Message 68411 in response to message 68410

Nice hit!

1,
Intel Pentium 4 CPU 2.80GHz [x86 Family 15 Model 2 Stepping 9]
Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00)
core client version: 5.8.15
time: 245,290.30
granted credit: 0.00

2,
Intel Pentium 4 CPU 3.00GHz [x86 Family 15 Model 6 Stepping 2]
Microsoft Windows XP Professional Edition, Service Pack 2, (05.01.2600.00)
core client version: 5.9.10
time: 254,179.84
granted credit: 502.45

3,
Intel Celeron CPU 2.60GHz [x86 Family 15 Model 2 Stepping 9]
Microsoft Windows 2000 Professional Edition, Service Pack 4, (05.00.2195.00)
core client version: 5.10.1
time: 222,296.13
granted credit: 502.45

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,515
Credit: 446,321,776
RAC: 52,565

RE: Even the version

Message 68412 in response to message 68410

Quote:


Even the version (4.17) of the app was the same between the boxes.

Maybe they are focusing in the wrong issue...

IIRC, Bernd mentioned that there are two known issues wrt validation:

- One is the x-platform validation problem. This is about Windows boxes computing different results than Linux or Darwin boxes in some cases. It's a fact and can be reproduced. This issue is not solved yet and is under investigation.

- Another issue has something to do with the way the app writes and reads checkpoint files, AFAIK you can get slightly different results depending on the number of times your app terminated and restarted during the course of a workunit. This can prevent hosts of the same platform to vailidate against each other. Hosts that are configured not to retain the application in memory while suspended should be affected more frequently. This particular issue was fixed in the beta, IIRC.

Right??

Apart from that, you will always see ab few same-platform validation errors from boxes that were overclocked just a bit too aggressively.

CU
BRM

Eric W
Eric W
Joined: 12 May 07
Posts: 3
Credit: 252,690
RAC: 0

RE: RE: Even the version

Message 68413 in response to message 68412

Quote:
Quote:


Even the version (4.17) of the app was the same between the boxes.

Maybe they are focusing in the wrong issue...

IIRC, Bernd mentioned that there are two known issues wrt validation:

- One is the x-platform validation problem. This is about Windows boxes computing different results than Linux or Darwin boxes in some cases. It's a fact and can be reproduced. This issue is not solved yet and is under investigation.

- Another issue has something to do with the way the app writes and reads checkpoint files, AFAIK you can get slightly different results depending on the number of times your app terminated and restarted during the course of a workunit. This can prevent hosts of the same platform to vailidate against each other. Hosts that are configured not to retain the application in memory while suspended should be affected more frequently. This particular issue was fixed in the beta, IIRC.

Right??

Apart from that, you will always see ab few same-platform validation errors from boxes that were overclocked just a bit too aggressively.

CU
BRM

No overclocking here :(

arcturus
arcturus
Joined: 11 Feb 05
Posts: 44
Credit: 1,008,160
RAC: 0

RE: As a Linux user myself,

Message 68414 in response to message 68409

Quote:
As a Linux user myself, I would suggest this user to upgrade his 2.14.xx kernel to a 2.16.xx kernel. I never have any validation problem with Windows users on my PII Deschutes running SuSE Linux 10.1.
Tullio

That's won't help. Just got 0 credit running the 2.6.20 kernel. Besides, the OP is running Windows, not Linux.

http://einsteinathome.org/workunit/34017383

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.