why does BRP4 produces much more errors than S6LV1 ?

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 802589562
RAC: 1237785

RE: So two more failed to

Quote:
So two more failed to validate, both against cuda-pc's.
Normally my mainsys is very reliable, nothing overclocked.
It's intresting, until now all failing wu's came from the newer HD6950, not from the older HD5850. Are there known issues?

Yes, we saw hints of this trend (weaker validation with 69xx cards than with 5xxx) during tests on Albert@Home but needed more data. I'm confident that even the somewhat reduced precision from the 6900s is sufficient to make scientifically valid results, so it might be enough to adjust the validator. Still we need to understand the cause of this trend.

CU

HB

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 521116812
RAC: 247647

RE: Still we need to

Quote:
Still we need to understand the cause of this trend.

http://einsteinathome.org/workunit/123571937
http://einsteinathome.org/workunit/123557679
http://einsteinathome.org/workunit/123477212

29 still waiting for validation; I'll keep you informed.

Alexander

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 521116812
RAC: 247647

RE: I'm confident that

Quote:
I'm confident that even the somewhat reduced precision from the 6900s is sufficient to make scientifically valid results

Back to the slide-rule?
This corrupts my understanding of computation. I've seen comments about fpu's producing different results and problems with gpu-programming. But about precision? OK, single and double, but my understanding was: single precision results should be all equal. Is there anywhere a discussion thread or a deeper explanation about that? Would be very intresting!

Alexander

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 521116812
RAC: 247647

RE: RE: Still we need to

Quote:
Quote:
Still we need to understand the cause of this trend.

http://einsteinathome.org/workunit/123571937
http://einsteinathome.org/workunit/123557679
http://einsteinathome.org/workunit/123477212

29 still waiting for validation; I'll keep you informed.

Alexander


add this: http://einsteinathome.org/workunit/123475517

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 521116812
RAC: 247647

Until now 14 wu's from the

Until now 14 wu's from the HD6950 are marked as invalid. Some more pending.
I think I'll stop crunching here until I find a message that this issue is solved.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 802589562
RAC: 1237785

Hi all! We are currently

Hi all!

We are currently investigating this, and we think we have an idea of what's going on, but this needs further tests and some work on a fix.

What we can say confidently now is this:

  • * all HD6900 series cards should be affected by this
    * the validation rate to expect for this type of card in the long run is roughly 50%
    * no other ATI/AMD card was yet found to have this behavior
    * improvement of validation rate for the HD 6900 will require a new app version and will involve a performance penalty (just how severe will have to be seen)

Stay tuned, we'll let you know when we have news. Those HD 6900 owners who prefer to stop crunching ATI/AMD apps for BRP4 for now can do so in a couple of ways:

  • * Deselect ATI apps (for all projects!!!) altogether in the global preferences (for a certain "venue"),or
    * deselect the BRP4 app altogether in the project specific settings (also affecting any CUDA cards for hosts in that venue),or
    * deselect GPU processing on a particular host in the Boinc Manager local settings (also affecting other project other than E@H),or
    * if you are familiar with editing the client configuration file cc_config.xml (see

http://boinc.berkeley.edu/wiki/Client_configuration), there is a setting (search for "") which probably does exactly what is convenient for this problem, on a host-per-host basis.

Cheers
HB

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 521116812
RAC: 247647

RE: Hi all! We are

Quote:

Hi all!

We are currently investigating this, and we think we have an idea of what's going on, but this needs further tests and some work on a fix.

If you need a tester contact me.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 802589562
RAC: 1237785

Hi! Thanks! Actually we

Hi!

Thanks! Actually we have just now put a new app version on the test project at albert.phys.uwm.edu (requires BOINC 7.0.27) that should improve the HD 6900 validation rate, although at a performance cost. Our initial tests indicate that the penalty on HD 6900s is well below the ca 50% validation failure rate (on average) with the current version, so we thought it was a good idea to test this quick fix for the HD6900 problem. We will probably come up with something that has a smaller performance penalty, later.

So anyone with a HD6900 who wants to help testing the fix is invited to join the "Albert" test project at the http://albert.phys.uwm.edu

Cheers
HB

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 521116812
RAC: 247647

RE: Actually we have just

Quote:

Actually we have just now put a new app version on the test project at albert.phys.uwm.edu (requires BOINC 7.0.27) that should improve the HD 6900 validation rate, although at a performance cost.

.. crunching @ albert

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 521116812
RAC: 247647

Speed loss: 1:45 @ Einstein :

Speed loss:
1:45 @ Einstein : 2:09 @ Albert

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.