Problem with a R9 390X when run 2 or more WUs at a time.

hotze33
hotze33
Joined: 10 Nov 04
Posts: 100
Credit: 363,440,607
RAC: 76,667

I´m running two BOINC

I´m running two BOINC clients at my machine since yesterday with catalyst 15.7 (Open CL 1800.8). So far some wu have been already valided, which is better than 100% invalids when running 2 wus in one client.
So there maybe a silver lining but it is still a little to early.

For anyone trying this method, I had to change the instructions from http://vyper.kafit.se/wp/index.php/2011/02/04/running-different-nvidia-architectures-most-optimal-at-setihome/ a little. First I had to use "--" for the arguments in the commandline instead of "-" and when connecting to the computer in the second client I had to use 127.0.0.1:9999 instead of localhost:9999. Credit to the user salvorhardin at the anandtech forum (http://forums.anandtech.com/showthread.php?t=2254389) for this.

Sutaru Tsureku
Sutaru Tsureku
Joined: 26 Oct 09
Posts: 24
Credit: 101,737,268
RAC: 0

I looked to your PC with the

I looked to your PC with the two BOINC Clients...
It looks like '50/50' - 50% OK and 50% invalids - until now.

It don't look so well.

It's really disappointing and annoying that AMD 'added' this BUG to the new drivers.
We have very powerful VGA cards, but couldn't enjoy the whole performance of it.

hotze33
hotze33
Joined: 10 Nov 04
Posts: 100
Credit: 363,440,607
RAC: 76,667

It is kind of strange. When

It is kind of strange. When running two wus in one client, they fail 100%. Running two clients with one wu each results in valids and invalids for both clients. But sometimes even valids from each calculated at the same time...
However, most of the failing wu of my second client (http://einsteinathome.org/host/12094224/tasks) are inconclusive as well as the results of my wingman.

Chooka
Chooka
Joined: 11 Feb 13
Posts: 102
Credit: 1,313,595,033
RAC: 870,072

RE: I have 4 AMD Radeon R9

Quote:
I have 4 AMD Radeon R9 Fury X VGA cards in one PC.

Jezuss... doesn't come much more powerful than that!


Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 463,004,198
RAC: 36,327

RE: What we need is to

Quote:


What we need is to prick the interest of the project dev's into looking at this from the projects app code point of view to narrow down the list of potential culprits causing tasks to fail when run concurrently.

Sadly, I doubt there are enough Hawaii and to a certain degree Tonga users here at Einstein (wanting to run multiple tasks) to make it worth the developers effort.

We are aware of the problem (thanks to those who reported it here). Currently we can't think of any possible way we could influence this from the app side, it is the driver's and OpenCL's responsibility to isolate the different processes from each other and if there is a failure to do so, what can we do (other than force people to run just one E@H GPU task at a time?)

I'm also not entirely convinced this is restricted to Hawaii and Tonga cards, I'm pretty sure I also see this on a 7790 'Bonaire'.

So we know running 2 x BRP or 2 x SETI GPU concurrently with the affected drivers can cause problems. What about 1 x BRP GPU + 1 x SETI GPU? Any experience with that ?

Cheers
HB

hotze33
hotze33
Joined: 10 Nov 04
Posts: 100
Credit: 363,440,607
RAC: 76,667

So just a quick thought: If

So just a quick thought: If Radeon 7790 (also R9 285?) has a similar problem then this looks like a problem with the GCN architecture? All the affected cards seem to have GCN >= 1.1, which means an increased number of asynchronous compute engines. They should increase the parallel processing power.
On a second note: I haven´t been able to track the changes in the opencl driver from 1445.5 (catalyst 14.4, 2 wus work but very slow) to 1526.3 (catalyst 14.8, faster but two wus will fail). Maybe someone knows what kind of performace improvements were done between these two versions.

ReaDy
ReaDy
Joined: 23 Aug 13
Posts: 4
Credit: 107,983,632
RAC: 0

I consider Einstein on Radeon

I consider Einstein on Radeon 7790 in 2 streams. The AMD Catalyst 14.2 Beta Download V1.3 driver (opencl driver 1411.4) I have no errors of calculations. Newer I don't use the driver
because on them I can't lower videocard frequency, only dispersal is available that doesn't suit me. I think that all problems on new videocards are connected with OpenCL 2.0

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 1,062
Credit: 1,172,657,181
RAC: 2,755,349

RE: So we know running 2 x

Quote:


So we know running 2 x BRP or 2 x SETI GPU concurrently with the affected drivers can cause problems. What about 1 x BRP GPU + 1 x SETI GPU? Any experience with that ?

Cheers
HB


Back in July when SETI ran out of work, my cards fell over to running both Einstein and MilkyWay at the same time on the cards. Never had any issues when running SETI and Einstein or SETI and MW at the same time on the same card. Evidently both Einstein and MW fight for contention of the OpenCL resources when run concurrently on the same card since I then produced errors on both projects. I had invalids on both projects and things returned to normal when SETI came back online and my normal mix of work resumed. I posted the errors over at MW in Message 63838.

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,388
Credit: 51,662,767,865
RAC: 69,887,828

RE: I'm also not entirely

Quote:
I'm also not entirely convinced this is restricted to Hawaii and Tonga cards, I'm pretty sure I also see this on a 7790 'Bonaire'.


I setup a 'Bonaire' card recently. It's more recent than the 7790 - an R7 260X 2GB. My understanding (and it could be wrong) is that it's the same chip with more memory and at somewhat higher clocks. It also has a factory overclock. A local computer store had them on special so I decided to test one out.

I ran it at 2x, 3x and 4x but had some validate errors at 4x with no gain in performance. After deciding on 3x, and after the weather warmed significantly, it started producing some invalid results (~10%) so I downclocked both core and mem from 1175/1625 MHz to 1150/1600 MHz. That has seemed to get rid of the invalids for the moment. With even warmer weather on the way, I might reduce these a bit more.

So, I'm not seeing this 'concurrent tasks' problem with my particular 'Bonaire'.

Cheers,
Gary.

Chooka
Chooka
Joined: 11 Feb 13
Posts: 102
Credit: 1,313,595,033
RAC: 870,072

Since getting my R9 390 I've

Since getting my R9 390 I've had it pointed out to me that I've got quite a number of invalids. (Card installed 09/10/15) That was running 2 WU's at a time (Setting 0.5 in preferences)

I'm trying 1 at a time now but that's going to be painfully slow.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.