Problem with a R9 390X when run 2 or more WUs at a time.

juan BFP
juan BFP
Joined: 18 Nov 11
Posts: 839
Credit: 421443712
RAC: 0
Topic 198213

A friend ask me to test the performance of a R9 390X in E@H.

Start one host with this GPU and works fine with 1 WU at a time, fast and no errors or invalids.

When i start to run 2 WU at a time the invalids starts.

Anyone have a fix to that?

lHj2ixL.jpg

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4968
Credit: 18760608419
RAC: 7160594

Problem with a R9 390X when run 2 or more WUs at a time.

I haven't read of one yet. Look at This thread
It seems nobody is having any luck running more than one BRP6 task on AMD 290/390 cards without producing invalids when trying to run more than one.

 

juan BFP
juan BFP
Joined: 18 Nov 11
Posts: 839
Credit: 421443712
RAC: 0

Thanks but the problems

Thanks but the problems aparently happening even when you run only BRP4 tasks.

At least all the last infalids are from this type of task.

lHj2ixL.jpg

 

chase1902
chase1902
Joined: 13 Aug 11
Posts: 37
Credit: 1264094642
RAC: 0

I never tried any BRP4 tasks,

I never tried any BRP4 tasks, as they are not always available.
Certainly couldn't get any of the BRP6 tasks to work with more than one running at a time, tried various drivers etc.

Ive noticed the later AMD drivers seem to be quicker overall but the CPU time is considerable longer, twice as long as expected for the R290x and over 3 times longer on my 7970 (just reinstalled with the latest version), interested to know why.
Thought the longer CPU times might have something to do with the invalid tasks but as they still work fine on the 7970 guess not.

Sasa Jovicic
Sasa Jovicic
Joined: 17 Feb 09
Posts: 75
Credit: 90454180
RAC: 221209

CPU time is OK. Try to

CPU time is OK. Try to underclock GPU memory and memory voltage.

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

I ran into the same issue on

I ran into the same issue on my R9-290x card some time back. I have not been able to find a solution. One task is the maximum that the card can handle for successful task validation.

Jeroen

juan BFP
juan BFP
Joined: 18 Nov 11
Posts: 839
Credit: 421443712
RAC: 0

RE: I ran into the same

Quote:

I ran into the same issue on my R9-290x card some time back. I have not been able to find a solution. One task is the maximum that the card can handle for successful task validation.

Jeroen


Seriusly? 1 WU at a time only in this top end GPU is something very wierd...

Seems like my building plans for a top 4xGPU AMD cruncher are sinkng.

Thanks all for the answers. Back to the drawing table.

lHj2ixL.jpg

 

Stef
Stef
Joined: 8 Mar 05
Posts: 206
Credit: 110568193
RAC: 0

I didn't find any newer

I didn't find any newer high-end AMD GPUs in the top 100 hosts list, so it definitely is a problem.

Gavin
Gavin
Joined: 21 Sep 10
Posts: 191
Credit: 40644337738
RAC: 1

Stef wrote: RE: I

Stef wrote:

Quote:
I didn't find any newer high-end AMD GPUs in the top 100 hosts list, so it definitely is a problem.


One page further and you would have found this host of mine (103 at the time of writing). MSI R9 390 running x1 tasks via app_config. Projected average RAC yield from this card is approx. 164K per day running just a single task at once under Windows 7 Pro. To put this into perspective I have several Win 7 machines toting either a HD7970 or R9 280X running tasks at x3, their daily RAC is in the region of 168K... (For those who take the time to look, my hosts are not yet where they ought to be after suffering a prolonged outage at my end!)

With the 390 I have tried all the tricks and tweaks that have been suggested both here and elsewhere but as others have found none of them work trying to crunch multiple BRP6 tasks which is a shame because this card could be truly fast. My testing running tasks x2 could have given an RAC of ~202K if it wasn't for the 50 odd or more % validate error rate!

Typical GPU loadings on the 390 (as read by GPU-z) for a single task swing between 68-96% or about 86% on average. Running x2 IIRC the load was more or less constant at 100% with the occasional drop to 79ish which by my rusty maths should give a broadly similar average load, or am I wrong?
Assuming I'm correct with the above and the average loading is near the same I have come to the conclusion that either there is something amiss with AMD's architecture/implementation of the Hawaii chip or there is some obscure incompatibility with the Einstein app and or OpenCL version/capability. Who knows? I'm clutching at straws for the answer but at the same time I'm not disappointed with the cards current performance, its cooler, quieter and cheaper to run than the 280's.

I have also had the good fortune to acquire a number of R9 380 'Tonga' cards for a possibly limited period. x2 is possible on these cards to push loading to near constant 99% but they too suffer from the dreaded validate error if they are asked to do anything whilst crunching. Steaming live video etc. seems to push the limits and cause problems...

Maybe Heinz-Bernd and the other developers will read this and the other recent AMD 290/Fury threads and delve a little deeper into the issue :-)

Gav.

chase1902
chase1902
Joined: 13 Aug 11
Posts: 37
Credit: 1264094642
RAC: 0

Thats a pretty impressive RAC

Thats a pretty impressive RAC Gavin,

I have not managed to get that out of my 7970 or R290X. I'm going to have to have another play see if I can up the flow a bit, especially now the cooler weather will be on the way i wont have to worry so much about the heat.

Can I ask do you over clock the cards much or run them at factory set.

John

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4968
Credit: 18760608419
RAC: 7160594

After looking at your invalid

After looking at your invalid tasks, I would guess that there is some sort of incompatibility/integration with the Einstein applications, ATI drivers and the OpenCL implementation. It looks like a lack of card resources when running 2X that immediately throws an exception handling event at task startup. I think the Einstein developers need to look closely at this. I don't believe it has anything to do with one specific host. First thing I would do is update to the latest 7.6.6 Boinc Manager since there are some specific fixes made for SETI and MW projects to prevent invalids. Might help with Einstein. Second would be to set some of the debug flags in the cc_config file using the BM interface. I would set co-processsor_debug, mem_usage_debug,checkpoint_debug,statefile_debug and task_debug. Then post the log results for an invalidated task to see if we can figure out just what the application or BOINC is complaining about that causes the invalid.

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.