RX 480 running slow, why?

fastbunny
fastbunny
Joined: 20 Apr 06
Posts: 22
Credit: 91424422
RAC: 0
Topic 204989

Yesterday I upgraded from an old AMD 7870 (2GB) to new AMD RX 480 (8GB), and I expected it to be at least 50% faster. Looking at the rated GFLOPS, it should even be around 100% faster. However, runtimes for WUs have actually more than doubled (see valid tasks for this machine here, from 1900 s/WU with the 7870 to 4800 with the RX 480). Can anybody tell me why this is?

 

I'm running Windows 10 x64, with the latest stable drivers and BIOS for the GPU. When checking with GPU-Z, I can see that the 'GPU only power draw' is only around 40W, and that the memory controller load is at 0% for most of the time. GPU load is reported at 100%, but I don't really believe this, looking at the other two measurements. On the plus side, the card is running very quietly...

 

I have tried running four instead of two tasks in parallel, thinking that maybe it could do four in the same time as two, but unfortunately it doesn't seem to work that way. Furthermore I tried assigning a full CPU core to each WU instead of 0.5, but this had no effect either.

I have also tested with a cryptomining program, and that does indeed run 50% faster on the new RX 480.

 

Why is Einstein not fully using my RX 480? Any help is much appreciated.

Mumak
Joined: 26 Feb 13
Posts: 325
Credit: 3527577613
RAC: 1468371

Have you checked other

Have you checked other parameters of the GPU, especially GPU clock and GPU Memory clock ?

-----

Bikermatt
Bikermatt
Joined: 9 May 10
Posts: 4
Credit: 1188994085
RAC: 0

Was it a 7870 XT?  if so that

Was it a 7870 XT?  if so that card has 2X the double precision GFLOPS of the RX 480 which is what the new app is dependent on.  if it was just a 7870 GHz edition then I don't have a good answer for you.

Gavin
Gavin
Joined: 21 Sep 10
Posts: 191
Credit: 40644337738
RAC: 1

Speaking with the voice of

Speaking with the voice of experience having had the same issue there are a couple of things to do:

1. Download and install the Intel Chipset Driver Update utility here and update if required. Doing this sorted the memory controller issue for me.

2. Try running just a single task ;-)

 

Gav.

fastbunny
fastbunny
Joined: 20 Apr 06
Posts: 22
Credit: 91424422
RAC: 0

It was a regular 7870 GHz

It was a regular 7870 GHz edition.

The GPU clock and memory clock are both in their highest states.

fastbunny
fastbunny
Joined: 20 Apr 06
Posts: 22
Credit: 91424422
RAC: 0

Thank you Gavin, it chews

Thank you Gavin, it chews trough a single task in around 10 minutes. Now I feel a bit stupid not trying this before... I figured it should do at least two like my old card, maybe more since it's an 8GB model. Anyway, I'm glad everything's running smoothly again.

Chipset drivers were up-to-date apparently.

 

Thank you all for the quick respones!

Gavin
Gavin
Joined: 21 Sep 10
Posts: 191
Credit: 40644337738
RAC: 1

Your welcome and I'm glad you

Your welcome and I'm glad you are making progress :-)

I suggested the Intel chipset update because like you I was trying to run the GPU at x2 tasks and above under Win 10 and the GPU's memory controller would after a time drop and stay at 0% activity even whilst running at x1, the (in my case needed) update cured that for me but x2 tasks and above remained very slow. x2 times were in line with your observed ~4500 seconds. Dropping to a single task gave me runtimes around 550 seconds!

Enjoy!!

Gav.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117758625407
RAC: 34817211

Gavin_14 wrote:... but x2

Gavin_14 wrote:
... but x2 tasks and above remained very slow. x2 times were in line with your observed ~4500 seconds. Dropping to a single task gave me runtimes around 550 seconds!

Hi Gavin,
Thanks for solving the OP's problem and thanks for reporting that you had the same behaviour.  Must be something to do with Windows because there are examples of hosts with dual RX 480s running concurrent tasks under Linux, presumably at the full speed with no slowdown of the type you two observe.

If you look through the validated tasks for that linked host, the current ones are taking about 940s and a couple of days ago they were taking about 1360s - both for the 1.18 app version.  My calculations suggest that the slower time was for x3 and that more recently the host has been switched to x2.  This seems to accord with your single task time of 550s, dropping to 480s per task at x2 and 450-460s per task at x3.  My figures are just from eyeballing a page of validated results so are very approximate only.  However the trend seems quite believable.

I'm interested in RX 4xx performance as I'm considering that series for a future upgrade or two.  I've ordered a 460 to play with so I can get the driver sorted out for my preferred Linux distro.  If that all works out, I'll probably get a 470 and 480 as well so I can do detailed comparisons.  Quite a lot of fun (and distraction) playing with new kit :-).

 

Cheers,
Gary.

Gavin
Gavin
Joined: 21 Sep 10
Posts: 191
Credit: 40644337738
RAC: 1

Gary Roberts wrote: Must be

Gary Roberts wrote:
 Must be something to do with Windows...

I suspect so. On the back of this thread I decided to play around with this host of mine. 2x RX480's running 1 task per GPU with no CPU work to cloud the issue, per task runtimes in this config. are about 520 seconds.
Changing my setup to allow two tasks per GPU displayed some strange behaviour and I quickly reverted back but what I observed in the brief time (and we're talking minutes here) that I ran at x2 concurrency per GPU was that there is no true concurrency (at least on this machine). What I saw was more of an alternate strategy whereby one GPU runs its 2 tasks whilst the other GPU stops running, then after a few seconds the stopped GPU would run and the other would stop! each time this happened there was a period ~2 seconds of zero activity on either card, maybe some sort of unload/load operation happening?
Whatever the cause it could prove to add minutes to runtimes if one is not weary and assumes multiple tasks equate to better throughput on any given host!
Conversely, I also have a machine with 2 R9 380's using the same driver and Windows 10 that is more than happy to run each GPU at x2 concurrency.

The other thing of note was that memory controller load as reported by GPU-z dropped like a stone for both cards when running x2... I had previously fixed this with the Intel chipset driver update I recommended earlier. However that fix was applied when running just a single 480...

Perhaps Stoneageman is around and is prepared comment on his experience with his Linux machines and dual RX480's. AgentB may be able to talk of his single RX480 machine config. also... and with a little help I could switch another dual 480 host to Kubuntu and explore the benefits of Linux and AMD GPU PRO drivers in the hope of unleashing these cards true potential!

Gav.

 

Mumak
Joined: 26 Feb 13
Posts: 325
Credit: 3527577613
RAC: 1468371

I had a similar experience

I had a similar experience with the Fury X. Running 1 WU takes ~440s, while 2 WUs 1500-1800s.
Also Tesla K20c runs extremely poor, about 2900s for 1 WU. Not sure what is causing this, perhaps a poor OpenCL driver support for these series. But on the other hand, running Milkyway or BRP4/6, yielded good/expected performance there.

It's also very strange that installing the Intel INF drivers would make any difference, since AFAIK these are just plain INF files without any real drivers. Just to replace the unknown devices in Device Manager with dummy devices.

-----

AgentB
AgentB
Joined: 17 Mar 12
Posts: 915
Credit: 513211304
RAC: 0

Gavin_14 wrote:Perhaps

Gavin_14 wrote:
Perhaps Stoneageman is around and is prepared comment on his experience with his Linux machines and dual RX480's. AgentB may be able to talk of his single RX480 machine config.[ 

Not really much to say other than it was pretty easy and iirc the only thing i needed to do was add boinc to the video group.  Compared with fgrlx (long overdue for the dustbin) amd-gpupro is a lot better, and hopefully it will lead to a fully open source at some point.

Currently running at 0.33 (aka x3) and seem to recall even successfully (albeit slower) running at x6 and x8, now with amdgpu-pro-16.50-362463.

Typically at x3 the CPU usage per task is ~80s and elapsed time ~1440s which averages 24*3600*3/1440=180 tasks per day, running two CPU tasks to keep the CPU warm leaving one core free.  It should top out at RAC 620-630K.

My app_config.xml looks like

    <app>
         <name>hsgamma_FGRPB1G</name>
         <gpu_versions>
             <gpu_usage>0.33</gpu_usage>
             <cpu_usage>0.5</cpu_usage>
        </gpu_versions>
     </app>

I probably could drop the CPU down to 0.25 without a problem. 

No over-clocking etc.

AMD have put a lot of effort into OpenCL, whereas nVidia pays only lip service.  So we should always expect OpenCL based apps will run better on (newer) AMD hardware, drivers and libraries. AMD also has traditionally strong on DP as well.  If I had nVidia hardware capable (i have 768 MB gtx-460s so they are short on memory) i might try running the AMD OpenCL libraries (libOpenCL.so) against the nVidia drivers - i expect that would be headache, but would be interesting if you could get it to work.

There are some stability issues with the video drivers, it seems mainly power related, displays freezing after a long idle period, i get around that by manually turning the monitor off, rebooting or restarting X maybe once a week i guess). The  tasks continue to crunch even if the video is frozen.

Gavin_14 wrote:

also... and with a little help I could switch another dual 480 host to Kubuntu and explore the benefits of Linux and AMD GPU PRO drivers in the hope of unleashing these cards true potential!

You know you want to.... penguins are friendly.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.