A walk to the AMD side

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7226058262
RAC: 1064729
Topic 218274

Years ago I ran some ATI cards, before AMD bought them.  I've been firmly in the Nvidia habit for years now and believed that for a generation or two Nvidia had a big power efficiency advantage with the Maxwell and Pascal cards, which matters to me.  Also, I was just in the habit and thought I knew how to play in the NVidia sandbox.

But folks here have persistently reported that Einstein currently works better on AMD than on Nvidia for some time now, and my Turing experience probably jostled me to have a more open mind.  Finally the recent Vega VII thread, and a new post by Gary Roberts on the Einstein virtues of AMD 570 cards finally tripped me to try something.

My choice was my wife's primary use PC, which a couple of years ago was running a GTX 970 plus a GTX 750, but more recently has slumbered along running just a GTX 1050, with modest power consumption, and even more modest Einstein throughput.

The case has only a few small poorly positioned fans and lives in a piece of furniture which somewhat hurts cooling.  This is a personal use machine, and we don't like fan noise, so in choosing a specific 570 card I paid attention to a review which tagged the XFX model RX-570P8DFD6 as having particularly good cooling among the 570 cards.

I'll list some of my little difficulties in getting going, just in case some other Nvidia to AMD migrant might be helped.

When I opened the box to check for fit, I noticed that my (older) case positioned hard drive cables in the "tail" area of the new graphics cards.  My XFX card is not especially long for a 570, but barely fit without making me move a hard drive.

The power supply in my case is somewhat older and had only a single 6-pin PCIe power connector.  This 570 wants an 8-pin PCIe.  My particular XFX 570 arrived with a siamese cable to plug two 6-pins into a single 8-pin, and my middle PC desk had a leftover siamese to convert two classic 4-pin "Molex" power connectors to a 6-pin, so, I had a MacGyver solution to power.

But when I plugged in the new card and powered up, I got zero pixels on the screen.  No beep either.  It turned out I had failed to seat the 8-pin PCIe connector into the card.  I have seen cards wake up enough to complain when you do that, but this one just kept mum until I finally figured it out.

The AMD driver install went pretty smoothly, though I was puzzled at the way it asked me a question late in the game--I needed to click something to say "no, I don't want that".

I had placed my queue of work on suspend and disabled work fetch before I removed the Nvidia card.  But I was caught out by a few things I should have known:

1. The downloaded tasks are "branded" as to GPU manufacturer.  All of my suspended Nvidia branded tasks needed to be aborted.

2. There is an explicit "allow Nvidia" and "allow AMD" option in Einstein project preferences.  I needed to enable AMD and to click the project update button so that BOINC on my PC would get the word that it was allowed to request work for the card installed.

3. I've been running TThrottle on my PCs for years, but mostly use it for temperature reporting, with the limits set high enough that throttling does not occur.  I had the GPU limit on this box set to 70C.  For the 1050 card, that was no limit at all at reasonable room temperatures unless a fan failed, but for the new 570, it meant that after high-stepping through the first couple of minutes of work processing, the rate of progress slowed, as TThrottle worked as intended.  I moved the limit up to 80C, and finally got completed work in an elapsed time of about 660 seconds, compared to 2215 for the GTX 1050.

I've gotten the first validation, so despite my stumbling, it appears things are working.

I'll post more details later, but box power is up a lot (about 194 vs. about 100) but still reasonable, and the box-level power productivity is clearly very greatly improved.  Still to come is a try to further raise productivity by running 2X, and a try to lower power by attempting power limitation in either AMD Wattman or MSIAfterburner (which indirectly would mean undervolting the GPU).

This is by no means a full treatment of AMD vs. Nvidia issues, even narrowly at Einstein.  But I thought I'd post quickly while some of my little glitches were fresh in my mind, just in case they might help someone. 

Oh, price: I paid US$170 for my 570, which is currently completing the current flavor of Einstein GRP in 660 elapsed seconds.  The 2080 set me back $800 and is currently delivering elapsed times of 537 seconds.  The 1070 in the same box cost me about $700, and is currently delivering 820 second elapsed times.

It is not just early days, but an early hour, but at the moment I am quite grateful to Gary Roberts for a recent post rehearsing the virtues of the 570 cards for current Einstein GRP work.

We all hope for an Einstein Gravity-wave GPU application.  I imagine card comparisons might be quite different on that.  So the low purchase price of 570 cards is a real plus--just in case they don't stay the right answer for too long.

cecht
cecht
Joined: 7 Mar 18
Posts: 1535
Credit: 2909342077
RAC: 2118157

ARCHAE86 wrote:...which is

ARCHAE86 wrote:
...which is currently completing the current flavor of Einstein GRP in 660 elapsed seconds.

My XFX RX 570 had a similar average task time of 667 s when running default power settings under Windows. At 2x tasks, individual task time dropped to 624 s.  (Under Linux that dropped to 592 s, in line with the 590 s reported by Gary Roberts, https://einsteinathome.org/content/all-things-vega-vii#comment-169608)

I've played around with GPU power and clock settings under Windows and Linux, and settled on just using power limit (power caps), which can automatically throttle top clock speeds. I find that power capping is easier than manually fiddling with clock speeds. Also, from my experience, when you set clock speeds, you lock in that set speed (or power state) when the card is running at full capacity. When you only limit power, however, the drivers can adjust clock speed to whatever is needed for the calculations at hand, so clock speeds can drift a bit.  To me, that seems more efficient. When running E@H tasks with a -23% power limit (96 W, down from 124 W), I see clock speeds bounce among 1143 MHz, 1208 MHz and 1250 MHz (power states 4, 5 & 6). I'm guessing that different generations of AMD cards may do better with different power or clocking strategies.

ARCHAE86 wrote:
My XFX card is not especially long for a 570, but barely fit without making me move a hard drive.

LOL, same here!

 

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117691859178
RAC: 35064147

archae86 wrote:I had placed

archae86 wrote:

I had placed my queue of work on suspend and disabled work fetch before I removed the Nvidia card.  But I was caught out by a few things I should have known:

1. The downloaded tasks are "branded" as to GPU manufacturer.  All of my suspended Nvidia branded tasks needed to be aborted.
....

Funny you should mention that! :-)

One of the old clunkers I decided to upgrade with one of the new RX 570s had previously been running a 650Ti.  It fired up fine but, being a bit cautious, I decided to allow some work and check that the nvidia card could still get validated results.  I ended up with about 15 tasks and after the first few were returned and a validation was achieved, I decided everything was OK for the upgrade.  I suspended the remaining tasks and returned the current one when it finished.  I then copied the entire BOINC tree to an external USB drive.

Rather than fiddle around with the old OS install, I wiped the disk and did a fresh install of everything needed for an RX 570.  I have a couple of scripts that automate the process so it's quicker and less error prone to do that.  I restored the BOINC tree and fired it up to be greeted with the "... missing NVIDIA GPU ..." message.  So, I too had overlooked the different plan classes that the tasks belong to.  I had installed a copy of the ati opencl app but didn't remember the plan class info in the state file.  Rather than abort them, I decided to experiment with something that I suspected would work.

My thinking was that an update would allow the scheduler to see the GPU change.  So if the scheduler also saw that the existing tasks had become 'lost", it should send me new copies but with the updated plan class.  To test this, I deleted just the <result> ... </result> blocks for each remaining task and then did the update.  The scheduler obliged with fresh copies, all with the correct plan class.  Crunching then resumed without further complaint about the missing GPU.

 

Cheers,
Gary.

kb9skw
kb9skw
Joined: 25 Feb 05
Posts: 21
Credit: 376409819
RAC: 8215

https://einsteinathome.org/ho

https://einsteinathome.org/host/12765822

 

I have the same RX 570 in this box along with a MSI Armor. The clock speed of the XFX is 1286Mhz vs 1268MHz on the MSI and it has the RAM clocked at 1950MHz instead of the MSI's 1750MHz. It's easy to see the difference in the times it returns, the XFX gets around ~690 seconds and the MSI gets around ~720 seconds with ~1220 and ~1275 seconds with two tasks per GPU going. 

You should have no issue clocking the GPU at 1300MHz and the RAM at 2,000MHz. Mine ran at that but I seem to have an issue setting clock speeds on both cards to different frequencies. The MSI's RAM is not going to do 2,000MHz.

I'm still enjoying the ~380 seconds (~700 seconds 2x) with the R9 Fury, though at a bit more power usage than the 120 watts the RX 570 uses. 

mmonnin
mmonnin
Joined: 29 May 16
Posts: 291
Credit: 3412756540
RAC: 3456599

The RX series (as well as

The RX series (as well as VEGA really) have been great E@H cards for the upfront cost and power consumption. I'm ignoring the inflated GPU cost during BTC mining period.

 

ARCHAE86, How many concurrent tasks are now running on Computer 10659288 with the 570? I see run times of around 2.2k seconds. Is that at 4x? There is nearly identical CPU time as Run time as well. For an OpenCL app on an AMD card in Windows that seems abnormal. KB9SKW's tasks are around ~18% and mine are a bit over 10% on an XFX 580 in Win7.

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7226058262
RAC: 1064729

MMONNIN,  The  2200 second

MMONNIN,  The  2200 second run times were logged on GTX 1050 tasks returned a day or more ago.  I am currently running 2X on the RX 570.  That is currently yielding elapsed times of  about 20:32.  You'll see these as 1232 second elapsed time with 166 second CPU times.  So if you move away from the Nvidia-processed tasks you'll see something less surprising to you.

When I was running at default conditions (and my wife was not playing solitaire), the 2X elapsed times were 20:15.  I've now specified a -40% power limitation using MSIAfterburner, which has given a box level power reduction from 197.9 to 178.1 watts, or 10% while suffering an Einstein output reduction of only 1.5%.  That power reduction, plus use of my old fan curve in MSIAfterburner, got the reported GPU temperature down from 82C to 69C, and moderately low fan noise.

On a credit/day/watt basis (at the box level), my RX 570 box nows scores 2725, second only to my RTX 2080 + GTX 1070 box (also operating under aggressive power limitation) which scores 3136, and has the advantage of amortizing the non-GPU power consumption over a considerably higher total output.

On this same power productivity measure, my revised system scored an abysmal 1326 when it was running the GTX 1050 alone.  This does not really mean the 1050 itself is terribly power-inefficient, but partly that it did not have enough output to amortize the base box consumption (of about 50 watts) effectively.

My third box still runs a GTX 1050 + GTX 1060 3GB.  With my newly reached power limitation results on the 570 box, that box burns slightly more power (183 vs. 178) and has considerably lower indicated Einstein credit rate for the current work type (360k vs. 485k).

My trigger finger is itchy to buy and install a second 570 card, ditching the 1050 + 1060 3GB.  My caution suggests waiting a bit for the other shoe to drop.  My gratitude to Gary Roberts grows.

 

mmonnin
mmonnin
Joined: 29 May 16
Posts: 291
Credit: 3412756540
RAC: 3456599

Gotcha, I flipped through a

Gotcha, I flipped through a couple of tabs but I guess not far enough through the pages to get to the new tasks.

 

Edit: Is this tempting? XFX 570 for $140 after $20MIR.

https://www.newegg.com/Product/Product.aspx?Item=N82E16814150815

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117691859178
RAC: 35064147

archae86 wrote:My trigger

archae86 wrote:
My trigger finger is itchy to buy and install a second 570 card, ditching the 1050 + 1060 3GB.

I'm really pleased (and very relieved) that this is working out for you.  I always worry that what I find works for me may not work well for others - particularly when my preference for AMD is not supported by 'hands on' experience with competing nvidia products.

I don't know what the 2nd hand market is like in your neck of the woods, but would it be possible to get enough for the 1060 + 1050 to pay for a RX 570 - particularly if you could get one for $140?

Cheers,
Gary.

kb9skw
kb9skw
Joined: 25 Feb 05
Posts: 21
Credit: 376409819
RAC: 8215

I bought the MSI Armor RX 570

I bought the MSI Armor RX 570 on eBay for $89.50 shipped. If you are not worried about a warranty keep an eye out there, they are cheap if you hit on the right day.

cecht
cecht
Joined: 7 Mar 18
Posts: 1535
Credit: 2909342077
RAC: 2118157

Thanks to MMONNIN's link to

Thanks to MMONNIN's link to that XFX RX 570 sale on Newegg, with it's fancy description of features on the Overview tab, I just discovered what the dual BIOS switch does on my XFX RX 570 card. I knew about the dual BIOS, but didn't realized that the card came with a mining BIOS already loaded; I assumed that the switch just gave cryptominers the option to safely flash their own BIOSs.

I've been running that card with the BIOS switch in the default gaming position (toward read of card), which gives a top clock speed of 1286 MHz, a top memory clock of 1750 MHz, and a power limit of 125 W. I switched to the mining BIOS (shut down, *flip*, reboot), and now see a top core clock of 1100 MHz, a top memory clock of 1850 MHz, and a power limit of 120 W.  Without tweaking any amdgpu settings, while running E@H, GPU power hasn't gone above 88 W and temps are 74 C.  And task times? While running at 2X tasks, individual task times are ~582 s with the mining BIOS  vs. ~606 s with the gaming BIOS. (For those gaming BIOS times, I had the card power capped at 97 W, or -22%; when paired with a RX 460 in the same host, using the power cap paradoxicallly gave me faster task times, and cooler temps, than if the 570 were run at full speed.)

Faster crunch times at lower power without using a utility to tweak card settings is a sweet deal for a simple flip of the switch. Do all RX 570 cards come with dual BIOS loaded, or is that an XFX thing?

I was a bit surprised to see both BIOSs using amdgpu's 0-3D_FULL_SCREEN performance mode. Performance mode is set to AUTO, so I guess running E@H doesn't trigger the COMPUTE performance mode.

Ideas are not fixed, nor should they be; we live in model-dependent reality.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117691859178
RAC: 35064147

cecht wrote:Faster crunch

cecht wrote:
Faster crunch times at lower power without using a utility to tweak card settings is a sweet deal for a simple flip of the switch. Do all RX 570 cards come with dual BIOS loaded, or is that an XFX thing?

Wow, that sure is a nice feature with a nice result! :-).  I have 570s from MSI, Sapphire, Gigabyte, and 580s from Asus and none of those have any mention of a switchable BIOS feature.  Maybe it is just an XFX thing.  Maybe with the slump in the mining boom, they need to get rid of excess inventory having that feature, while they can :-).

cecht wrote:
I was a bit surprised to see both BIOSs using amdgpu's 0-3D_FULL_SCREEN performance mode. Performance mode is set to AUTO, so I guess running E@H doesn't trigger the COMPUTE performance mode.

Maybe, with AUTO, the Einstein app would need to offer some sort of signal to trigger a mode switch.  I don't imagine the Einstein app is fancy enough to do that.  Can you set that COMPUTE mode and, if so, does it make any difference?

 

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.