NVIDIA/AMD question

Anonymous

6 Aug 2014 10:59:36 UTC

Topic 197667

(moderation:

)

I have 3 boxes crunching E@H. Two employ Nvidia cards and the other an AMD. The cards are all pretty much equal but it seems to me that NVIDIA cards run a lot hotter than does the AMD. Is this what others experience?

Phil

Joined: 8 Jun 14

Posts: 726

Credit: 266026010

RAC: 2026187

NVIDIA/AMD question

6 Aug 2014 12:39:47 UTC

Message 122510

(moderation:

)

I've run both Nvidia and AMD. I chose to stick with Nvidia for several reasons, but not heat. My Nvidia cards always seemed to run cooler. My hottest card is currently running 58C, and other cards are running around 50C.

Phil

Edit: Cards vary widely on cooling systems, regardless of chipset. Two similar cards electronically may have much different cooling. This is something I've been researching for future purchases. The chips may come from Nvidia/AMD, but the card layout and cooling system are up to the card manufacturer.

I thought I was wrong once, but I was mistaken.

Jeroen

Joined: 25 Nov 05

Posts: 379

Credit: 740030628

RAC: 0

One of the main factors is

6 Aug 2014 23:34:48 UTC

Message 122511

(moderation:

)

One of the main factors is the factory fan profile setup in the firmware of the card. For example, my Gigabyte 7970 cards have an aggressive fan profile which will ramp the fan speed up to 100% if necessary to keep the load temperatures down. The Gigabyte card I have running currently has a fan speed of 81%, GPU load temperature of 65C, and ambient temperature of 31C. This is with the fan speed set to auto.

The MSI 7970 cards have a fan profile similar to NVIDIA cards that will allow the GPU temperature to exceed 80C under load. I have worked with many different NVIDIA cards and all of the cards I have worked with will exceed 80C under load with the factory fan profile. For these cards, I manually set the fan speed between 70 and 90% depending on ambient temperature.

The AMD R9-290x card will reach as high as 95C with the factory fan profile. This GPU will run much cooler under load with a fan speed set manually at 70% or greater.

Another factor is the TIM application under the heatsink. In some cases, the TIM application is poor and this will cause the GPU to run excessively hot under load.

Bill592

Joined: 25 Feb 05

Posts: 786

Credit: 70825065

RAC: 0

Thanks Jeroen ! That is good

7 Aug 2014 2:51:35 UTC

Message 122512 in response to message 122511

(moderation:

)

Thanks Jeroen ! That is good info.
I recently set my fan speed to Manual as it
was running hotter than I would have liked
on Auto.

Bill

MAGIC Quantum M...

Joined: 18 Jan 05

Posts: 1909

Credit: 1440289478

RAC: 1225575

I use Precision X on 6 of

7 Aug 2014 4:12:37 UTC

Message 122513

(moderation:

)

I use Precision X on 6 of my nVidia GeForce cards to check the temp and set the fan speeds to keep them running in the upper 50C's on a hot summer day and at night they run closer to 50C when the room temp drops and I leave the fans speeds the same (approx. 52%)

I can't use that program on my laptop GeForce 610M so the fan is set on auto since it gets up towards 90C on hot summer days and cools down to 84C at night and has been running 24/7 for the last 26 months (of course the battery didn't last that long and I didn't waste money on a new one)

Anonymous

RE: One of the main factors

7 Aug 2014 10:37:28 UTC

Message 122514 in response to message 122511

(moderation:

)

Quote:

One of the main factors is the factory fan profile setup in the firmware of the card. For example, my Gigabyte 7970 cards have an aggressive fan profile which will ramp the fan speed up to 100% if necessary to keep the load temperatures down. The Gigabyte card I have running currently has a fan speed of 81%, GPU load temperature of 65C, and ambient temperature of 31C. This is with the fan speed set to auto.

I run Linux on all of my crunchers and have installed "into" the nvidia gui a fan slider that allows me to adjust the fan rpm. The slider can be installed like:
1. sudo nvidia-xconfig â€“cool-bits=4
2. reboot
3. launch nvidia-settings from command line or from the â€œDash Homeâ€.
4. click on â€œThermal Settingsâ€ for your GPU.
5. Note the â€œenable gpu fan settings checkboxâ€ and the fan slider.
6. You will have to acknowledge that changing fan speed manually will effect your warranty.

I currently run fans on both a GeForce 760 and 770 at about 73% to hold a temp around 65C. This setting however is not maintained across system reboots. You have to reset it by entering the GUI. The 73% setting is required for this time of the year and will be lowered during the fall/winter period.

[Edit] Jerome,

I was looking at your Tahiti computer and noted that it is running 3 Tahiti cards and generating around 330,000 per day or say 100,000 per card. I have a Pitcairn generating around 76,000/day. It is crunching 4 GPU WUs simultaneously. Do you think it should/could generate more credits? If so what adjustments might I make?

Jeroen

Joined: 25 Nov 05

Posts: 379

Credit: 740030628

RAC: 0

RE: Thanks Jeroen ! That

7 Aug 2014 23:12:49 UTC

Message 122515 in response to message 122512

(moderation:

)

Quote:

Thanks Jeroen ! That is good info.
I recently set my fan speed to Manual as it
was running hotter than I would have liked
on Auto.

Bill

No problem. In my experience, the higher fan speed and resulting lower GPU temperature helps with long term stability of the cards used for crunching. I prefer keeping the GPU temperature below 80C for 24/365 load.

Jeroen

Joined: 25 Nov 05

Posts: 379

Credit: 740030628

RAC: 0

RE: I run Linux on all of

7 Aug 2014 23:27:51 UTC

Message 122516

(moderation:

)

Quote:

I run Linux on all of my crunchers and have installed "into" the nvidia gui a fan slider that allows me to adjust the fan rpm. The slider can be installed like:

Thanks for sharing your experience and installation steps for NVIDIA fan control.

On the AMD side, I very much like the aticonfig tool. The tool makes it simple for fan speed adjustment and OC control. I can also run the tool remotely through SSH which is handy.

Quote:

I was looking at your Tahiti computer and noted that it is running 3 Tahiti cards and generating around 330,000 per day or say 100,000 per card. I have a Pitcairn generating around 76,000/day. It is crunching 4 GPU WUs simultaneously. Do you think it should/could generate more credits? If so what adjustments might I make?

PCI-E bandwidth has a significant impact on runtime with BRP5 Perseus Arm Survey Search. With the setup that I am running, I have my cards running at x16/x16/x8 3.0 and I also increased the bus frequency to 103 MHz so that each card has as much bandwidth as possible. Running three tasks per GPU, the one card connected at x8 has a runtime of approximately 8350 seconds compared to approximately 6900 seconds of the cards connected at x16. This is a fairly significant difference between the link speeds. Increasing the PCI-E frequency works in some but not all cases. The 7970 cards handle an increase by a few MHz well but not the R9-290x.

For a single GPU system, ideally the card will connect at x16 3.0. You can confirm this with lspci in Linux. The tool should report 8.0 GT/s for the slot where the card is installed in. NVIDIA has a tool called bandwidthtest and AMD a tool called BufferBandwidth, both of which can measure PCI-E bandwidth in GB/s. At x16 3.0, you should see between 12 and 14 GB/s with these tools.

Here is a one way to pull the link speed by device ID with the GPU under load:

lspci -vv -s 01:00.0 |grep LnkSta:
Output: LnkSta: Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

I have also seen some performance difference from one kernel version to another. I found the most optimal and stable configuration to date to be kernel 3.12 patched with BFS scheduler and a patch to be able to build for newer CPUs. I run this kernel version with AMD driver 14.6.

Jeroen

Anonymous

RE: and I also increased

8 Aug 2014 2:35:01 UTC

Message 122517 in response to message 122516

(moderation:

)

Quote:

and I also increased the bus frequency to 103 MHz so that each card has as much bandwidth as possible.

Thanks Jeroen. Most informative. I will have to look into increasing the bus frequency. I am assuming this is a BIOS change.

Titan

Joined: 29 Aug 13

Posts: 19

Credit: 25868802

RAC: 0

Robl, I found a minor

8 Aug 2014 3:15:44 UTC

Message 122518

(moderation:

)

Robl, I found a minor improvement in running a 102.8 MHz bclk x46 multiplier versus 100 MHz x 47 multiplier with my 3570k and 2 R9 280x's, while working on the Perseus Arm tasks. The bclk speed on ivybridge also generates the clock speed of the PCI e bus.) On the order of about 1.5% reduction in average task time with 3 tasks per GPU. Also simply raising the clock speed of the CPU to 4.7ghz vs stock speeds also yielded improvements in task completion time.

Sid

Joined: 17 Oct 10

Posts: 164

Credit: 980210578

RAC: 356861

RE: PCI-E bandwidth has a

10 Nov 2014 10:57:46 UTC

Message 122519 in response to message 122516

(moderation:

)

Quote:

PCI-E bandwidth has a significant impact on runtime with BRP5 Perseus Arm Survey Search. With the setup that I am running, I have my cards running at x16/x16/x8 3.0 and I also increased the bus frequency to 103 MHz so that each card has as much bandwidth as possible. Running three tasks per GPU, the one card connected at x8 has a runtime of approximately 8350 seconds compared to approximately 6900 seconds of the cards connected at x16. This is a fairly significant difference between the link speeds. Increasing the PCI-E frequency works in some but not all cases. The 7970 cards handle an increase by a few MHz well but not the R9-290x.

Jeroen

Jeroen,

Could you please provide a bit more info regarding your setup?
You have mentioned 8350 seconds for PCI-E x8 and 6900 seconds for x16. However, it is not clear for me - are those times for 7970(280x) or for 290x?
Thank you.

NVIDIA/AMD question

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner