NVIDIA/AMD question

Anonymous
Topic 197667

I have 3 boxes crunching E@H. Two employ Nvidia cards and the other an AMD. The cards are all pretty much equal but it seems to me that NVIDIA cards run a lot hotter than does the AMD. Is this what others experience?

Phil
Phil
Joined: 8 Jun 14
Posts: 726
Credit: 266026010
RAC: 2026187

NVIDIA/AMD question

I've run both Nvidia and AMD. I chose to stick with Nvidia for several reasons, but not heat. My Nvidia cards always seemed to run cooler. My hottest card is currently running 58C, and other cards are running around 50C.

Phil

Edit: Cards vary widely on cooling systems, regardless of chipset. Two similar cards electronically may have much different cooling. This is something I've been researching for future purchases. The chips may come from Nvidia/AMD, but the card layout and cooling system are up to the card manufacturer.

I thought I was wrong once, but I was mistaken.

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

One of the main factors is

One of the main factors is the factory fan profile setup in the firmware of the card. For example, my Gigabyte 7970 cards have an aggressive fan profile which will ramp the fan speed up to 100% if necessary to keep the load temperatures down. The Gigabyte card I have running currently has a fan speed of 81%, GPU load temperature of 65C, and ambient temperature of 31C. This is with the fan speed set to auto.

The MSI 7970 cards have a fan profile similar to NVIDIA cards that will allow the GPU temperature to exceed 80C under load. I have worked with many different NVIDIA cards and all of the cards I have worked with will exceed 80C under load with the factory fan profile. For these cards, I manually set the fan speed between 70 and 90% depending on ambient temperature.

The AMD R9-290x card will reach as high as 95C with the factory fan profile. This GPU will run much cooler under load with a fan speed set manually at 70% or greater.

Another factor is the TIM application under the heatsink. In some cases, the TIM application is poor and this will cause the GPU to run excessively hot under load.

Bill592
Bill592
Joined: 25 Feb 05
Posts: 786
Credit: 70825065
RAC: 0

Thanks Jeroen ! That is good

Thanks Jeroen ! That is good info.
I recently set my fan speed to Manual as it
was running hotter than I would have liked
on Auto.

Bill

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1909
Credit: 1440289478
RAC: 1225575

I use Precision X on 6 of

I use Precision X on 6 of my nVidia GeForce cards to check the temp and set the fan speeds to keep them running in the upper 50C's on a hot summer day and at night they run closer to 50C when the room temp drops and I leave the fans speeds the same (approx. 52%)

I can't use that program on my laptop GeForce 610M so the fan is set on auto since it gets up towards 90C on hot summer days and cools down to 84C at night and has been running 24/7 for the last 26 months (of course the battery didn't last that long and I didn't waste money on a new one)

Anonymous

RE: One of the main factors

Quote:
One of the main factors is the factory fan profile setup in the firmware of the card. For example, my Gigabyte 7970 cards have an aggressive fan profile which will ramp the fan speed up to 100% if necessary to keep the load temperatures down. The Gigabyte card I have running currently has a fan speed of 81%, GPU load temperature of 65C, and ambient temperature of 31C. This is with the fan speed set to auto.

I run Linux on all of my crunchers and have installed "into" the nvidia gui a fan slider that allows me to adjust the fan rpm. The slider can be installed like:
1. sudo nvidia-xconfig –cool-bits=4
2. reboot
3. launch nvidia-settings from command line or from the “Dash Homeâ€.
4. click on “Thermal Settings†for your GPU.
5. Note the “enable gpu fan settings checkbox†and the fan slider.
6. You will have to acknowledge that changing fan speed manually will effect your warranty.

I currently run fans on both a GeForce 760 and 770 at about 73% to hold a temp around 65C. This setting however is not maintained across system reboots. You have to reset it by entering the GUI. The 73% setting is required for this time of the year and will be lowered during the fall/winter period.

[Edit] Jerome,

I was looking at your Tahiti computer and noted that it is running 3 Tahiti cards and generating around 330,000 per day or say 100,000 per card. I have a Pitcairn generating around 76,000/day. It is crunching 4 GPU WUs simultaneously. Do you think it should/could generate more credits? If so what adjustments might I make?

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

RE: Thanks Jeroen ! That

Quote:

Thanks Jeroen ! That is good info.
I recently set my fan speed to Manual as it
was running hotter than I would have liked
on Auto.

Bill

No problem. In my experience, the higher fan speed and resulting lower GPU temperature helps with long term stability of the cards used for crunching. I prefer keeping the GPU temperature below 80C for 24/365 load.

Jeroen

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

RE: I run Linux on all of

Quote:

I run Linux on all of my crunchers and have installed "into" the nvidia gui a fan slider that allows me to adjust the fan rpm. The slider can be installed like:

Thanks for sharing your experience and installation steps for NVIDIA fan control.

On the AMD side, I very much like the aticonfig tool. The tool makes it simple for fan speed adjustment and OC control. I can also run the tool remotely through SSH which is handy.

Quote:

I was looking at your Tahiti computer and noted that it is running 3 Tahiti cards and generating around 330,000 per day or say 100,000 per card. I have a Pitcairn generating around 76,000/day. It is crunching 4 GPU WUs simultaneously. Do you think it should/could generate more credits? If so what adjustments might I make?

PCI-E bandwidth has a significant impact on runtime with BRP5 Perseus Arm Survey Search. With the setup that I am running, I have my cards running at x16/x16/x8 3.0 and I also increased the bus frequency to 103 MHz so that each card has as much bandwidth as possible. Running three tasks per GPU, the one card connected at x8 has a runtime of approximately 8350 seconds compared to approximately 6900 seconds of the cards connected at x16. This is a fairly significant difference between the link speeds. Increasing the PCI-E frequency works in some but not all cases. The 7970 cards handle an increase by a few MHz well but not the R9-290x.

For a single GPU system, ideally the card will connect at x16 3.0. You can confirm this with lspci in Linux. The tool should report 8.0 GT/s for the slot where the card is installed in. NVIDIA has a tool called bandwidthtest and AMD a tool called BufferBandwidth, both of which can measure PCI-E bandwidth in GB/s. At x16 3.0, you should see between 12 and 14 GB/s with these tools.

Here is a one way to pull the link speed by device ID with the GPU under load:

lspci -vv -s 01:00.0 |grep LnkSta:
Output: LnkSta: Speed 8GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

I have also seen some performance difference from one kernel version to another. I found the most optimal and stable configuration to date to be kernel 3.12 patched with BFS scheduler and a patch to be able to build for newer CPUs. I run this kernel version with AMD driver 14.6.

Jeroen

Anonymous

RE: and I also increased

Quote:

and I also increased the bus frequency to 103 MHz so that each card has as much bandwidth as possible.

Thanks Jeroen. Most informative. I will have to look into increasing the bus frequency. I am assuming this is a BIOS change.

Titan
Titan
Joined: 29 Aug 13
Posts: 19
Credit: 25868802
RAC: 0

Robl, I found a minor

Robl, I found a minor improvement in running a 102.8 MHz bclk x46 multiplier versus 100 MHz x 47 multiplier with my 3570k and 2 R9 280x's, while working on the Perseus Arm tasks. The bclk speed on ivybridge also generates the clock speed of the PCI e bus.) On the order of about 1.5% reduction in average task time with 3 tasks per GPU. Also simply raising the clock speed of the CPU to 4.7ghz vs stock speeds also yielded improvements in task completion time.

Sid
Sid
Joined: 17 Oct 10
Posts: 164
Credit: 980210578
RAC: 356861

RE: PCI-E bandwidth has a

Quote:

PCI-E bandwidth has a significant impact on runtime with BRP5 Perseus Arm Survey Search. With the setup that I am running, I have my cards running at x16/x16/x8 3.0 and I also increased the bus frequency to 103 MHz so that each card has as much bandwidth as possible. Running three tasks per GPU, the one card connected at x8 has a runtime of approximately 8350 seconds compared to approximately 6900 seconds of the cards connected at x16. This is a fairly significant difference between the link speeds. Increasing the PCI-E frequency works in some but not all cases. The 7970 cards handle an increase by a few MHz well but not the R9-290x.

Jeroen

Jeroen,

Could you please provide a bit more info regarding your setup?
You have mentioned 8350 seconds for PCI-E x8 and 6900 seconds for x16. However, it is not clear for me - are those times for 7970(280x) or for 290x?
Thank you.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.