PCIe 40x, 16x, what next?

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 539727516
RAC: 150375

RE: The benefit of lower

Quote:
The benefit of lower gpu utilization is lower power/heat.


That's a very inefficient way to throttle power usage, because you're still running the GPU in a "full throttle" mode (it doesn't know you want it throttled, but assumes you want the task finished as qickly as possible). And the calculation breaks when the single WU has to wait for something are too short for the GPU to power down.

If you load your higher with multiple WUs, but comppensate by reducing the power target (this should also be possible for AMDs) they will reduce clock speed and voltage. The latter gives you better energy efficiency.

This is independent of "single vs. triple GPU". And I wrote it because I think your config with just 1 WU per GPU exaggerates the PCIe differences (due to the worse load balancing / micro pauses already mentioned).

MrS

Scanning for our furry friends since Jan 2002

disturber
disturber
Joined: 26 Oct 14
Posts: 30
Credit: 57155818
RAC: 0

RE: BTW: the top hosts

Quote:

BTW: the top hosts with 2 Tahitis (smaller than your Hawaiis) achieve about 120k RAC per GPU, using i7 4770K as hosts, i.e. with 8x PCIe 3 or with a PLX.

MrS

How does one achieve those outputs for the AMD cards? I just bought a Gigabyte 7970 OC gpu clock is 1000MHz. The host is a 2500K running at 4.3GHz. Slot is 16x PCIe 2. I am running 2 Perseus Arm WU and each finishes in 90 minutes. Is there anything I can do to bring up the output from 87k (calculated based on elapsed time and credit). I ran Arecebo tasks before and I was getting even lower RAC (78k). Is PCIe 2 limiting me, or running windows? I rolled back the latest driver to an earlier version.

http://einsteinathome.org/host/11685226

woohoo
woohoo
Joined: 28 Jul 14
Posts: 20
Credit: 352552543
RAC: 0

If the goal was efficiency, I

If the goal was efficiency, I don't think I would have got the most power hungry gpus around. Heat is somewhat important as too much causes failures. But let's look at some numbers:

with one wu my system is using 540w, the 295x2 runs at 52c and the 290x runs at 68c

with two wu the system uses 585w, the 295x2 runs at 53c and the 290x runs at 70c. the power increase isn't very large, and some of that would be due to the cpu working about 20% harder to feed double the gpu wus.

so i'm not one to tell anyone how to run their stuff, but this is my first time running a video card cooled by a fish pump and reading horror stories about pump failures, leaks, and automatic throttling down due to overheating has led me to be just a little bit conservative considering the coin i just dropped. none of the top computers run more than two gpus on a 16 lane platform so let's just say i was trying to see how much blood i could squeeze from a rock. my 295x2 only has 85% gpu utilization with two wus so it could be pushed harder as 53c is not a lot of heat.

woohoo
woohoo
Joined: 28 Jul 14
Posts: 20
Credit: 352552543
RAC: 0

my lazy math says you should

my lazy math says you should be getting more than 100k on that. pcie2 is slower than pcie3 but you could try going to 3 or 4 wu

disturber
disturber
Joined: 26 Oct 14
Posts: 30
Credit: 57155818
RAC: 0

RE: my lazy math says you

Quote:
my lazy math says you should be getting more than 100k on that. pcie2 is slower than pcie3 but you could try going to 3 or 4 wu

I will try that now. My version of the card is a terrible overclocker. After raising the mem clock to 1025 and the gpu clock to 1050, it spit out nothing but 0 second workunits that failed with error on compute. My Nvidia cards behave differently by crunching the whole wu and then come up with errors.

woohoo
woohoo
Joined: 28 Jul 14
Posts: 20
Credit: 352552543
RAC: 0

If it were me I wouldn't

If it were me I wouldn't worry so much about trying to overclock. Some will even downclock to stock if reduces errors. i would just use gpu-z to check average temperature and gpu usage.

so on my 290x it was 81% gpu usage on one wu and 97% on two wu so going to three wu might not help much more

but on my 295x2 which is bottlenecked more on the pcie it's 75% on one wu and 85% on two wu so going to three or four wu might help, but at the same time i only have four cpu cores so i have to keep an eye on that too

disturber
disturber
Joined: 26 Oct 14
Posts: 30
Credit: 57155818
RAC: 0

I am going to go with 3

I am going to go with 3 Perseus Arm wu at a time. Because of the performance penalty of the PCIe 2, the gpu was only loaded to 84% with 2 wu. This increased to 93% and gave me 9% more calculated RAC. This is not a lot, but it does put me right at the 100k RAC. I have two cpu processes running, reducing to 1 did not seem to have any effect on the time.

woohoo
woohoo
Joined: 28 Jul 14
Posts: 20
Credit: 352552543
RAC: 0

I'm going to run just one wu

I'm going to run just one wu at a time because for some reason I have a lot of invalids and I never that problem before. Or maybe the driver is the problem.

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7056664931
RAC: 1605069

woohoo wrote:I'm going to run

woohoo wrote:
I'm going to run just one wu at a time because for some reason I have a lot of invalids

Have you tried turning the clock rate down? (core or memory or both?)

I noticed that on both your hosts the majority of the Perseus jobs listed in the task list with "validate error" show on the task page outcome: Validate error (58:00111010)

That specific outcome first showed up on my GTX 970 during core clock overclocking experiments today, never having shown up at all in months of operation of five cards on three hosts. Of course the similarity could be a coincidence, but turning down the clock(s) would be quickly diagnostic.

Even if you think yourself not overclocked this might be worth a try. I currently have two of my five cards (both are GTX 660s, as it happens) slightly underclocked.

woohoo
woohoo
Joined: 28 Jul 14
Posts: 20
Credit: 352552543
RAC: 0

using Nvidia cards in the

using Nvidia cards in the past I've always been able to use Precision X to change clocks but Catalyst isn't allowing my changes to stick so I will stay with one wu per gpu to see if the invalids go away.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.