GPU Problem

Pavel Hanak
Pavel Hanak
Joined: 27 Jul 06
Posts: 9
Credit: 58,214,482
RAC: 265,544

@Alinator I think you take

@Alinator

I think you take this too seriously. I simply posted some of my experiences with GPU apps, that's all. If it helps some crunchers (or even Einstein programmers), good. If it doesn't - what the hell.

Quote:
I suggest you turn off Hyperthreading and re-run your experiments again.

I tried disabling Hyper-threading in BIOS, but W7 wouldn't boot afterwards (good thing I make OS recovery points when I do wild experiments like these).

Quote:
My money is on that at least one of them is going to be brought to a virtual standstill when using all cores, regardless of which way you run the CPU.

No standstills happen. I lock the affinity only for GPU apps (which I always run only one at a time), normal CPU apps are free to jump between CPU threads as they please. And yes, that includes threads which are otherwise "reserved" for GPU apps.

Quote:
The other bone I have to pick is about your conversation with Jake over at MW regarding the BOINC scheduler. ...

I won't dispute that, because I don't know. I'm a fan of BOINC, but not that big fan. I really don't care what is done by BOINC and what by OS or whatever. The only case I'm willing to lose my time and delve deeper into it is when something doesn't work properly (or when I think it doesn't). Because in the end, BOINC is about getting as much science done as fast and reliably as possible. Which is exactly why I bothered to write here and Milkyway forums in the first place.

Quote:
As far as 'low' GPU utilization for EAH apps goes, who are you to say the app has a problem? ...

Like I said, I judged that from my previous experiences. Can I be wrong in the case of Einsten GPU app? Of course I can. Then again, other people here have also noticed that GPU utilization suddenly dropped after the new Einsten app was released. That is all.

Chris
Chris
Joined: 9 Apr 12
Posts: 61
Credit: 45,056,670
RAC: 0

Ok, a question on all of

Ok, a question on all of this.

Running windows, 2 cores and the gpu uses ~60 of the cpu power. I've never seen a GPU app use more than 10% on task manager.

Why would I be bottleneck going to 3 cpu cores? That settles in at around 85%, I'd think you'd be able to keep the gpu fed if you have that much power left over.

Chris
Chris
Joined: 9 Apr 12
Posts: 61
Credit: 45,056,670
RAC: 0

And, as I said that, I

And, as I said that, I accidentally ran all 4 core (apparently 0= all!)

4 cores, 1x GPU.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6,537
Credit: 286,312,022
RAC: 103,610

FWIW : bear in mind that the

FWIW : bear in mind that the bus from CPU to/from GPU can be limiting to ....

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

wal
wal
Joined: 31 Mar 11
Posts: 5
Credit: 19,077,368
RAC: 0

Hello yours, I'm crunching

Hello yours,
I'm crunching e@h over 2 years now. For this I have a special computer build with
- I7-processor
- 8Gbyte Memory
- GTX590 (3GByte Memory)
- Win7 64bit

All over the time, the BRP4 tasks needs 1hour to complete on the GPU, now I need 9-15 hours to complete a tasks.

Einstein@Home 1.33 Binary Radio Pulsar Search (Perseus Arm Survey) (BRP4cuda32nv301) PA0077_00151_48_4 15:26:04 (03:54:02) 0.2C+0.33NV (d1) 25.27 Reported: OK * wal-seti1

Nothing has changed on the machine.
I don't what's going wrong.
Other GPU tasks (seti@home etc.) have normal / known runtimes.

somebody with an idea?

thanks juergen

Chris
Chris
Joined: 9 Apr 12
Posts: 61
Credit: 45,056,670
RAC: 0

RE: Hello yours, I'm

Quote:

Hello yours,
I'm crunching e@h over 2 years now. For this I have a special computer build with
- I7-processor
- 8Gbyte Memory
- GTX590 (3GByte Memory)
- Win7 64bit

All over the time, the BRP4 tasks needs 1hour to complete on the GPU, now I need 9-15 hours to complete a tasks.

Einstein@Home 1.33 Binary Radio Pulsar Search (Perseus Arm Survey) (BRP4cuda32nv301) PA0077_00151_48_4 15:26:04 (03:54:02) 0.2C+0.33NV (d1) 25.27 Reported: OK * wal-seti1

Nothing has changed on the machine.
I don't what's going wrong.
Other GPU tasks (seti@home etc.) have normal / known runtimes.

somebody with an idea?

thanks juergen

Well, you've seen that BRP4 are now CPU only and GPUs are now getting the much longer BRP5. They were 10x longer, but are now about 6x longer than the old BRP4.

FalconFly
FalconFly
Joined: 16 Feb 05
Posts: 191
Credit: 15,650,710
RAC: 0

I see some folks are

I see some folks are concerned that their GPUs aren't working at or near 100% load all the time.

Please note that this is technically near-impossible, even highly optimized GPU code (on BOINC projects) can't load the GPU to anywhere near 100% all the time, if at all.

Has to do with the need to prepare the data in chunks for the GPU and even during GPU crunch phases, the code will likely never use all your GPU to anywhere near 100% (GPGPU code will only use a fraction of your GPUs functions, and then the level of optimization has natural limits). Plus it needs Data on-/ and offload phases.

So seeing something like 70-80% peak load is perfectly normal (and from a GPU code perspective actually pretty good). Lower load periods are typical for the time then results are transferred from GPU back to RAM/HD and fresh chunks of data is being sent to the GPU, therefor GPU crunching is always something like a Burst mode with more or less some idle time inbetween.
Running more than 1 task should reduce most of that intermittent idle time.

Aside from that, your Video cards wouldn't last long under 100% maximum load. Remember these are consumer devices designed for typical use - not permanent full performance. As it stands, they require very good cooling conditions if you expect them to last longer than 6-12 months under this stress (the GPU itself likely will live without problems, the problematic parts are mainly condensators and voltage regulator modules on the Card, these have temperature-dependent limited lifespans).
Cooling is your Video Card's best friend here, and that means the entire card with all components (don't be lured into false safety by using GPU temperature only to judge, the GPU itself is the component least likely to fail on a video card).

--------------
I had the same issue that GPU tasks took forever (>24hrs) with extremely low GPU loads.
Had to set GPU utilization factor of BRP apps from 1.0 (default) to 0.5. After finished GPU tasks were reported (apparently manually updating project preferences is not sufficient to make BOINC adapt this setting), the GPU started picking 2 tasks and reserving now 2x0.5= 1 CPU core for the duty.

Works perfect ever since and the CPU core not used for CPU tasks is pretty busy keeping the 2 GPU tasks loaded. At least that was the solution for me. Haven't tried setting the factor down to 0.25 to get 4 GPU tasks running.

When it comes to crunch times, I can only imagine that different WorkUnits may create a certain spread of runtime (at least that's what I'm observing, although part of it likely comes from me using the Computer while GPU tasks keep running).

This is peak load with my HD7970 (2 GPU tasks running) :

TJ
TJ
Joined: 11 Feb 05
Posts: 178
Credit: 21,041,858
RAC: 0

Good post but I have two

Good post but I have two remarks.
At GPUGRID (nVidia only) I have seen GPU load up to 94.7%, so depends on the programmer.
Secondly high-end cards, especially the factory OC ones have, high grade components and are better than consumer cards and will last longer. But indeed not with 24/7 at full load

Greetings from
TJ

FalconFly
FalconFly
Joined: 16 Feb 05
Posts: 191
Credit: 15,650,710
RAC: 0

Agreed on the code part,

Agreed on the code part, 94.7% GPU loadout is extremely good (assuming it's also efficient, which is a whole different story).

About those factory OC cards, I'd be careful. Although the advertising definitely suggests differently, they're still built by the lowest bidder in China ;)
Plus, those by default run closer to redline limits than normal cards. I wouldn't bet the house on Overclocked components... that's all I'm sayin.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.