GPU utilization benchmark

Damaraland

Joined: 9 Feb 05

Posts: 24

Credit: 5691134

RAC: 0

16 Mar 2012 16:24:57 UTC

Topic 196241

(moderation:

)

I tested 1, 0.5 y 0.33 GPU utilization factor of BRP apps. Binary Radio Pulsar Search (Arecibo) v1.23 (BRP4cuda32nv270)
My system
i7-2600K CPU @ 3.40GHz, 2 GPUs, Linux3.0.0-16-generic.
On multiprocessors, use at most 60% of the processors.
I'm running both Einstein & POEM@home
NVIDIA GeForce GTX 560 Ti
NVIDIA GeForce GTX 260

Here are the results:
where s=standard deviation (very small, so the mean is a very good mesure)
where secs/unit is the mean of all units processed
Running GPU utilization factor of BRP apps=1
GPU............secs/unit...s... nÂºunits tested
GTX 560 Ti....2.136.......3.......11
GTX 260........2.440.......4.......10

Running GPU utilization factor of BRP apps=0.5
2 task each GPU Memory is about 60%
GPU............secs/unit...s... nÂºunits tested
GTX 560 Ti....3.830.......15.......15....1.915 secs/unit (equivalent average)
GTX 260.......4.759.......4.........11....2.380 secs/unit (equivalent average)

Running GPU utilization factor of BRP apps=0.333
3 task each GPU Memory is about 98%
GPU............secs/unit...s... nÂºunits tested
GTX 560 Ti....5.163.......27.......28....1.721 secs/unit (equivalent average)
GTX 260.......6.789.......6.........17....2.263 secs/unit (equivalent average)

Conclusions:
Running GPU utilization factor of BRP apps=0.5
Increase of performance of 10,3% on a GTX 560 Ti
Increase of performance of 2,5% on a GTX 260

Running GPU utilization factor of BRP apps=0.333
Increase of performance of 19,4% on a GTX 560 Ti
Increase of performance of 7,2% on a GTX 260

Temperatures of GPUS are fine.

HOW TO - Full installation Ubuntu 11.10

Sparrow

Joined: 4 Jul 11

Posts: 29

Credit: 10701417

RAC: 0

GPU utilization benchmark

16 Mar 2012 21:36:01 UTC

Message 108804

(moderation:

)

My results with a GTX 560Ti are a bit different:

GPU utilization factor of BRP apps=1
secs/unit: 1709 (average from 22 WUs)

GPU utilization factor of BRP apps=0.5
secs/unit: 2786 (average from 3 WUs)
= 1393 secs/unit equivalent average
-> 18.5% increase of performance

Interesting side note:
The variation of time needed for each WU is going down significantly when I'm running 2 WUs at a time. The the three WUs took 2784, 2785, and 2789 seconds. The WUs that are running right now will also take exactly this time.
The WUs I was running with only 1 WU at a time took between 1600 and 1758 seconds.

My system: i920 @ 3.31GHz, Linux 2.6.32-39-generic, 1 GPU

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7390421687

RAC: 2013402

I had been running three GPU

19 Mar 2012 20:41:29 UTC

Message 108805

(moderation:

)

I had been running three GPU jobs using app_info. After a long wait running down my queue of older work, I removed app_info and got a good start using the new mechanism with count of 0.33 giving three GPU jobs.

But now I am trying to check out two GPU job behavior, and it does not seem to be listening. I've checked the location (a.k.a. "venue"), changed the location-specific "GPU utilization factor of BRP apps" from 0.33 to 0.5, and pressed the "update" button for Einstein. When the next units to start up started as a triple, I tried shutting down boincmgr and restarting. No joy. Shut down the computer, counted to thirty, and rebooted. Still three jobs. Did another user-requested Update. Still three jobs. Although I believed that this is not a local preference item and that I had no local preference set, I nevertheless went into the boincmgr preference dialog and commanded "clear". Still no joy (i.e. three GPU Einstein jobs).

Anyone else see some problem in changing the specified number of GPU jobs? Any secret sauce? Any idea what I most likely did wrong?

Logforme

Joined: 13 Aug 10

Posts: 332

Credit: 1714373961

RAC: 0

RE: Anyone else see some

19 Mar 2012 21:04:18 UTC

Message 108806 in response to message 108805

(moderation:

)

Quote:

Anyone else see some problem in changing the specified number of GPU jobs? Any secret sauce? Any idea what I most likely did wrong?

Had the exact same problem. If I understood the answer I got here correctly, only new workunits downloaded from the server will be flagged with the new parameters.
So patients does reward you :)

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3039304817

RAC: 1922161

RE: RE: Anyone else see

19 Mar 2012 21:11:55 UTC

Message 108807 in response to message 108806

(moderation:

)

Quote:

Quote:
Anyone else see some problem in changing the specified number of GPU jobs? Any secret sauce? Any idea what I most likely did wrong?

Had the exact same problem. If I understood the answer I got here correctly, only new workunits downloaded from the server will be flagged with the new parameters.
So patients does reward you :)

Not quite, but related. The value lives in the section in client_state.xml, and that section is only re-transmitted from the server to the client when new work is allocated. But once transmitted, it applies to all work in the cache, old and new.

@ archae86 - which host are you talking about, and when did you try this? 3409259 last received work at 19 Mar 2012 0:46:24 UTC, 4234243 at 19 Mar 2012 12:20:35 UTC (unless there are later allocations lurking on later pages - I didn't check).

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7390421687

RAC: 2013402

Richard Haselgrove wrote:Not

19 Mar 2012 21:59:31 UTC

Message 108808 in response to message 108807

(moderation:

)

Richard Haselgrove wrote:

Not quite, but related. The value lives in the section in client_state.xml, and that section is only re-transmitted from the server to the client when new work is allocated. But once transmitted, it applies to all work in the cache, old and new.

@ archae86 - which host are you talking about, and when did you try this? 3409259 last received work at 19 Mar 2012 0:46:24 UTC, 4234243 at 19 Mar 2012 12:20:35 UTC (unless there are later allocations lurking on later pages - I didn't check).

Thank you, yet again, Richard, and now that answer sounds familiar--must have failed to remember it from these pages.

The host I mean to experiment with is my newest, and has recently been banging into the daily maximum (as has been my other GPU-equipped host) as I seek to rebuild inventory from taking it to zero at the changeover. I think midnight UTC is just a little time in the future--I'll force an update then and expect to see compliance with my 2 GPU job request.

As to experiments, the host under review had a spontaneous restart problem about a week ago. I suspected over-temperature, and reduced the maximum number of CPU jobs to two (from four). I saw a larger than expected decrease in GPU job completion time--enough to wonder if the 2 CPU/3 GPU case might actually more productive than the 4/3 case, in addition to being lower power, cooler, lower fan noise, and safer.

Maybe I'll know something of it in a couple weeks.

Horacio

Joined: 3 Oct 11

Posts: 205

Credit: 80557243

RAC: 0

RE: Not quite, but related.

20 Mar 2012 7:20:08 UTC

Message 108809 in response to message 108807

(moderation:

)

Quote:

Not quite, but related. The value lives in the section in client_state.xml, and that section is only re-transmitted from the server to the client when new work is allocated. But once transmitted, it applies to all work in the cache, old and new.

I have the same issue with one computer that I want to run only one WU per GPU... Since Ive changed the venue it has not received any new BRP WU but it got some other tasks from Einstein and the client_state is still showing 0.5 instead of 1 in the cuda count.

Should I have to wait until a new BRP is downloaded to get the count changed in the client_state?

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3039304817

RAC: 1922161

RE: RE: Not quite, but

20 Mar 2012 7:33:53 UTC

Message 108810 in response to message 108809

(moderation:

)

Quote:

Quote:
Not quite, but related. The value lives in the section in client_state.xml, and that section is only re-transmitted from the server to the client when new work is allocated. But once transmitted, it applies to all work in the cache, old and new.

I have the same issue with one computer that I want to run only one WU per GPU... Since Ive changed the venue it has not received any new BRP WU but it got some other tasks from Einstein and the client_state is still showing 0.5 instead of 1 in the cuda count.

Should I have to wait until a new BRP is downloaded to get the count changed in the client_state?

Provided you're comfortable with the general principles of editing client_state (shut BOINC completely, plain text editor, extreme care, at your own risk), I'd see nothing wrong in helping it along a bit.

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7390421687

RAC: 2013402

RE: Should I have to wait

20 Mar 2012 15:17:47 UTC

Message 108811 in response to message 108809

(moderation:

)

Quote:

Should I have to wait until a new BRP is downloaded to get the count changed in the client_state?

To get the change automatically, yes.

In my case I was wrong about my quota exhaustion forcing me to wait until midnight UTC. While I distinctly recall that in the past at least some quota limits were released at that time, in this case it appears that the 80 per day limit set a 24-hour timer from the moment it went into force. Only this morning was I able to get new work, and even after I'd been allocated new BRP units and they had started downloading, three GPU tasks continued executing, apparently switching to the desired two when the first (of a couple of dozen) BRP tasks finished downloading to my host. So now the experiment in varying number of GPU and CPU tasks can continue.

While I'm mentioning the 80 limit, I think I saw another awkward consequence. As BRP work was recently much more routinely available than the GW work for my CPU slots and my host was pretty hungry after going to zero, there was a tendency to fulfill nearly all the 80 quota with BRP, leaving me with a very small amount of pending CPU work when the quota gate closed, sometimes taking my CPU to idle later. I'd have guessed there to be separate maximum daily download quotas for CPU and GPU, but in this instance it appeared not.

Horacio

Joined: 3 Oct 11

Posts: 205

Credit: 80557243

RAC: 0

RE: RE: Should I have to

20 Mar 2012 16:54:23 UTC

Message 108812 in response to message 108811

(moderation:

)

Quote:

Quote:
Should I have to wait until a new BRP is downloaded to get the count changed in the client_state?
To get the change automatically, yes.

Quote:

Provided you're comfortable with the general principles of editing client_state (shut BOINC completely, plain text editor, extreme care, at your own risk), I'd see nothing wrong in helping it along a bit.

Thanks, Ive changed the count in the client state and it is running fine.
I guess the scheduller was a bit stressed trying to start the second task which cant run as those GPUs dont have enough ram to play with 2 BRPs...

There is something there also... As now that BOINC knows that it can run only one BRP per card it forced SETI to panic mode... I guess the scheduler (at least in 6.10.60) is still not fully tuned to deal with wrong settings in multiple GPU tasks...

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7390421687

RAC: 2013402

RE: As now that BOINC

20 Mar 2012 17:19:42 UTC

Message 108813 in response to message 108812

(moderation:

)

Quote:

As now that BOINC knows that it can run only one BRP per card it forced SETI to panic mode... I guess the scheduler (at least in 6.10.60) is still not fully tuned to deal with wrong settings in multiple GPU tasks...

I think you could get a lot of us to agree that the scheduler finds many common situations puzzling.

One trick I've used to avert undesired High Priority execution is to place most of the queue tasks on suspend. It appears that in estimating schedule risk it does not count the suspended stuff. This does, however, stop new downloads on that project until reversed.

Or, you can just let it work itself out of trouble. But in High Priority panic, often it seems to start job after job, run them just a little, then skip on to another--often not even the soonest deadline. So I'm not happy with the "just let it work out" solution in the case of priority panics caused by sudden estimated performance changes.

GPU utilization benchmark

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner