GPU utilization benchmark

Damaraland
Damaraland
Joined: 9 Feb 05
Posts: 24
Credit: 5691134
RAC: 0
Topic 196241

I tested 1, 0.5 y 0.33 GPU utilization factor of BRP apps. Binary Radio Pulsar Search (Arecibo) v1.23 (BRP4cuda32nv270)
My system
i7-2600K CPU @ 3.40GHz, 2 GPUs, Linux3.0.0-16-generic.
On multiprocessors, use at most 60% of the processors.
I'm running both Einstein & POEM@home
NVIDIA GeForce GTX 560 Ti
NVIDIA GeForce GTX 260

Here are the results:
where s=standard deviation (very small, so the mean is a very good mesure)
where secs/unit is the mean of all units processed
Running GPU utilization factor of BRP apps=1
GPU............secs/unit...s... nºunits tested

GTX 560 Ti....2.136.......3.......11
GTX 260........2.440.......4.......10

Running GPU utilization factor of BRP apps=0.5
2 task each GPU Memory is about 60%
GPU............secs/unit...s... nºunits tested
GTX 560 Ti....3.830.......15.......15....1.915 secs/unit (equivalent average)
GTX 260.......4.759.......4.........11....2.380 secs/unit (equivalent average)

Running GPU utilization factor of BRP apps=0.333
3 task each GPU Memory is about 98%
GPU............secs/unit...s... nºunits tested
GTX 560 Ti....5.163.......27.......28....1.721 secs/unit (equivalent average)
GTX 260.......6.789.......6.........17....2.263 secs/unit (equivalent average)

Conclusions:
Running GPU utilization factor of BRP apps=0.5
Increase of performance of 10,3% on a GTX 560 Ti
Increase of performance of 2,5% on a GTX 260

Running GPU utilization factor of BRP apps=0.333
Increase of performance of 19,4% on a GTX 560 Ti
Increase of performance of 7,2% on a GTX 260

Temperatures of GPUS are fine.

Sparrow
Sparrow
Joined: 4 Jul 11
Posts: 29
Credit: 10701417
RAC: 0

GPU utilization benchmark

My results with a GTX 560Ti are a bit different:

GPU utilization factor of BRP apps=1
secs/unit: 1709 (average from 22 WUs)

GPU utilization factor of BRP apps=0.5
secs/unit: 2786 (average from 3 WUs)
= 1393 secs/unit equivalent average
-> 18.5% increase of performance

Interesting side note:
The variation of time needed for each WU is going down significantly when I'm running 2 WUs at a time. The the three WUs took 2784, 2785, and 2789 seconds. The WUs that are running right now will also take exactly this time.
The WUs I was running with only 1 WU at a time took between 1600 and 1758 seconds.

My system: i920 @ 3.31GHz, Linux 2.6.32-39-generic, 1 GPU

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7315971688
RAC: 2300903

I had been running three GPU

I had been running three GPU jobs using app_info. After a long wait running down my queue of older work, I removed app_info and got a good start using the new mechanism with count of 0.33 giving three GPU jobs.

But now I am trying to check out two GPU job behavior, and it does not seem to be listening. I've checked the location (a.k.a. "venue"), changed the location-specific "GPU utilization factor of BRP apps" from 0.33 to 0.5, and pressed the "update" button for Einstein. When the next units to start up started as a triple, I tried shutting down boincmgr and restarting. No joy. Shut down the computer, counted to thirty, and rebooted. Still three jobs. Did another user-requested Update. Still three jobs. Although I believed that this is not a local preference item and that I had no local preference set, I nevertheless went into the boincmgr preference dialog and commanded "clear". Still no joy (i.e. three GPU Einstein jobs).

Anyone else see some problem in changing the specified number of GPU jobs? Any secret sauce? Any idea what I most likely did wrong?

Logforme
Logforme
Joined: 13 Aug 10
Posts: 332
Credit: 1714373961
RAC: 0

RE: Anyone else see some

Quote:
Anyone else see some problem in changing the specified number of GPU jobs? Any secret sauce? Any idea what I most likely did wrong?


Had the exact same problem. If I understood the answer I got here correctly, only new workunits downloaded from the server will be flagged with the new parameters.
So patients does reward you :)

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2993079440
RAC: 711451

RE: RE: Anyone else see

Quote:
Quote:
Anyone else see some problem in changing the specified number of GPU jobs? Any secret sauce? Any idea what I most likely did wrong?

Had the exact same problem. If I understood the answer I got here correctly, only new workunits downloaded from the server will be flagged with the new parameters.
So patients does reward you :)


Not quite, but related. The value lives in the section in client_state.xml, and that section is only re-transmitted from the server to the client when new work is allocated. But once transmitted, it applies to all work in the cache, old and new.

@ archae86 - which host are you talking about, and when did you try this? 3409259 last received work at 19 Mar 2012 0:46:24 UTC, 4234243 at 19 Mar 2012 12:20:35 UTC (unless there are later allocations lurking on later pages - I didn't check).

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7315971688
RAC: 2300903

Richard Haselgrove wrote:Not

Richard Haselgrove wrote:

Not quite, but related. The value lives in the section in client_state.xml, and that section is only re-transmitted from the server to the client when new work is allocated. But once transmitted, it applies to all work in the cache, old and new.

@ archae86 - which host are you talking about, and when did you try this? 3409259 last received work at 19 Mar 2012 0:46:24 UTC, 4234243 at 19 Mar 2012 12:20:35 UTC (unless there are later allocations lurking on later pages - I didn't check).

Thank you, yet again, Richard, and now that answer sounds familiar--must have failed to remember it from these pages.

The host I mean to experiment with is my newest, and has recently been banging into the daily maximum (as has been my other GPU-equipped host) as I seek to rebuild inventory from taking it to zero at the changeover. I think midnight UTC is just a little time in the future--I'll force an update then and expect to see compliance with my 2 GPU job request.

As to experiments, the host under review had a spontaneous restart problem about a week ago. I suspected over-temperature, and reduced the maximum number of CPU jobs to two (from four). I saw a larger than expected decrease in GPU job completion time--enough to wonder if the 2 CPU/3 GPU case might actually more productive than the 4/3 case, in addition to being lower power, cooler, lower fan noise, and safer.

Maybe I'll know something of it in a couple weeks.

Horacio
Horacio
Joined: 3 Oct 11
Posts: 205
Credit: 80557243
RAC: 0

RE: Not quite, but related.

Quote:
Not quite, but related. The value lives in the section in client_state.xml, and that section is only re-transmitted from the server to the client when new work is allocated. But once transmitted, it applies to all work in the cache, old and new.

I have the same issue with one computer that I want to run only one WU per GPU... Since Ive changed the venue it has not received any new BRP WU but it got some other tasks from Einstein and the client_state is still showing 0.5 instead of 1 in the cuda count.

Should I have to wait until a new BRP is downloaded to get the count changed in the client_state?

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2993079440
RAC: 711451

RE: RE: Not quite, but

Quote:
Quote:
Not quite, but related. The value lives in the section in client_state.xml, and that section is only re-transmitted from the server to the client when new work is allocated. But once transmitted, it applies to all work in the cache, old and new.

I have the same issue with one computer that I want to run only one WU per GPU... Since Ive changed the venue it has not received any new BRP WU but it got some other tasks from Einstein and the client_state is still showing 0.5 instead of 1 in the cuda count.

Should I have to wait until a new BRP is downloaded to get the count changed in the client_state?


Provided you're comfortable with the general principles of editing client_state (shut BOINC completely, plain text editor, extreme care, at your own risk), I'd see nothing wrong in helping it along a bit.

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7315971688
RAC: 2300903

RE: Should I have to wait

Quote:
Should I have to wait until a new BRP is downloaded to get the count changed in the client_state?

To get the change automatically, yes.

In my case I was wrong about my quota exhaustion forcing me to wait until midnight UTC. While I distinctly recall that in the past at least some quota limits were released at that time, in this case it appears that the 80 per day limit set a 24-hour timer from the moment it went into force. Only this morning was I able to get new work, and even after I'd been allocated new BRP units and they had started downloading, three GPU tasks continued executing, apparently switching to the desired two when the first (of a couple of dozen) BRP tasks finished downloading to my host. So now the experiment in varying number of GPU and CPU tasks can continue.

While I'm mentioning the 80 limit, I think I saw another awkward consequence. As BRP work was recently much more routinely available than the GW work for my CPU slots and my host was pretty hungry after going to zero, there was a tendency to fulfill nearly all the 80 quota with BRP, leaving me with a very small amount of pending CPU work when the quota gate closed, sometimes taking my CPU to idle later. I'd have guessed there to be separate maximum daily download quotas for CPU and GPU, but in this instance it appeared not.

Horacio
Horacio
Joined: 3 Oct 11
Posts: 205
Credit: 80557243
RAC: 0

RE: RE: Should I have to

Quote:
Quote:
Should I have to wait until a new BRP is downloaded to get the count changed in the client_state?
To get the change automatically, yes.

Quote:
Provided you're comfortable with the general principles of editing client_state (shut BOINC completely, plain text editor, extreme care, at your own risk), I'd see nothing wrong in helping it along a bit.

Thanks, Ive changed the count in the client state and it is running fine.
I guess the scheduller was a bit stressed trying to start the second task which cant run as those GPUs dont have enough ram to play with 2 BRPs...

There is something there also... As now that BOINC knows that it can run only one BRP per card it forced SETI to panic mode... I guess the scheduler (at least in 6.10.60) is still not fully tuned to deal with wrong settings in multiple GPU tasks...

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7315971688
RAC: 2300903

RE: As now that BOINC

Quote:
As now that BOINC knows that it can run only one BRP per card it forced SETI to panic mode... I guess the scheduler (at least in 6.10.60) is still not fully tuned to deal with wrong settings in multiple GPU tasks...

I think you could get a lot of us to agree that the scheduler finds many common situations puzzling.

One trick I've used to avert undesired High Priority execution is to place most of the queue tasks on suspend. It appears that in estimating schedule risk it does not count the suspended stuff. This does, however, stop new downloads on that project until reversed.

Or, you can just let it work itself out of trouble. But in High Priority panic, often it seems to start job after job, run them just a little, then skip on to another--often not even the soonest deadline. So I'm not happy with the "just let it work out" solution in the case of priority panics caused by sudden estimated performance changes.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.