GTS 450: Does two CUDA tasks at a time make sense?

alintope
alintope
Joined: 27 Jan 12
Posts: 52
Credit: 295551180
RAC: 0
Topic 196178

In my computer CUDA work is done by a GeForce GTS 450 graphics card. Does it make sense to have it working on more than one CUDA task (BRPS) at a time? Are 192 CUDA cores enough? The actual GPU usage by a single BRPS task is ~70%.

Gruß
Heinrich

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

GTS 450: Does two CUDA tasks at a time make sense?

It's not the number of cores that counts but the amount of memory on the card.

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

alintope
alintope
Joined: 27 Jan 12
Posts: 52
Credit: 295551180
RAC: 0

Memory is 1024 MB. As others

Memory is 1024 MB. As others say, that's enough for up to 3 tasks. But if GPU hasn't got enough computing capabilities, I suppose the computing time of a single task to increase proportional to the number of tasks. So what would it be use for when I try parallel computing of several GPU tasks?

Gruß
Heinrich

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 582578142
RAC: 142687

At currently 70% utilization

At currently 70% utilization you should see some speedup due to running several WUs. I can't say how much, though, except that it shouldn't be higher than 1.43 ;)

Why not just try and report the results here?

MrS

Scanning for our furry friends since Jan 2002

alintope
alintope
Joined: 27 Jan 12
Posts: 52
Credit: 295551180
RAC: 0

I'm inclined to give it a

I'm inclined to give it a try. On the other side I'am cautious about the file app_info.xml. There's not only a single person who is writing about different troubles in this forum here when using this (but necessary) file. Archae86 e.g. wrote, "one highly likely result of an error is loss of all current work in queue or in progress".

What I'm looking for is a faultless app_info file along with explanations or a reference guide which enables the user to adapt this file to his personal needs (according to the user's hardware, OS, crunching parameters)? Does that exist?

I don't like to do things blindly. At least I'd like to have a rough idea, what the consequences are of what I'm doing.

I tried to learn from what I found in this forum. Taking a look at the different suggestions of app_info files, I find them rather different in what they are listing (especially executable files). Which files are essential? GPU ram differs too. One file says 334572800.000000 the next one 220200960.000000. What is the right value or what is it depending on? Do I have to devide the amount of memory available by the number of tasks I want the GPU to do? I've got more of these questions, but will not bother you.

As I already wrote, I'd greatly appreciate a reference guide for app_info files.

Gruß
Heinrich

transient
transient
Joined: 3 Jun 05
Posts: 62
Credit: 115835369
RAC: 0

Before experimenting, you

Before experimenting, you could set Einstein to No New Tasks, and letting the existing tasks crunch and report.

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 534
Credit: 665716543
RAC: 561787

Hallo! I´m running an i7

Hallo!
I´m running an i7 2100 @3,2GHz with a GTX550Ti, 1GB, 192 shaders @ slot 1 in the motherboard and GT440, 1GB, 96 shaders @ slot 2 of the motherboard. The PCI at slot2 is also 16 bit wide. All BRP4 taks are forced to run at priority level of Normal by Process Tamer without disturbances to normal use like just now, evaluating and writing this.
For the crunching times I get the following measurements:

GTX550Ti
single taks 0.7016 +/- 0.0297 [h]. rel. error 4,23% - form 197 tasks
3 taks parallel 1,707 +/- 0,069 [h], rel. error 4,04% - form 475 tasks

In the mean you get a finished file every (1,707 +/- 0,069 [h])/3 = 0,569 +/- 0.023 [h].

The increase is 0.7016/0.569 - 1 = 23,3 +/- 7,21 [%]

GT440
single task 1.728 +/- 0,0455[h], rel. error 2.63 [%] - from 80 taksk
3 taks parallel 4.551 +/- 0.090[h], rel. error 1,98 [%] - from 180 tasks

In the mean you get a finished file every (4.551 +/- 0,090 [h])/3 = 1.517 +/- 0.030 [h].

The increase is 1.728/1.517 - 1 = 13,9 +/- 3,75 [%]

----------------- GPU-load ----------------------- Memory-load ---------
----------- single taks ----- 3 tasks ---------- single taks ----- 3 tasks

550 --------- ~83% -------- ~96% --------------- ~280MB ------- ~820MB
440 --------- ~60% -------- ~90% --------------- ~330MB ------- ~800MB

It seems, that GPUs with more shaders have a higher benefit from crunching files in parallel, but less than the ratio of the number of shaders.

In the Event Log of the BOINC Manager one will find at startup the peak crunching power of the GPU listed. im my case:
GTX550Ti 486[GFLOPS]
GT440 207[GFLOPS]
The ratio of this gives 2.3478
This ratio is equivalent to the ratio of (shaders * shader clock) for each GPU. In this case (192*1900)/(92*1620) = 2,3456
If one takes the ratio of the mean crunching times for 3 tasks in parallel one get 1,517/0.569 = 2,66, which ist just 14% higher than the ratio above. Also in this case there is an advantage of the GPU with the higher number of shaders. This is very likely due to higher GPU- and Memory-Clock and Memory-Bandwidth at the GTX550Ti.

The relative scattering of the crunching time is singnificently higher at the GPU with more shaders.

Kind regards
Martin

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 534
Credit: 665716543
RAC: 561787

Sorry! In my foregoing post

Sorry!
In my foregoing post the sentence

Quote:
It seems, that GPUs with more shaders have a higher benefit from crunching files in parallel, but less than the ratio of the number of shaders.


is obsolet. I was unable to kill it, as the editing time of 1h was just over.

Kind regards
Martin

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 0

RE: Before experimenting,

Quote:
Before experimenting, you could set Einstein to No New Tasks, and letting the existing tasks crunch and report.

Easier just to copy your data directory and pull your network connection. If you oops and frag your work you can just close boinc restore the backup and try again.

alintope
alintope
Joined: 27 Jan 12
Posts: 52
Credit: 295551180
RAC: 0

Thank you all for your

Thank you all for your valuable informations. As you were encouraging me, I'll start my first experiments with a multitasking GPU now.

Gruß von
Heinrich

alintope
alintope
Joined: 27 Jan 12
Posts: 52
Credit: 295551180
RAC: 0

Crunching time results

Crunching time results running CUDA BRPS tasks parallelly

As already said, I'm running a GPU Geforce 450 that uses 196 shaders. The board has 1024 MB GDDR5 of RAM. Crunching times are mean values taken from a bunch of >5. Each GPU task was fed by a separate CPU core with data. (The Windows task manager allows for a mapping of a task onto a certain CPU core.)

1 task : 51,4 min ----------- 1 task: 51,4 min
2 tasks: 85,4 min ---- equals 1 task: 42,7 min
3 tasks: 121 min ----- equals 1 task: 40,3 min

A simple calculation shows an increase of speed of about 20% in case of double tasking, about 25 % in case of triple tasking. In the latter case GPU load rose from 70% to values close to 90%. GPU core temperature rose only a little, about 2°C. In other words, the additional electrical power needed may be considered negligible.

Gruß von
Heinrich

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.