CUDA and openCL Benchmarks

ggesmundo
ggesmundo
Joined: 3 Jun 12
Posts: 31
Credit: 18,699,116
RAC: 0

I am running an intel I7

I am running an intel I7 3.4Ghz with one core free for the GPU and other threads. I was suprised when I first installed this card, an Asus ENGTX560 DCII and was running twice as fast as the 550Ti I have running in a q6600 2.4Ghz quad core machine where 1Wu takes 3900s and 2 Wus take approx 6720s.

frei74
frei74
Joined: 19 Nov 11
Posts: 1
Credit: 801,625
RAC: 0

overclocked GT240 with GDDR5

overclocked GT240 with GDDR5 (core clock 750MHz/shader clock 1500 Mhz) best result: 4,035.16 s.
w/o overclocking: ~4500 s.

LiborA
LiborA
Joined: 8 Dec 05
Posts: 74
Credit: 337,135
RAC: 0

FirePro 3D V4800 (one core

FirePro 3D V4800 (one core free for GPU): avg. time from 4 WU: 10620 sec. for 1 WU (Win 7)

Amauri
Amauri
Joined: 12 Jul 11
Posts: 7
Credit: 22,809,307
RAC: 15,310

GT 640, Linux, 1 WU =~ 5700

GT 640, Linux, 1 WU =~ 5700 sec.
GT 520, Linux, 1 WU =~ 9600 sec. (retired)

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1,305
Credit: 421,284,802
RAC: 104,022

3,021.93 - 3,065.20 NVIDIA

3,021.93 - 3,065.20 NVIDIA GeForce GTX 550 Ti

Average 50 minutes

 

ggesmundo
ggesmundo
Joined: 3 Jun 12
Posts: 31
Credit: 18,699,116
RAC: 0

I have run the following

I have run the following tests to see what the impact of freeing up cpu's had on gpu performance. I have GTX560 running two BPR4 tasks and all remaining active cpu's running gamma-ray pulsar searches for all but the last test. Following is the link for the test machine:

http://einsteinathome.org/host/5460148

After each change in free cpus, the currently running 2 wu's were allowed to finish and the following 4 wu's were averaged to get the run time.

1 free cpu 3500s
2 free cpu 2898s
3 free cpu 2842s
4 free cpu 2774s

4 Gravitational Wave S6 Line Veto
four free cpus 2664s

This suggests that to achieve the best gpu performance, there should be a free cpu for each gpu task. It all makes sense, a BPR4 process has five threads running, gamma-ray and an S6LV processes have 3 threads each running. The cpu's are already busy running threads for their respective processes and everybody waits.

A side effect of freeing cpus is that the cpu processes speed up, runtimes are very consistent and the ratio of runtime to cpu time goes down significantly. At the 4/4 ratio a gamma-ray pulsar search went from approx. 9.5 to 6 hours and S6LV went from approx 5.5 to 3.5 hours.

It would be interesting to see if this holds true with a gpu capable of running 3 more tasks, mine is memory constrained to 2 tasks.

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1,079
Credit: 341,280
RAC: 0

RE: It would be interesting

Quote:
It would be interesting to see if this holds true with a gpu capable of running 3 more tasks, mine is memory constrained to 2 tasks.


3 or more tasks? (just guessing)

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

ggesmundo
ggesmundo
Joined: 3 Jun 12
Posts: 31
Credit: 18,699,116
RAC: 0

Yes, re-read it several times

Yes, re-read it several times and still missed it.

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1,305
Credit: 421,284,802
RAC: 104,022

RE: 3,021.93 - 3,065.20

Quote:

3,021.93 - 3,065.20 NVIDIA GeForce GTX 550 Ti

Average 50 minutes

In case anyone wondered, along with the Cuda's I also run a LHC and a 2-core T4T 24/7 on this quad-core with the nVidia GeForce GTX 550 Ti

And on this laptop I am on right now which is also a quad-core I run the same tasks and this one has the NVIDIA GeForce 610M (2048MB) driver: 28564 running the Cuda's in an average of 7,800.00 seconds (2hrs and 10mins) each.

Also runs 24/7

 

dskagcommunity
dskagcommunity
Joined: 16 Mar 11
Posts: 87
Credit: 717,416,417
RAC: 0

hi!! I have better results

hi!!

I have better results for your list with a stockclocked standart 560TI from Zotac (feed by a not so slow C2D E8400 @ 3,6Ghz):

With 1WU runtime is ~1900secs
With 2WUs runtime is 3094secs (up to 90% GPU Load peak, ~35% CPU Load)
With 3WUs runtime is 3961secs (up to 97% GPU Load peak, up to 51% CPU Load)

And i think in your list it is only meant 8800GT is not openCL usable in THIS project (here is OpenCL1.1 needed or?), cos im running 8xxx and 9xxx (and a HD4850) cards in OpenCL1.0 projects (like POEM) and they do fine.

DSKAG Austria Research Team: [LINK]http://www.research.dskag.at[/LINK]

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.