CUDA and openCL Benchmarks

Sunny129

Joined: 5 Dec 05

Posts: 162

Credit: 160342159

RAC: 0

RE: RE: Could you please

25 Aug 2012 21:09:13 UTC

Message 110109 in response to message 110108

(moderation:

)

Quote:

Quote:
Could you please try 6 WUs at a time ?

Don't think this card will like 6 at a time, but I plan on increasing the number of parallel task over the next few days and will report back in due time.

sure you can. your GTX 660 Ti is a 2GB card, so you can run 6 WUs in parallel and not exceed the GPU's memory capacity. i can only run up to 3 tasks in parallel on my particular GTX 560 Ti, but Sid can run up to 6 in parallel b/c he has the 2GB version of my card.

Sunny129

Joined: 5 Dec 05

Posts: 162

Credit: 160342159

RAC: 0

RE: RE: that's

26 Aug 2012 7:08:53 UTC

Message 110110 in response to message 110108

(moderation:

)

Quote:

Quote:
that's interesting to say the least ...

i wonder if that's fairly indicative of the performance increase expected when going from a GTX 560 Ti to a GTX 660 Ti...

Might be, in some of the reviews I read before purchasing this card they claimed that the 192 bit wide memory bus would slow this card down a bit. And if I were to guess the BRP4-app does a fair bit of memory transfers when running on the card and to and from the main system.

on second thought, we should probably take into consideration that the only way these documented BRP4 run time comparisons could be a true apples-to-apples comparison is if they are all tested on the same hardware bed and OS. not only do the hardware beds and OS's vary wildly across the above chart of documented run times, but so do the DC loads put on those CPUs and GPUs. for example, the first documented run times to show up in this thread for a GTX 560 Ti were very much in line with my own run times, particularly for 3 tasks at a time (which takes right around ~5,000s on my machine). but in Petrion's most recent update of the chart, that figure dropped down to ~4,000s. at first i couldn't figure out how that was possible. then i thought about the fact that when run Einstein@Home, i typically run 6 BRP4 CUDA tasks in parallel (3 per GPU) in conjunction w/ Test4Theory@Home (the multithreaded version, which consumes just under 2 CPU cores), and allocate the remaining CPU resources to either LHC@Home Classic (SixTrack) CPU tasks or Einstein@Home Gravitational Wave CPU tasks. perhaps the original documented run time for 3 consecutive BRP4 tasks of approx. 5,000s on a GTX 560 Ti is similar to mine b/c that host was also allocating the remainder of his CPU resources to other projects like me...and perhaps the other documented run time for 3 consecutive BRP4 tasks on a GTX 560 Ti of approx. 4,000s was done on a host that was loaded only with Einstein@Home BRP4 CUDA work, and not crunching work from any other projects at the time. perhaps the user who provided those GTX 560 Ti run times to Petrion could speak up so we could compare other hardware (CPU, memory quantity, etc.) and maybe make more sense of the difference between our run times...

...and with all that said, i'm thinking that we shouldn't deem the difference in run times between my GTX 560 Ti and your GTX 660 Ti definitive just yet...especially if you did your BRP4 CUDA testing with no other projects crunching in the background.

Sid

Joined: 17 Oct 10

Posts: 160

Credit: 928177461

RAC: 292705

RE: ...and with all that

26 Aug 2012 7:27:03 UTC

Message 110111 in response to message 110110

(moderation:

)

Quote:

...and with all that said, i'm thinking that we shouldn't deem the difference in run times between my GTX 560 Ti and your GTX 660 Ti definitive just yet...especially if you did your BRP4 CUDA testing with no other projects crunching in the background.

If I'm running some CPU tasks in parallel with BRP4 tasks I can clearly see that GPU load is not more then 75% compare to 95% otherwise.
So time can be really different.

Maciek

Joined: 21 Mar 05

Posts: 1

Credit: 319915

RAC: 0

Hello I have a problem. Which

26 Aug 2012 13:29:06 UTC

Message 110112

(moderation:

)

Hello
I have a problem. Which of these cards gt430 or hd6670 (96 cuda cores vs 480 cores 268.8 vs 768 GFLOPS all data from wiki comparison table) should gave more computed WU/day? I don't know exactly what should I expect from radeon with latest opencl app.

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

RE: ...and with all that

27 Aug 2012 20:37:38 UTC

Message 110113 in response to message 110110

(moderation:

)

Quote:

...and with all that said, i'm thinking that we shouldn't deem the difference in run times between my GTX 560 Ti and your GTX 660 Ti definitive just yet...especially if you did your BRP4 CUDA testing with no other projects crunching in the background.

I've done some more testing and here are my setup and the run times:

CPU: Core i7 GHz
GPU: GTX660Ti PCI-E 3.0 x16, MHz as reported by GPU-Z
RAM: 16 GB of Corsair PC3-12800 (800 MHz, dual channel)
OS: Win7 x64

[pre]All times in seconds and the CPU fully loaded with Einstein CPU-tasks.
# Mean Median Range # of tasks completed
x1 ~1700 1697 1685 - 1728 5
x2 ~2900 2824 2493 - 3494 35
x3 ~4360 4491 3393 - 4999 35
x4 ~6030 6105 4403 - 6802 20
x5 ~8660 8867 5920 - 9741 7
x6 ~12760 13198 11448 - 14066 5[/pre]
The times varies quite much, probably because this is my only computer and I do use it a fair bit. It seems that 2 at a time is most efficient without further tweaking and testing.

Edit: I used Process Lasso to raise the priority of the CPU-part of the BRP-app to "above normal" to improve the GPU-load.
Further more when running 3 or more parallel tasks I observed some minor lags while using the computer, especially while watching video.

Vladimir Lukovic

Joined: 10 Sep 08

Posts: 10

Credit: 681501

RAC: 0

Could you run a single

29 Aug 2012 19:53:54 UTC

Message 110114 in response to message 110113

(moderation:

)

Could you run a single workunit of Albert@Home with the new improved 1.28 cuda app.
My 560gtx does 1 wu in about 1700-1800s.
It would be awesome to see how much of improvement the 660ti makes.

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

Already done that, check this

29 Aug 2012 20:04:26 UTC

Message 110115 in response to message 110114

(moderation:

)

Already done that, check this post.

Times were about 1180s.

joe areeda

Joined: 13 Dec 10

Posts: 285

Credit: 320378898

RAC: 0

This thread is very

2 Sep 2012 19:32:38 UTC

Message 110116

(moderation:

)

This thread is very interesting, thank you all for contributing.

Here's some info on my GPUs

AMD Phenom II x6, nVidia 560ti: 6cpu + 2GPU
2654 s/wu, Ubuntu 11.04 x86_64, 26,209 avg credit

Intel I7-2600K, nVidia 560: 8cpu + 2GPU
2300 s/WU, Scientific Linux 6.3 x86_64, 34968 avg credit

Intel Iy-3770K nVidia 550ti: 8cpu + 2gpu
2961 s/wu, SL 6.3 x86_64, 33793 avg credit
I'm not sure average credit is stable yet (it's been a month)

I'm thinking of swapping these cards around so I can get a better feel for what exactly the card is doing and how the different CPU and motherboard affect it.

Now my dilemma is that I'm building another system and am ready to order another GPU but what the heck do I buy?

I'm going to stick with CUDA capable (nVidia) for now. Maybe I will have to double my options with OpenCL cards next year.

It seems the 600 series doesn't seem to outperforming the 500 series by very much. I'm pretty much down to a 560ti (2GB) for $270, 660ti (2GB) for $300, 670 (2GB) $400,

These machines are not dedicated to E@H but will be running jobs that are somewhat similar in nature and E@H will be the backfill for any idle time.

Anybody have any opinions or comments that can break the 3-way tie I seem to have? Do those of you who bought 600 series feel they are worth the extra money?

thanks
Joe

Horacio

Joined: 3 Oct 11

Posts: 205

Credit: 80557243

RAC: 0

RE: Anybody have any

2 Sep 2012 21:16:11 UTC

Message 110117 in response to message 110116

(moderation:

)

Quote:

Anybody have any opinions or comments that can break the 3-way tie I seem to have? Do those of you who bought 600 series feel they are worth the extra money?

thanks
Joe

Ill choose something in the Keppler series (warning not all 600 GPUs are Keppler, anyway, those you named are).
If not for performance, then it will be for the lower power requierements that will make you save more money on electricity than the extra paid on the cards... and also for the low noise levels...

archae86

Joined: 6 Dec 05

Posts: 3145

Credit: 7059464931

RAC: 1206156

It seems the question of

2 Sep 2012 22:50:37 UTC

Message 110118

(moderation:

)

It seems the question of whether the GTX 660ti is a great buy or an honorable mention for Einstein use gets down to whether the application here (I hope plural applications in the future) is more nearly limited by shader performance or memory. Apparently it should have the shader count and performance of more expensive cousins, but a significant memory performance deficit.

Also a question is whether the rather oddly asymmetric 660 ti memory implementation might have some additional harmful effect.

Were I buying right now, I, personally, would be strongly tempted to get a Gigabyte 660 ti, and try to compare it to other Einstein hosts with Tesla and Fermi chips to help others decide whether it is a good choice.

CUDA and openCL Benchmarks

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner