CUDA and openCL Benchmarks

MAGIC Quantum M...

Joined: 18 Jan 05

Posts: 1702

Credit: 1067702915

RAC: 1272806

And here is another

21 Aug 2012 0:29:22 UTC

Message 110099

(moderation:

)

And here is another review/test I read late last night.

EVGA GTX 660 Ti Superclocked

Petrion

Joined: 30 Apr 08

Posts: 53

Credit: 1243186

RAC: 0

For those who don't know,

23 Aug 2012 5:06:26 UTC

Message 110100

(moderation:

)

For those who don't know, they released the new improved OpenCL app version for Einstein after testing on Albert http://albert.phys.uwm.edu/forum_thread.php?id=8912&nowrap=true#112191

From my own experience I had almost a 50% performance increase (from 3,800 to 2,100 seconds) over the previous version. I'll be updating the table with the new times as they come in...will look at listing the old times alongside the new for comparisons sake.

Let me know your new times!!

Alex

Joined: 1 Mar 05

Posts: 451

Credit: 500394558

RAC: 38624

RE: Let me know your new

23 Aug 2012 15:49:41 UTC

Message 110101 in response to message 110100

(moderation:

)

Quote:

Let me know your new times!!

HD6950 2wu's ~3,500 790MHz GPU 1250MHz Mem
HD5850 2wu's ~6,085 (pcie x8 slot) 765MHz GPU 1125MHz Mem
HD5830 1wu ~2,916
AMD A8 3870 APU: 1wu 6,489.60

Fred J. Verster

Joined: 27 Apr 08

Posts: 118

Credit: 22451438

RAC: 0

RE: RE: Let me know your

24 Aug 2012 13:47:13 UTC

Message 110102 in response to message 110101

(moderation:

)

Quote:

Quote:
Let me know your new times!!

HD6950 2wu's ~3,500 790MHz GPU 1250MHz Mem
HD5850 2wu's ~6,085 (pcie x8 slot) 765MHz GPU 1125MHz Mem
HD5830 1wu ~2,916
AMD A8 3870 APU: 1wu 6,489.60

I just started to crunch on 2 HD5870 GPUs.

CPU (I7-2600) is doing Docking@home now and SETI MB/AstroPulse work.

Doing 1 per_device (GPU), I'll post runtimes+CPU-times.

MAGIC Quantum M...

Joined: 18 Jan 05

Posts: 1702

Credit: 1067702915

RAC: 1272806

Well the UPS truck just

24 Aug 2012 22:09:56 UTC

Message 110103

(moderation:

)

Well the UPS truck just delivered my EVGA GTX 660 Ti Superclocked and the 8GB Ram but it looks like I have to wait until monday for the power supply needed before I can install the 660Ti.

I put the Ram in and that already makes a difference having 12GB instead of just 4GB

So I hope to have tests and some numbers by monday night.

JHMarshall

Joined: 24 Jul 12

Posts: 17

Credit: 1018018169

RAC: 0

Run time info for HD7950 -

25 Aug 2012 11:14:37 UTC

Message 110104 in response to message 110101

(moderation:

)

Run time info for HD7950 - both machines 8GB RAM, v1.28, Cat 12.6, BOINC 7.0.28, Win7 Pro 64bit

System 1: E5300, G41 chipset, 1 wu ~ 1840 secs, CPU ~ 620 Secs
System 2: i5-2500K, H67 Chipset, 1 wu ~ 1145-1160 secs, CPU ~ 255 secs

Both systems are not overclocked. v1.28 is much improved over v1.24! Looks like the CPU/chipset/bandwidth of System 1 severely limits the HD7950. I will probably move that card to another faster system.

Right now System 1 is dedicated to Einstein and System 2 is dedicated to Milkyway. Looks like I'm shortchanging Einstein, but I'll fix that.

I'm really impressed with the HD7950s. The double precision performance in Milkyway is incredible.

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

Yesterday I upgraded my GPU

25 Aug 2012 15:51:24 UTC

Message 110105

(moderation:

)

Yesterday I upgraded my GPU from a GeForce 9600GT to a EVGA GTX 660Ti, to be precise the model number is GV-N66TOC-2GD-EU. This is a factory over clocked card, where the core runs @ 1033 MHz (Boost to 1111 MHz) while the memory clock is left to the default speed of 6008 Mhz.
This is paired with a i7 GHz on a PCI-E 3.0 x16 bus.

Link to the host

Here are some early results running BRP4 v1.25 and 8 Einstein CPU-units:

9600 GT one WU at a time ~4430 secs (average over 50 units)
GTX 660Ti one WU at a time ~1700 secs (average over 5 units)
GTX 660Ti two WU at a time ~2900 secs (average over 35 units)
GTX 660Ti three WU at a time ~4500 secs (average over 6 units)

Checking GPU-Z while crunching the sensor-page claims that the core clock is 1201.9 MHz and the average load running 2 at a time is ~81%. I've used Process Lasso to raise the priority of the BRP-app to above normal to get a bit more performance, without raising the priority the load running 2 was ~70%.

Sunny129

Joined: 5 Dec 05

Posts: 162

Credit: 160342159

RAC: 0

RE: 9600 GT one WU at a

25 Aug 2012 18:09:15 UTC

Message 110106 in response to message 110105

(moderation:

)

Quote:

9600 GT one WU at a time ~4430 secs (average over 50 units)
GTX 660Ti one WU at a time ~1700 secs (average over 5 units)
GTX 660Ti two WU at a time ~2900 secs (average over 35 units)
GTX 660Ti three WU at a time ~4500 secs (average over 6 units)

Checking GPU-Z while crunching the sensor-page claims that the core clock is 1201.9 MHz and the average load running 2 at a time is ~81%. I've used Process Lasso to raise the priority of the BRP-app to above normal to get a bit more performance, without raising the priority the load running 2 was ~70%.

that's interesting to say the least - i have a dual GTX 560 Ti machine that crunches 6 BRP4 tasks in parallel (3 per GPU). this is a Win7 x64 platform w/ a Phenom II X6 1090T CPU, 8GB of DDR31600, and a PCIe 2.0 bus. averaged over hundreds of units (again, 3 at a time), the run times are ~5200s. your run times are only ~13.5% shorter than mine. i wonder if that's fairly indicative of the performance increase expected when going from a GTX 560 Ti to a GTX 660 Ti...

Sid

Joined: 17 Oct 10

Posts: 160

Credit: 926329162

RAC: 283657

RE: GTX 660Ti one WU at a

25 Aug 2012 18:45:21 UTC

Message 110107 in response to message 110105

(moderation:

)

Quote:

GTX 660Ti one WU at a time ~1700 secs (average over 5 units)
GTX 660Ti two WU at a time ~2900 secs (average over 35 units)
GTX 660Ti three WU at a time ~4500 secs (average over 6 units)

Could you please try 6 WUs at a time ?
My time for 560 Ti is ~7750 secs.

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

RE: that's interesting to

25 Aug 2012 19:32:03 UTC

Message 110108 in response to message 110106

(moderation:

)

Quote:

that's interesting to say the least ...

i wonder if that's fairly indicative of the performance increase expected when going from a GTX 560 Ti to a GTX 660 Ti...

Might be, in some of the reviews I read before purchasing this card they claimed that the 192 bit wide memory bus would slow this card down a bit. And if I were to guess the BRP4-app does a fair bit of memory transfers when running on the card and to and from the main system.

I've run some tasks over at Albert@home where a new CUDA-app is being tested and that shaved of about 700 secs on the runtime and more than halved the CPU-time, GPU load on the beta-app were over 95% when running 2 at a time. Here's hoping that app gets released over here soon!

I've begun running three at a time for a while and will see how it goes.

Quote:

Could you please try 6 WUs at a time ?

Don't think this card will like 6 at a time, but I plan on increasing the number of parallel task over the next few days and will report back in due time.

CUDA and openCL Benchmarks

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner