CUDA and openCL Benchmarks

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1,305
Credit: 421,284,802
RAC: 104,022

And here is another

And here is another review/test I read late last night.

EVGA GTX 660 Ti Superclocked

 

Petrion
Petrion
Joined: 30 Apr 08
Posts: 53
Credit: 1,243,186
RAC: 0

For those who don't know,

For those who don't know, they released the new improved OpenCL app version for Einstein after testing on Albert http://albert.phys.uwm.edu/forum_thread.php?id=8912&nowrap=true#112191

From my own experience I had almost a 50% performance increase (from 3,800 to 2,100 seconds) over the previous version. I'll be updating the table with the new times as they come in...will look at listing the old times alongside the new for comparisons sake.

Let me know your new times!!

Alex
Alex
Joined: 1 Mar 05
Posts: 449
Credit: 339,174,036
RAC: 10,788

RE: Let me know your new

Quote:
Let me know your new times!!

HD6950 2wu's ~3,500 790MHz GPU 1250MHz Mem
HD5850 2wu's ~6,085 (pcie x8 slot) 765MHz GPU 1125MHz Mem
HD5830 1wu ~2,916
AMD A8 3870 APU: 1wu 6,489.60

Fred J. Verster
Fred J. Verster
Joined: 27 Apr 08
Posts: 118
Credit: 22,451,438
RAC: 0

RE: RE: Let me know your

Quote:
Quote:
Let me know your new times!!

HD6950 2wu's ~3,500 790MHz GPU 1250MHz Mem
HD5850 2wu's ~6,085 (pcie x8 slot) 765MHz GPU 1125MHz Mem
HD5830 1wu ~2,916
AMD A8 3870 APU: 1wu 6,489.60

I just started to crunch on 2 HD5870 GPUs.

CPU (I7-2600) is doing Docking@home now and SETI MB/AstroPulse work.

Doing 1 per_device (GPU), I'll post runtimes+CPU-times.

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1,305
Credit: 421,284,802
RAC: 104,022

Well the UPS truck just

Well the UPS truck just delivered my EVGA GTX 660 Ti Superclocked and the 8GB Ram but it looks like I have to wait until monday for the power supply needed before I can install the 660Ti.

I put the Ram in and that already makes a difference having 12GB instead of just 4GB

So I hope to have tests and some numbers by monday night.

 

JHMarshall
JHMarshall
Joined: 24 Jul 12
Posts: 14
Credit: 884,150,859
RAC: 5,183,092

Run time info for HD7950 -

Run time info for HD7950 - both machines 8GB RAM, v1.28, Cat 12.6, BOINC 7.0.28, Win7 Pro 64bit

System 1: E5300, G41 chipset, 1 wu ~ 1840 secs, CPU ~ 620 Secs
System 2: i5-2500K, H67 Chipset, 1 wu ~ 1145-1160 secs, CPU ~ 255 secs

Both systems are not overclocked. v1.28 is much improved over v1.24! Looks like the CPU/chipset/bandwidth of System 1 severely limits the HD7950. I will probably move that card to another faster system.

Right now System 1 is dedicated to Einstein and System 2 is dedicated to Milkyway. Looks like I'm shortchanging Einstein, but I'll fix that.

I'm really impressed with the HD7950s. The double precision performance in Milkyway is incredible.

Holmis
Joined: 4 Jan 05
Posts: 1,118
Credit: 795,829,813
RAC: 161,023

Yesterday I upgraded my GPU

Yesterday I upgraded my GPU from a GeForce 9600GT to a EVGA GTX 660Ti, to be precise the model number is GV-N66TOC-2GD-EU. This is a factory over clocked card, where the core runs @ 1033 MHz (Boost to 1111 MHz) while the memory clock is left to the default speed of 6008 Mhz.
This is paired with a i7 GHz on a PCI-E 3.0 x16 bus.

Link to the host

Here are some early results running BRP4 v1.25 and 8 Einstein CPU-units:

9600 GT one WU at a time ~4430 secs (average over 50 units)
GTX 660Ti one WU at a time ~1700 secs (average over 5 units)
GTX 660Ti two WU at a time ~2900 secs (average over 35 units)
GTX 660Ti three WU at a time ~4500 secs (average over 6 units)

Checking GPU-Z while crunching the sensor-page claims that the core clock is 1201.9 MHz and the average load running 2 at a time is ~81%. I've used Process Lasso to raise the priority of the BRP-app to above normal to get a bit more performance, without raising the priority the load running 2 was ~70%.

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160,342,159
RAC: 0

RE: 9600 GT one WU at a

Quote:

9600 GT one WU at a time ~4430 secs (average over 50 units)
GTX 660Ti one WU at a time ~1700 secs (average over 5 units)
GTX 660Ti two WU at a time ~2900 secs (average over 35 units)
GTX 660Ti three WU at a time ~4500 secs (average over 6 units)

Checking GPU-Z while crunching the sensor-page claims that the core clock is 1201.9 MHz and the average load running 2 at a time is ~81%. I've used Process Lasso to raise the priority of the BRP-app to above normal to get a bit more performance, without raising the priority the load running 2 was ~70%.


that's interesting to say the least - i have a dual GTX 560 Ti machine that crunches 6 BRP4 tasks in parallel (3 per GPU). this is a Win7 x64 platform w/ a Phenom II X6 1090T CPU, 8GB of DDR31600, and a PCIe 2.0 bus. averaged over hundreds of units (again, 3 at a time), the run times are ~5200s. your run times are only ~13.5% shorter than mine. i wonder if that's fairly indicative of the performance increase expected when going from a GTX 560 Ti to a GTX 660 Ti...

Sid
Sid
Joined: 17 Oct 10
Posts: 145
Credit: 477,584,215
RAC: 303,710

RE: GTX 660Ti one WU at a

Quote:


GTX 660Ti one WU at a time ~1700 secs (average over 5 units)
GTX 660Ti two WU at a time ~2900 secs (average over 35 units)
GTX 660Ti three WU at a time ~4500 secs (average over 6 units)


Could you please try 6 WUs at a time ?
My time for 560 Ti is ~7750 secs.

Holmis
Joined: 4 Jan 05
Posts: 1,118
Credit: 795,829,813
RAC: 161,023

RE: that's interesting to

Quote:

that's interesting to say the least ...

i wonder if that's fairly indicative of the performance increase expected when going from a GTX 560 Ti to a GTX 660 Ti...


Might be, in some of the reviews I read before purchasing this card they claimed that the 192 bit wide memory bus would slow this card down a bit. And if I were to guess the BRP4-app does a fair bit of memory transfers when running on the card and to and from the main system.

I've run some tasks over at Albert@home where a new CUDA-app is being tested and that shaved of about 700 secs on the runtime and more than halved the CPU-time, GPU load on the beta-app were over 95% when running 2 at a time. Here's hoping that app gets released over here soon!

I've begun running three at a time for a while and will see how it goes.

Quote:
Could you please try 6 WUs at a time ?


Don't think this card will like 6 at a time, but I plan on increasing the number of parallel task over the next few days and will report back in due time.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.