Binary Radio Pulsar Search (Perseus Arm Survey) "BRP5"

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160342159
RAC: 0

RE: I have replaced my GTX

Quote:
I have replaced my GTX 560Ti with a GTX 760 now. Unfortunately there are almost no improvements for BRP5 WUs. The run time is about the same :-(


tell me about it...i made the same upgrade (GTX 560 Ti --> GTX 670) and noticed only a minimal improvement in run times as well.

now take into account that a GTX 560 Ti is capable of 1263.4 GFLOPs, while the GTX 670 is capable of 2460 GFLOPs, which is almost double the FP32 (single-precision) performance of the former. now granted, i understand that these are maximum theoretical throughput figures, and that they are not representative of the everyday performance we'll see from our GPUs. but even if our GPUs come nowhere near the maximum theoretical performance, a GTX 670 should still be substantially more powerful than a GTX 560 Ti. so not only is the Kepler architecture (GTX 6xx series) no better than the Fermi architecture (GTX 5xx series) when it comes to Einstin@Home, but Kepler is actually substantially worse clock for clock at Einstein@Home than Fermi. of course you really can't blame it on the GPUs architecture...ultimately it comes down to Einstein@Home code not being as optimized for Kepler as it is for Fermi.

i ended up selling the GTX 670 and "downgrading" to a GTX 580 (i actually have 3 of them now), which blows the GTX 670 away in terms of Einstein@Home performance...even if it consumes slightly more power than a GTX 670.

Khangollo
Khangollo
Joined: 17 Feb 11
Posts: 42
Credit: 928047659
RAC: 0

RE: GTX560Ti: 9900

Quote:
GTX560Ti: 9900 sec
GTX760: 9500 sec
(both with 1 WU at a time)


For better performance on those GPUs with many CUDA cores with current BRP applications, you really need to run more than one WU at a time (2 or 3).

MaU38.gif

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160342159
RAC: 0

just to be clear, i was

just to be clear, i was running 3 simultaneously on each of my GTX 560 Ti's when i had them, and i was running 4 simultaneously on the GTX 670 when i had it. the GTX 670 still only performed marginally better than the GTX 560 Ti's did.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 507044931
RAC: 111931

If you compare the data

If you compare the data sheets you will find out that GTX 5xx series have internal core clock twice the external core clock. GTX 6xx does not double the core clock. A very basic calculation is: twice the cores give the same performance as the 5xx series.
GTX 7xx is a finetuned GTX6xx series.

The advantage of the new cards is: less power consumption.
Since cuda developement continues you can not be shure that the older cards will be usable forever. It is pretty like with the AMD cards: AMD HD4xxx is no longer supported here.

There are threads here where the dev's explain why they are still using an older cuda version, but new versions will come for shure.

So keep the new cards and you are on the save side.

Alex

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160342159
RAC: 0

RE: If you compare the data

Quote:
If you compare the data sheets you will find out that GTX 5xx series have internal core clock twice the external core clock. GTX 6xx does not double the core clock. A very basic calculation is: twice the cores give the same performance as the 5xx series.


this is exactly the reason i spoke in terms of GFLOPs only...b/c you can't compare the Kepler and Fermi architectures on either core count or shader clock alone - both must be taken into consideration in order to do an apples to apples comparison (GFLOPs vs GFLOPs), where GFLOPs = shader clock X core count.

my point is that the GTX 670's internal core clock is greater than half of the GTX 560 Ti's internal core clock (or shader clock...whatever you want to call it) AND it has 3.5 times as many cores as the GTX 560 Ti. hence the reason the GTX 670 has twice the theoretical FP32 (single-precision) throughput that the GTX 560 Ti has. and so the reason the GTX 670 isn't substantially more efficient at Einstein@Home is a lack of optimization for the Kepler architecture as compared with the older Fermi architecture.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 507044931
RAC: 111931

RE: this is exactly the

Quote:


this is exactly the reason i spoke in terms of GFLOPs only...b/c you can't compare the Kepler and Fermi architectures on either core count or shader clock alone - both must be taken into consideration in order to do an apples to apples comparison (GFLOPs vs GFLOPs), where GFLOPs = shader clock X core count.

my point is that the GTX 670's internal core clock is greater than half of the GTX 560 Ti's internal core clock (or shader clock...whatever you want to call it) AND it has 3.5 times as many cores as the GTX 560 Ti. hence the reason the GTX 670 has twice the theoretical FP32 (single-precision) throughput that the GTX 560 Ti has. and so the reason the GTX 670 isn't substantially more efficient at Einstein@Home is a lack of optimization for the Kepler architecture as compared with the older Fermi architecture.

You are right, but as long as we have only a cuda32 app my calculation is much closer to the reality.
Might be very different and closer to your calculation when the cuda 5x apps will be available.

Beyond
Beyond
Joined: 28 Feb 05
Posts: 121
Credit: 2335296212
RAC: 5281262

RE: RE: I have replaced

Quote:
Quote:
I have replaced my GTX 560Ti with a GTX 760 now. Unfortunately there are almost no improvements for BRP5 WUs. The run time is about the same :-(

tell me about it...i made the same upgrade (GTX 560 Ti --> GTX 670) and noticed only a minimal improvement in run times as well.

now take into account that a GTX 560 Ti is capable of 1263.4 GFLOPs, while the GTX 670 is capable of 2460 GFLOPs, which is almost double the FP32 (single-precision) performance of the former.


If you did the same comparison at GPUGrid, you'd find the GTX 670 to be almost twice as fast as the 560Ti, but that's CUDA42.

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160342159
RAC: 0

RE: If you did the same

Quote:
If you did the same comparison at GPUGrid, you'd find the GTX 670 to be almost twice as fast as the 560Ti, but that's CUDA42.


exactly - its an optimization issue. and your point lends further credence to what i said earlier - Einstein@Home just isn't optimized as well for Kepler as it is for Fermi. if it were (i.e. if Einstein@Home were taking advantage of CUDA42, and not just CUDA32), then we would actually see substantial differences in performance between Kepler GPUs and Fermi GPUs on the Einstein@Home project. but until then...

Mumak
Joined: 26 Feb 13
Posts: 325
Credit: 3517801924
RAC: 1642004

This has already been

This has already been discussed a few posts above:
http://einsteinathome.org/node/196873&nowrap=true#123837
It seems we're out of luck...

-----

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1886
Credit: 1405501285
RAC: 1168971

I have several different

I have several different GeForce cards but the 660Ti is running these tasks the fastest for me and the 550Ti is faster than the 650Ti

But I OC'd all of them as much as I can running 2X (and all of my processors are a couple years old so I don't have any OC'd)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.