GTX 750 ti

archae86
archae86
Joined: 6 Dec 05
Posts: 3,150
Credit: 7,115,674,931
RAC: 562,162

While it will be best to see

While it will be best to see some more repetitions, there are indications that the completion times under this particular test condition are going to be very consistent. So a preliminary comparison may not be premature. I've used eyeball averaging, not formal averaging
[pre]
Parameter 750ti 750
BRP4G seconds 5553 5863
CPU seconds 870 1236
GPU load % 95 98
Mem cont load % 84 90
GPU core clock 1228 1176
GPU mem clock 1375 1253
GPU temp C 44 45
Fan speed % 31 41
Mem Ded max 477 470
Mem Dy Max 77 77
Power avg %TDP 57.4 53.4
PerfCap reason VOp VOp
VDDC 1.15 1.137[/pre]

The nGPU Arecibo results for my host may be seen here. For the time being, any result with a "sent" time in March 11 has run under this test condition.

Comments:

1. I was very surprised to see my host generating 98% GPU loading at this condition. I have run a variety of mixed loads (iGPU, CPU, and the three available nGPU applications) and never seen sustained average CPU load anywhere near this high. So apparently the iGPU and CPU work has appreciably distracted this modest 2-core host CPU from keeping the nGPU busy at my previous test conditions.
2. It is possible that my 2-core Haswell on a cheapish Asrock Z87 Extreme3 motherboard is nevertheless having better success at keeping the nGPU busy than Alex's one-generation back but higher grade Ivy Bridge. In particular, my GTX 750 is sitting in a 16x PCIe 3 slot, with nothing else on any PCIe slot. If the motherboard/CPU is conferring advantage, than the comparison in this case is too kind to the 750 vs. the ti.
3. alternately, it may be that the extra hardware on the ti combines with the code for BRP4G to make it more difficult to keep the GPU loaded.
4. I don't know quite what interpretation to give to the reported higher CPU times for my GTX 750 jobs than Alex's. Both of us are running hyperthreaded CPUs of the same clock rate, with very little other activity besides the BRP4G CPU support job loading the CPU. It seems odd that mine consumes more of this resource.
5. As, even with OC badges, I think the different card vendors do not set their clock speeds uniformly, the results should be taken with some hesitancy as indicating overall 750 vs. 750 ti performance.

I have ten more BRP4G tasks ready to go, and intend to leave the configuration running this way for most of today. My guess as of now is that the results will be very consistent, in which case nothing should change here save perhaps for typo correction.

I did turn on averaging on some parameters in GPU-Z, so it is possible that among GPU load, Memory controller load, and Power Consumption the comparison between 750 and 750 ti is somewhat off for this reason--but not very much, I think.

On other matters:
Average power consumption at the wall for my host at this test condition is 97.9 watts. Computed daily credit production at this condition is 29483, for a credit/day/watt score of 301. While this is materially credit/day than the 35,600 or so which I have seen at the best mixed load conditions, it is actually a slightly better power efficiency of credit production than my best previous condition.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 502,961,644
RAC: 3,750

RE: Comments: 1. I was

Quote:


Comments:

1. I was very surprised to see my host generating 98% GPU loading at this condition. I have run a variety of mixed loads (iGPU, CPU, and the three available nGPU applications) and never seen sustained average CPU load anywhere near this high. So apparently the iGPU and CPU work has appreciably distracted this modest 2-core host CPU from keeping the nGPU busy at my previous test conditions.
2. It is possible that my 2-core Haswell on a cheapish Asrock Z87 Extreme3 motherboard is nevertheless having better success at keeping the nGPU busy than Alex's one-generation back but higher grade Ivy Bridge. In particular, my GTX 750 is sitting in a 16x PCIe 3 slot, with nothing else on any PCIe slot. If the motherboard/CPU is conferring advantage, than the comparison in this case is too kind to the 750 vs. the ti.
3. alternately, it may be that the extra hardware on the ti combines with the code for BRP4G to make it more difficult to keep the GPU loaded.
4. I don't know quite what interpretation to give to the reported higher CPU times for my GTX 750 jobs than Alex's. Both of us are running hyperthreaded CPUs of the same clock rate, with very little other activity besides the BRP4G CPU support job loading the CPU. It seems odd that mine consumes more of this resource.
5. As, even with OC badges, I think the different card vendors do not set their clock speeds uniformly, the results should be taken with some hesitancy as indicating overall 750 vs. 750 ti performance.

I did turn on averaging on some parameters in GPU-Z, so it is possible that among GPU load, Memory controller load, and Power Consumption the comparison between 750 and 750 ti is somewhat off for this reason--but not very much, I think.

On other matters:
Average power consumption at the wall for my host at this test condition is 97.9 watts. Computed daily credit production at this condition is 29483, for a credit/day/watt score of 301. While this is materially credit/day than the 35,600 or so which I have seen at the best mixed load conditions, it is actually a slightly better power efficiency of credit production than my best previous condition.

Looks like you're quite right, the performance difference between these cards is ~ 5% but the price difference is ~30€.
It looks like it makes sense to replace the older HD5xxx and eventually the HD6xxx cards with these types. Looking at the power consumption it makes sense.

What we did not compare is the memory speed; but i'm out of office and will be back tomorrow evening, I can report it then.

@ higher gpu load: maybe the 128bit memory interface limits the loading of the extra shaders in some way, which could mean, the non ti version is the better choice here since we know, Einstein is a memory bandwith intensive project.

I've turned the avg feature on as you did, the results can be seen here:
https://dl.dropboxusercontent.com/u/50246791/gpu-z%202%20avg.PNG

For comparison reasons, how exact the detected GFLOP's represent the real crunching power: can you pls post the detected speed by BOINC (also we know it is false, but by a constant known factor).

Let's see how the coming 20nm cards perform ...

Cheers
Alexander

archae86
archae86
Joined: 6 Dec 05
Posts: 3,150
Credit: 7,115,674,931
RAC: 562,162

RE: I've turned the avg

Quote:
I've turned the avg feature on as you did, the results can be seen here:
https://dl.dropboxusercontent.com/u/50246791/gpu-z%202%20avg.PNG


That changes the comparison a little in some parameters, but nothing dramatic. Compared to the value I put for your system in my first-draft comparison table, your averaged data show:
Slight lower GPU temperature--42.9 vs 44
slightly higher power consumption:59.0 vs. 57.4
so it does not change the picture appreciably.

Quote:
For comparison reasons, how exact the detected GFLOP's represent the real crunching power: can you pls post the detected speed by BOINC (also we know it is false, but by a constant known factor).

I assume you mean the number posted in the event log during BOINC startup. This is currently inaccessible to me as the buffer size has been exceeded since last startup. In the interest of generating a consistent set of results, I'll defer responding to that request for a few hours, but shall do so then.

It is entirely possible that the ti advantage may differ on the Perseus (BRP5) work, and yet again on the Gamma ray pulsar work.

Also, enthusiasts may well find quite a bit of overclocking headroom, which may differ between samples or models systematically, in addition to random variation.

David S
David S
Joined: 6 Dec 05
Posts: 2,473
Credit: 22,936,222
RAC: 0

Alex and archae, thanks for

Alex and archae, thanks for doing this comparison.

If I'm understanding you correctly, the early results appear to show that the extra cash outlay for the Ti version is NOT a good value, if your primary concern is crunching economically.

Heads up: I just found that tigerdirect.com is offering the non-Ti 1 GB superclocked version (EVGA part #...2753...) with a $10 rebate that brings it down to the same price as the non-superclocked (...2751...) card. The rebate is offered until March 31.

David

Miserable old git
Patiently waiting for the asteroid with my name on it.

archae86
archae86
Joined: 6 Dec 05
Posts: 3,150
Credit: 7,115,674,931
RAC: 562,162

N9JFE wrote:...the early

N9JFE wrote:
...the early results appear to show that the extra cash outlay for the Ti version is NOT a good value, if your primary concern is crunching economically.

I'd say that goes well beyond the demonstrated outcome.

I do continue to think that for the neophyte who has never done GPU crunching and is toying with the idea of putting a toe in, or for the old hand whose existing card is generations behind and would like to move ahead on a low budget, especially if either is paying for their own electric power, or is minded to consider power cost no matter who pays, it is a live and pretty attractive option.

More serious crunchers seem likely to want to wait a bit for higher-end Maxwell parts, which may not deliver more crunch per graphics card dollar, but are quite likely to deliver more crunch per system dollar (unless you are as talented as Gary Roberts at acquiring cheap systems). Of course, there is a bit of worry as to whether memory or I/O bottlenecks may starve them when running current Einstein code...

Also there is the worry of whether something one cares about needs more than the 1 Gbyte of RAM on the base cards. But today, for the Perseus and Arecibo work of main interest here, that is not a problem at all.

I'm still quite happy with my choice, but not triumphalist at all.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 502,961,644
RAC: 3,750

I've reset the configuration

I've reset the configuration to normal, so the incoming wu's are no longer usable for comparison purposes.
I also updated BM to 7.3.11 Card is reported now as
11.03.2014 22:47:42 | | CUDA: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version 335.23, CUDA version 6.0, compute capability 5.0, 2048MB, 1948MB available, 1489 GFLOPS peak)

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2,142
Credit: 2,823,479,672
RAC: 1,011,851

RE: Alex and archae, thanks

Quote:

Alex and archae, thanks for doing this comparison.

If I'm understanding you correctly, the early results appear to show that the extra cash outlay for the Ti version is NOT a good value, if your primary concern is crunching economically.

Heads up: I just found that tigerdirect.com is offering the non-Ti 1 GB superclocked version (EVGA part #...2753...) with a $10 rebate that brings it down to the same price as the non-superclocked (...2751...) card. The rebate is offered until March 31.


Remember that your extra outlay is paying, not just for speed, but for an extra GB of memory too.

Over the 3-5 year expected lifetime of a card like this, my betting is that more projects will expand to fill the 2 GB offered - GPUGrid are already doing this. Since I doubt the VRAM is field-upgradeable, you may come to feel that the smaller card was a false economy.

archae86
archae86
Joined: 6 Dec 05
Posts: 3,150
Credit: 7,115,674,931
RAC: 562,162

Alex wrote:For comparison

Alex wrote:
For comparison reasons, how exact the detected GFLOP's represent the real crunching power: can you pls post the detected speed by BOINC (also we know it is false, but by a constant known factor).

I stopped the last pair--so my test set ends now (as predicted, there was very little variation observed at this test condition).

Here from the startup messages:

[pre]CUDA: NVIDIA GPU 0: GeForce GTX 750 (driver version 334.89, CUDA version 6.0, compute capability 5.0, 1024MB, 931MB available, 1666 GFLOPS peak)
OpenCL: NVIDIA GPU 0: GeForce GTX 750 (driver version 334.89, device version OpenCL 1.1 CUDA, 1024MB, 931MB available, 1666 GFLOPS peak)
OpenCL: Intel GPU 0: Intel(R) HD Graphics 4400 (driver version 10.18.10.3412, device version OpenCL 1.2, 1496MB, 1496MB available, 110 GFLOPS peak)
OpenCL CPU: Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz (OpenCL driver vendor: Intel(R) Corporation, driver version 3.0.1.10878, device version OpenCL 1.2 (Build 76413))
Processor: 4 GenuineIntel Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz [Family 6 Model 60 Stepping 3]
Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes syscall nx lm vmx tm2 pbe
[/pre]

A.M.
A.M.
Joined: 14 Jun 06
Posts: 15
Credit: 66,121,829
RAC: 0

RE: RE: For comparison

Quote:
Quote:
For comparison reasons, how exact the detected GFLOP's represent the real crunching power: can you pls post the detected speed by BOINC (also we know it is false, but by a constant known factor).
I assume you mean the number posted in the event log during BOINC startup. This is currently inaccessible to me as the buffer size has been exceeded since last startup.

For future reference, the entire Event Log can be found in the file BOINC/stdoutdae.txt
Also, stdoutdae.old contains the previous edition of the log.

Anonymous

GeForce Driver 335.23

GeForce Driver 335.23 available: enhanced GPU clock offset options for GTX 750Ti and GTX 750.

[EDIT] had not noticed Alex's post above.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.