GTX 750 ti

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508870244
RAC: 75806
Topic 197438

Hi,
I just installed my GTX 750 ti.

GPU-Z 0.7.7 reports:
GPU clock 1228 MHz
Mem clock 1375 MHz
temp 43 deg C
fan speed 31%
GPU load 86 - 93 % running 1 Arecibo + 1 Perseus
mem contr. load 68 -75 %
mem usage dedicated 389 MB
mem usage dynamic 77 MB
Power consumption 42 - 60 % TDP

Manufacturer: Gainward left @ std. clock
it is this pc: http://einsteinathome.org/host/6801076/tasks

Mumak
Joined: 26 Feb 13
Posts: 335
Credit: 3552501787
RAC: 1250155

GTX 750 ti

Thanks for letting us know. Can you please try to run only one BRP4 or BRP5 task and post how long it takes?
Some users have reported that when running BRP4+BRP5 tasks concurrently, the resulting times were unexpectedly higher.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508870244
RAC: 75806

RE: Can you please try to

Quote:
Can you please try to run only one BRP4 or BRP5 task and post how long it takes?
Some users have reported that when running BRP4+BRP5 tasks concurrently, the resulting times were unexpectedly higher.

I've set up the pc to run a single gpu task only. But might take until tomorrow afternoon to clear the workbuffer.

I'll let you know the results.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508870244
RAC: 75806

RE: Can you please try to

Quote:
Can you please try to run only one BRP4 or BRP5 task and post how long it takes?

A single perseus wu takes 10422 to 10547 sec
a single arecibo wu takes 3438 to 3895 sec

all in all 6.7 cpu's from the i7 are in use; 6 plain cpu wu's, one arecibo-nvidia which takes 0.2 cpu's and one arecibo-intel with 0.5 cpu's.

HTH.

MarkJ
MarkJ
Joined: 28 Feb 08
Posts: 437
Credit: 139002861
RAC: 0

Some of the other users have

Some of the other users have reported slower run times for AMD/Nvidia GPU work when they also use the Intel GPU. The suggested approach seems to be to disable Intel GPU processing. You may want to try with and without Intel GPU to see if/how it effects your host.

Mumak
Joined: 26 Feb 13
Posts: 335
Credit: 3552501787
RAC: 1250155

Thanks for the results. The

Thanks for the results. The times are impressive for such a small beast !
My GF 660 Ti does a BRP4G in ~2800 s.

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 581798152
RAC: 139027

RE: Thanks for the results.

Quote:
Thanks for the results. The times are impressive for such a small beast !
My GF 660 Ti does a BRP4G in ~2800 s.


Indeed - this bodes very well for the higher end Maxwells! Memory controller utilization is pretty high.. but then so is performance. This means you should see solid gains from memory OC (as well as obvious core OC) and that conffigurations with relatively low memory bandwidth (like OEMs like to sell) should be avoided. Well, this is already true, but probably becomes even more severe with Maxwell.

MrS

Scanning for our furry friends since Jan 2002

archae86
archae86
Joined: 6 Dec 05
Posts: 3160
Credit: 7261075247
RAC: 1544382

Alex. As I have posted in

Alex. As I have posted in another thread I just brought up a new host with a base-model GTX750 card.

I think some people here are interested in the base GTX 750 vs ti-model comparison.

Aside from the added hardware, yours is running at somewhat faster clocks than mine (GPU 1228 vs. 1176, GPU Mem clock 1375 vs. 1253).

While we are running different flavors of Windows 7, possibly that is unimportant. However your CPU is an i7 Ivy Bridge Quad stated to be 3.4 GHz, while mine is an i3 Haswell dual stated to be 3.4 GHz, with chipset differences as well.

If you are interested, I'd be willing to devote some run time to agreed trial conditions intended to focus on the card differences while attempting to make host differences less important.

As a guess, perhaps this might best be done by stopping pure CPU BOINC work, and stopping on-chip Intel GPU work as well. In the interest of reproducibility in a short test I also think we should avoid mixed work, and the dreaded GRP3 jobs.

Restricting things to a single GPU job running at a time would keep things at their simplest--but would be unrepresentative of how real users would likely run things, and given some of the contention and memory access issues, might mis-represent the likely relative merits in real service.

The BRP4G work has the advantage of shorter run times, so less time lost in a non-optimal configuration. However I am currently standardized on Perseus work (very recently).

So, if you are interested and agree, my suggested comparison configuration is:

1. zero CPU jobs
2. zero iGPU jobs
3. two-up BRP4G nvidia GPU jobs
4. no intervention to the standard settings of GPU, CPU, or RAM (i.e. no overclocking)
5. no adjustment to windows default behavior regarding CPU affinity or priority.
6. normal operation configuration (i.e. connected to the internet), but avoiding user interactive use during the test.
7. report same-version GPU-Z average GPU loading and memory loading averaged over a substantial portion of the test run--also the other major GPU-Z parameters.

This is meant as an opening bid--not an ultimatum, subject to your being interested, and possibly suggesting different test configuration details.

I'm assuming your ti is appreciably better--the question is "how much--for this particular application?"

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508870244
RAC: 75806

RE: So, if you are

Quote:

So, if you are interested and agree, my suggested comparison configuration is:

1. zero CPU jobs
2. zero iGPU jobs
3. two-up BRP4G nvidia GPU jobs
4. no intervention to the standard settings of GPU, CPU, or RAM (i.e. no overclocking)
5. no adjustment to windows default behavior regarding CPU affinity or priority.
6. normal operation configuration (i.e. connected to the internet), but avoiding user interactive use during the test.
7. report same-version GPU-Z average GPU loading and memory loading averaged over a substantial portion of the test run--also the other major GPU-Z parameters.

This is meant as an opening bid--not an ultimatum, subject to your being interested, and possibly suggesting different test configuration details.

I'm assuming your ti is appreciably better--the question is "how much--for this particular application?"

Why not?

I've changed my setup to run two arecibo gpu jobs only and leave this setup for a day or so. So everyone who is interested can follow the results and see the average crunching time.
For better comparison I copied the startupmessage from BOINC which shows the compute capability
11.03.2014 10:46:53 | | CUDA: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version 335.23, CUDA version 6.0, compute capability 5.0, 2048MB, 1948MB available, 2233 GFLOPS peak)
I did not stop all other programs; this pc acts as a local ftp server, but with 3 unused cpu cores (7 unused threads) it should not make a big difference, no 'heavy load'. All clock settings are standard, internet available 100%, no known limitations or unusual priorities. And, of course, a vnc server is running. Sorry, PC has no monitor, keyboard or mouse attached. And standard security settings (firewall, antivirus).
At startup cpu-z gave this picture:
https://dl.dropboxusercontent.com/u/50246791/gpuz-1.PNG
short link to this pc: http://einsteinathome.org/host/6801076
Card info's are here:
https://dl.dropboxusercontent.com/u/50246791/GTX%20750ti%20basics.PNG

Let's wait for results.

Alexander

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2976727742
RAC: 785010

RE: 11.03.2014 10:46:53 |

Quote:
11.03.2014 10:46:53 | | CUDA: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version 335.23, CUDA version 6.0, compute capability 5.0, 2048MB, 1948MB available, 2233 GFLOPS peak)


Just a small bit of information, before anyone gets too excited.

The GM107 chip used in the GTX 750 range has - as BOINC has detected - 'compute capability 5.0'

This architecture has 128 shaders per multiprocessor, where the previous Kepler chips had 192 shaders per SM. This means that the rather complicated derivation BOINC uses to calculate GFLOPS peak will overstate the estimated speed by 50%. This will be fixed in v7.3/v7.4 clients, but it looks like Maxwell was released too late for the change to be back-ported to v7.2.42 that Alex is using.

Doesn't affect the validity of this test, of course, and we're all waiting to see how well this rather promising newcomer performs in the real world - but I thought I'd jump in before somebody reads that number and spends money on a card that might not be quite as good as they think it is.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 508870244
RAC: 75806

RE: RE: 11.03.2014

Quote:
Quote:
11.03.2014 10:46:53 | | CUDA: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version 335.23, CUDA version 6.0, compute capability 5.0, 2048MB, 1948MB available, 2233 GFLOPS peak)

Just a small bit of information, before anyone gets too excited.

The GM107 chip used in the GTX 750 range has - as BOINC has detected - 'compute capability 5.0'

This architecture has 128 shaders per multiprocessor, where the previous Kepler chips had 192 shaders per SM. This means that the rather complicated derivation BOINC uses to calculate GFLOPS peak will overstate the estimated speed by 50%. This will be fixed in v7.3/v7.4 clients, but it looks like Maxwell was released too late for the change to be back-ported to v7.2.42 that Alex is using.

THX for this information. I compared it against a price comparison platform, the value for the GFLOPs for this card is listed as 1538 there. No idea how relevant this is.

Edit: the wu's which were sent after 9:40 UTC are relevant

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.