CUDA and openCL Benchmarks

mountkidd
mountkidd
Joined: 14 Jun 12
Posts: 127
Credit: 4,112,850,429
RAC: 2,240,154

RE: and there is no

Quote:
and there is no advantage to have 3 parallel tasks instead of 2 ones on this card.

This is only true if you are running cpu tasks as well. There is a whole new world out there once you stop the cpu from interfering with the gpu...

Gord

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160,342,159
RAC: 0

well there's some truth in

well there's some truth in everyone's statements here...

i suppose i should rephrase my claim as follows: 3 simultaneous E@H tasks run more efficiently on a GTX 560 Ti 1GB than 2 simultaneous E@H tasks do on a Win7 x64 platform so long as you also have sufficient CPU resources and a full 16 lanes of PCIe 2.0 bandwidth for the GPU. this also means that 3 simultaneous E@H tasks will run better on a GTX 560 Ti 1GB than 2 simultaneous E@H tasks will on a Win7 x64 platform so long as you have sufficient CPU resources and at least 8 lanes of PCIe 3.0 bandwidth. i can also confirm the same results as above when limited to only 8 lanes of PCIe 2.0 bandwidth (from when i had dual GTX 560 Ti's installed and the x16 slots were limited to PCIe 2.0 x8 bandwidth), and so the same must be true when limited to 4 lanes of PCIe 3.0 bandwidth on a Win7 x64 platform.

Shafa
Shafa
Joined: 31 May 05
Posts: 53
Credit: 627,005,014
RAC: 0

RE: This is only true if

Quote:


This is only true if you are running cpu tasks as well. There is a whole new world out there once you stop the cpu from interfering with the gpu...

Gord

Not exactly.
On 2 of 3 PCs no CPU units are running and on the Wheezy there is 4core cpu with 2 parallel cpu units only.
Moreover both linux PCs have increased priority of GPU tasks.
WinXP PC has integrated AMD video card and 560ti is dedicated only for E@H, not for desktop operations.

Good for Sunny to have fast PC, probably based on Intel (?).
Those my 3PCs are based on obsolete 45W-65W AMDs 2xAM2/1xAM3, 2xDDR2/1xDDR3, slow craps ;-)

On the other hand, I do not think the difference between 2/3 parallel units on 560Ti is kind of significant.

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160,342,159
RAC: 0

RE: WinXP PC has integrated

Quote:

WinXP PC has integrated AMD video card and 560ti is dedicated only for E@H, not for desktop operations.

Good for Sunny to have fast PC, probably based on Intel (?).
Those my 3PCs are based on obsolete 45W-65W AMDs 2xAM2/1xAM3, 2xDDR2/1xDDR3, slow craps ;-)


that's what i do as well - IGP as dedicated display GPU, allowing the discrete GPUs to be completely dedicated to crunching. one of my home machines has a 6-core 1090T CPU and utilizes the HD 4290 IGP from the 890GX chipset to run the display. my other two home crunchers have i7 3770K CPUs and utilize the CPU's integrated HD 4000 GPU...and yes, having several available CPU core/threads certainly helps on the GPU end ;-)

Quote:
On the other hand, I do not think the difference between 2/3 parallel units on 560Ti is kind of significant.


it depends - on your slightly outdated hardware that may very well be the case, but i can assure you the difference on my machines is substantial (i can't recall the exact difference in terms of RAC or PPD b/c i experimented with and optimized my platform a long time ago...but i do remember that the difference was substantial enough for me to continue running 3 simultaneous tasks). with the necessary additional VRAM and compute power, i'm able to run 4 simultaneous tasks on my GTX 580s with substantially better results than running only 3 simultaneous tasks. but like any other GPU (or combination of GPUs), production efficiency will depending on a slew of other factors, and running the same number of simultaneous tasks on the same exact video card w/ the same exact core and memory clock may not work for someone else due to those other factors (like PCIe slot bandwidth and/or speed, mobo chipset, CPU type, etc)...

mountkidd
mountkidd
Joined: 14 Jun 12
Posts: 127
Credit: 4,112,850,429
RAC: 2,240,154

RE: RE: and there is no

Quote:
Quote:
and there is no advantage to have 3 parallel tasks instead of 2 ones on this card.

This is only true if you are running cpu tasks as well. There is a whole new world out there once you stop the cpu from interfering with the gpu...


I am amending the first part of my statement to read:

This is only true if your cpu doesn't have the horsepower to feed the gpu to its full potential.

Both of my 560Ti's run optimally at x4.

Gord

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 738,423,020
RAC: 0

I updated my Linux AMD

I updated my Linux AMD drivers from version 13.4 to 13.10 Beta 2 last week via both of my AMD hosts. On my dual card host, I have seen a significant reduction in run time compared to the older driver.

3x BRP5 tasks per GPU - 20 tasks averaged per version

Driver - Run time in seconds per task
13.4 - 7694.10
13.10 Beta 2 - 6406.51

The 3x card host also saw a reduction but not as significant as the dual card host that I have. I think this may be due to this particular host having the extra card running via an x8 slot which I have previously seen can reduce performance a bit via the cards running via x16 slots. There may be another reason as well.

The groups of tasks that I averaged with were from last week. I wanted to get a few days of run time to confirm driver stability before posting.

Alex
Alex
Joined: 1 Mar 05
Posts: 449
Credit: 341,373,374
RAC: 91,954

Hi, has anyone experience

Hi,
has anyone experience with the new nVidia GT 630 with the GK208-301-A1 "Kepler" chip?
The card is rated at 25W maximum and, according to TomsHardware, has (nearly) the same performance as the GT 640. But the benchmarks are mostly based on games, not on open-CL crunching.

Alex

mmstick
mmstick
Joined: 6 Jun 12
Posts: 14
Credit: 2,066,411
RAC: 0

If you run with Linux, AMD

If you run with Linux, AMD GPUs will perform faster, clock higher, and use lower voltages than in Windows in regards to OpenCL. Plus, it uses far less resources than Windows XP.

mmstick
mmstick
Joined: 6 Jun 12
Posts: 14
Credit: 2,066,411
RAC: 0

RE: If you run with Linux,

Quote:
If you run with Linux, AMD GPUs will perform faster, clock higher, and use lower voltages than in Windows in regards to OpenCL. Plus, it uses far less resources than Windows XP.


After doing some testing with my Radeon HD 7950 running Catalyst 13.11 @ 1200MHz and no voltage increase in Ubuntu and a GPU load of 93%, runtime per task is 800 seconds with three BRP4G-opencl-ati work units (as in, run three work units then divide their time by three for actual throughput rate). I'll soon have numbers for the longer BRP5 tasks

Sunny129
Sunny129
Joined: 5 Dec 05
Posts: 162
Credit: 160,342,159
RAC: 0

RE: RE: If you run with

Quote:
Quote:
If you run with Linux, AMD GPUs will perform faster, clock higher, and use lower voltages than in Windows in regards to OpenCL. Plus, it uses far less resources than Windows XP.

After doing some testing with my Radeon HD 7950 running Catalyst 13.11 @ 1200MHz and no voltage increase in Ubuntu and a GPU load of 93%, runtime per task is 800 seconds with three BRP4G-opencl-ati work units (as in, run three work units then divide their time by three for actual throughput rate). I'll soon have numbers for the longer BRP5 tasks


yeah, that beats the pants off my 1083 seconds per task in Windows 7, and i'm running dual 7970s @ 1000MHz each!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.