CUDA and openCL Benchmarks

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1,305
Credit: 421,284,802
RAC: 104,022

RE: My times are

Quote:
My times are two-at-a-time and I think I remember that Magic is running three at a time or maybe even four.

Yes tbret, I always run BRP PAS tasks X4 on the 660To SC (I did test it with 2 and 3 when I first got it)

It is on my not so new quad-core but it also always runs a vLHC -VB 24hr task at the same time and once in a while I add an Atlas task just to see how they will run since that host has more ram than the rest of mine.

.......and Mikey, you have mail

 

tbret
tbret
Joined: 12 Mar 05
Posts: 2,115
Credit: 4,172,694,955
RAC: 1,104,255

RE: Yes I am running 2 at

Quote:

Yes I am running 2 at a time to get those numbers. And yes my 760 here is in an AMD 6 core machine with 16gb of ram. That is as good as any of my machines get though, I do have a couple others just like it or very similar, and then I too go down from there.

Well, I knew when I chose the BRP4 tasks to-do there was a chance that I would get fewer "credits" than with BRP5, but I "calculated" I'd be getting slightly more "credits" doing the BRP4 tasks.

My calculation is that I'm getting about 7,000 credits out of my R9 270X at the same time you're getting about 6,666 doing BRP5. I'm not sure how fast it would do a BRP5.

I know that when I let the 470/560Ti machine switch from BRP4 to BRP5 the RAC fell, but I really don't remember by how much. What I do remember is that I used to let them do 3 tasks at a time, but it really didn't help. It didn't hurt, but it didn't help.

Oh, the stupid RAM is due Thursday, not Wednesday, so there will be a day's delay.

mikey
mikey
Joined: 22 Jan 05
Posts: 6,366
Credit: 556,198,684
RAC: 218,917

RE: RE: Yes I am

Quote:
Quote:

Yes I am running 2 at a time to get those numbers. And yes my 760 here is in an AMD 6 core machine with 16gb of ram. That is as good as any of my machines get though, I do have a couple others just like it or very similar, and then I too go down from there.

Well, I knew when I chose the BRP4 tasks to-do there was a chance that I would get fewer "credits" than with BRP5, but I "calculated" I'd be getting slightly more "credits" doing the BRP4 tasks.

My calculation is that I'm getting about 7,000 credits out of my R9 270X at the same time you're getting about 6,666 doing BRP5. I'm not sure how fast it would do a BRP5.

I know that when I let the 470/560Ti machine switch from BRP4 to BRP5 the RAC fell, but I really don't remember by how much. What I do remember is that I used to let them do 3 tasks at a time, but it really didn't help. It didn't hurt, but it didn't help.

Oh, the stupid RAM is due Thursday, not Wednesday, so there will be a day's delay.

I too ran three units at a time for a bit but didn't see any advantage so dropped back to 2 at a time on my 760's. I don't remember why I went to the BRP5 units from the BRP4 units, it has been a long time.

...and Magic you have mail

Tex1954
Tex1954
Joined: 15 Mar 11
Posts: 28
Credit: 668,413,018
RAC: 100,881

FWIW, I have two (7970/R9

FWIW, I have two (7970/R9 280X) cards running in two different setups. One motherboard is PCIe 3.0 and the other is PCIe 2.0. Both run the cards in x16 mode.

The 7970 is running 925/1375 and the R9 280X is running 1020/1500. Both at stock speeds and not overclocked. Even with the higher clock speed, the PCIe 2.0 is measurably slower.

PCIe 3.0 7970 tasks run two at a time in any mix:

1.39 Binary Radio Pulsar Search (Arecibo, GPU) (BRP4G-opencl-ati) 00:33:18 (00:13:16) 0.5C + 0.5 AMD/ATI GPUs
1.39 Binary Radio Pulsar Search (Arecibo, GPU) (BRP4G-opencl-ati) 00:26:51 (00:13:19) 0.5C + 0.5 AMD/ATI GPUs
1.39 Binary Radio Pulsar Search (Perseus Arm Survey) (BRP5-opencl-ati) 01:33:08 (00:59:13) 0.5C + 0.5 AMD/ATI GPUs
1.39 Binary Radio Pulsar Search (Perseus Arm Survey) (BRP5-opencl-ati) 01:41:03 (00:58:31) 0.5C + 0.5 AMD/ATI GPUs

PCIe 2.0 R9 280X tasks run two at a time in any mix:

1.39 Binary Radio Pulsar Search (Arecibo, GPU) (BRP4G-opencl-ati) 00:35:11 (00:15:34) 0.5C + 0.5 AMD/ATI GPUs
1.39 Binary Radio Pulsar Search (Arecibo, GPU) (BRP4G-opencl-ati) 00:35:11 (00:15:36) 0.5C + 0.5 AMD/ATI GPUs
1.39 Binary Radio Pulsar Search (Perseus Arm Survey) (BRP5-opencl-ati) 01:50:54 (01:07:22) 0.5C + 0.5 AMD/ATI GPUs
1.39 Binary Radio Pulsar Search (Perseus Arm Survey) (BRP5-opencl-ati) 01:50:49 (01:07:29) 0.5C + 0.5 AMD/ATI GPUs

One can see the effect of the PCIe speed... 3.0 is definitely faster using full bandwidth higher powered GPUs...

Soo, all these comparisons must also take into account PCIe version and width.

And just for another FWIW, my dual 560 Ti OC cards run 900/2106 and do no better running 2 tasks vs. 1 task at a time on a PCIe 2.0 @ x16 i7-950 system... so I only run one task at a time on them.

8-)

Tex1954
Tex1954
Joined: 15 Mar 11
Posts: 28
Credit: 668,413,018
RAC: 100,881

FWIW, more data... 925/1375

FWIW, more data... 925/1375 7970 in PCIe 2.0 x8 mode

1.39 Binary Radio Pulsar Search (Arecibo, GPU) (BRP4G-opencl-ati) 00:53:18 (00:18:20) 0.5C + 0.5ATI (d1)
1.39 Binary Radio Pulsar Search (Arecibo, GPU) (BRP4G-opencl-ati) 00:53:21 (00:18:20) 0.5C + 0.5ATI (d1)
1.39 Binary Radio Pulsar Search (Arecibo, GPU) (BRP4G-opencl-ati) 00:53:24 (00:18:18) 0.5C + 0.5ATI (d0)
1.39 Binary Radio Pulsar Search (Arecibo, GPU) (BRP4G-opencl-ati) 00:53:22 (00:18:12) 0.5C + 0.5ATI (d0)
1.39 Binary Radio Pulsar Search (Arecibo, GPU) (BRP4G-opencl-ati) 00:49:45 (00:17:28) 0.5C + 0.5ATI (d1)
1.39 Binary Radio Pulsar Search (Arecibo, GPU) (BRP4G-opencl-ati) 00:49:45 (00:17:20) 0.5C + 0.5ATI (d1)

Significantly longer due to poorer bandwidth..

8-)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.