Observations on FGRPB1 1.16 for Windows

Keith Myers

Joined: 11 Feb 11

Posts: 5049

Credit: 19060199943

RAC: 6474223

Oh yes, the computers thermal

18 Dec 2016 19:14:18 UTC

Message 153040 in response to message 153032

(moderation:

)

Oh yes, the computers thermal output certainly helps keep the furnace from cycling so often. The only real issue is with the 970 card machine. It evidently is so heavily loaded with Einstein work that it prevents any new MilkyWay tasks to download. I am almost out. I couldn't figure out why the onboard task count was dropping from the default 160 tasks for two cards. The other 1070 machines are keeping their full load of MilkyWay tasks up. I finally found a log entry that said the reason for not downloading any new work was something along the lines of "not getting work - not highest GPU priority project". My resource allocation is 80% SETI and 10% MilkyWay and 10% Einstein. Evidently the combination of SETI and Einstein GPU work has maxed out the GPU workload on the 970 based machines. I hope that after I work off some of those 600+ Gamma Ray tasks that the machine will begin to pick up MilkyWay work again. The downside to that will be the same situation I ran into with the new Einstein work, the machine will go into exclusive MilkyWay running to equalize the project time deficit. I sure wish BOINC was a "set and forget" application. I am getting tired of having to babysit it all the time to get it to do what I want.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

Zalster wrote:I have over 900

18 Dec 2016 23:11:49 UTC

Message 153047 in response to message 153032

(moderation:

)

Zalster wrote:

I have over 900 each on 2 different machines. Not that I'm complaining.

I was downloading more tasks just a minute ago and saw a new kind of limiting message I've never seen before:

"Not requesting tasks: too many runnable tasks"

That host had 1009 tasks "in progress" at that point. Looks like power of that message overrides power of the daily quota.

Here's a fresh result for R9 270X :

AMD R9 270X (16.12.1), Intel X56xx @ 4GHz, running 2 tasks : 12min (~710 sec)

(AMD R9 270X (16.12.1), Intel X56xx @ 4GHz, running 1 task : 9min (~550 sec) )

Nice benefit from running 2x also with this AMD !

Betreger

Joined: 25 Feb 05

Posts: 992

Credit: 1626192870

RAC: 788564

1.16 on Widoz is a great

18 Dec 2016 23:49:51 UTC

Message 153052

(moderation:

)

1.16 on Widoz is a great improvement over 1.15, but much slower than the Linux machines I've been paired with.

I've been doing distributed computing since 99 and this is the first instance I've seen where the OS has had a significant effect upon throughput. My GTX660 was processing about 80K per day on BRP6 beta, 60k with the switch over to BRP4, the way it now is I doubt if it will produce much more than 25k. I don't understand.

Jesse Viviano

Joined: 8 Jun 05

Posts: 33

Credit: 133045917

RAC: 0

That is not much of a

19 Dec 2016 2:12:34 UTC

Message 153057

(moderation:

)

That is not much of a surprise. AMD's GPUs are much more weighted towards shader hardware which is good for compute jobs than fixed function graphics hardware. The reasons that Nvidia still gets a high market share for GPUs in compute are legacy CUDA programs that have not been converted to OpenCL still exist, Nvidia guarantees tighter accuracy tolerances than AMD does which only guarantees its accuracy to the looser OpenCL accuracy tolerances, and AMD's drivers have for a long time been crashing messes. (It seems that AMD is now making a much needed effort to clean up its drivers, so this hopefully has changed or will change.) The last consumer Nvidia GPUs that have significant double precision hardware support are the original GTX Titan and the GTX Titan Black Edition, which allow the user to enable full double precision support at the cost of limiting the GPU's clock speed due to the additional power required to support all of that double precision hardware.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

Now this was interesting to

19 Dec 2016 3:26:24 UTC

Message 153059

(moderation:

)

Now this was interesting to find out. I've been running 'Multi-Directed Continuous Gravitational Wave search CV 1.00 (AVX)' tasks parallel with these FGRPB1G 1.16 GPU tasks in my Windows host with AMD card (host 12467404).

Running even a few of these CV 1.00 (AVX) tasks will cause a significant slow-down on these GPU tasks in this machine! I've used Options > Computing preferences... "use at most xx % of the CPUs" to limit the total amount of tasks running.

Running 1.16 (FGRPopencl-Beta-ati) 2x +

4 CV's = 715 s

3 CV's = 675 s

1 CV = 648 s

0 CV's = 720 s ... wtf ??

edit: Now I see. When there is nothing but 2 GPU tasks running and with current BIOS settings this system will think there's no hurry... the load is so low that system starts down-clocking CPU to 2.3GHz and CPU speed keeps jumping widely. I can see that in CPU-Z.

When I resume 1 CPU task and total load is then 3/12 threads this system stops relaxing CPU clock speed and it stays constantly at the high limit. So... with current BIOS settings this system needs at least 1 CPU task alongside those 2 GPU tasks to operate at full speed!

I wonder if that trend with those completion times would be similar also if CPU tasks were something else than 'Multi-Directed Continuous Gravitational Wave search CV'. I'm going to test with 'Gamma-ray pulsar binary search #1 1.05 (FGRPSSE)'s :

Running 1.16 (FGRPopencl-Beta-ati) 2x +

1 FGRPSSE = 646 s ... (and system load was sufficient; CPU clock stayed constantly at the high limit)

2 FGRPSSE's = 650 s

3 FGRPSSE's = 652 s

4 FGRPSSE's = 665 s

5 FGRPSSE's = 678 s

6 FGRPSSE's = 711 s

Those times are not absolutely accurate, but that trend was clearly there.

My conclusions:

Those two CPU applications can have a notably different amount of pressure on a system running these GPU tasks. Running 'CV 1.00 (AVX)' CPU tasks in this machine will slow down these FGRPB1G 1.16 tasks in quite a steep curve per every additional CPU task. Running a few FGRPSSE tasks will result in much more gentle rise on the completion times.

Jesse Viviano

Joined: 8 Jun 05

Posts: 33

Credit: 133045917

RAC: 0

First, you can disable the

19 Dec 2016 23:20:32 UTC

Message 153117 in response to message 153059

(moderation:

)

First, you can disable the underclocking due to lack of work by going into your power options control panel in Windows and selecting the high performance power profile. Second, SIMD instructions like AVX instructions, while very efficient power wise when programmed well for the amount of work that they do, are power hogs because they do lots of work per instruction. Your CPU will hit its power or thermal limit and have to slow down to stay within the power and thermal limits when loaded with these instructions.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

Jesse, thank you for that

19 Dec 2016 23:41:24 UTC

Message 153118

(moderation:

)

Jesse, thank you for that information. High performance power profile sounds like a nice alternative (quicker than fiddling with BIOS).

SupeRNovA

Joined: 12 Apr 05

Posts: 2

Credit: 68031945

RAC: 0

I run 3 packages in the same

20 Dec 2016 0:30:12 UTC

Message 153122

(moderation:

)

I run 3 packages in the same time on my Nvidia 1080 average 8 min and a half. Now i'm trying 4 packages getting around 11 min and 10 sec. Really happy with the performance.

Observations on FGRPB1 1.16 for Windows

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner