Observations on FGRBP1 1.17 for Windows

AgentB

Joined: 17 Mar 12

Posts: 915

Credit: 513211304

RAC: 0

The performance differences

29 Dec 2016 13:18:08 UTC

Message 153493

(moderation:

)

The performance differences may not be the application's to fix.

There is often quite large differences between different the two OS's (look around the rendering community who use OpenCL)

There are OS / library / compiler / driver differences the application has no chance of "improving".

We also seem to be setting GPU expectations based on the CUDA experience in BRP4 and BRP6. We have not run OpenCL on nVidia here in the past, so perhaps that is also something to consider as well.

Betreger

Joined: 25 Feb 05

Posts: 992

Credit: 1591812357

RAC: 774872

That does not seem to be the

29 Dec 2016 20:33:24 UTC

Message 153503 in response to message 153493

(moderation:

)

That does not seem to be the case over at Seti.

I asked the question:

SOG is an open cl app as I understand. Is there a appreciable difference in run time between running open cl on Windoz vs Linux? The reason I ask is Einstein recently came out with an open cl app and it runs 5 to 10 times faster on Linux than Windoz. Is that something inherent in Linux or just the way the app was written?

This is the answer I got:

When I moved one of my crunchers from Windows to Linux it briefly ran SoG applications. I found very little difference in run times, but there was a reduction in the demands on the CPU.

MAGIC Quantum M...

Joined: 18 Jan 05

Posts: 1887

Credit: 1411684505

RAC: 1182452

I ran a quick 8 FGRP v1.17

29 Dec 2016 21:10:22 UTC

Message 153509

(moderation:

)

I ran a quick 8 FGRP v1.17 X2 with a Win 10 OS and GeForce 660Ti in 53 mins each X2 running.

I have started and stopped the next X2 a couple times and they restarted fine (busy running some VB tasks)

Plan on running the rest I have tonight and when they finally put the vLHC to sleep I can get all my GPU cards back to work here before I start my 13th year here in a couple weeks.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7225651597

RAC: 1049294

Betreger wrote:Einstein

30 Dec 2016 0:17:32 UTC

Message 153518 in response to message 153503

(moderation:

)

Betreger wrote:

Einstein recently came out with an open cl app and it runs 5 to 10 times faster on Linux than Windoz.

While the very first Windows application released for FGRBP1 was rushed out with a very limited subset of computation moved to the GPU (FFT only, I think) and thus was almost a pure CPU application with speed-up supplied by GPU for roughly half the original pure CPU work, that was followed very quickly by a release which moved much more work to the GPU.

I doubt the current released application runs 5 to 10 times faster on Linux than the 1.17 Windows application. I think it would be well to compare elapsed times on comparable hardware comparably configured. Because of the high CPU requirement which inhibits many configurations from running high multiplicity, and because of the intermittent GPU use which renders 1X running unusually ineffiicient, I suggest we compare systems which are running 2X, and which have a light enough CPU load to avoid crippling the FGRBP1 application by CPU starvation.

Here are numbers averaged over several days running. They come from a single system, which is my most modern and productive. A single i5-4690K CPU, running stock, is supporting a total of 4 GPU tasks, as the system has both a GTX 1070 and a 6 GB GTX 1060, both running at the fastest overclocks I believe to be long-term safe. No BOINC CPU tasks are running, and although I use the system as my daily driver, I believe the Einstein productivity is very little reduced by that. While recent Einstein applications have suffered little performance loss in sharing a capable host among two GPUs, this application is much more demanding of host resources, and I suspect my times are degraded by this sharing appreciably.

I've removed from the averages a few remaining WUs of the much shorter type which get 693 credits and have names generally starting LATeah2003 instead of LATeah0010.

I hope some people reading this will report GTX 1070 or 1060 elapsed time averages on Windows and Linux machines running 2X. Perhaps we can put a sounder comparison in place for the current applications.

Card    CPU     mult ET(h:mm:ss)
1070    i5-4690K 2X     0:52:21
6GB1060 i5-4690K 2X     1:48:40

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18749079003

RAC: 7076917

I also don't think that there

30 Dec 2016 2:15:54 UTC

Message 153523 in response to message 153518

(moderation:

)

I also don't think that there is that much difference between similar platforms running either Windows or Linux.

Card    CPU     mult ET(h:mm:ss)
1070    FX-8350  2X     0:42:56

1070    FX-8370  2X     0:43:40

970     FX-8300  2X     0:51:13

Each GPU task supported by 1 full CPU core.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7225651597

RAC: 1049294

Keith Myers wrote:Card CPU

30 Dec 2016 4:10:26 UTC

Message 153531 in response to message 153523

(moderation:

)

Keith Myers wrote:

Card CPU mult ET(h:mm:ss)

1070    FX-8350  2X     0:42:56

1070    FX-8370  2X     0:43:40

970     FX-8300  2X     0:51:13

Each GPU task supported by 1 full CPU core.

As the computers visible from your ID are all Windows, I assume these are Windows timings. As they also list two GPUs for every host, I assume you are also suffering some harm from host sharing. But your timings are appreciably better than mine for the same model of card (1070). Perhaps your motherboard/CPU combination is doing a better job of supporting the GPU than is mine for some reason.

Also, are you overclocking your GPU cards?

Chris

Joined: 9 Apr 12

Posts: 61

Credit: 45056670

RAC: 0

I am running one AVX and one

30 Dec 2016 5:23:03 UTC

Message 153532

(moderation:

)

I am running one AVX and one 1.17 at a time on a quad core. Getting pretty severe lag on any inputs. This goes away as soon as I snooze the gpu. I am sure t here is some non boinc mucking things up too, but I wonder if the swapping between the GPU and CPU is just too much. Its only a 1 gb card.

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18749079003

RAC: 7076917

Yes, all Windows. Two

30 Dec 2016 6:07:12 UTC

Message 153534 in response to message 153531

(moderation:

)

Yes, all Windows. Two Windows7 64 bit and one Windows10 64 bit. Yes, I am overclocking the cards in their P-2 states to the original P0 state timings for GPU core and memory frequencies via NVI. Other than that, just letting the cards do their normal GPU Boost thing. If anything, I am running handicapped on the CPU front since I run AMD FX chips. They are slightly overclocked to 4.0 Ghz for the 8300 and 4.4 Ghz for the 8350/8370.

DF1DX

Joined: 14 Aug 10

Posts: 105

Credit: 3871286854

RAC: 4936313

Dualboot Dell T20, ECC

30 Dec 2016 13:27:56 UTC

Message 153546

(moderation:

)

Dualboot Dell T20, ECC Memory, NVidia 750Ti, no overclocking, no CPU Tasks, 2x FGRP 1.17:

Card    CPU      mult  ET(h:mm:ss) Driver  OS               Hostid

750Ti   E3-1225v3  2X  ~2:18:45(!) 367.57  Linux Mint 17.3  12241921

750Ti   E3-1225v3  2X  ~1:57:30    376.57  Win 7/64 Bit     12247194

and my other pc (Windows only) with no CPU Tasks, 2x FGRP 1.17:

1060/6GB i5-3570K  2x  ~0:43:00    372.70  Win 7/64 Bit      6742381

n12365

Joined: 4 Mar 16

Posts: 26

Credit: 6491436572

RAC: 0

archae86 wrote:I hope some

30 Dec 2016 17:55:58 UTC

Message 153561 in response to message 153518

(moderation:

)

archae86 wrote:

I hope some people reading this will report GTX 1070 or 1060 elapsed time averages on Windows and Linux machines running 2X. Perhaps we can put a sounder comparison in place for the current applications.
Card    CPU     mult ET(h:mm:ss)
1070    i5-4690K 2X     0:52:21
6GB1060 i5-4690K 2X     1:48:40

All of my computers are running Windows 10 and these are the run times I am getting for FGRP 1.17:

Card CPU Multi Run Time

1070 i5-6600 2X 0:29:06

1070 Q9450 2X 0:38:32

6GB1060 i5-4690 2X 0:41:46

3GB1060 i7-4790K 2X 0:43:34

My times are significantly different from the times you are getting. I am using the median run time over the last 7 days to compare the relative performance of my computers. Is the elapsed time you are using the same as the run time I am using? I am pulling the run time from the task webpage. This is an example from my fast 1070 machine: https://einsteinathome.org/task/599111630

Observations on FGRBP1 1.17 for Windows

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner