Gamma-ray pulsar binary search #1 on GPUs

Kailee71

Joined: 22 Nov 16

Posts: 35

Credit: 42623563

RAC: 0

Defender_2 wrote:Since E@H is

18 Dec 2016 11:08:51 UTC

Message 153019 in response to message 153018

(moderation:

)

Defender_2 wrote:

Since E@H is verhy bandwith-hungry I guess it's caused by the different PCI-lanes.

Could someone please confirm this? I would have thought as much data as possible would be transferred to gpu memory initially where it would then be crunched... This would mean pie bandwidth wouldn't have such a large effect on performance.

Tia,

Kailee.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250389218

RAC: 34726

In the FGRP GPU app I think

18 Dec 2016 11:42:00 UTC

Message 153020

(moderation:

)

I don't know the BRP code from the top of my head, but I believe in the FGRP GPU app less data is transferred between CPU and GPU memory than in BRP. I wouldn't expect the PCI bandwith being an issue here.

There is indication, however, that the clFinish()s in the current code cause a lot of CPU load via the driver. I'm really not sure how much data transfer this involves, it really depends on the implementation in the driver. We'll work on a way to get rid of these clFinish()s as much as possible, which should also reduce the CPU utilization.

Kailee71

Joined: 22 Nov 16

Posts: 35

Credit: 42623563

RAC: 0

Ok so I did some gpu

18 Dec 2016 13:46:23 UTC

Message 153024

(moderation:

)

Ok so I did some gpu shuffling between my machines to see what's going on. The machines are:
elmo: 2x L5630, PCIE x8
frazzle: 2x X5670, PCIE x16
kermit: 2x X5670, PCIE x8

Runtimes (FGRP 1.14 opencl-ati on R9 280x);
elmo: 975 frazzle: 420 kermit: 437
Runtimes (FGRP 1.14 opencl-nvidia on GTX580);
elmo: 890 frazzle: 1100 kermit: 1130

Geekbench compute results using the R9's follow this trend also; elmo: 56129, frazzle 113664, kermit: 113168

So clearly the PCIE bandwidth is not the issue; frazzle and kermit are identical except for the x16/x8 slots, and kermit and elmo have literally identical motherboards. The only significant difference here then are the CPUs; can these really cause such large discrepancies?

TIA,

Kailee.

AgentB

Joined: 17 Mar 12

Posts: 915

Credit: 513211304

RAC: 0

Are they all running the same

18 Dec 2016 14:17:58 UTC

Message 153025 in response to message 153024

(moderation:

)

Are they all running the same memory at the same speed and channels?

A while back i was looking at the difference between several Xeons here and there is a Intel PCM tool

The Intel tool helped identify the difference. HTH.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

Kai Leibrandt wrote:So

18 Dec 2016 14:20:12 UTC

Message 153026 in response to message 153024

(moderation:

)

Kai Leibrandt wrote:

So clearly the PCIE bandwidth is not the issue; frazzle and kermit are identical except for the x16/x8 slots, and kermit and elmo have literally identical motherboards. The only significant difference here then are the CPUs; can these really cause such large discrepancies?

Both hosts have 24GB RAM, but what is the RAM speed on them (MHz, single/dual/triple channel, CAS-latencies)? I wonder if difference there could be a reason... if one is running RAM perhaps at significantly different speed.

edit: AgentB had faster thoughts :)

AgentB

Joined: 17 Mar 12

Posts: 915

Credit: 513211304

RAC: 0

http://ark.intel.com/compare/

18 Dec 2016 15:07:48 UTC

Message 153029 in response to message 153026

(moderation:

)

http://ark.intel.com/compare/47920,47927 shows the L5630 should be slower all things being equal.

Pete

Joined: 31 Jul 10

Posts: 14

Credit: 1020243718

RAC: 0

I would like to take issue

18 Dec 2016 16:54:58 UTC

Message 153033

(moderation:

)

I would like to take issue with Oliver Bock's comment on page 12 of this thread where he states that utilising the gpu at 100% is their goal. BOINC states on its first page that:-Use the idle time on your computer (Windows, Mac, Linux, or Android) to cure diseases, study global warming, discover pulsars, and do many other types of scientific research. It's safe, secure, and easy: Therefore taking 100% of my computer's gpu and causing it to be laggy and visually unusable is not in the ethos of BOINC. On the previous application I ran 2 simultaneous BRPG4s and the gpu ran at about 95-98%. This left plenty for me to run web browsing and YouTube videos and I never noticed Einstein crunching away in the background.I would be more than happy to continue in this way. Now with one of these 1.16 applications I cannot. Someone else suggested running 'not as root user' well I don't know how to do this, since I think it already is a root user. Oliver also suggested TThrottle;- well I could only see how gpu use is throttled by temperature not % use, so that is a very blunt tool. I tried it and it seemed useless as it hunted for long periods of time around the temp and still allowed 100% use of the gpu. The other option of using computer preferences is poor for me as Einstein would hardly use the gpu at all for many hours of the day. I also read that people with GTX750s have this problem so what of your concern about all the little people who contribute to your science project? My guess is that most will turn off the application or gpu as it make their computer unuseable.

Lastly, from an engineering point of view, and please correct me if I am wrong, I was happy that my gpu sat at 95-98% all day as the temperature was stable from one hour to the next. Now the gpu goes at 100% for 4-5 secs then spikes to 0% then back to 100% again. Thermally cycling millions of electronic junction by a degree or so. If I use computer preferences then the temp will do much bigger thermal cycles all day long. Is this good for the longevity of my card?

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

I agree with you that

18 Dec 2016 17:27:31 UTC

Message 153035 in response to message 153033

(moderation:

)

I agree with you that something like 98% could be better goal than 100% (my one host experienced same kind of total lagging earlier with test version... and I can see it's an annoying moment to even try to reach the settings if the computer almost doesn't respond at all).

Pete_28 wrote:

I tried it and it seemed useless as it hunted for long periods of time around the temp and still allowed 100% use of the gpu.

In the "Expert mode settings" on TThrottle... did you try the 'Laptop' mode for GPU throttling? Manual says: "Sometimes the GPU throttle is working too fast or too slow. Select "desktop" for slow regulating and "laptop" for fast regulation." I don't know if this would help get rid of the lagging, but what if you set even lower temp limit for the GPU... so that TThrottle would really start to limit the utilization?

Pete_28 wrote:

this good for the longevity of my card?

I believe that if it's possible to run more than one task at a time, that would help to even out the thermal stress at least somewhat. If tasks are running in different phase then there's a change that GPU utilization doesn't keep jumping from 0% to something. Instead, there will be some amount of load all the time.

choks

Joined: 24 Feb 05

Posts: 16

Credit: 145992405

RAC: 78606

@Pete: Now the gpu goes at

18 Dec 2016 18:06:12 UTC

Message 153037

(moderation:

)

@Pete: Now the gpu goes at 100% for 4-5 secs then spikes to 0% then back to 100% again

That's the way the maths works: we need to go to the CPU at some points at least to have boinc status saved, so you can restart your job not at the beginning.

As for thermal cycling, do you see significant GPU temperature changes?

Mouse/scrolling lags looks a more important issue to me. It is only affecting Nvidia cards? - From what I have read here, Nvidia cards performing a WU in more than 1000-2000 seconds looks affected by lagging.

Please report the inconvenience and your HW specs.

I can make a patch tomorrow to force a sleep in order to make the desktop more responsible, but I need to know on which hardware I have to do it. (don't want to slow down users that don't have any lag effects - like me )

Thanks,

Christophe

TimeLord04

Joined: 8 Sep 06

Posts: 1442

Credit: 72378840

RAC: 0

[Update:] MAC/Hackintosh is

18 Dec 2016 18:15:58 UTC

Message 153038

(moderation:

)

[Update:]

MAC/Hackintosh is FINALLY on the 1.14 FGRPB1G Units. Been crunching since 6 AM - PST, and at some point in the last couple of hours the MAC has completed several 1.14 Units.

Of the Invalids that I have showing for the MAC, three are 1.12 Units, and two are 1.13 Units. I still attribute these Invalids to the same OpenCL Bug that affects SETI OpenCL Units on MAC. I will keep monitoring and report anymore new Invalids.

Both of my systems are still crunching TWO Units at a time per GPU card. Three GPU cards crunching, for a Total of 6 Units at a time.

TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join SETI Refugees

Gamma-ray pulsar binary search #1 on GPUs

Forums › Technical News

Comment viewing options

Forums › Technical News