GW O2 All-Sky v1.07 on GPU doesn't work on macOS/AMD

B.I.G
B.I.G
Joined: 26 Oct 07
Posts: 117
Credit: 1175017741
RAC: 936934
Topic 219396

Got GW tasks again today but this issue still isn't solved:

Continuous Gravitational Wave search O2 All-Sky v1.07 () x86_64-apple-darwin

macOS 10.13 with all updates installed

MacPro 5.1 / AMD RX 580

 

The task goes to ~45-50% within ~35 minutes, then falls back to 0.2% and after another 30 minutes it's at 0.9% with a predicted 5 days remaining. FGRP tasks finish in 15 min.

I am going to abort now and deselect GW tasks again. If there is anything I can do to help finding the problem let me know.

geoffdot5
geoffdot5
Joined: 22 May 19
Posts: 3
Credit: 25320176
RAC: 0

I too am experiencing the

I too am experiencing the same issue as B.I.G. with GW O2 ALL-SKY v1.07.

I am on Mac OS 10.14.6. Mojave

Mac Pro 5.1 running AMD RX 580 8gb

3.46 GHz, 6 core cpu. 32gb 1333MHz ecc, DDR3 memory.

512gb NVME PCIe storage.

any help would be good as I am very new to this. thanks in advance. Geoff.

B.I.G
B.I.G
Joined: 26 Oct 07
Posts: 117
Credit: 1175017741
RAC: 936934

hello GEOFFDOT5, until the

hello GEOFFDOT5,

until the GW tasks are fixed you can set the preferences to not send you this type of task:

go to Account -> Preferences - > Project

select the preference set your computer is using and scroll down to the point "Applications"

There deselect all Gravity Wave applications, a bit further down set "Allow non-preferred apps:" to "NO"

Klick "Save Changes" at the bottom, then you can abort all GW task in your client and hit the "Update" button. That way you should only get FGRP Apps which run fine. Once the issues for macOS are solved you can undo these settings again.

geoffdot5
geoffdot5
Joined: 22 May 19
Posts: 3
Credit: 25320176
RAC: 0

thanks for the help, all

thanks for the help, all aborted and GW unticked in preferences, now will wait for a more stable one.

Rolf
Rolf
Joined: 7 Aug 17
Posts: 27
Credit: 135377187
RAC: 0

Hi BIG and Geoff, I don't

Hi BIG and Geoff,

I don't have a Mac but have seen that if you run a GW GPU task concurrently with FGRP GPU tasks, the FGRP tasks will hog most of the capacity and the GW task will run very slowly. Only if they run together with other GW tasks, will they get some speed. They also take a lot of CPU capacity, almost one core per task.

Rolf

B.I.G
B.I.G
Joined: 26 Oct 07
Posts: 117
Credit: 1175017741
RAC: 936934

Not in my case, the GPU is

Not in my case, the GPU is crunching one task at a time, the GW Tasks never finish, well I didn't let one run for 5 days yet. Also they only use 50% of the GPU when running single, FGRP Tasks use up to 90%

On my machine the GW GPU tasks don't take much CPU time, one core is used 20% by the GW Task that's it, same with FGRP Apps.

I disabled CPU work completely btw.

geoffdot5
geoffdot5
Joined: 22 May 19
Posts: 3
Credit: 25320176
RAC: 0

hi ROLF, nice to hear from

hi ROLF,

nice to hear from you, I too was only using about 20-30% GPU and 1 core and I got up to 2 days of number crunching when I decided to call it time.

most of my GPU tasks are set to 0.33% so that I can run 3 concurrently with no issues, I did feel I was pushing it with 0.25% but again it worked fine just a little toasty but never really got above 80% utilisation. all other tasks complete in good time and up to that point I had very few if no errors that I can remember.

I will give it a week or two then re-enable GW tasks to see if a bug has been found and or a work around has been found.

thanks in advance. Geoff 

pzajdel
pzajdel
Joined: 23 Mar 11
Posts: 9
Credit: 17867366
RAC: 10805

Hello, I have the same

Hello,

I have the same problem with AMD (8970M)/Win10
Application Continuous Gravitational Wave search O2 All-Sky 1.07 (GW-opencl-ati)
Progress rate 0.360% per hour
Executable einstein_O2AS20-500_1.07_windows_x86_64__GW-opencl-ati.exe

It is set up to run 2 tasks at once since it's a bit faster but it is stuck even at one. GPU is almost not used. It does not depend what else is running on this GPU. Other apps (Seti@Home) run OK.

I haven't noticed it for the last 13 hours. I wonder how many people still run it.

Best

Pawel

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117652786098
RAC: 35177980

pzajdel wrote:... I have the

pzajdel wrote:
... I have the same problem with AMD (8970M)/Win10 ...

I doubt it's the same problem.  The OP is running an RX 580 on a Mac.  The problem there is probably something to do with Apple's implementation of OpenCL.  Under Windows or Linux, the RX 580 can run a lot better than what the OP reports.

Your problem is probably that a 8790M is a GCN 1st gen device.  I've just spent some time testing a HD 7850 (also 1st gen) with a single task on the V1.07 app.  It runs exceedingly slowly.  It looked like it would work quite well - it got to about 27% progress after about 30 mins.  As it was continuing to crunch, the progress dropped back to zero and it then took about 5 hours to make it back to 8%.  At that point I pulled the plug and put it back to crunching FGRPB1G where it takes about 20 mins per task crunching 2 at a time.

The GW app is really still under development and it may take quite a while to become more efficient than it currently is.  Until things improve, I suggest you return to crunching FGRPB1G.  You have some completed tasks of that type in your list.  One of them has an excessively long crunch time - I'm guessing perhaps it was paired up with a GW task.  There have been some comments that it's not a good idea to mix the two types.

Cheers,
Gary.

B.I.G
B.I.G
Joined: 26 Oct 07
Posts: 117
Credit: 1175017741
RAC: 936934

I'm not so sure about the

I'm not so sure about the Open CL implementation being the problem. With the FGRP Apps, the programs I use for work and so on all works fine. Also I tried now on my MacBook Pro with a NVIDIA GT 650m, there the results take as long as they used to with the CPU version but at least they finish - validation pending.

While on the MacPro they don't finish at all... well, let's see what the developers are able to do about it in the coming weeks.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117652786098
RAC: 35177980

B.I.G wrote:... on the MacPro

B.I.G wrote:
... on the MacPro they don't finish at all...

I didn't respond to your initial message about the problem because I didn't have any experience or anything useful to add at that time.  You said that crunching was progressing so slowly that you decided to abort - the correct thing to do under the circumstances.  However, that doesn't seem to tie in with your current "don't finish at all" comment.

There were two reasons why I decided to respond to pzajdel's message.  Firstly, I wanted him to understand why his issue was NOT the same as yours.  Secondly, by chance, I had just completed a series of tests on a HD 7850 which is exactly the same GCN generation as his GPU.  Also, coincidentally, the behaviour of the HD 7850 seemed to be very similar to what you had described in your original post - crunching proceeds at a reasonable speed for a while and then the progress resets to 0% and continues at a crawl.  It was pointless to continue tying up a GPU when the task could be crunched faster on a single CPU core so, like you, I aborted.

As a matter of interest, I tried crunching GW tasks on the HD 7850 using both the proprietary fglrx/opencl driver that was deprecated in 2016 and the latest amdgpu/legacy opencl as provided by the AMDGPU-PRO package from AMD.  In both cases there is a similar behaviour - crunching seems to work well for a while and then the reset to 0% occurs, followed by the slow crawl.  The slow crawl was markedly slower for the current AMDGPU-PRO version of OpenCL as compared to the 2016 proprietary version that came with fglrx.  In both cases, there seemed to be no point in continuing.

For your problem, you are using a relatively modern GPU that does work in both Windows and Linux.  At this early stage, there still seems to be validation issues but the crunching itself on RX 570/580 seems to work as it should.  Unless Apple has done something to the device firmware, what do you think could be the cause of the different behaviour you see?  To me, it seems likely there's a good chance it's down to how the app responds to the different OS/firmware/driver/opencl implementation.  I have no idea exactly what might be causing this particular behaviour.  However if crunching seems 'normal' on Windows and Linux, the Apple ecosystem seems a likely candidate.

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.