Native Apple Silicon / M1|M2|M3 Apps available

Marcin

Joined: 19 Jun 09

Posts: 20

Credit: 6749684

RAC: 1143

Bernd Machenschalk wrote: I

29 Mar 2024 17:03:17 UTC

Message 223724 in response to message 223665

(moderation:

)

Bernd Machenschalk wrote:

I set the GPU App to require only half a core. Feedback is welcome.

Regarding the overshoot of tasks, this is an issue with BOINC's runtime estimation, which is difficult to handle. If you're not running any other App of this project on the machine, it should adjust itself after running some dozens of these tasks. I'll see what I can do to mitigate that on the server side, too. Regarding server side management these processors' behavior is pretty new, and the app is also still under development.

the app instantly terminated on my m1 with an error

Quote:

Name:p2030.1711377073.G60.20-03.65.C.b2s0g0.00000_3481_0

Workunit ID:796977518

Created:29 Mar 2024 5:44:57 UTC

Sent:29 Mar 2024 16:58:34 UTC

Report deadline:12 Apr 2024 16:58:34 UTC

Received:29 Mar 2024 17:00:06 UTC

Server state:Over

Outcome:Computation error

Client state:Compute error

Exit status:6 (0x00000006) Unknown error code

Computer:12985378

Run time (sec):8.21

CPU time (sec):0.99

Peak working set size (MB):27.28

Peak swap size (MB):421378.08

Peak disk usage (MB):0.02

Validation state:Invalid

Granted credit:0

Application:Binary Radio Pulsar Search (Arecibo,GBT,A) v2.08 (Apple_M-opencl)
arm64-apple-darwin

Quote:

Stderr output
<core_client_version>7.22.2</core_client_version>
<![CDATA[
<message>
process exited with code 6 (0x6, -250)</message>
<stderr_txt>
[17:58:40][13531][INFO ] Application startup - thank you for supporting Einstein@Home!
[17:58:40][13531][INFO ] Starting data processing...
[17:58:40][13531][INFO ] Using Metal device "Apple M1"
[17:58:40][13531][INFO ] Checkpoint file unavailable: p2030.1711377073.G60.20-03.65.C.b2s0g0.00000_3481.cpt (No such file or directory).
------> Starting from scratch...
[17:58:40][13531][INFO ] Header contents:
------> Original WAPP file: ./p2030.20160623.G60.20-03.65.C.b2s0g0.00000_DM734.10
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 57562.305947446097
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.336182022
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 195732.3456
------> DEC (J2000): 221420.4799
------> Galactic l: 0
------> Galactic b: 0
------> Name: G60.20-03.65.C
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 734.1 cm^-3 pc
------> Scale factor: 0.000969215
[17:58:40][13531][INFO ] Seed for random number generator is 1161917594.
-[MTLComputePipelineDescriptorInternal setComputeFunction:withType:]:722: failed assertion `computeFunction must not be nil.'
[17:58:41][13531][ERROR] Application caught signal 6.
[17:58:41][13531][ERROR] Backtrace:
------> 0 einsteinbinary_BRP4A_2.08_arm64-app 0x0000000104f83cac _ZL10sighandleri + 148
------> 1 libsystem_platform.dylib 0x000000018a01b584 _sigtramp + 56
------> 2 libsystem_pthread.dylib 0x0000000189feac20 pthread_kill + 288
------> 3 libsystem_c.dylib 0x0000000189ef7a20 abort + 180
------> 4 libsystem_c.dylib 0x0000000189ef6d10 err + 0
------> 5 Metal 0x00000001944a11a4 _Z13MTLGetEnvCaseI16MTLErrorModeTypeEbPKcRT_RKNSt3__16vectorINS5_4pairIS2_S3_EENS5_9allocatorIS8_EEEE.cold.1 + 0
------> 6 Metal 0x000000019447ddc0 MTLReportFailure + 464
------> 7 Metal 0x000000019446cc04 -[MTLComputePipelineDescriptorInternal setComputeFunction:withType:] + 244
------> 8 Metal 0x0000000194351f7c -[_MTLDevice newComputePipelineStateWithFunction:error:] + 56
------> 9 einsteinbinary_BRP4A_2.08_arm64-app 0x0000000104f89474 _ZL15initializeVkFFTP16VkFFTApplication18VkFFTConfiguration + 452
------> 10 einsteinbinary_BRP4A_2.08_arm64-app 0x0000000104f8909c set_up_fft + 104
------> 11 einsteinbinary_BRP4A_2.08_arm64-app 0x0000000104f86af4 MAIN + 10184
------> 12 einsteinbinary_BRP4A_2.08_arm64-app 0x0000000104f83914 main + 2300
------> 13 dyld 0x0000000189c620e0 start + 2360
17:58:46 (13531): called boinc_finish(6)

</stderr_txt>
]]>

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250468812

RAC: 35303

Oliver Behnke wrote: Guys,

29 Mar 2024 19:07:49 UTC

Message 223726 in response to message 223582

(moderation:

)

Oliver Behnke wrote:

Guys, we think we found the culprit causing the signal 6 errors on Sonoma, finally!

The details can be found here but it boils down to you having to install BOINC 7.24.3 to fix the permission issue. If a simple update doesn't help you might want to uninstall and fully reinstall BOINC to make sure.

The good thing is that investigating the issue also allowed us to improve the numerical stability of our app such that it should validate much better :)

We're going to release a new version (2.08) without the debugging stuff soon.

Thanks for your patience! This one took a while to figure out because we couldn't reproduce the error in our labs.

Cheers,
Oliver

That looks like the problem Oliver mentioned here. If you didn't install BOINC Client 7.24.3 recently, please do so.

Marcin

Joined: 19 Jun 09

Posts: 20

Credit: 6749684

RAC: 1143

Thanks for the prompt

29 Mar 2024 21:49:44 UTC

Message 223730

(moderation:

)

Thanks for the prompt response!
I have ran the update, will report if GPU task is successful once i get some since currently Einstein did not assign any

Marcin

Joined: 19 Jun 09

Posts: 20

Credit: 6749684

RAC: 1143

Marcin wrote: Thanks for the

30 Mar 2024 17:16:55 UTC

Message 223744 in response to message 223730

(moderation:

)

Marcin wrote:

Thanks for the prompt response!
I have ran the update, will report if GPU task is successful once i get some since currently Einstein did not assign any

follow up: after the update apps now come in and seem to compute properly at least for Arecibo GBT,A

booc0mtaco

Joined: 30 Nov 18

Posts: 4

Credit: 15612112

RAC: 76300

Thanks for the solution. I'm

1 Apr 2024 17:08:22 UTC

Message 223782

(moderation:

)

Thanks for the solution. I'm seeing some tasks running without error so far :) Will report back once they reach validation.

EDIT: One thing I am curious about (possibly a tangent), I didn't see a prompt in the BOINC client that there was a newer 7.x version to update to, like normal. I suspect others who have this hardware might also not see an update and still be on a 7.22.x version, like I was. Not sure if anything can be done on the project side to help folks update to latest...

Joined: 13 Mar 05

Posts: 36

Credit: 557054846

RAC: 83094

Same with me. Using the

1 Apr 2024 17:47:01 UTC

Message 223785 in response to message 223782

(moderation:

)

Same with me. Using the “check for update” returned no update. However I checked the boinc website anyway and saw there was an update. This was a couple of weeks ago. Unfortunately I didn’t realize the implications, otherwise I could have helped people here. Disappointing.

zombie67 [MM]

Joined: 10 Oct 06

Posts: 121

Credit: 492689980

RAC: 1430099

So now that everything works,

3 Apr 2024 4:34:25 UTC

Message 223820

(moderation:

)

So now that everything works, can we get a readout on the performance of the Mac GPUs? I am running the original M1 Mac mini, and my tasks take about ~1400 seconds. How would that compare, theoretically to other GPUs? And what about other apple GPU like Mac/pro/ultra? M1 vs m2 vs m3?

Reno, NV Team: SETI.USA

Oliver Behnke

Moderator

Administrator

Joined: 4 Sep 07

Posts: 984

Credit: 25171438

RAC: 39

jd wrote: Same with me.

3 Apr 2024 8:03:00 UTC

Message 223823 in response to message 223785

(moderation:

)

jd wrote:

Same with me. Using the “check for update” returned no update. However I checked the boinc website anyway and saw there was an update. This was a couple of weeks ago. Unfortunately I didn’t realize the implications, otherwise I could have helped people here. Disappointing.

There's a bug report upstream for this issue.

Einstein@Home Project

Oliver Behnke

Moderator

Administrator

Joined: 4 Sep 07

Posts: 984

Credit: 25171438

RAC: 39

zombie67 wrote: So now that

3 Apr 2024 8:17:40 UTC

Message 223824 in response to message 223820

(moderation:

)

zombie67 wrote:

So now that everything works, can we get a readout on the performance of the Mac GPUs? I am running the original M1 Mac mini, and my tasks take about ~1400 seconds. How would that compare, theoretically to other GPUs? And what about other apple GPU like Mac/pro/ultra? M1 vs m2 vs m3?

As noted above the current GPU version isn't yet fully done. It's our first proof-of-concept native Metal app which still has the speed potential of a couple of factors more. Feel free, though, to start a thread in the Cruncher's Corner to gather timings from your fellow volunteers. Please keep in mind that it's not straight forward to compare those figures meaningfully as they'll depend not only on the hardware used, but also on the app version as well the runtime environment, e.g. the GPU and CPU load as well as the (unified!) memory consumption of everything else running on the system.

Oliver

Einstein@Home Project

zombie67 [MM]

Joined: 10 Oct 06

Posts: 121

Credit: 492689980

RAC: 1430099

Oliver Behnke wrote: If you

27 Apr 2024 3:51:26 UTC

Message 224515 in response to message 223587

(moderation:

)

Oliver Behnke wrote:

If you look back closely the error didn't change. We just augmented the output which gave us the needed clue.

Please note that the app isn't yet at its full potential since we're not done yet with porting all of the data analysis code. So stay tuned for more :)

Oliver

Any new news on this optimization effort?

Reno, NV Team: SETI.USA

Native Apple Silicon / M1|M2|M3 Apps available

Forums › Technical News

Stderr output

Comment viewing options

Forums › Technical News