Native Apple Silicon / M1|M2|M3 Apps available

Marcin
Marcin
Joined: 19 Jun 09
Posts: 20
Credit: 6749684
RAC: 1143

Bernd Machenschalk wrote: I

Bernd Machenschalk wrote:

I set the GPU App to require only half a core. Feedback is welcome.

Regarding the overshoot of tasks, this is an issue with BOINC's runtime estimation, which is difficult to handle. If you're not running any other App of this project on the machine, it should adjust itself after running some dozens of these tasks. I'll see what I can do to mitigate that on the server side, too. Regarding server side management these processors' behavior is pretty new, and the app is also still under development.


the app instantly terminated on my m1 with an error

Quote:


 

Name:p2030.1711377073.G60.20-03.65.C.b2s0g0.00000_3481_0

Workunit ID:796977518

Created:29 Mar 2024 5:44:57 UTC

Sent:29 Mar 2024 16:58:34 UTC

Report deadline:12 Apr 2024 16:58:34 UTC

Received:29 Mar 2024 17:00:06 UTC

Server state:Over

Outcome:Computation error

Client state:Compute error

Exit status:6 (0x00000006) Unknown error code

Computer:12985378

Run time (sec):8.21

CPU time (sec):0.99

Peak working set size (MB):27.28

Peak swap size (MB):421378.08

Peak disk usage (MB):0.02

Validation state:Invalid

Granted credit:0

Application:Binary Radio Pulsar Search (Arecibo,GBT,A) v2.08 (Apple_M-opencl)
arm64-apple-darwin


Quote:


 

Stderr output

<core_client_version>7.22.2</core_client_version>
<![CDATA[
<message>
process exited with code 6 (0x6, -250)</message>
<stderr_txt>
[17:58:40][13531][INFO ] Application startup - thank you for supporting Einstein@Home!
[17:58:40][13531][INFO ] Starting data processing...
[17:58:40][13531][INFO ] Using Metal device "Apple M1"
[17:58:40][13531][INFO ] Checkpoint file unavailable: p2030.1711377073.G60.20-03.65.C.b2s0g0.00000_3481.cpt (No such file or directory).
------> Starting from scratch...
[17:58:40][13531][INFO ] Header contents:
------> Original WAPP file: ./p2030.20160623.G60.20-03.65.C.b2s0g0.00000_DM734.10
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 57562.305947446097
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.336182022
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 195732.3456
------> DEC (J2000): 221420.4799
------> Galactic l: 0
------> Galactic b: 0
------> Name: G60.20-03.65.C
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 734.1 cm^-3 pc
------> Scale factor: 0.000969215
[17:58:40][13531][INFO ] Seed for random number generator is 1161917594.
-[MTLComputePipelineDescriptorInternal setComputeFunction:withType:]:722: failed assertion `computeFunction must not be nil.'

[17:58:41][13531][ERROR] Application caught signal 6.
[17:58:41][13531][ERROR] Backtrace:
------> 0 einsteinbinary_BRP4A_2.08_arm64-app 0x0000000104f83cac _ZL10sighandleri + 148
------> 1 libsystem_platform.dylib 0x000000018a01b584 _sigtramp + 56
------> 2 libsystem_pthread.dylib 0x0000000189feac20 pthread_kill + 288
------> 3 libsystem_c.dylib 0x0000000189ef7a20 abort + 180
------> 4 libsystem_c.dylib 0x0000000189ef6d10 err + 0
------> 5 Metal 0x00000001944a11a4 _Z13MTLGetEnvCaseI16MTLErrorModeTypeEbPKcRT_RKNSt3__16vectorINS5_4pairIS2_S3_EENS5_9allocatorIS8_EEEE.cold.1 + 0
------> 6 Metal 0x000000019447ddc0 MTLReportFailure + 464
------> 7 Metal 0x000000019446cc04 -[MTLComputePipelineDescriptorInternal setComputeFunction:withType:] + 244
------> 8 Metal 0x0000000194351f7c -[_MTLDevice newComputePipelineStateWithFunction:error:] + 56
------> 9 einsteinbinary_BRP4A_2.08_arm64-app 0x0000000104f89474 _ZL15initializeVkFFTP16VkFFTApplication18VkFFTConfiguration + 452
------> 10 einsteinbinary_BRP4A_2.08_arm64-app 0x0000000104f8909c set_up_fft + 104
------> 11 einsteinbinary_BRP4A_2.08_arm64-app 0x0000000104f86af4 MAIN + 10184
------> 12 einsteinbinary_BRP4A_2.08_arm64-app 0x0000000104f83914 main + 2300
------> 13 dyld 0x0000000189c620e0 start + 2360
17:58:46 (13531): called boinc_finish(6)

</stderr_txt>
]]>


Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250468812
RAC: 35303

Oliver Behnke wrote: Guys,

Oliver Behnke wrote:

Guys, we think we found the culprit causing the signal 6 errors on Sonoma, finally!

The details can be found here but it boils down to you having to install BOINC 7.24.3 to fix the permission issue. If a simple update doesn't help you might want to uninstall and fully reinstall BOINC to make sure.

The good thing is that investigating the issue also allowed us to improve the numerical stability of our app such that it should validate much better :)

We're going to release a new version (2.08) without the debugging stuff soon.

Thanks for your patience! This one took a while to figure out because we couldn't reproduce the error in our labs.

Cheers,
Oliver

That looks like the problem Oliver mentioned here. If you didn't install BOINC Client 7.24.3 recently, please do so.

BM

Marcin
Marcin
Joined: 19 Jun 09
Posts: 20
Credit: 6749684
RAC: 1143

Thanks for the prompt

Thanks for the prompt response!
I have ran the update, will report if GPU task is successful once i get some since currently Einstein did not assign any 

Marcin
Marcin
Joined: 19 Jun 09
Posts: 20
Credit: 6749684
RAC: 1143

Marcin wrote: Thanks for the

Marcin wrote:

Thanks for the prompt response!
I have ran the update, will report if GPU task is successful once i get some since currently Einstein did not assign any 


follow up: after the update apps now come in and seem to compute properly at least for Arecibo GBT,A

booc0mtaco
booc0mtaco
Joined: 30 Nov 18
Posts: 4
Credit: 15612112
RAC: 76300

Thanks for the solution. I'm

Thanks for the solution. I'm seeing some tasks running without error so far :) Will report back once they reach validation.

EDIT: One thing I am curious about (possibly a tangent), I didn't see a prompt in the BOINC client that there was a newer 7.x version to update to, like normal. I suspect others who have this hardware might also not see an update and still be on a 7.22.x version, like I was. Not sure if anything can be done on the project side to help folks update to latest...

jd
jd
Joined: 13 Mar 05
Posts: 36
Credit: 557054846
RAC: 83094

Same with me. Using the

Same with me. Using the “check for update” returned no update. However I checked the boinc website anyway and saw there was an update. This was a couple of weeks ago. Unfortunately I didn’t realize the implications, otherwise I could have helped people here. Disappointing. 

zombie67 [MM]
Joined: 10 Oct 06
Posts: 121
Credit: 492689980
RAC: 1430099

So now that everything works,

So now that everything works, can we get a readout on the performance of the Mac GPUs?  I am running the original M1 Mac mini, and my tasks take about ~1400 seconds.  How would that compare, theoretically to other GPUs?  And what about other apple GPU like Mac/pro/ultra?  M1 vs m2 vs m3?

 

Reno, NV Team: SETI.USA

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 984
Credit: 25171438
RAC: 39

jd wrote: Same with me.

jd wrote:

Same with me. Using the “check for update” returned no update. However I checked the boinc website anyway and saw there was an update. This was a couple of weeks ago. Unfortunately I didn’t realize the implications, otherwise I could have helped people here. Disappointing. 

There's a bug report upstream for this issue.

Einstein@Home Project

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 984
Credit: 25171438
RAC: 39

zombie67 wrote: So now that

zombie67 wrote:

So now that everything works, can we get a readout on the performance of the Mac GPUs?  I am running the original M1 Mac mini, and my tasks take about ~1400 seconds.  How would that compare, theoretically to other GPUs?  And what about other apple GPU like Mac/pro/ultra?  M1 vs m2 vs m3?

As noted above the current GPU version isn't yet fully done. It's our first proof-of-concept native Metal app which still has the speed potential of a couple of factors more. Feel free, though, to start a thread in the Cruncher's Corner to gather timings from your fellow volunteers. Please keep in mind that it's not straight forward to compare those figures meaningfully as they'll depend not only on the hardware used, but also on the app version as well the runtime environment, e.g. the GPU and CPU load as well as the (unified!) memory consumption of everything else running on the system.

Oliver

Einstein@Home Project

zombie67 [MM]
Joined: 10 Oct 06
Posts: 121
Credit: 492689980
RAC: 1430099

Oliver Behnke wrote: If you

Oliver Behnke wrote:

If you look back closely the error didn't change. We just augmented the output which gave us the needed clue.

Please note that the app isn't yet at its full potential since we're not done yet with porting all of the data analysis code. So stay tuned for more :)

Oliver

 

Any new news on this optimization effort?

Reno, NV Team: SETI.USA

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.