We're testing a new Apple Silicon App versions for BRP4, both CPU and CPU/GPU. As validation against other BRP4 apps is pretty poor, we are testing this as a separate BOINC Application "BRP4A". Although the plan class is named "Apple_M-opencl" (so that it works with current clients), the code of the GPU version is actually our first Metal code.
Just pulled down some new tasks today on my Mac (M1 Ultra on macOS 14.3.1). Looks like it gets a computation Error after ~7s . Sounds like this is expected on the current app, given the OS version?
Just pulled down some new tasks today on my Mac (M1 Ultra on macOS 14.3.1). Looks like it gets a computation Error after ~7s . Sounds like this is expected on the current app, given the OS version?
Sorry, no. I meant at least macOS Ventura, newer should be OK. Could you share your HostID or stderr of the failed tasks? Are these pure CPU tasks or do these use the GPU/coprocessor?
HostID is 12927206. Looking in stderrdae.txt, one example task (p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0) started, then finished 7-8 seconds later, and had the following lines.
I believe these are using the coprocessor; the application is Binary Pulsar Radio Search with (Apple_M-opencl) and the status printed is "Computation error (1 CPU + 1 Apple M1 Ultra)"
Application:Binary Radio Pulsar Search (Arecibo,GBT,A) v2.00 (Apple_M-opencl)
arm64-apple-darwin
Stderr output
<core_client_version>7.22.2</core_client_version>
<![CDATA[
<message>
process exited with code 6 (0x6, -250)</message>
<stderr_txt>
[08:00:45][143][INFO ] Application startup - thank you for supporting Einstein@Home!
[08:00:45][143][INFO ] Starting data processing...
[08:00:45][143][INFO ] Using Metal device "Apple M1 Ultra"
[08:00:45][143][INFO ] Checkpoint file unavailable: p2030.20190214.G193.86-01.90.N.b0s0g0.00000_1420.cpt (No such file or directory).
------> Starting from scratch...
[08:00:45][143][INFO ] Header contents:
------> Original WAPP file: ./p2030.20190214.G193.86-01.90.N.b0s0g0.00000_DM142.00
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 58529.069560120333
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.336182022
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 60841.4006996
------> DEC (J2000): 160251.518299
------> Galactic l: 0
------> Galactic b: 0
------> Name: G193.86-01.90.N
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 142 cm^-3 pc
------> Scale factor: 0.00134561
[08:00:45][143][INFO ] Seed for random number generator is 1161415202.
-[MTLComputePipelineDescriptorInternal setComputeFunction:withType:]:722: failed assertion `computeFunction must not be nil.'
[08:00:48][143][ERROR] Application caught signal 6.
08:00:53 (143): called boinc_finish(6)
I'm running macOS 14.3.1 on an M1 MacBook Pro. As of yesterday (05 Mar 2024) or maybe the day before, I started seeing a lot of Computation Errors. These are for the "Binary Radio Pulsar Search (Arecibo,GBT,A) 2.0 (Apple_M-opencl)" applications. They show up with an expected run time of 06:15 and result in an error after 07 or 08 seconds. Here's a link to one of the Tasks that resulted in an error:
Bernd Machenschalk
)
Great news Bernd!
Forgive me for asking but what is BRP4?
Just pulled down some new
)
Just pulled down some new tasks today on my Mac (M1 Ultra on macOS 14.3.1). Looks like it gets a computation Error after ~7s . Sounds like this is expected on the current app, given the OS version?
Forgive me for asking but
)
Sorry, the full name as in the preferences is "Binary Radio Pulsar Search (Arecibo,GBT,arm64) (BRP4A)".
BM
booc0mtaco wrote: Just
)
Sorry, no. I meant at least macOS Ventura, newer should be OK. Could you share your HostID or stderr of the failed tasks? Are these pure CPU tasks or do these use the GPU/coprocessor?
BM
HostID is 12927206. Looking
)
HostID is 12927206. Looking in stderrdae.txt, one example task (p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0) started, then finished 7-8 seconds later, and had the following lines.
06-Mar-2024 00:11:29 [Einstein@Home] Starting task p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0
06-Mar-2024 00:11:37 [Einstein@Home] Computation for task p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0 finished
06-Mar-2024 00:11:37 [Einstein@Home] Output file p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0_0 for task p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0 absent
06-Mar-2024 00:11:37 [Einstein@Home] Output file p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0_1 for task p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0 absent
06-Mar-2024 00:11:37 [Einstein@Home] Output file p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0_2 for task p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0 absent
06-Mar-2024 00:11:37 [Einstein@Home] Output file p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0_3 for task p2030.20190214.G193.86-01.90.N.b4s0g0.00000_3108_0 absent
I believe these are using the coprocessor; the application is Binary Pulsar Radio Search with (Apple_M-opencl) and the status printed is "Computation error (1 CPU + 1 Apple M1 Ultra)"
Hope that helps
I also have a Mac Studio
)
TASK 1587149214
Name:p2030.20190214.G193.86-01.90.N.b0s0g0.00000_1420_0
Workunit ID:791866765
Created:5 Mar 2024 11:16:46 UTC
Sent:6 Mar 2024 16:00:37 UTC
Report deadline:20 Mar 2024 16:00:37 UTC
Received:6 Mar 2024 16:02:42 UTC
Server state:Over
Outcome:Computation error
Client state:Compute error
Exit status:6 (0x00000006) Unknown error code
Computer:12608439
Run time (sec):10.46
CPU time (sec):2.50
Peak working set size (MB):203.42
Peak swap size (MB):419507.98
Peak disk usage (MB):0.02
Validation state:Invalid
Granted credit:0
Application:Binary Radio Pulsar Search (Arecibo,GBT,A) v2.00 (Apple_M-opencl)
arm64-apple-darwin
Stderr output
[08:00:48][143][ERROR] Application caught signal 6.
08:00:53 (143): called boinc_finish(6)
</stderr_txt>
]]>
I'm running macOS 14.3.1 on
)
I'm running macOS 14.3.1 on an M1 MacBook Pro. As of yesterday (05 Mar 2024) or maybe the day before, I started seeing a lot of Computation Errors. These are for the "Binary Radio Pulsar Search (Arecibo,GBT,A) 2.0 (Apple_M-opencl)" applications. They show up with an expected run time of 06:15 and result in an error after 07 or 08 seconds. Here's a link to one of the Tasks that resulted in an error:
https://einsteinathome.org/task/1587314763
The Gamma-ray pulsar search #5 1.14 continues to run just fine.
Looks like 2.05 is working
)
Looks like 2.05 is working nicely. I've gotten 81 validated vs 5 invalid, which seems good! I have about 26 valid 2.01, 56 version 2.01 invalids.
I'm having the same 'Error
)
I'm having the same 'Error while computing' errors with all of the Binary Radio Pulsar Search workunits including 2.05 and 2.06
M1 Max Mac Studio on MacOS 14.4 Sonoma
Here's one of the recent errors on Binary Radio Pulsar Search (Arecibo,GBT,A) v2.06 (Apple_M-opencl)
https://einsteinathome.org/task/1593432147
Let me know if there's any
)
Let me know if there's any way I can assist if you need logs or if I should detatch/reattach to the project.
I just saw two v.207 units go by, both with computation errors:
https://einsteinathome.org/task/1593534744
https://einsteinathome.org/task/1593534017