CUDA application for the O3ASHF search

hadron
hadron
Joined: 27 Jan 23
Posts: 62
Credit: 85972516
RAC: 590493

Ian&Steve C. wrote: here's

Ian&Steve C. wrote:

here's my host running it. Ubuntu 22.04 with 6.5 kernel.

https://einsteinathome.org/host/12830576

this is my app_info.xml file:

I'm using the user manual on the Boinc website, https://boinc.berkeley.edu/wiki/Client_configuration, and many of the fields you are using are not mentioned at all, for example:

<file>..</file>

<non_cpu_intensive>..</non_cpu_intensive>

<dont_throttle/>

Is there some more complete user manual somewhere that I haven't been able to find?

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3923
Credit: 45273372642
RAC: 63329499

documentation is scattered

documentation is scattered all over. you kind of have to dig through different sections. <file> and <file_info> are interchangeable as far as i can tell.

https://boinc.berkeley.edu/wiki/Anonymous_platform
https://boinc.berkeley.edu/trac/wiki/AppVersionNew

for the most part, I just check how the app is setup in the client_state.xml file, then copy/emulate that. it's pretty straightforward.

_________________________________________________________________________

DF1DX
DF1DX
Joined: 14 Aug 10
Posts: 105
Credit: 3732952003
RAC: 3179852

FYI: - The first two WUs

FYI:

- The first two WUs are valid, no error so far.

- The runtime with one WU is slightly shorter compared to 1.07 (GW-opencl-nvidia-2).

hadron
hadron
Joined: 27 Jan 23
Posts: 62
Credit: 85972516
RAC: 590493

Ian&Steve C.

Ian&Steve C. wrote:

documentation is scattered all over. you kind of have to dig through different sections. <file> and <file_info> are interchangeable as far as i can tell.

https://boinc.berkeley.edu/wiki/Anonymous_platform
https://boinc.berkeley.edu/trac/wiki/AppVersionNew

for the most part, I just check how the app is setup in the client_state.xml file, then copy/emulate that. it's pretty straightforward.

Thanks for this info. I'm not sure I need it just now, but it may well help diagnose any future problems which might arise.

IMO, all the documentation for the config .xml files should have been gathered in one place -- and the item descriptions aren't worth much either.

walton748
walton748
Joined: 1 Mar 10
Posts: 92
Credit: 1454462619
RAC: 2517882

Here is my host running it:

Here is my host running it: https://einsteinathome.org/host/13163963

 

I am using Ian's app_info.xml, but run the app 2x staggered. The system is on Debian 12.4 + updates, but not upgraded to Debian 12.5. Cuda is in non-mps mode. The system has been running the official app exactly this way for the last 2 weeks.

 

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3923
Credit: 45273372642
RAC: 63329499

The CUDA app has been working

The CUDA app has been working great btw. A little faster than the OpenCL app. And no invalids. 
 

the Linux version could be pushed to production as-is IMO. 

_________________________________________________________________________

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4307
Credit: 249644124
RAC: 34386

The app was made an official

The app was made an official app (Beta test to check validation on larger scale), you can remove the app_info.xml now.

BM

John
John
Joined: 17 Jan 18
Posts: 5
Credit: 2505070534
RAC: 15319751

Just started getting this

Just started getting this error in one of my hosts (https://einsteinathome.org/host/13160863):

<core_client_version>7.18.1</core_client_version>
<![CDATA[
<message>
process exited with code 127 (0x7f, -129)</message>
<stderr_txt>
../../projects/einstein.phys.uwm.edu/einstein_O3AS_1.09_x86_64-pc-linux-gnu__GW-cuda: error while loading shared libraries: libcufft.so.10.0: cannot open shared object file: No such file or directory

</stderr_txt>
]]>

has anyone seen this before?

Ben Scott
Ben Scott
Joined: 30 Mar 20
Posts: 53
Credit: 1482821992
RAC: 3817318

The new 1.09 cuda app is

The new 1.09 cuda app is failing 100% on both my computers.

 

All-Sky Gravitational Wave search on O3 1.09 (GW-cuda)

Eugene Stemple
Eugene Stemple
Joined: 9 Feb 11
Posts: 67
Credit: 361876849
RAC: 539952

Beta test 1.09 is trying to

Beta test 1.09 is trying to run...  but work units failed for missing libcufft.so.10 .  However, that file has now been downloaded at 02:39 UTC today (3/6) so I think devs are on it.  Waiting now for new work.

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.