New improved Gravitational Wave app & Happy New Year 2024 special

Dear crunchers!

We would like to share with you two things:

a) the GPU accelerated GW app is now much less memory hungry and less likely to produce errors. If you had previously opted out of the GW app, we invite you to reconsider and give the new version a chance.

b) to celebrate the new app, we have a holiday season special offer for our crunchers: you'll get twice the BOINC credits for the GW App results.

[see details in the forum thread]

Comments

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,324
Credit: 251,072,067
RAC: 41,570

We have been busy lately to

We have been busy lately to improve the crunching experience when running the O3 All Sky Gravitational Wave Search on O3 data (O3ASHF search).

We had noticed that a few things were not quite optimal: the app originally  required almost 4GB of memory on your graphics card, which, admittedly, was too ambitious. We noticed many results coming back to us as computation errors, caused by memory allocation errors. We also noticed that quite a few users have opted out of the gravitational wave app, and we suspect that the relatively high error rate and the high memory requirements are to blame for this. We are sorry for any inconveniences this might have caused.

So we changed the workunits and the app substantially: Instead of crunching through a certain amount of parameter space in one go per workunit (and using up to almost 4GB VRAM for this), the new app will run on workunits that search thru that same space but in two steps sequentially, each time covering half the previous search volume. The advantage is that now the maximum VRAM used for this is only around 2GB, and to create some safety headroom, we let BOINC assume a requirement of ca 2.5 GB.

The new app version was deployed some time ago and we are indeed seeing a substantial decrease in the number of work units failing with an error, so this works as intended.

Important: If you have previously opted out from the gravitational wave search in favor of the BRP7 GPU accelerated search, we would like to invite you to re-enable the GW search again in your preferences (under the "Project" setting: note that you might want to set this in all of the BOINC "venues").         

Happy New Year 2024 Extra Credits

As an incentive to try the new app (especially if you have previously opted out), and as compensation for potential troubles in the past, we now increased the credits per new workunits that are generated from now on to 10k, or twice the previous amount.

Some additional technical notes

If you have a graphics card with 4GB VRAM or less, you probably want to make sure that you do not accidentally run two or more instances of the app at the same time. The "GPU utilization factor" in your project preferences settings in BOINC should be set to 1.0 in this case, which is the default. If you reduced this to 0.5 etc to allow multiple units in parallel in the past, and you see errors in the computation (because BOINC will try to start two units with very close to 4GB RAM usage in total because BOINC is fooled by a low VRAM usage at the very start of the app), you should set this to 1.0

Happy crunching!

BM

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4,001
Credit: 47,541,162,324
RAC: 46,533,863

sounds great Bernd :) is the

sounds great Bernd :)

is the 10k points a "limited time offer" kind of thing? will it go back to 5k at some time? or will you leave it at 10k from here on out?

this might be a big ask, but would it be possible to compile the GW app for CUDA for nvidia devices? CUDA gives some special benefits for Linux users in being able to run the Multi-Process Service and gives the user some tweaking ability that can't be done on OpenCL. I use this a lot, and with running several other projects that are CUDA also (GPUGRID, Asteroids) it makes my life easier not having to stop and start MPS to switch between CUDA and OpenCL. Simply having the binary in CUDA with no other changes is sufficient.

_________________________________________________________________________

kotenok2000
Joined: 22 Feb 11
Posts: 10
Credit: 5,946,828
RAC: 0

Can you update build script

Can you update build script for brp application here https://einsteinathome.org/brp-src-release.zip ?

It tries to download zlib 1.2.8 from  http://zlib.net/zlib-1.2.8.tar.gz bit it was moved to https://www.zlib.net/fossils/zlib-1.2.8.tar.gz

There are probably more modifications needed.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4,001
Credit: 47,541,162,324
RAC: 46,533,863

kotenok2000 wrote:Can you

kotenok2000 wrote:

Can you update build script for brp application here https://einsteinathome.org/brp-src-release.zip ?

It tries to download zlib 1.2.8 from  http://zlib.net/zlib-1.2.8.tar.gz bit it was moved to https://www.zlib.net/fossils/zlib-1.2.8.tar.gz

There are probably more modifications needed.



you can update the version to pull the latest version in the build script

just change "ZLIB_VERSION=1.2.8" to "ZLIB_VERSION=1.3"

but this is all off-topic. this thread is talking about the Gravitational Wave tasks, and you're asking about the BRP app.

_________________________________________________________________________

WPrion
WPrion
Joined: 15 Mar 16
Posts: 26
Credit: 813,255,420
RAC: 73

Bernd Machenschalk wrote:We

Bernd Machenschalk wrote:

We have been busy lately to improve the crunching experience when running the O3 All Sky Continuous Gravitational Wave Search (O3ASHF search).

 

I assume that's "All-Sky Gravitational Wave search on O3 (O3AS)" in the Project Preferences and Applications pages?  (I don't see a "Continuous" or an O3ASHF)

 

Why do these application names often not quite match up??

Edited to add:  OK - I see the task is listed as O3ASHF in BOINC's TASKS list.  But I can't see that until a task is downloaded.  If you're advising users to sign up for an application, it would be better to list that application by the name it appears in the Project Preferences page.

 

Thanks
 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4,001
Credit: 47,541,162,324
RAC: 46,533,863

WPrion wrote: I assume

WPrion wrote:

I assume that's "All-Sky Gravitational Wave search on O3 (O3AS)" in the Project Preferences and Applications pages?  (I don't see a "Continuous" or an O3ASHF)

yes. that is the one.

_________________________________________________________________________

WPrion
WPrion
Joined: 15 Mar 16
Posts: 26
Credit: 813,255,420
RAC: 73

...and the task name does not

...and the task name does not say "Continuous".  Do I have the right one??

WPrion
WPrion
Joined: 15 Mar 16
Posts: 26
Credit: 813,255,420
RAC: 73

Thanks, but my recommendation

Thanks, but my recommendation still stands:

 

If you're advising users to sign up for an application, it would be better to list that application by the name it appears in the Project Preferences page.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4,001
Credit: 47,541,162,324
RAC: 46,533,863

there is only one

there is only one Gravitational Wave O3AS selection. you can't get it wrong. it identifies as "All-Sky Gravitational Wave on O3" which is pretty self explanatory given there's only one GW search. it's the right one.

the O3MD* are previous completed searches. there is no work for those.

_________________________________________________________________________

WPrion
WPrion
Joined: 15 Mar 16
Posts: 26
Credit: 813,255,420
RAC: 73

Got it.  Choose the Project

Got it.  Choose the Project Preferences application by best guess and process of elimination.  Accuracy is so overrated anyway.  It's not like we're dealing with computers here...oh, wait...

pututu
pututu
Joined: 6 Apr 17
Posts: 67
Credit: 653,417,392
RAC: 2

Thanks Bernd! Let me give my

Thanks Bernd! Let me give my Radeon VII a shot. It seems well suited for this task.

Sabroe_SMC
Sabroe_SMC
Joined: 9 Oct 06
Posts: 28
Credit: 417,110,937
RAC: 841,729

Ich habe es ausprobiert. Ich

Ich habe es ausprobiert. Ich kann dazu nur sagen es ist nicht ansatzweise das was ich mir von einer "neuen" App versprochen habe. Die Auslastung der CPU war durchgängig zwischen 3,6 und 6% (auf meinem AMD Ryzen 9 7950X3D mit SMT an). Die Auslastung der GPU ( RTX4090 ) lag bei einer Aufgabe bei max 52% und einer Belastung von 110W, bei 2 Aufgaben gleichzeitig bei 95% und ebenfalls 110 W. Bei 2 gleichzeitig laufenden Aufgaben verlängerte sich die Laufzeit der beiden WUs auf mehr als das doppelte, ich habe sie abgebrochen. Gibt es eine Möglichkeit das das gleichzeitige berechnen von 2 Wus des Typs O3AS verhindert werden kann?

Ausserdem bekommt man für das berechnen einer Meerkat-Wu 3333 Credits. Die läuft aber nur 160 bis 190 sec. Die "neue" O3AS braucht aber mehr als 600 sec und erhält 10000 Credits. Also ein deutliches Missverhältnis.

 

I tried it. All I can say is that it's not even close to what I expected from a "new" app. CPU utilization was consistently between 3.6 and 6% (on my AMD Ryzen 9 7950X3D with SMT on). The utilization of the GPU (RTX4090) was a maximum of 52% and a load of 110W for one task, and 95% and also 110W for two tasks at the same time. With two tasks running at the same time, the runtime of the two WUs was more than doubled , I canceled it. Is there a way to prevent the simultaneous calculation of 2 Wus of type O3AS?

You also get 3333 credits for calculating a Meerkat-Wu. But it only runs for 160 to 190 seconds. But the "new" O3AS needs more than 600 seconds and gets 10,000 credits. So there is a clear disproportion.

[AF>Le_Pommier] Jerome_C2005
[AF>Le_Pommier]...
Joined: 1 May 10
Posts: 38
Credit: 111,591,916
RAC: 0

I'd like to help but O3 GPU

I'd like to help but O3 GPU does absolutely nothing on my AMD iMac GPU, less than 5% of GPU is used, so there's no point.

It was the same back in 2022 when I started that thread (and I remember I had also tested this before and it was the same), and unfortunately the recent updates of the app didn't change that problem.

Wade Tregaskis
Joined: 17 Mar 10
Posts: 6
Credit: 42,905,743
RAC: 6,938

I too wish E@H would use the

I too wish E@H would use the GPU effectively.  I also see average utilisation of about 7% on a Vega64, for einstein_O3AS, and suspiciously that is irrespective of how many concurrent GPU tasks I run.  Each GPU task also uses about 7 GB of GPU RAM, even with the latest version that you say should use only 2.5 GB.

The other GPU apps - e.g. einstein_O2MDF & hsgamma_FGRPB1G - were much more effective at utilising the GPU properly and required much less RAM - in particular, they scaled nicely; I could run five or more such GPU tasks simultaneously with close to linear throughput gains.  It's unfortunate that E@H is no longer running those apps.

I know the credits are largely made up, but FWIW I used to earn an order of magnitude more from E@H than I do now, on this same computer.  Back when E@H was using the GPU properly.

Other BOINC projects are able to utilise my GPU fully (often with a single task, which is convenient), though their project objectives have less merit (IMO).  It's a shame that my preferred project performs so badly.

pututu
pututu
Joined: 6 Apr 17
Posts: 67
Credit: 653,417,392
RAC: 2

Got my Radeon VII to run

Got my Radeon VII to run some O3ASHF gpu tasks. Running 5 tasks per gpu by staggering each task after the cpu portion of each task is completed. Total average run time for 5 tasks = 2008 secs or 401.6 secs per task with average gpu board power of around ~105W give or take. Average actual clock ~1360GHz. Memory clock ~1800MHz. AMD Driver 23.5.2.

Link to host

GPU-Z screenshot:

Average PPD for O3ASHF is 2.15M at 105W board gpu, assuming 0% invalid Not sure what's the average current invalid rate looks like. Maybe a few percent.

Seems like my Radeon VII still has some good life in it. Wondering how long will the O3ASHF project last and if ever the new credit granted is permanent or temporary.

 

ace_quaker
ace_quaker
Joined: 21 May 06
Posts: 6
Credit: 461,005,482
RAC: 650,889

pututu wrote: Got my Radeon

pututu wrote:

Got my Radeon VII to run some O3ASHF gpu tasks. Running 5 tasks per gpu by staggering each task after the cpu portion of each task is completed. Total average run time for 5 tasks = 2008 secs or 401.6 secs per task with average gpu board power of around ~105W give or take. Average actual clock ~1360GHz. Memory clock ~1800MHz. AMD Driver 23.5.2.

Link to host

GPU-Z screenshot:

Average PPD for O3ASHF is 2.15M at 105W board gpu, assuming 0% invalid Not sure what's the average current invalid rate looks like. Maybe a few percent.

Seems like my Radeon VII still has some good life in it. Wondering how long will the O3ASHF project last and if ever the new credit granted is permanent or temporary.

 

 

Any recommendation on automating staggering? I am seeing quite large task times on my VII (@40 mins running 2x), going up near 50 mins if running at same percentage without staggering manually. Are my poor times due to CPU limitations? (It's on a xeon e5 v4 host).  I run a slight auto under volt but don't think its that.

 

I haven't seen the new credit bump either yet must be still running through old tasks.

wujj123456
wujj123456
Joined: 16 Sep 08
Posts: 20
Credit: 2,022,677,220
RAC: 2,165,328

Just one data point for the

Just one data point for the opt-out. I once had to opt-out GW apps because the app crashed on latest Nvidia driver and I happen to need that driver for a game. In addition, such error blocked me from getting any EAH tasks, not just GW apps, even though other apps are working fine. So the only option was opting out GW app. Other than checking forums occasionally, I have no way for me if/when the issue was resolved and whether I need to upgrade my driver again. Ultimately I forgot to retry for a long time.

It would be nice if server side can quickly block only problematic combination of apps and drivers, though I know that's probably not EAH but BOINC server code and not necessarily easy to do. Assuming that's not worth the effort, it would be helpful to have a shout out through the boinc notice whenever new drivers or apps don't work for some specific versions and when such issues are resolved. LHC can detect when virtualbox is not installed and send a notice in boinc manager. I wonder if that means there is some capability to detect the platform configurations and target a similar message for hosts with problematic drivers?

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,324
Credit: 251,072,067
RAC: 41,570

The 10k credits will remain

The 10k credits will remain until the end of this "run", as long as it takes. The current prediction is 2 months. However, we hope that with this offer we'll finish it a little faster.

I'm working on compiling this app for CUDA (Win&Lin), but ran into some problems and ultimately had to postpone that for more urgent things. I'll pick it up again ASAP.

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,324
Credit: 251,072,067
RAC: 41,570

I once had to opt-out GW apps

I once had to opt-out GW apps because the app crashed on latest Nvidia driver

Thanks, that's an important piece of information. By far most of the errors we saw were memory allocation errors, so we worked on fixing the memory issue first. I'll take a look at the remaining errors now. Could you post the driver version that's creating the problem? Do others see this, too? Is this limited to a specific OS (Windows, Linux) or does this happen on both?

BM

Yeti
Yeti
Joined: 17 Nov 04
Posts: 59
Credit: 1,371,192,349
RAC: 459,397

I ran 2 WUs on my 2080 since

I ran 2 WUs concurrent on my 2080 since yesterday and now it has crashed with a blue screen: https://einsteinathome.org/de/host/12987429

The Box is normally doing fine, no crahes, regardless what I crunch.

Windows10, fully patched, NVIDIA-Driver 466.77

 

Supporting BOINC, a great concept !

pututu
pututu
Joined: 6 Apr 17
Posts: 67
Credit: 653,417,392
RAC: 2

Quote:ace_quaker

 

 

Quote:

ace_quaker wrote:

Any recommendation on automating staggering? I am seeing quite large task times on my VII (@40 mins running 2x), going up near 50 mins if running at same percentage without staggering manually. Are my poor times due to CPU limitations? (It's on a xeon e5 v4 host).  I run a slight auto under volt but don't think its that.

 

I haven't seen the new credit bump either yet must be still running through old tasks.

When the first run the task, you will notice that the "progress" time for the task is pegged at 0.000% for a while. During this time, the cpu is doing all the work. So having faster or higher cpu clock speed helps to reduce the overall run time. Once the "progress" time moves to x.xxx%, run the second task and repeat up to the number of tasks that you are planning to run concurrently for one gpu. That's how I did my stagger. My run time with 5 tasks running concurrently are quite consistent but worth watching this once a while and reset it needed. 

Also, set the gpu memory clock as high as you can (but keep it stable from crashing) as the O3AS task benefits from higher memory bandwidth like Radeon VII. 

Gandolph1
Gandolph1
Joined: 20 Feb 05
Posts: 180
Credit: 389,671,576
RAC: 1,484

pututu

pututu wrote:

 

 

Quote:

ace_quaker wrote:

Any recommendation on automating staggering? I am seeing quite large task times on my VII (@40 mins running 2x), going up near 50 mins if running at same percentage without staggering manually. Are my poor times due to CPU limitations? (It's on a xeon e5 v4 host).  I run a slight auto under volt but don't think its that.

 

I haven't seen the new credit bump either yet must be still running through old tasks.

When the first run the task, you will notice that the "progress" time for the task is pegged at 0.000% for a while. During this time, the cpu is doing all the work. So having faster or higher cpu clock speed helps to reduce the overall run time. Once the "progress" time moves to x.xxx%, run the second task and repeat up to the number of tasks that you are planning to run concurrently for one gpu. That's how I did my stagger. My run time with 5 tasks running concurrently are quite consistent but worth watching this once a while and reset it needed. 

Also, set the gpu memory clock as high as you can (but keep it stable from crashing) as the O3AS task benefits from higher memory bandwidth like Radeon VII. 

My primary complaint would be that I do not see any additional load on the CPU during the 3 GPU pauses that each task takes to complete.  That would indicate to me that the task is not fully utilizing the resources it has available.  This leads me to believe that something is off in the program...

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4,001
Credit: 47,541,162,324
RAC: 46,533,863

Bernd Machenschalk

Bernd Machenschalk wrote:

The 10k credits will remain until the end of this "run", as long as it takes. The current prediction is 2 months. However, we hope that with this offer we'll finish it a little faster.

I'm working on compiling this app for CUDA (Win&Lin), but ran into some problems and ultimately had to postpone that for more urgent things. I'll pick it up again ASAP.



thanks Bernd :). I look forward to a CUDA version.

_________________________________________________________________________

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,497
Credit: 9,613,340,714
RAC: 3,643,284

I have it running up to 3 x

I have it running up to 3 x per GPU.  And my impression is there is enough CPU processing going on that we could run up to our memory limit and still get a speed up?

Yes/No?

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Guðni Már Gilbert
Guðni Már Gilbert
Joined: 30 Jun 20
Posts: 12
Credit: 442,238,570
RAC: 204,653

Thank you for the update :)

Thank you for the update :)

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,497
Credit: 9,613,340,714
RAC: 3,643,284

Tom M wrote: I have it

Tom M wrote:

I have it running up to 3 x per GPU.  And my impression is there is enough CPU processing going on that we could run up to our memory limit and still get a speed up?

Yes/No?

I have a test-rig running 4 x per Gpu.  It looks like it is running 80% of the available user gpu memory.

The Epyc-7601 is showing 21% cpu usage with no other tasks processing.

I had it running at varying # of tasks per Gpu.  So only the "last" ones are 4 x tasks.

Its on NNT.  Shutdown after it runs out of tasks.

Tom M

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Speedy
Speedy
Joined: 11 Aug 05
Posts: 41
Credit: 23,806,889
RAC: 15,607

Ian&Steve C. wrote:WPrion

Ian&Steve C. wrote:

WPrion wrote:

I assume that's "All-Sky Gravitational Wave search on O3 (O3AS)" in the Project Preferences and Applications pages?  (I don't see a "Continuous" or an O3ASHF)

yes. that is the one.


I would love to help with this project however I have the above application selected in my preferences and I get the below output
18/01/2024 9:52:00 AM | Einstein@Home | Project requested delay of 60 seconds
18/01/2024 9:53:01 AM | Einstein@Home | Sending scheduler request: To fetch work.
18/01/2024 9:53:01 AM | Einstein@Home | Requesting new tasks for NVIDIA GPU
18/01/2024 9:53:03 AM | Einstein@Home | Scheduler request completed: got 0 new tasks
18/01/2024 9:53:03 AM | Einstein@Home | No work sent
18/01/2024 9:53:03 AM | Einstein@Home | No work available for the applications you have selected.  Please check your preferences on the web site.
18/01/2024 9:53:03 AM | Einstein@Home | Project requested delay of 60 seconds

Tried a project reset to no avail. Not to move the focus away from this project I thought the search focus was on project "BRP7"?

kotenok2000
Joined: 22 Feb 11
Posts: 10
Credit: 5,946,828
RAC: 0

It is All-Sky Gravitational

It is All-Sky Gravitational Wave search on O3 (O3AS)

Ian&Steve C. wrote:

WPrion wrote:

I assume that's "All-Sky Gravitational Wave search on O3 (O3AS)" in the Project Preferences and Applications pages?  (I don't see a "Continuous" or an O3ASHF)

yes. that is the one.

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,497
Credit: 9,613,340,714
RAC: 3,643,284

Speedy wrote:Ian&Steve C.

Speedy wrote:

Ian&Steve C. wrote:

WPrion wrote:

I assume that's "All-Sky Gravitational Wave search on O3 (O3AS)" in the Project Preferences and Applications pages?  (I don't see a "Continuous" or an O3ASHF)

yes. that is the one.


I would love to help with this project however I have the above application selected in my preferences and I get the below output
18/01/2024 9:52:00 AM | Einstein@Home | Project requested delay of 60 seconds
18/01/2024 9:53:01 AM | Einstein@Home | Sending scheduler request: To fetch work.
18/01/2024 9:53:01 AM | Einstein@Home | Requesting new tasks for NVIDIA GPU
18/01/2024 9:53:03 AM | Einstein@Home | Scheduler request completed: got 0 new tasks
18/01/2024 9:53:03 AM | Einstein@Home | No work sent
18/01/2024 9:53:03 AM | Einstein@Home | No work available for the applications you have selected.  Please check your preferences on the web site.
18/01/2024 9:53:03 AM | Einstein@Home | Project requested delay of 60 seconds

Tried a project reset to no avail. Not to move the focus away from this project I thought the search focus was on project "BRP7"?

So what tasks have you got enabled in your profile?  If it is All-Sky GW,  do you also have the "and other tasks if needed" enabled?  Might even turn on the "beta test" for the profile too. Hope you have run Gpu tasks and both the Nvidia and Amd gpus selected in your profile too.

HTH,

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Speedy
Speedy
Joined: 11 Aug 05
Posts: 41
Credit: 23,806,889
RAC: 15,607

Tom M wrote:Speedy

Tom M wrote:

Speedy wrote:

Ian&Steve C. wrote:

WPrion wrote:

I assume that's "All-Sky Gravitational Wave search on O3 (O3AS)" in the Project Preferences and Applications pages?  (I don't see a "Continuous" or an O3ASHF)

yes. that is the one.


I would love to help with this project however I have the above application selected in my preferences and I get the below output
18/01/2024 9:52:00 AM | Einstein@Home | Project requested delay of 60 seconds
18/01/2024 9:53:01 AM | Einstein@Home | Sending scheduler request: To fetch work.
18/01/2024 9:53:01 AM | Einstein@Home | Requesting new tasks for NVIDIA GPU
18/01/2024 9:53:03 AM | Einstein@Home | Scheduler request completed: got 0 new tasks
18/01/2024 9:53:03 AM | Einstein@Home | No work sent
18/01/2024 9:53:03 AM | Einstein@Home | No work available for the applications you have selected.  Please check your preferences on the web site.
18/01/2024 9:53:03 AM | Einstein@Home | Project requested delay of 60 seconds

Tried a project reset to no avail. Not to move the focus away from this project I thought the search focus was on project "BRP7"?

So what tasks have you got enabled in your profile?  If it is All-Sky GW,  do you also have the "and other tasks if needed" enabled?  Might even turn on the "beta test" for the profile too. Hope you have run Gpu tasks and both the Nvidia and Amd gpus selected in your profile too.

HTH,

Tom M


I only have the project above/run test application selected and Nvidia GPU as I don't want anything running on my AMD GPU as it is combined with my CPU

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,497
Credit: 9,613,340,714
RAC: 3,643,284

Speedy wrote: I only have

Speedy wrote:

I only have the project above/run test application selected and Nvidia GPU as I don't want anything running on my AMD GPU as it is combined with my CPU

What happens when you set this to YES? 

"Allow non-preferred apps:yes/no"

Does it download ANY gpu tasks?

 

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,497
Credit: 9,613,340,714
RAC: 3,643,284

Tom M wrote: Tom M wrote: I

Tom M wrote:

Tom M wrote:

I have it running up to 3 x per GPU.  And my impression is there is enough CPU processing going on that we could run up to our memory limit and still get a speed up?

Yes/No?

I have a test-rig running 4 x per Gpu.  It looks like it is running 80% of the available user gpu memory.

The Epyc-7601 is showing 21% cpu usage with no other tasks processing.

I had it running at varying # of tasks per Gpu.  So only the "last" ones are 4 x tasks.

Its on NNT.  Shutdown after it runs out of tasks.

My current guess is if  4 x really is averaging 47 minutes then the Rac production is about equal to brp7/MeerKat.

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Speedy
Speedy
Joined: 11 Aug 05
Posts: 41
Credit: 23,806,889
RAC: 15,607

Tom M wrote: Speedy

Tom M wrote:

Speedy wrote:

I only have the project above/run test application selected and Nvidia GPU as I don't want anything running on my AMD GPU as it is combined with my CPU

What happens when you set this to YES? 

"Allow non-preferred apps:yes/no"

Does it download ANY gpu tasks?

 


Not sure I only want to run the project above I may try again in the coming days. Thanks for the suggestions. I haven't had such issues with other subprojects

pututu
pututu
Joined: 6 Apr 17
Posts: 67
Credit: 653,417,392
RAC: 2

Tom M wrote:My current

Tom M wrote:

My current guess is if  4 x really is averaging 47 minutes then the Rac production is about equal to brp7/MeerKat.

 

I think the board power draw for O3ASHF will be much lower than running BRP7? So you could actually save some power when running O3AS vs BRP7 for the same RAC?

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,497
Credit: 9,613,340,714
RAC: 3,643,284

Speedy wrote: Tom M

Speedy wrote:

Tom M wrote:

Speedy wrote:

I only have the project above/run test application selected and Nvidia GPU as I don't want anything running on my AMD GPU as it is combined with my CPU

What happens when you set this to YES? 

"Allow non-preferred apps:yes/no"

Does it download ANY gpu tasks?

 


Not sure I only want to run the project above I may try again in the coming days. Thanks for the suggestions. I haven't had such issues with other subprojects

If it downloads any GPU tasks then it is likely not your setup that is the problem.

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Janne
Janne
Joined: 2 Oct 06
Posts: 1
Credit: 6,674,578
RAC: 2,646

Thank you for update! Sounds

Thank you for update!
Sounds very good, these are the exact reasons why I had to keep GPU off from crunch :)

fastbunny
fastbunny
Joined: 20 Apr 06
Posts: 22
Credit: 91,424,422
RAC: 0

ace_quaker wrote: Are my

ace_quaker wrote:

Are my poor times due to CPU limitations? (It's on a xeon e5 v4 host).  I run a slight auto under volt but don't think its that.

 

Probably yes. I have been getting these tasks for a while now, I think because I opted in to 'run test applications'.

I run 4 of these tasks simultaneously on my 7900 XT (seemed most efficient when I tested a long time ago), and my CPU is a 12-core 5900X. When I'm running 4 CPU tasks at the same time as the 4 GPU tasks, thus still using only 8 of 12 physical cores, this will add about 5 minutes of runtime per GPU task, going from ~27 to ~32 minutes. That's quite a lot, seeing that the GPU is not taxed more in any way by the CPU tasks, so it seems the GPU tasks are heavily limited by the CPU.

I have not tried staggering the GPU tasks. It sounds like a very good idea. I also wonder what's causing this extra runtime. Perhaps the GPU tasks are limited mostly by CPU cache. Unfortunately, when running 4 CPU + 4 GPU tasks, the Windows thread scheduler will first put all six cores on the first chiplet to work, and put only two tasks on the second chiplet. This is very inefficient from a cache point of view, since each chiplet has its own cache shared between six cores. I might try to spread them out evenly with Process Lasso sometime if I feel l like it.

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,497
Credit: 9,613,340,714
RAC: 3,643,284

I suspect that the moderators

I suspect that the moderators would prefer we stop cluttering up a "News" thread with on going discussion.

I have opened up a discussion thread in the Crunchers area here

Your invited.

Respectfully,

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

wujj123456
wujj123456
Joined: 16 Sep 08
Posts: 20
Credit: 2,022,677,220
RAC: 2,165,328

Bernd Machenschalk wrote: I

Bernd Machenschalk wrote:

I once had to opt-out GW apps because the app crashed on latest Nvidia driver

Thanks, that's an important piece of information. By far most of the errors we saw were memory allocation errors, so we worked on fixing the memory issue first. I'll take a look at the remaining errors now. Could you post the driver version that's creating the problem? Do others see this, too? Is this limited to a specific OS (Windows, Linux) or does this happen on both?

That was quite a while ago and have since resolved. There was a thread on this forum back then but I couldn't find it now. :-( IIRC, it was consistently crashing on Windows for one of the 525 driver updates.

Sorry if that caused confusion. I was only mentioning this to point out people could disable some app but never learn that it's fixed later. It's not a current problem.

Steven Gaber
Steven Gaber
Joined: 24 Oct 22
Posts: 9
Credit: 7,343,814
RAC: 25,976

I have tried to run the GW

I have tried to run the GW tasks, but they always indicate absurd time for completion. One estimated 330 days to complete. I abort these. Today I aborted one whose time to completion was 169 days. Even when I suspend all other tasks, I get the same result.

Can you suggest any solutions to this dilemma?

This computer has a 1TB SSD and 16 GB of RAM.

Total credit:1,478,027

Average credit:2,910.38

Cross project credit:

CPU type:AuthenticAMD AMD Ryzen 7 5700G with Radeon Graphics [Family 25 Model 80 Stepping 0]

Number of processors:16

Coprocessors:AMD AMD Radeon(TM) Graphics (6227MB)

Operating system:Microsoft Windows 11 Core x64 Edition, (10.00.22621.00)

BOINC client version:7.22.2

Memory:15754.27 MiB

Cache:512 KiB

Swap space:18186.27 MiB

Total disk space:930.81 GiB

Free disk space:817.61 GiB

Measured floating point speed:5727.01 million ops/sec

Measured integer speed:25051.66 million ops/sec

Average upload rate:46.22 KiB/sec

Average download rate:7945.19 KiB/sec

Average turnaround time:1 days

Tasks:86

Number of times client has contacted server:1437

Last time contacted server:22 Jan 2024 19:44:00 UTC

% of time BOINC client is running:93.9626 %

While BOINC running, % of time host has an Internet connection:99.9991 %

While BOINC running, % of time work is allowed:98.0415 %

Task duration correction factor:0.84422

 

 


S. Gaber

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,497
Credit: 9,613,340,714
RAC: 3,643,284

Steven Gaber wrote:I have

Steven Gaber wrote:

I have tried to run the GW tasks, but they always indicate absurd time for completion. One estimated 330 days to complete. I abort these. Today I aborted one whose time to completion was 169 days. Even when I suspend all other tasks, I get the same result.

Can you suggest any solutions to this dilemma?

Sure. The time estimates for the first 11 or so tasks are completely meaningless. You don't begin to get a reliable estimate till later on.

I have run these tasks on my Windows 5700G and they don't take 330 days. Honest. They likely will take hours. But not days. No I don't regularly crunch on my Windows box. But I do experiment with it.

Try again. It will get better.

We can provide more help at: https://einsteinathome.org/content/new-improved-gravational-wave-app-discussion so we don't clutter up this news thread any further.

Tom M

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

KLiK
KLiK
Joined: 1 Apr 14
Posts: 67
Credit: 446,139,500
RAC: 1,050,022

A small digression, for a GW

A small digression, for a GW tasks...they finish up to 99,5% & then they hang for a while, doing only CPU work while GPU is on minimal %...do those task really need to do those CPU evaluation for so long or is it some "glitch"? 

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,497
Credit: 9,613,340,714
RAC: 3,643,284

KLiK wrote: A small

KLiK wrote:

A small digression, for a GW tasks...they finish up to 99,5% & then they hang for a while, doing only CPU work while GPU is on minimal %...do those task really need to do those CPU evaluation for so long or is it some "glitch"? 

It stops at about 50 percent and 99.5 percent on every task I have processed.

This was described as planned earlier in the thread. This is why processing time is sensitive to CPU speed and cache.

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,522
Credit: 741,152,651
RAC: 937,399

KLiK wrote: A small

KLiK wrote:

A small digression, for a GW tasks...they finish up to 99,5% & then they hang for a while, doing only CPU work while GPU is on minimal %...do those task really need to do those CPU evaluation for so long or is it some "glitch"? 

Not a glitch, this is how the OpenCL app works, currently. The same happens around 49.5% .. 50% .

Here is why:  After some initial data parsing, validation and preparations which are done on the CPU only, the GPU-heavy part starts and the application computes many thousands of potential signal candidates, ranked by some detection statistic. This part of the computation is very well suited for GPU processing because (somewhat simplified), we can try many signal templates in a regularly spaced frequency grid and for identical sky points in parallel, and doing the same operations in parallel for different but "similar" data points in such a configuration is what GPUs are really good at. 

Now, once we have a list of candidates, we need to perform some operations on each of those candidates. Because those candidates are now from a sparse subset of the original search grids and are thus all over the sky and no longer are  arranged in a regular frequency grid, this additional step is not so well suited for GPU computations anymore. Therefore, the current OpenCL app now does these operations on the CPU.

The "new" workunits now contain a bundle of two sub-workunits, so this switching between GPU-intensive and CPU-only processing happens twice during the processing of a workunit: the first half of the batch from 0 ... 49.5% (GPU intensive) and   49.5..50% (CPU intensive), and then the second sub-workunit follows with 50%..95.5% (GPU) and 95%..100% (CPU).

It is possible that the next E@H version of the software will be able to do this second step on the GPU as well, BUT because the data is (for the reasons outlined above) less "regular" than in the first step of the (GPU) processing, it will  still not be that much faster than the CPU code that we use now, at least on most machines. It obviously depends a lot on the relative speed of your GPU and CPU.  It is not clear whether we can roll out the new code before the end of the current O3ASHF search.  

I hope this answers your question.

Tom M
Tom M
Joined: 2 Feb 06
Posts: 6,497
Credit: 9,613,340,714
RAC: 3,643,284

Thank you for a great recap!

Thank you for a great recap!

A Proud member of the O.F.A.  (Old Farts Association).  Be well, do good work, and keep in touch.® (Garrison Keillor)  I want some more patience. RIGHT NOW!

Holdolin
Holdolin
Joined: 28 Dec 20
Posts: 7
Credit: 374,245,629
RAC: 0

Bikeman (Heinz-Bernd

Bikeman (Heinz-Bernd Eggenstein) wrote:

Not a glitch, this is how the OpenCL app works, currently. The same happens around 49.5% .. 50% .

Here is why:  After some initial data parsing, validation and preparations which are done on the CPU only, the GPU-heavy part starts and the application computes many thousands of potential signal candidates, ranked by some detection statistic. This part of the computation is very well suited for GPU processing because (somewhat simplified), we can try many signal templates in a regularly spaced frequency grid and for identical sky points in parallel, and doing the same operations in parallel for different but "similar" data points in such a configuration is what GPUs are really good at. 

Now, once we have a list of candidates, we need to perform some operations on each of those candidates. Because those candidates are now from a sparse subset of the original search grids and are thus all over the sky and no longer are  arranged in a regular frequency grid, this additional step is not so well suited for GPU computations anymore. Therefore, the current OpenCL app now does these operations on the CPU.

The "new" workunits now contain a bundle of two sub-workunits, so this switching between GPU-intensive and CPU-only processing happens twice during the processing of a workunit: the first half of the batch from 0 ... 49.5% (GPU intensive) and   49.5..50% (CPU intensive), and then the second sub-workunit follows with 50%..95.5% (GPU) and 95%..100% (CPU).

It is possible that the next E@H version of the software will be able to do this second step on the GPU as well, BUT because the data is (for the reasons outlined above) less "regular" than in the first step of the (GPU) processing, it will  still not be that much faster than the CPU code that we use now, at least on most machines. It obviously depends a lot on the relative speed of your GPU and CPU.  It is not clear whether we can roll out the new code before the end of the current O3ASHF search.  

I hope this answers your question.

Hey, thanks for the explanation.  Sorry I'm late to the party.  Tossin a VII Pro on the pile to crunch this.  Here's to some good data :)

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1,898
Credit: 1,422,488,287
RAC: 961,538

All-Sky Gravitational Wave

All-Sky Gravitational Wave search on O3 v1.07 

....No Work....any news?

 

Chooka
Chooka
Joined: 11 Feb 13
Posts: 134
Credit: 3,769,357,759
RAC: 3,724,064

pututu wrote: Got my Radeon

pututu wrote:

Got my Radeon VII to run some O3ASHF gpu tasks. Running 5 tasks per gpu by staggering each task after the cpu portion of each task is completed. Total average run time for 5 tasks = 2008 secs or 401.6 secs per task with average gpu board power of around ~105W give or take. Average actual clock ~1360GHz. Memory clock ~1800MHz. AMD Driver 23.5.2.

Link to host

GPU-Z screenshot:

Average PPD for O3ASHF is 2.15M at 105W board gpu, assuming 0% invalid Not sure what's the average current invalid rate looks like. Maybe a few percent.

Seems like my Radeon VII still has some good life in it. Wondering how long will the O3ASHF project last and if ever the new credit granted is permanent or temporary.

 

I'm running 4 tasks on my Radeon VII but can't achieve as good as results as you are getting Pututu

Low wattage which is great though 

https://einsteinathome.org/host/12602626/tasks/4/56

Maybe it's the memory overclock? 

Do you have a profile which you use? I used to have one for undervolting. 

 


KLiK
KLiK
Joined: 1 Apr 14
Posts: 67
Credit: 446,139,500
RAC: 1,050,022

Thanks, as maybe I should try

Thanks, as maybe I should try running 4x WUs for my RTX 4000. ????

Sabroe_SMC
Sabroe_SMC
Joined: 9 Oct 06
Posts: 28
Credit: 417,110,937
RAC: 841,729

Bernd Machenschalk

Bernd Machenschalk wrote:

I once had to opt-out GW apps because the app crashed on latest Nvidia driver

Thanks, that's an important piece of information. By far most of the errors we saw were memory allocation errors, so we worked on fixing the memory issue first. I'll take a look at the remaining errors now. Could you post the driver version that's creating the problem? Do others see this, too? Is this limited to a specific OS (Windows, Linux) or does this happen on both?

 

And now? How far are you with your work? Half a year is quite a bit long