Gravitational Wave search O2 Multi-Directional ("O2MD1")

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250457854

RAC: 35097

Our workunits have varying

17 Apr 2020 12:02:47 UTC

Message 176698

(moderation:

)

Our workunits have varying memory requirements, depending much (but not only) on the analysis frequency - the higher the frequency, the more data (files) is needed to process these, and the more memory to store it. We developed a model for the memory usage of our apps depending on the input parameters, and coded it into the workunit generator. The workunit generator caclculates the memory requirement for each particular workunit and writes it into the workunit record, such that the scheduler doesn't give tasks to clients which have too few (available) memory. For CPU Apps, this works pretty good.

However, BOINC's handling of GPU memory restrictions is completely separated from that. It only allows to specify a minimum RAM per GPU, and a single fixed value for how much memory will actually be used (by this App), completely ignorant of the memory requirement assigned to the workunit. Therefore I now changed the scheduler, such that it will require the GPU to have at least as much memory as recorded in the workunit. Test show that this is a pretty good fit, at least for our applications. This might lead to a lot more "work requests" being rejected (or fulfilled by FGRP work), but it should avoid the memory allocation errors of the GPU apps.

Zalster

Joined: 26 Nov 13

Posts: 3117

Credit: 4050672230

RAC: 0

Thanks Bernd, You should

17 Apr 2020 16:17:26 UTC

Message 176709

(moderation:

)

Thanks Bernd,

You should look at this thread where it's discussed how BOINC determines the amount of RAM on Nvidia GPUs

https://einsteinathome.org/goto/comment/176690

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250457854

RAC: 35097

Thanks for the note. However,

17 Apr 2020 18:55:45 UTC

Message 176722

(moderation:

)

Thanks for the note. However, I don't think that any workunit of ours does now require more than 2GB of RAM, or will in the foreseeable future.

Zalster

Joined: 26 Nov 13

Posts: 3117

Credit: 4050672230

RAC: 0

No problem. Note Nvidia only

17 Apr 2020 19:05:36 UTC

Message 176723

(moderation:

)

No problem. Note Nvidia only allows use of 27% of the RAM on a card to be used for OpenCl. This is creating issues with low RAM nvidia (2-3 GB) cards. AMD and Intel use somewhere between 52-67% of a card's available RAM for scientific computations.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3949

Credit: 46779802642

RAC: 64115474

Bernd Machenschalk wrote:Our

6 May 2020 22:27:46 UTC

Message 177564 in response to message 176698

(moderation:

)

Bernd Machenschalk wrote:

Our workunits have varying memory requirements, depending much (but not only) on the analysis frequency - the higher the frequency, the more data (files) is needed to process these, and the more memory to store it. We developed a model for the memory usage of our apps depending on the input parameters, and coded it into the workunit generator. The workunit generator caclculates the memory requirement for each particular workunit and writes it into the workunit record, such that the scheduler doesn't give tasks to clients which have too few (available) memory. For CPU Apps, this works pretty good. However, BOINC's handling of GPU memory restrictions is completely separated from that. It only allows to specify a minimum RAM per GPU, and a single fixed value for how much memory will actually be used (by this App), completely ignorant of the memory requirement assigned to the workunit. Therefore I now changed the scheduler, such that it will require the GPU to have at least as much memory as recorded in the workunit. Test show that this is a pretty good fit, at least for our applications. This might lead to a lot more "work requests" being rejected (or fulfilled by FGRP work), but it should avoid the memory allocation errors of the GPU apps.

Hi Bernd, Please see my post here: https://einsteinathome.org/content/discussion-thread-continuous-gw-search-known-o2md1-now-o2mdf-gpus-only?page=28#comment-177563

I think there is some bug in your scheduling method for determining how much GPU memory will be used by a particular WU. you can see in my screenshot that the scheduler thinks it only needs 1800MB, but in practice, it tries to allocate >3000MB (~3200 when run on a GPU with enough memory). This is why so many people with 3GB GPUs keep getting sent GW tasks that error out from not enough memory.

I hope this information is helpful for you to track down the cause and be able to fix this error. :)

_________________________________________________________________________

Rob R

Joined: 22 May 14

Posts: 1

Credit: 34135745

RAC: 0

Seams as though this is a

9 May 2020 18:11:28 UTC

Message 177643

(moderation:

)

Seams as though this is a known issue but I thought Id add my GPU for reference.

I have a GTX1050 (2GB) The work units run for about 1:30 then the GPU memory usage starts going up. At about 1:50 it hits 1.8GB used then the task fails with a computational error.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3949

Credit: 46779802642

RAC: 64115474

also, the scheduler is

9 May 2020 21:49:52 UTC

Message 177649 in response to message 177564

(moderation:

)

also, the scheduler is checking global memory, it might be a little better to look at available memory instead. running the desktop on a GUI based OS will eat up a couple hundred MB of GPU ram, which can be the deciding factor in available GPU mem to run the task on a GPU that's on the line. a 2GB GPU might be able to run a GW task needing 1800MB, but not if it's driving the display also. but the scheduler isnt taking that into account when it's checking global mem.

_________________________________________________________________________

astro-marwil

Joined: 28 May 05

Posts: 532

Credit: 644996543

RAC: 1106785

Hallo! It seems to me,

21 May 2020 22:23:52 UTC

Message 177990

(moderation:

)

Hallo!

It seems to me, that O2MD1 is finalizing, running out of tasks. Since some days I don´t get any new task.

What and when is comming next app in this regard???

Kind regards and happy crunching

Martin

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18717964300

RAC: 6380281

You and I would like to

22 May 2020 0:29:24 UTC

Message 177994

(moderation:

)

You and I would like to know. I think that the O2MD1 app data is finished up since Bernd posted the notice their analysis of the data was published.

He did hint they needed to followup with maybe an O3 run to look at the outlier candidates that look interesting.

But no messages or posts from the admins that the next application and data run is forthcoming.

I haven't been able to get more than 1 or 2 O2MD1 tasks per week for several weeks now.

My cpu threads are mostly running the FGRP5 tasks since there is plenty of those.

But there is plenty of O2MDF gpu tasks to run.

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6588

Credit: 316253501

RAC: 338093

It seems to be that the O3

22 May 2020 1:44:12 UTC

Message 177999

(moderation:

)

It seems to be that the O3 run - even shortened due to COVID - was such a bonanza of detections. That speaks to the sensitivity gain of the instruments over O1 and O2. This augers well for a continuous wave detection in our follow up of candidates using O3 data.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Gravitational Wave search O2 Multi-Directional ("O2MD1")

Forums › Technical News

Comment viewing options

Forums › Technical News