Gravity Wave search on GPUs: do we have a problem?

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770157264
RAC: 933107
Topic 225265

The Server Status Page says that there are no tasks left for O2MDF, and that the O2MDFS3a work generator is 'disabled'. I'm getting a few resends only.

Betreger
Betreger
Joined: 25 Feb 05
Posts: 987
Credit: 1433384814
RAC: 596418

1 of my boxes ran out of GWs

1 of my boxes ran out of GWs and has resorted to pulsars the other one still has a few GWs. 

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770157264
RAC: 933107

Two days since I noticed that

Two days since I noticed that the GW work generator had been turned off, but I was surprised to see that those 'few resends' kept me topped up all through the weekend. That's an awful lot of 'no reply' or 'not started by deadline' failures. And I was getting to the point where locality scheduling breaks down and practically every work fetch triggers a new set of data files to download.

But I've just got my first few 'O3ASE' engineering beta tasks through. I'll give them the hurry-up.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770157264
RAC: 933107

First four back - from GTX

First four back - from GTX 1660 under Linux Mint. Ran normally, no validation yet (still pending).

 

First comments:

They're slower than before, or at least run for longer (might be an effect of the low frequency band).

The clean-up pause at 99% lasts much longer.

The runtime estimate (and hence DCF) is still way out of alignment with FGRPB1G on the same machine.

As usual with OpenCL on NVidia, it actually clocks up 100% CPU while running, but is sent out with an estimate of 0.9 - which allows BOINC to start another CPU task. That's another one for my app_config.xml file.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6537
Credit: 286460284
RAC: 93249

Richard Haselgrove

Richard Haselgrove wrote:

But I've just got my first few 'O3ASE' engineering beta tasks through. I'll give them the hurry-up.

O3 ? Hmmm . .. nice & shiny new tasks .... my precious .... ;-)

I think I may be the generator of more than a few 'not started by deadline' failures alas, power down here has been ungraciously inconsistent.

 

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770157264
RAC: 933107

There's another big batch of

There's another big batch of shiny new tasks, just starting. Still only getting them on Linux machines (might be my settings).

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3709
Credit: 34643113270
RAC: 41865191

Richard Haselgrove

Richard Haselgrove wrote:

First four back - from GTX 1660 under Linux Mint. Ran normally, no validation yet (still pending).

 

First comments:

They're slower than before, or at least run for longer (might be an effect of the low frequency band).

The clean-up pause at 99% lasts much longer.

The runtime estimate (and hence DCF) is still way out of alignment with FGRPB1G on the same machine.

As usual with OpenCL on NVidia, it actually clocks up 100% CPU while running, but is sent out with an estimate of 0.9 - which allows BOINC to start another CPU task. That's another one for my app_config.xml file.

how's GPU utilization?

what are the estimated flops set to for these new tasks?

_________________________________________________________________________

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770157264
RAC: 933107

Ian&Steve C. wrote:how's GPU

Ian&Steve C. wrote:

how's GPU utilization?

what are the estimated flops set to for these new tasks?

1) I don't have a GPU monitor installed on those Linux machines. Sorry.

2) Starting with the just-finishing O2MDF tasks:

    <rsc_fpops_est>144000000000000.000000</rsc_fpops_est>
    <rsc_fpops_bound>2880000000000000.000000</rsc_fpops_bound>
 

New O3ASE tasks:

    <rsc_fpops_est>144000000000000.000000</rsc_fpops_est>
    <rsc_fpops_bound>2880000000000000.000000</rsc_fpops_bound>

(same)

FGRPB1G tasks for comparison:

    <rsc_fpops_est>525000000000000.000000</rsc_fpops_est>
    <rsc_fpops_bound>10500000000000000.000000</rsc_fpops_bound>


The running time for FGRPB1G tasks in this machine falls between the time for O2MDF and O3ASE tasks, but the estimate is 3.6 times higher (fpops), or 6.1 times higher (runtime). I'll have to think about that difference.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3709
Credit: 34643113270
RAC: 41865191

if you have the nvidia driver

if you have the nvidia driver package installed. just open Nvidia Settings. it's included with the driver. it will give you alll kinds of metrics. alternatively, you can also run nvidia-smi from the terminal, which will give you a snapshot of GPU parameters (GPU utilization, memory use, etc). special 3rd party software is not necessary.

 

re flops: looks to remain unchanged from O2MDF at 144,000 GFlops. you're still getting only 1000 cred when it validates. so with the run times of these O3 tasks being roughly double (I checked your task history), you're getting half the credits per unit time.

_________________________________________________________________________

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770157264
RAC: 933107

Ian&Steve C. wrote:you're

Ian&Steve C. wrote:
you're still getting only 1000 cred when it validates. 

If.

None of my O3 tasks has been validated yet.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2140
Credit: 2770157264
RAC: 933107

(No subject)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.