Gravity Wave search on GPUs: do we have a problem?

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2956446437
RAC: 715583
Topic 225265

The Server Status Page says that there are no tasks left for O2MDF, and that the O2MDFS3a work generator is 'disabled'. I'm getting a few resends only.

Betreger
Betreger
Joined: 25 Feb 05
Posts: 992
Credit: 1588792395
RAC: 757594

1 of my boxes ran out of GWs

1 of my boxes ran out of GWs and has resorted to pulsars the other one still has a few GWs. 

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2956446437
RAC: 715583

Two days since I noticed that

Two days since I noticed that the GW work generator had been turned off, but I was surprised to see that those 'few resends' kept me topped up all through the weekend. That's an awful lot of 'no reply' or 'not started by deadline' failures. And I was getting to the point where locality scheduling breaks down and practically every work fetch triggers a new set of data files to download.

But I've just got my first few 'O3ASE' engineering beta tasks through. I'll give them the hurry-up.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2956446437
RAC: 715583

First four back - from GTX

First four back - from GTX 1660 under Linux Mint. Ran normally, no validation yet (still pending).

 

First comments:

They're slower than before, or at least run for longer (might be an effect of the low frequency band).

The clean-up pause at 99% lasts much longer.

The runtime estimate (and hence DCF) is still way out of alignment with FGRPB1G on the same machine.

As usual with OpenCL on NVidia, it actually clocks up 100% CPU while running, but is sent out with an estimate of 0.9 - which allows BOINC to start another CPU task. That's another one for my app_config.xml file.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6588
Credit: 316161457
RAC: 334754

Richard Haselgrove

Richard Haselgrove wrote:

But I've just got my first few 'O3ASE' engineering beta tasks through. I'll give them the hurry-up.

O3 ? Hmmm . .. nice & shiny new tasks .... my precious .... ;-)

I think I may be the generator of more than a few 'not started by deadline' failures alas, power down here has been ungraciously inconsistent.

 

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2956446437
RAC: 715583

There's another big batch of

There's another big batch of shiny new tasks, just starting. Still only getting them on Linux machines (might be my settings).

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46766922642
RAC: 64052889

Richard Haselgrove

Richard Haselgrove wrote:

First four back - from GTX 1660 under Linux Mint. Ran normally, no validation yet (still pending).

 

First comments:

They're slower than before, or at least run for longer (might be an effect of the low frequency band).

The clean-up pause at 99% lasts much longer.

The runtime estimate (and hence DCF) is still way out of alignment with FGRPB1G on the same machine.

As usual with OpenCL on NVidia, it actually clocks up 100% CPU while running, but is sent out with an estimate of 0.9 - which allows BOINC to start another CPU task. That's another one for my app_config.xml file.

how's GPU utilization?

what are the estimated flops set to for these new tasks?

_________________________________________________________________________

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2956446437
RAC: 715583

Ian&Steve C. wrote:how's GPU

Ian&Steve C. wrote:

how's GPU utilization?

what are the estimated flops set to for these new tasks?

1) I don't have a GPU monitor installed on those Linux machines. Sorry.

2) Starting with the just-finishing O2MDF tasks:

    <rsc_fpops_est>144000000000000.000000</rsc_fpops_est>
    <rsc_fpops_bound>2880000000000000.000000</rsc_fpops_bound>
 

New O3ASE tasks:

    <rsc_fpops_est>144000000000000.000000</rsc_fpops_est>
    <rsc_fpops_bound>2880000000000000.000000</rsc_fpops_bound>

(same)

FGRPB1G tasks for comparison:

    <rsc_fpops_est>525000000000000.000000</rsc_fpops_est>
    <rsc_fpops_bound>10500000000000000.000000</rsc_fpops_bound>


The running time for FGRPB1G tasks in this machine falls between the time for O2MDF and O3ASE tasks, but the estimate is 3.6 times higher (fpops), or 6.1 times higher (runtime). I'll have to think about that difference.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46766922642
RAC: 64052889

if you have the nvidia driver

if you have the nvidia driver package installed. just open Nvidia Settings. it's included with the driver. it will give you alll kinds of metrics. alternatively, you can also run nvidia-smi from the terminal, which will give you a snapshot of GPU parameters (GPU utilization, memory use, etc). special 3rd party software is not necessary.

 

re flops: looks to remain unchanged from O2MDF at 144,000 GFlops. you're still getting only 1000 cred when it validates. so with the run times of these O3 tasks being roughly double (I checked your task history), you're getting half the credits per unit time.

_________________________________________________________________________

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2956446437
RAC: 715583

Ian&Steve C. wrote:you're

Ian&Steve C. wrote:
you're still getting only 1000 cred when it validates. 

If.

None of my O3 tasks has been validated yet.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2956446437
RAC: 715583

(No subject)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.