Multi-Directional Gravitational Wave Search on O3 data (O3MD1/F)

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2939848082
RAC: 703545

Ian&Steve C. wrote:since it

Ian&Steve C. wrote:
since it only seems to be affecting Windows and maybe Apple/Mac, and not Linux, that would point to an application problem. 

Maybe it's a bit of both, then?

I keep a long stdoutdae.txt log file, so I've gone through it.

Between 14 Dec 2002 and 22 Dec 2022, I completed 1287 GW tasks - no errors. The frequencies went up slowly from 0310.80 to 0315.40

On 22 Dec, I was switched from 0315.40 to 0454.20, and immediately started getting errors. Most tasks reported 'Unrecoverable error' in the log: others simply exited with a missing output file, but no error.

I was excluded because I'd exceeded my error quota limit, and completed my last GW task at 08:20 UTC on 23 Dec (by then, I had a few at frequency 0454.40). I waited out my exclusion, and reset the project to delete that data sequence.

I completed my first task after the quota reset at 01:06 UTC 24 Dec, starting with a 0423.80 dataset. The very first task was a _7 replication (final before the workunit is retired for 'too many errors'). I'm going away for a few days, so I switched exclusively to Gamma ray tasks to avoid wasting electricity while I'm away.

Greg_BE
Greg_BE
Joined: 15 Aug 08
Posts: 90
Credit: 105469727
RAC: 27978

Is this float point error a

Is this float point error a GPU only error or also CPU?

Are GW tasks affected on the O3 project?

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4944
Credit: 18575255362
RAC: 5662694

Greg_BE wrote: Is this float

Greg_BE wrote:

Is this float point error a GPU only error or also CPU?

Are GW tasks affected on the O3 project?

? ? ? ? ?

Since this thread is specifically about the O3MD1/F Gravity Wave tasks . . . . then yes the O3 project is affected.

 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3923
Credit: 45261672642
RAC: 63264375

I think he means if the

I think he means if the errors lately with the windows GPU app is affecting the CPU app also. 
 

I don’t think it is. I haven’t heard anyone say they were having that same issue. Seems to only affect the Windows GPU application.  So O3MDF. 

_________________________________________________________________________

Greg_BE
Greg_BE
Joined: 15 Aug 08
Posts: 90
Credit: 105469727
RAC: 27978

Thanks Ian and Steve.I tend

Thanks Ian and Steve.
I tend to not express myself perfectly at times. Thought that was pretty clear.

 

I had a look at my results page. Not very promising on the 24th (the last day I ran O3 gpu) 18 good and over 70 bad. That's not a great average. So just leave them off for now until this is fixed.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2939848082
RAC: 703545

One of my other Windows

One of my other Windows machines has just produced a couple of sporadic 'Float Invalid Operation' errors, in the middle of a perfectly normal production sequence. The error tasks (14022950401403709211) are resends from the 0414.20 sequence, whereas the first-issue tasks it's working through are 0413.80

That narrows it down a bit, and is getting a bit close for comfort - I'm going to have to keep an eye on that machine!

LumenDan
LumenDan
Joined: 30 Apr 05
Posts: 11
Credit: 202237411
RAC: 145219

I had been getting O3GW

I had been getting O3GW errors on a Windows 11 22H2 machine since early December, coinciding with nvidia driver 527.56 (RTX3060) 
https://einsteinathome.org/host/12894743

My other Windows 11 machine was still on Windows 11 21H2 and had been producing clean results even after updating the Nvidia driver to 237.56 later in December (with GTX1660)
https://einsteinathome.org/host/12916464

I initially thought it may just be a bug in the recent driver affecting computation on the 30 series GPU, however, after updating the second machine to Windows 11 22H2 yesterday that machine is also producing computation errors with the O3 gravity wave search on Nvidia GPU. 
I am aware that contributors are finding issues with other versions of Windows as well so this may not be particularly useful information but having identified a specific change in system configuration that "broke" the application I thought it was worth posting.

Regards, 

Lumendan

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2939848082
RAC: 703545

Your second machine has also

Your second machine has also just changed dataset. Yesterday morning, you were given tasks from 0377.80: today, you got 0445.20

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2939848082
RAC: 703545

Both my remaining Windows

Both my remaining Windows machines have now reached the dreaded 0414.20 range and started throwing errors.

whill44
whill44
Joined: 16 Jul 16
Posts: 2
Credit: 162300243
RAC: 0

I think I'm going to take a

I think I'm going to take a break from einsteinathome, for a week. Even though i indited no more multi directional work in preferences I'm still getting swamped by them and their gumming up the good work. I put time and money into this because it's fun working with computers in max load conditions. I'll check back in a week.

See ya

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.