Am I getting longer running tasks ...

Bert Hyman
Bert Hyman
Joined: 5 Dec 05
Posts: 15
Credit: 6206746
RAC: 0
Topic 220540

... or has my computer sprung a leak?

Until recently, the "Gravitational Wave Search" tasks ran in about 10 hours on my old Intel I3-based Linux box. The past week or so, I find some running for over 15 hours. The machine's done nothing but run 2 tasks at a time, each receiving 100% of a a thread on a 2-core 4-thread processor for more than a week. The processes are shown as each using 25% of the CPU.

The two running tasks show an estimated 15 hour running time. They're

 h1_0186.00_O2C02Cl1In0__O2MD1C1_CasA_186.15Hz_13

 h1_0186.00_O2C02Cl1In0__O2MD1C1_CasA_186.15Hz_12

Those tasks and all those waiting to run have an "Estimated Computation Size" of 144,000 GFLOPs.

Is it my machine or the universe at fault here?

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

The focus of the "search"

The focus of the "search" changed from O2MD1V1_VelaJr1 to O2MD1C1_CasA. That can be seen on the names of the tasks.

Different data... and these CasA tasks may require longer to crunch through. Personally all my six CasA tasks so far have ended up 'validate error' (linux host). i don't know yet if it's my computer or something else.

Bert Hyman
Bert Hyman
Joined: 5 Dec 05
Posts: 15
Credit: 6206746
RAC: 0

Thanks. Looking back, I now

Thanks. Looking back, I now see that the earlier tasks that ran long were actually "Gama Ray Pulsar something." One was LATeah1002F_1320.0_178994_0.0_0.

So none of the new ones have even completed on my machine. I have 2 running and 6 more in the queue. Looking at the elapsed + estimated remaining time for the 2 active tasks, it looks like 16 hours each now.

See you all (much) later, I guess.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17541659602
RAC: 6359418

All my O2MD1C1_CasA tasks

All my O2MD1C1_CasA tasks have gone straight to validate errors too.  You are not alone.  Probably the tasks are bad.

 

ursmii
ursmii
Joined: 15 Sep 19
Posts: 2
Credit: 20585157
RAC: 0

why should I spend so much

why should I spend so much computing time if so many were invalid.

bye bye einstein ...

Bert Hyman
Bert Hyman
Joined: 5 Dec 05
Posts: 15
Credit: 6206746
RAC: 0

The two tasks that finished

The two tasks that finished both failed with validate errors.

h1_0186.00_O2C02Cl1In0__O2MD1C1_CasA_186.15Hz_13_0

h1_0186.00_O2C02Cl1In0__O2MD1C1_CasA_186.15Hz_12_0

Looks like something is wrong.

At least it's keeping that part of the basement warm.

San-Fernando-Valley
San-Fernando-Valley
Joined: 16 Mar 16
Posts: 260
Credit: 6910071637
RAC: 21942229

... that's an excellent

... that's an excellent question ...

maybe somebody can explain that to you?

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

When things like this happens

When things like this happens the project staff usually grants credit manually for failed tasks (validate errors) after the problem is fixed. So usually you'll get credit for the work done but the science is still lost.

Bert Hyman
Bert Hyman
Joined: 5 Dec 05
Posts: 15
Credit: 6206746
RAC: 0

San-Fernando-Valley wrote:...

San-Fernando-Valley wrote:

... that's an excellent question ...

maybe somebody can explain that to you?

Explain what, exactly?

Bert Hyman
Bert Hyman
Joined: 5 Dec 05
Posts: 15
Credit: 6206746
RAC: 0

Holmis wrote:When things like

Holmis wrote:
When things like this happens the project staff usually grants credit manually for failed tasks (validate errors) after the problem is fixed. So usually you'll get credit for the work done but the science is still lost.

Not worried so much about the credits, but about the repeated failures of the tasks.

If there's something wrong on my end, I'd like to fix it. If there's a problem with the software that's massaging the data, I'll stop taking new tasks until I see that it's been fixed. Or, if the error reports are actually spurious, I'll let things continue as they are.

 

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

Bert Hyman wrote:Holmis

Bert Hyman wrote:
Holmis wrote:
When things like this happens the project staff usually grants credit manually for failed tasks (validate errors) after the problem is fixed. So usually you'll get credit for the work done but the science is still lost.

Not worried so much about the credits, but about the repeated failures of the tasks.


I do share that worry.

Quote:
If there's something wrong on my end, I'd like to fix it.


If you look at your tasks and find that others also get validate errors then it will most probably not be your fault.
A validate error is declared when there is something obliviously wrong with the result returned, the validator does a sanity check of the result when returned before trying to compare it to your wingman, if that check fails then it's declared as a validate error.
If you run your gear out of default operating parameters and your wingmen returns good results, then it might be a good indication that your hardware is returning bad results.

Quote:
If there's a problem with the software that's massaging the data, I'll stop taking new tasks until I see that it's been fixed.


If you get numerous validate erors and your wingman do too then it will probably be something wrong with the tasks, feel free to stop running them until the staff gets a chance to examine the problem.

Quote:
Or, if the error reports are actually spurious, I'll let things continue as they are.


Nothing wrong with reporting errors! That's one way the staff gets informed of what's happening!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.