Behavior that slows down the project (or not)?

Denis Puhar, dr. med.
Denis Puhar, dr...
Joined: 5 Nov 09
Posts: 36
Credit: 7006583
RAC: 0
Topic 197204

Hi!

In the last weeks I had some (big!) problems with my PC (also haven't been able to read relevant new posts, or said otherwise, I've been 'out of the loop' for some time), besides, I have been also forced, to ABORT quite a few tasks (I've done that AS SOON as I realized, that I won't be able to complete them).

As discussed previously, it is better to abort and another 'wingman' gets the task, as to 'let it be' attitude and wait, that the DEADLINE is passed and ONLY AFTER that (14 days) another wingman gets the task.

I admit, that this can happen sometimes to ALL of us on a smaller scale, but by pure coincidence I 'stumbled' over a host, that has a powerful, way above average PC (fast i7 processor, excellent GPU).

The 'owner' is ANONYMOUS, has 325 tasks under his computer info and a turnaround time of 14 days (!).

Of those tasks are ONLY 23 in progress, 7 are valid or pending, but the VAST MAJORITY, that is 295 tasks, are under 'ERROR' and around 230 of those task have been 'TIMED OUT' and the rest were ALSO ABORTED just hours before they would pass the DEADLINE. (ALL are BRP5 tasks and the time I've taken screenshots - for the purpose of a time-frame - is: around 13.40pm, 25.9.2013, UTC + 2h)

I do not know the reason for that and I do not wish to jump to ANY conclusions, but such 'behavior' seems to slow down the 'research' process of 'examining' the results and the project itself.

AND personally I think, that even if looking at the whole picture (350 000+ volunteers) and this individual case will not really have much impact on the project itself, my PERSONAL opinion is, that some kind of a 'red line' should be drawn in such instances (except, if there is a 'benign' explanation behind that - accident of any kind, a move to a new PC, software bug,...)

Denis

“A little knowledge is a dangerous thing. So is a lot.” - Albert EINSTEIN

David S
David S
Joined: 6 Dec 05
Posts: 2473
Credit: 22936222
RAC: 0

Behavior that slows down the project (or not)?

Quote:

by pure coincidence I 'stumbled' over a host, that has a powerful, way above average PC (fast i7 processor, excellent GPU).

The 'owner' is ANONYMOUS, has 325 tasks under his computer info and a turnaround time of 14 days (!).

Of those tasks are ONLY 23 in progress, 7 are valid or pending, but the VAST MAJORITY, that is 295 tasks, are under 'ERROR' and around 230 of those task have been 'TIMED OUT' and the rest were ALSO ABORTED just hours before they would pass the DEADLINE. (ALL are BRP5 tasks and the time I've taken screenshots - for the purpose of a time-frame - is: around 13.40pm, 25.9.2013, UTC + 2h)

I do not know the reason for that and I do not wish to jump to ANY conclusions, but such 'behavior' seems to slow down the 'research' process of 'examining' the results and the project itself.

Denis


If you give us the host ID, someone can make a stab at diagnosing its problem, maybe even try to contact the owner. This happens all the time over at Seti.

David

Miserable old git
Patiently waiting for the asteroid with my name on it.

Denis Puhar, dr. med.
Denis Puhar, dr...
Joined: 5 Nov 09
Posts: 36
Credit: 7006583
RAC: 0

Hi! It was deliberate not

Hi!

It was deliberate not to post ANYTHING about the 'volunteer' in question, because I DID NOT WANT to 'point a finger' into ANYONE, unless I'd be sure, that such 'behavior' is unwanted to say at least.

The Computer in question is:
7330906 (look under one of my PENDING BRP5 WUs: 173982157, where 'he' was my wingman)

My computer in this WU (where I have finished my task) is: 5372207

Denis

“A little knowledge is a dangerous thing. So is a lot.” - Albert EINSTEIN

Claggy
Claggy
Joined: 29 Dec 06
Posts: 560
Credit: 2699403
RAC: 0

RE: Hi! It was deliberate

Quote:

Hi!

It was deliberate not to post ANYTHING about the 'volunteer' in question, because I DID NOT WANT to 'point a finger' into ANYONE, unless I'd be sure, that such 'behavior' is unwanted to say at least.

The Computer in question is:
7330906 (look under one of my PENDING BRP5 WUs: 173982157, where 'he' was my wingman)

My computer in this WU (where I have finished my task) is: 5372207

Denis


Since the computer is listed as anonymous, it could be anyone who has their computers hidden, so you can't point a finger at anyone.

Edit: made your host and workunit links live,

Claggy

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2950507031
RAC: 690114

At this project, nodes from

At this project, nodes from big supercomputer clusters (not least, the Atlas cluster at MPI Hannover) are occasionally allocated BOINC/Einstein work if not needed for their primary purpose. Any unfinished tasks are unceremoniously dumped if the supercompute node is called back for its day job.

Because the nodes are on fast internal networks, that doesn't cause any bandwidth problems, and I think the administrators have concluded that the extra processing power is worth using, even if it delays some individual task completions.

Any node that applies to will be running Linux - if the one you found is running Windows, then forget everything I've just said....

Edit - Windows 7, Home version. It's not a cluster node...

An i7 with 256 CPU cores? That's just silly. The cluster nodes will be typically Core2 CPUs, probably Xeon branded, and with NVidia Tesla GPUs if at all.

Denis Puhar, dr. med.
Denis Puhar, dr...
Joined: 5 Nov 09
Posts: 36
Credit: 7006583
RAC: 0

So, an almost brand new

So, an almost brand new POWERFUL Intel 3rd generation (I think) i7, with an excellent GPU and a WIN7 OS.

No question marks here.

But number of processors: 256 (!?!)

Some information here is clearly false and raises by itself a 'red flag' (if nothing else: a bug?).

Besides that, the question I asked, remains (P.S: And I'm not interested in pointing at anyone):

Is such behavior acceptable, regardless how much impact it has on OVERALL project performance (especially because - as I understood - we can rule out the 'supercomputer theory')?

D

“A little knowledge is a dangerous thing. So is a lot.” - Albert EINSTEIN

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2950507031
RAC: 690114

'Number of processors' can be

'Number of processors' can be quite legitimately adjusted by a configuration option, but that's intended to help developers debug applications by mimicking a larger processor than they can afford:

Quote:
N
Act as if there were N CPUs; e.g. to simulate 2 CPUs on a machine that has only 1. To use the number of available CPUs, set the value to -1 (was 0 which in newer clients really means zero).

(from Client configuration)

In my opinion, if the user is carrying out active testing for some reason, the tiny (in relative terms) number of tasks delayed doesn't really matter at all. But if they've made a mistake, then it would be polite to correct it.

Denis Puhar, dr. med.
Denis Puhar, dr...
Joined: 5 Nov 09
Posts: 36
Credit: 7006583
RAC: 0

Thank You for this

Thank You for this explanation and also for Your opinion about my 'original' question I asked.

Quote:
In my opinion, if the user is carrying out active testing for some reason, the tiny (in relative terms) number of tasks delayed doesn't really matter at all. But if they've made a mistake, then it would be polite to correct it.

I have (had) the same thoughts about that, but I appreciate, that also someone else (maybe with slightly more 'experiences' with the project) thinks similarly.

D

“A little knowledge is a dangerous thing. So is a lot.” - Albert EINSTEIN

Denis Puhar, dr. med.
Denis Puhar, dr...
Joined: 5 Nov 09
Posts: 36
Credit: 7006583
RAC: 0

Just out of curiosity,

Just out of curiosity, because whatever the answer, it'll have little or no OVERALL effect on the project (regarding the ANONYMOUS host we were discussing about in the thread):

As I took a glance at THIS computer again a few minutes ago, pretty much everything was still the same. BUT how is it possible, that in a time frame of a day or a day and a half, the AVERAGE TURNAROUND TIME for this host changed from 14 days to 2.62 days?!

D

“A little knowledge is a dangerous thing. So is a lot.” - Albert EINSTEIN

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7213324931
RAC: 965210

RE: BUT how is it possible,

Quote:
BUT how is it possible, that in a time frame of a day or a day and a half, the AVERAGE TURNAROUND TIME for this host changed from 14 days to 2.62 days?!


With averaging, it is always a question as to with respect to what the average is taken, and which samples get counted as part of the average. I'll hazard a guess that average turnaround time as represented here is per valid task, perhaps restricted to some moderate time frame. If you just click through to the task list for the host you are discussing here, you may notice in checking the Pending, Valid, Invalid, and Error lists that the host in question has returned a great deal of work very recently, and also cancelled a great deal of work.

It seems likely that the tasks returned late (and so listed as invalid) and the tasks cancelled by the user (listed on the error page) don't contribute to the average turnaround calculation.

As very few valid tasks had been returned previously, and the many recent ones included some with very short turnaround times, the shortening seems a matter of ordinary arithmetic.

Denis Puhar, dr. med.
Denis Puhar, dr...
Joined: 5 Nov 09
Posts: 36
Credit: 7006583
RAC: 0

RE: As very few valid tasks

Quote:
As very few valid tasks had been returned previously, and the many recent ones included some with very short turnaround times, the shortening seems a matter of ordinary arithmetic.

I agree completely with this explanation. But that raises (at least for me) a much more important issue:

If that is really the fact, the info about avg. turnaround time can be (as demonstrated in this case) sometimes (how often?) TOTALLY useless.

D

P.S.

I don't want to (or do this very reluctantly) raise this question (or 'dust') again, because it was discusses previously quite a lot, but nevertheless, such (and/or similar) 'events' (e.g.: a volunteer uses a TOP performance PC for a very short while, discovers a new pulsar in this time, and what follows - 'mission accomplished', no need to participate anymore) can be (and ARE) VERY discouraging and do not do a very good job in motivating people to stay 'aboard' infinitely (counting myself, who has VERY VERY limited resources, time and money).

Sooner or later arises the question:

What about the rest of us (not a small number and scientists excluded), who have been volunteering for years (including myself and what about those, who do this even for much more years than I do - at least when talking about this project) and haven't been 'lucky' enough to discover anything (too) valuable for science with their (below) average computers and are simply left with 'empty hands'?

Sure, people will disagree with me, saying, that EVERY single volunteer matters in the name of science in general, but we are all human beings, who need such or another motivation. Period!

As primitive as it may seem, this 'urge' is 'rooted' in EVERY SINGLE ONE of us.

I'm perfectly aware, that pure 'luck' should also matter, but to me, it seems a bit 'deceiving' or at least way out of proportion, that those who find a NEW pulsar (I suppose, some achieved this in a matter of weeks or less) get a framed certificate and are mentioned in a renowned scientific journal, while the rest of us should be satisfied with being a 'user of a day' from time to time.

And I'm not asking for any special rewards. Even something as BASIC as a system of badges (JUST an EXAMPLE) would be MORE THAN ENOUGH and work 'miracles' - at least for me.

D

“A little knowledge is a dangerous thing. So is a lot.” - Albert EINSTEIN

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.