Resumed Gamma-Ray Pulsar search

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117481210518
RAC: 35442848

RE: New FGRP2 tasks will

Quote:

New FGRP2 tasks will run a bit longer (~ twice as long) now, and will have the FLOPs estimation reduced to 1/4. Flops estimation and Credit will be fine-tuned when we have more data (i.e. tasks returned), but possibly not this year anymore.

BM


Thanks very much for attending to this. I've added several hosts very recently and these have downloaded and completed tasks with the changed configs already. The estimated and actual times are much closer now so that is great to see.

Once again, thanks for fixing this promptly.

Cheers,
Gary.

Steve Applin
Steve Applin
Joined: 19 Jul 10
Posts: 14
Credit: 20185964
RAC: 0

I've noticed on two of my

I've noticed on two of my machines that there has been a substantial (30% more on 4127571 and 5 x more on 4127568) for Gravitational Wave S6 LineVeto search v1.13 (SSE2) searches.

An example of the massive increase in time is task 140002012 (http://einsteinathome.org/workunit/140002012) on machine 4127568.

Is the increase in time related to this issue, or do I have another problem?

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 531
Credit: 642416543
RAC: 1107888

Hallo! From 125 tasks

Hallo!
From 125 tasks crunched on one computer within a bit more than 4 days, I get
Mean Crunching Time : 2.71 +/- 0,67[h]
Mean Run Time : 2,96 +/- 0.79[h]
Mean Realtive Crunching Overhead : 9.4 +/- 8.1[%]
The shortest Run Time was 1,4[h], the longest 5.1[h].
The smallest Relative Overhead was 1.7[%], the biggest one 55.1[%]
There is no correlation between relative overhead and crunching time.
I also didn´t find a correlation between long running times and my activities on this computer like writing this here or backup or virus search activities.
So, there is a very high varity in the behaviour of the tasks.

Kind regards an happy crunching
Martin

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117481210518
RAC: 35442848

RE: Is the increase in time

Quote:
Is the increase in time related to this issue, or do I have another problem?


No, the dramatic increase in actual run time shown by the task you referenced has nothing to do with any see-sawing of estimated run time to be expected when there is a wide variation in the accuracy of estimates of various science runs within the one project. The potential problem I was pointing to has now been averted (as explained by Bernd) by the actions taken to correct the estimates for the new FGRP2 run. The estimates still need further refinement but are certainly good enough so as not to cause violent swings in the DCF value. I've been watching things closely in several of my hosts and whilst there is still fluctuation in DCF, the swings are modest and shouldn't cause any real problems.

You certainly have another issue and it's one that I've seen from time to time in some of my hosts. However there are no guarantees that the causes in my cases are necessarily the same as for your case.

These days, I largely run Linux and I don't see the problem. A couple of years ago I was running a much greater proportion of WinXP hosts and I saw the problem (run times blowing out to 5x to 10x normal) quite regularly.

My habit was (and still is with Linux) to run crunching hosts with no keyboard, mouse, or monitor attached. WinXP (and perhaps related somewhat to the hardware on which it was running) doesn't like this and maybe after days to a week or two, it would start delivering dramatically extended run times just like your example. The tasks would still validate but progress was woeful. I quickly found a workaround and that was to hookup a keyboard and mouse.

This wasn't a complete solution. What it really did was to simply extend the period before the dramatic slowdown started. The complete workaround was to actually toggle some keys on the keyboard or move the mouse once in a while. With a keyboard and mouse attached, it usually took several weeks for a slowdown to occur and I found I could prevent this from ever occurring by toggling the numlock key or moving the mouse every week or so. I never see this problem on any machines with Linux. They run for months and months (just the box, power cable and network cable) with no sign of a slowdown.

I don't know the exact cause of the slowdown but I'm guessing it was something to do with Windows consuming increasing amounts of CPU cycles trying to poll the detached hardware, or something like that. The problem resolved itself the instant I connected the devices and/or toggled the numlock key and/or moved the mouse. I wanted to change to Linux anyway so this was a pretty good excuse.

Apart from the above dramatic slowdowns, I also see what is usually a much less significant slowdown that is heat related. I assume it is some sort of thermal throttling of one (or more) core(s) in a multi-core CPU that happen to be running a bit hotter than some internal limit is happy with. On a quad, for example, there is usually not much variation from what is expected for the 4 simultaneous tasks that are running if all cores are sufficiently cool. If the ambient is too elevated, or if the heat sink is starting to lose efficiency, or if the fan is starting to run dry, this often can be spotted by occasional tasks running slower than previously. It's not usually a huge slowdown like in your example, more like 10-50% slower than normal. It's a wakeup call to do some PM, after which the slowdown is usually resolved.

I don't know what might have caused the slowdown you reported but hopefully the above may give you some things to consider.

Cheers,
Gary.

Steve Applin
Steve Applin
Joined: 19 Jul 10
Posts: 14
Credit: 20185964
RAC: 0

@ astro-marwil @ Gary

@ astro-marwil @ Gary

Thanks for your help with this, I appreciate your time.

The huge slowdown has happened on my work laptop and I suspect it's seen better days. I suspect it will soon be time to uninstall Boinc and return it to IT for a new one. The lessor of the two slowdowns happened on my home computer.

It struck me as a bit strange because the issue affected two computers at the same time, the difference was instant, not a buildup in completion times over time, but not on a third laptop I'm running Boinc on.

When I get a bit more time, "later", I'll try some PM on my home PC.

Thanks again for your time.

Steve

Miklos M.
Miklos M.
Joined: 3 Apr 05
Posts: 19
Credit: 1741294417
RAC: 312540

I wonder why I receive 70

I wonder why I receive 70 credits for a Gamma wu on one computer and on the other one I get 377?

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250376692
RAC: 35061

See this post. BM

See this post.

BM

BM

Miklos M.
Miklos M.
Joined: 3 Apr 05
Posts: 19
Credit: 1741294417
RAC: 312540

Sorry, can you be more

Sorry, can you be more specific? The credits seem to vary as of late.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250376692
RAC: 35061

The credit that will be

The credit that will be granted is assigned to the workunit (WU) when it is generated. Tasks of FGRP2 WUs generated before Jan 4 will be granted the old FGRP1 value of 377 credits, tasks of FGRP2 WUs generated after Jan 4 will be granted 70 (as announced here).

BM

BM

Miklos M.
Miklos M.
Joined: 3 Apr 05
Posts: 19
Credit: 1741294417
RAC: 312540

Thank you, now it is clear.

Thank you, now it is clear.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.