S5GCE, was: Beyond S5R6

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109393893397
RAC: 35828994

RE: I don't suppose it

Message 96833 in response to message 96832

Quote:
I don't suppose it would be an idea to allow S5R6 to enter an 'end of run' mode where it sends out additional copies to get results more quickly?


Just after S5GCE started, around 3 days ago, there were around 250K outstanding S5R6 workunits (if my memory is correct). Now, just a few days later, there are only 117K. If the IR had been increased by just 1 back then, there would be fewer than 117K now but probably not a whole lot fewer. To achieve that 'modest gain', 250K extra tasks would have been sent out, many of which would have been just a waste of donated resources. In the last hour, the number of outstandings has dropped close to 2K from 119K to 117K. With that good rate of progress, why do anything just yet?

In a week or two when the number is very much smaller and the rate of drop has slowed to a crawl, it would be much less profligate to consider bumping the IR to hurry things up a bit. I would imagine that might be done if there really was a need to clean things up quickly. Remember that in say 2 weeks time there will have already been at least one 'resend' event for all non-returned (deadline miss) tasks in flight at the time S5GCE started. This should have added some short term stimulus to the rate of return. In the end however, it's always a slow process to get those last few quorums completed, as can be seen from the ABP1 numbers.

Cheers,
Gary.

Ver Greeneyes
Ver Greeneyes
Joined: 26 Mar 09
Posts: 140
Credit: 9562235
RAC: 0

Fair enough on all counts :)

Message 96834 in response to message 96833

Fair enough on all counts :) By the way, any word on the odd progress bar behaviour for these new WUs? Is it just an aesthetic issue or is something in the code actually failing and backing up to try again? (by design or otherwise)

Stranger7777
Stranger7777
Joined: 17 Mar 05
Posts: 436
Credit: 417507283
RAC: 33800

Thank you both for the

Thank you both for the answer. Now I understand why those 6 tasks are on the status page yet. BTW, I've seen some of those ABP1 tasks 1 month ago and found that they can not be completed, because I've already reset the project (I had to because of hardware malfunction) and it have no opportunity to download executables for it - there was no suitable files on the server. I've aborted those few tasks just not to wait for two weeks for deadline and for reissueing to another host. Was it correct?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109393893397
RAC: 35828994

RE: ... any word on the odd

Message 96836 in response to message 96834

Quote:
... any word on the odd progress bar behaviour for these new WUs?


All I know is what Bernd said earlier in this thread.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109393893397
RAC: 35828994

RE: I've aborted those few

Message 96837 in response to message 96835

Quote:
I've aborted those few tasks just not to wait for two weeks for deadline and for reissueing to another host. Was it correct?


If you are unable to crunch tasks for whatever reason it is very helpful to abort them promptly rather than let them time out. That way they can be reissued promptly.

Sometimes you just can't do that - for example, a hard disk crash where it's not possible to access what was on it. In those cases, you just have to let the tasks time out, unless you are prepared to reconstruct sufficient of the state file to allow the fresh installation on the replacement hard drive to pretend to have the former hostID. With a bit of effort it's actually possible to do that and take advantage of the 'resend lost results' feature to get all the trashed tasks back and so prevent them having to wait to time out. Not really worth the effort unless you are a bit of a nut case :-).

Cheers,
Gary.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4266
Credit: 244924956
RAC: 16565

I put out a new generation

I put out a new generation (x.04) of S5GCE Apps with a couple of smaller changes:

- progress counting should be fixed
- less verbose progress messages in stderr
- fixed minor memory leak (16 Bytes every checkpoint)
- Windows App should write a stackdump on a crash

This is a maintenance / bugfix release, I don't expect any change in processing speed.

BM

BM

MarkJ
MarkJ
Joined: 28 Feb 08
Posts: 437
Credit: 137327847
RAC: 18476

Any news on what app we'll be

Any news on what app we'll be using for the next search? I ask because isn't the purpose of testing the S5GCE's to prove:

  • a) They produce the same result, and b) They are faster

Seeing as they have failed at b) does that mean we'll do the next run using a All Sky 301-based app rather than an S5GCE-based one? Or are there moves afoot to try and optimize the S5GCE implementation

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 1580

RE: a) They produce the

Message 96840 in response to message 96839

Quote:
  • a) They produce the same result, and

actually not getting the same result is a major objective of the new app. I've reposted a link to the paper the ligo team has written about the GCE app. Ligo produces too much data to exaustively search all of it. Instead the search process is written to discard the data that's least likely to have a signal in it early on to concentrate on teh more promising part. This results in a chance that data containing a real signal will be discarded.

If you look at figure 2 in the paper, the main advantage of GCE is that it pushes the signal strenth that will almost always result in the signal being detected to be about 10x weaker than in S5R6, and has much higher chances of detecting somewhat weaker signals. The old approach only almost guaranteed that signals 100x stronger than the weakest signal that could possibly be detected. The new approach almost gaurantees any signal 10x stronger than the weakest possible will be picked up.

http://arxiv.org/pdf/0906.0023.pdf

Since the only known pulsar close enough that it might be detected, the crab nebula pulsar, is sitting right on the edge of being detectable, this means that GCE will be much more likely to give a positive result. (I don't have a citation for this, it was from a few posters from a conference that Bernd(Bruce???) linked to a few years back.)

Stranger7777
Stranger7777
Joined: 17 Mar 05
Posts: 436
Credit: 417507283
RAC: 33800

Is there any possibility not

Is there any possibility not to run S5GCE for a while and concentrate on the rest of S5R6 data?

Ver Greeneyes
Ver Greeneyes
Joined: 26 Mar 09
Posts: 140
Credit: 9562235
RAC: 0

RE: Is there any

Message 96842 in response to message 96841

Quote:
Is there any possibility not to run S5GCE for a while and concentrate on the rest of S5R6 data?


I'm pretty sure you can only do that if there are actual WUs available - right now they've all been sent out, and we're waiting for hosts to return them, or for their deadlines to run out. They -could- send out additional copies to speed things up, but the amount of WUs left is already decreasing fairly rapidly (down from 100000 to 40000 in 4 or 5 days). The chances of a WU being sent out to broken hosts twice in a row are very low, nevermind 3 times or more in a row. If the chances of a WU not being returned are 1%, the chances of this happening twice in a row would be 1%*1%, or 0.01% - to put it another way, we would expect about 400 out of the currently remaining 40000 to need to be sent out again when their deadline expires, which should be in ~6 days on average, after which we would expect to have 4 WUs without final results left in ~18 days (with most of the 400 returning much sooner), and 0 in the 12 days after that. It might make sense to speed things up for the last stragglers, as that wouldn't put too much extra load on the server.

Mind you, having said all that - I have been prioritizing any S5R6 WUs I've seen in my task list ;) (by suspending all other tasks for a moment to get them to start, then resuming the rest)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.