Gamma-ray pulsar binary search #1 on GPUs

TimeLord04

Joined: 8 Sep 06

Posts: 1442

Credit: 72378840

RAC: 0

[Update:] On the MAC, I'm

20 Dec 2016 22:03:36 UTC

Message 153169

(moderation:

)

[Update:]

On the MAC, I'm still crunching away on my 1.14 Units in queue... May be a few more days until I hit the new 1.17 Units. On a positive note; however, NO NEW Invalids have popped up. Only the 5 I mentioned in prior posts are showing on my Web Results Page. (Three 1.12 Units Invalid, and two 1.13 Units Invalid. Still suspect OpenCL Bug.)

If 1.17 Units are as dependable as the 1.14 Units seem to be; maybe I won't encounter anymore issues with OpenCL at Einstein@Home. OpenCL CONTINUES to be an issue at SETI on MACs running Darwin 15.4.0 or newer.

On my XP Pro x64 system, I'm still crunching away at the 1.16 Units in queue, again - it will be two or three days before I hit the 1.17 Units in queue. Still MANY Pending Arecibo BRP4G Units awaiting Validation. ...AND, one Arecibo Unit with "Validate Error" due to "WU Cancelled"...

TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join SETI Refugees

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

TimeLord04

20 Dec 2016 22:14:16 UTC

Message 153171 in response to message 153169

(moderation:

)

TimeLord04 wrote:

[Update:]

On the MAC, I'm still crunching away on my 1.14 Units in queue... May be a few more days until I hit the new 1.17 Units.

Wouldn't this be one of the few times when aborting tasks is OK to start running a newer version of an app, regardless of what one thinks of this type action?
If a newly released app has an update wouldn't it be better to run the newer version instead of persisting on an older version that is now outdated? Especially if the older version has some sort of problem like you reported earlier, although it doesn't seem to be the case with this particular update. I would think that there is a good reason to release a new version or the project wouldn't do that.

TimeLord04

Joined: 8 Sep 06

Posts: 1442

Credit: 72378840

RAC: 0

Holmis wrote:TimeLord04

21 Dec 2016 1:48:35 UTC

Message 153183 in response to message 153171

(moderation:

)

Holmis wrote:

TimeLord04 wrote:
[Update:]

On the MAC, I'm still crunching away on my 1.14 Units in queue... May be a few more days until I hit the new 1.17 Units.

Wouldn't this be one of the few times when aborting tasks is OK to start running a newer version of an app, regardless of what one thinks of this type action?
If a newly released app has an update wouldn't it be better to run the newer version instead of persisting on an older version that is now outdated? Especially if the older version has some sort of problem like you reported earlier, although it doesn't seem to be the case with this particular update. I would think that there is a good reason to release a new version or the project wouldn't do that.

In my opinion, (YMMV), it is ESSENTIAL to crunch what one is given in due respect for the Wingmen assigned to the same Units. Until such time as Bernd, or Oliver issue a Server Side Cancellation, or notify us in these Threads to Cancel Units, I will continue crunching ALL work sent to my systems.

Example: Was just sent, yesterday, a Resend of a BRP4G to my MAC. It was a First Reissue/Resend; ending in "_2". The Wingman on this Unit was patiently waiting since issuance to him/her on December 5th. His initial Wingman Timed Out! Is it fair to the remaining Wingman on December 5th if I cancel the Resend on my computer??? I think NOT.

In looking at the remaining 1.14 Units; some of them are "_2" and "_3" Resends... I will gladly crunch these until notified by Bernd or Oliver to Abort them.

[EDIT:]

I did in fact Cancel ALL the 1.15 Units sent to my XP Pro x64 system. (Totaling over 300 of them.) These were cancelled in response to the fact that they were ALL taking over 6 Hrs to crunch, (2 at a time), on my system; AND, Bernd stated he was issuing the 1.16 Units to Replace the 1.15 Units. In this example, EVERYONE was complaining about the 4-6 Hours taken to finish those Units! At that pace, and with 300+ of them to go through, they WOULD NOT have finished by deadline. This is a GOOD case to cancel Units.

If my system is capable of making the deadline, and Wingmen are waiting for collaboration, then I believe it is ESSENTIAL that I complete the work assigned.

TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join SETI Refugees

frederikhk

Joined: 28 Nov 15

Posts: 6

Credit: 4147280

RAC: 0

Are the workunits for the GPU

21 Dec 2016 6:44:58 UTC

Message 153193

(moderation:

)

Are the workunits for the GPU application test wu's or real data?

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4349

Credit: 253578960

RAC: 36166

All FGRP tasks are "real"

21 Dec 2016 7:23:00 UTC

Message 153194

(moderation:

)

All FGRP tasks are "real" data. However the ones currently issued for FGRPB1G are from a set that was already processed in FGRPB1 (CPU) earlier (just "resized") to allow for comparison & validation. We originally planned to issue "new" workunits for FGRPB1G on Monday, but found that preparing these takes longer than expected. Should be done today, though.

Christian Beer

Joined: 9 Feb 05

Posts: 595

Credit: 197663587

RAC: 18433

We just switched over to a

21 Dec 2016 11:08:27 UTC

Message 153199

(moderation:

)

We just switched over to a new dataset with an increased "payload" of science. These tasks are designed to run 5 times longer than the previous tasks. I'm going to monitor runtimes a bit over the next days and plan to refine the settings for flops_estimation and credit calculation a bit.

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

TimeLord04 wrote:In my

21 Dec 2016 11:14:02 UTC

Message 153200 in response to message 153183

(moderation:

)

TimeLord04 wrote:

In my opinion, (YMMV), it is ESSENTIAL to crunch what one is given in due respect for the Wingmen assigned to the same Units. Until such time as Bernd, or Oliver issue a Server Side Cancellation, or notify us in these Threads to Cancel Units, I will continue crunching ALL work sent to my systems.

Yes, and normally I agree with and live by the same motto. But as I said this is still a very new app that's still being tested and developed so under these circumstances I feel that running the latest version has a higher priority. But as always everyone should act the way they feel is best when there is no clear statement from the project on how they would want us to act. I'm not trying to put blame on anyone, just want to have a good discussion.

TimeLord04 wrote:

Example: Was just sent, yesterday, a Resend of a BRP4G to my MAC. It was a First Reissue/Resend; ending in "_2". The Wingman on this Unit was patiently waiting since issuance to him/her on December 5th. His initial Wingman Timed Out! Is it fair to the remaining Wingman on December 5th if I cancel the Resend on my computer??? I think NOT.

In looking at the remaining 1.14 Units; some of them are "_2" and "_3" Resends... I will gladly crunch these until notified by Bernd or Oliver to Abort them.

As I understand it when one aborts a task and that abortion is reported to the server a resend is immediately created and put in the queue to be sent out, it should go out fairly quickly and the wingman shouldn't have to wait that much longer. Sure the luck of the draw might act so that the resend is sent to a slower host or one that won't return it but that's true with all tasks.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4349

Credit: 253578960

RAC: 36166

As I understand it when one

21 Dec 2016 12:25:19 UTC

Message 153203

(moderation:

)

As I understand it when one aborts a task and that abortion is reported to the server a resend is immediately created and put in the queue to be sent out

This is correct. If you abort the task and report it immediately, there is no delay on the server side, except for the time between a task is created and sent. This is very short, however, with the high throughput of FGRPB1G.

frederikhk

Joined: 28 Nov 15

Posts: 6

Credit: 4147280

RAC: 0

Great! Thanks for the fast

21 Dec 2016 13:27:32 UTC

Message 153207

(moderation:

)

Great! Thanks for the fast answer :)

MarkHNC

Joined: 31 Aug 12

Posts: 37

Credit: 170965842

RAC: 0

Christian Beer wrote:We just

21 Dec 2016 17:42:03 UTC

Message 153216 in response to message 153199

(moderation:

)

Christian Beer wrote:

We just switched over to a new dataset with an increased "payload" of science. These tasks are designed to run 5 times longer than the previous tasks. I'm going to monitor runtimes a bit over the next days and plan to refine the settings for flops_estimation and credit calculation a bit.

In case it helps, on a crunching-only Xeon E5-2670 (v1) 2.6GHz/32GB RAM with HT/16 threads and GTX 960 running two GPU tasks at once, I was getting times between 1290 seconds and 1350 seconds, and getting 693 points for each validated work unit. Now, six completed work units have ranged between 3662 seconds and 4814 seconds. Three of those that have validated have been granted only 700 points. (Oddly, that box is running Win10, but Einstein shows it as Win8.1.)

Gamma-ray pulsar binary search #1 on GPUs

Forums › Technical News

Comment viewing options

Forums › Technical News