Gamma-ray pulsar binary search #1 on GPUs

TimeLord04
TimeLord04
Joined: 8 Sep 06
Posts: 1,442
Credit: 72,378,840
RAC: 0

[Update:] On the MAC, I'm

[Update:]

On the MAC, I'm still crunching away on my 1.14 Units in queue...  May be a few more days until I hit the new 1.17 Units.  On a positive note; however, NO NEW Invalids have popped up.  Only the 5 I mentioned in prior posts are showing on my Web Results Page.  (Three 1.12 Units Invalid, and two 1.13 Units Invalid.  Still suspect OpenCL Bug.)

 

If 1.17 Units are as dependable as the 1.14 Units seem to be; maybe I won't encounter anymore issues with OpenCL at Einstein@Home.  OpenCL CONTINUES to be an issue at SETI on MACs running Darwin 15.4.0 or newer.

 

On my XP Pro x64 system, I'm still crunching away at the 1.16 Units in queue, again - it will be two or three days before I hit the 1.17 Units in queue.  Still MANY Pending Arecibo BRP4G Units awaiting Validation.  ...AND, one Arecibo Unit with "Validate Error" due to "WU Cancelled"...

 

TL

TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join SETI Refugees

Holmis
Joined: 4 Jan 05
Posts: 1,118
Credit: 811,677,463
RAC: 234,662

TimeLord04

TimeLord04 wrote:

[Update:]

On the MAC, I'm still crunching away on my 1.14 Units in queue...  May be a few more days until I hit the new 1.17 Units.

Wouldn't this be one of the few times when aborting tasks is OK to start running a newer version of an app, regardless of what one thinks of this type action?
If a newly released app has an update wouldn't it be better to run the newer version instead of persisting on an older version that is now outdated? Especially if the older version has some sort of problem like you reported earlier, although it doesn't seem to be the case with this particular update. I would think that there is a good reason to release a new version or the project wouldn't do that.

TimeLord04
TimeLord04
Joined: 8 Sep 06
Posts: 1,442
Credit: 72,378,840
RAC: 0

Holmis wrote:TimeLord04

Holmis wrote:
TimeLord04 wrote:

[Update:]

On the MAC, I'm still crunching away on my 1.14 Units in queue...  May be a few more days until I hit the new 1.17 Units.

Wouldn't this be one of the few times when aborting tasks is OK to start running a newer version of an app, regardless of what one thinks of this type action?
If a newly released app has an update wouldn't it be better to run the newer version instead of persisting on an older version that is now outdated? Especially if the older version has some sort of problem like you reported earlier, although it doesn't seem to be the case with this particular update. I would think that there is a good reason to release a new version or the project wouldn't do that.

In my opinion, (YMMV), it is ESSENTIAL to crunch what one is given in due respect for the Wingmen assigned to the same Units.  Until such time as Bernd, or Oliver issue a Server Side Cancellation, or notify us in these Threads to Cancel Units, I will continue crunching ALL work sent to my systems.

 

Example:  Was just sent, yesterday, a Resend of a BRP4G to my MAC.  It was a First Reissue/Resend; ending in "_2".  The Wingman on this Unit was patiently waiting since issuance to him/her on December 5th.  His initial Wingman Timed Out!  Is it fair to the remaining Wingman on December 5th if I cancel the Resend on my computer???  I think NOT.

 

In looking at the remaining 1.14 Units; some of them are "_2" and "_3" Resends...  I will gladly crunch these until notified by Bernd or Oliver to Abort them.

 

[EDIT:]

I did in fact Cancel ALL the 1.15 Units sent to my XP Pro x64 system.  (Totaling over 300 of them.) These were cancelled in response to the fact that they were ALL taking over 6 Hrs to crunch, (2 at a time), on my system; AND, Bernd stated he was issuing the 1.16 Units to Replace the 1.15 Units.  In this example, EVERYONE was complaining about the 4-6 Hours taken to finish those Units!  At that pace, and with 300+ of them to go through, they WOULD NOT have finished by deadline.  This is a GOOD case to cancel Units.

 

If my system is capable of making the deadline, and Wingmen are waiting for collaboration, then I believe it is ESSENTIAL that I complete the work assigned.

 

TL

TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join SETI Refugees

frederikhk
frederikhk
Joined: 28 Nov 15
Posts: 6
Credit: 3,339,308
RAC: 0

Are the workunits for the GPU

Are the workunits for the GPU application test wu's or real data? 

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,979
Credit: 205,063,671
RAC: 36,240

All FGRP tasks are "real"

All FGRP tasks are "real" data. However the ones currently issued for FGRPB1G are from a set that was already processed in FGRPB1 (CPU) earlier (just "resized") to allow for comparison & validation. We originally planned to issue "new" workunits for FGRPB1G on Monday, but found that preparing these takes longer than expected. Should be done today, though.

BM

Christian Beer
Christian Beer
Moderator
Joined: 9 Feb 05
Posts: 595
Credit: 97,272,826
RAC: 12,809

We just switched over to a

We just switched over to a new dataset with an increased "payload" of science. These tasks are designed to run 5 times longer than the previous tasks. I'm going to monitor runtimes a bit over the next days and plan to refine the settings for flops_estimation and credit calculation a bit.

Holmis
Joined: 4 Jan 05
Posts: 1,118
Credit: 811,677,463
RAC: 234,662

TimeLord04 wrote:In my

TimeLord04 wrote:
In my opinion, (YMMV), it is ESSENTIAL to crunch what one is given in due respect for the Wingmen assigned to the same Units.  Until such time as Bernd, or Oliver issue a Server Side Cancellation, or notify us in these Threads to Cancel Units, I will continue crunching ALL work sent to my systems.


Yes, and normally I agree with and live by the same motto. But as I said this is still a very new app that's still being tested and developed so under these circumstances I feel that running the latest version has a higher priority. But as always everyone should act the way they feel is best when there is no clear statement from the project on how they would want us to act. I'm not trying to put blame on anyone, just want to have a good discussion. Laughing

TimeLord04 wrote:

Example:  Was just sent, yesterday, a Resend of a BRP4G to my MAC.  It was a First Reissue/Resend; ending in "_2".  The Wingman on this Unit was patiently waiting since issuance to him/her on December 5th.  His initial Wingman Timed Out!  Is it fair to the remaining Wingman on December 5th if I cancel the Resend on my computer???  I think NOT.

In looking at the remaining 1.14 Units; some of them are "_2" and "_3" Resends...  I will gladly crunch these until notified by Bernd or Oliver to Abort them.


As I understand it when one aborts a task and that abortion is reported to the server a resend is immediately created and put in the queue to be sent out, it should go out fairly quickly and the wingman shouldn't have to wait that much longer. Sure the luck of the draw might act so that the resend is sent to a slower host or one that won't return it but that's true with all tasks.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,979
Credit: 205,063,671
RAC: 36,240

As I understand it when one

As I understand it when one aborts a task and that abortion is reported to the server a resend is immediately created and put in the queue to be sent out

This is correct. If you abort the task and report it immediately, there is no delay on the server side, except for the time between a task is created and sent. This is very short, however, with the high throughput of FGRPB1G.

BM

frederikhk
frederikhk
Joined: 28 Nov 15
Posts: 6
Credit: 3,339,308
RAC: 0

Great! Thanks for the fast

Great! Thanks for the fast answer :)

MarkHNC
MarkHNC
Joined: 31 Aug 12
Posts: 37
Credit: 170,965,842
RAC: 0

Christian Beer wrote:We just

Christian Beer wrote:
We just switched over to a new dataset with an increased "payload" of science. These tasks are designed to run 5 times longer than the previous tasks. I'm going to monitor runtimes a bit over the next days and plan to refine the settings for flops_estimation and credit calculation a bit.

In case it helps, on a crunching-only Xeon E5-2670 (v1) 2.6GHz/32GB RAM with HT/16 threads and GTX 960 running two GPU tasks at once, I was getting times between 1290 seconds and 1350 seconds, and getting 693 points for each validated work unit.  Now, six completed work units have ranged between 3662 seconds and 4814 seconds.  Three of those that have validated have been granted only 700 points.  (Oddly, that box is running Win10, but Einstein shows it as Win8.1.)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.