i don't seem to be able to opt out on S6Bucket, but i want to do so.
This is a deliberate policy decision on my part; let me explain why.
The search for gravitational waves is the fundamental reason for Einstein@Home, and I want to keep that at the core of our activities. The scientific impact of gravitational wave detections and observations is hard to overstate.
Detecting gravitational waves is hard -- it's not a walk in the park -- and I want to ensure that at least half of our computational resources go into that direction. I hope you understand!
Task 236578787 completed and reported. Out of curiosity, I looked through the result lists for my wingmate host 4092109. The Tesla card is returning results just fine, but I couldn't find any CPU results - just some post-deadline 'aborted by [user]', as we've discussed before.
On the checkpoint issue: it won't be visible in the limited amount of debug data returned by the client, but I found these lines in stderr.txt on my P4
19:29:15 (2376): [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/hsgamma_FGRP1_0.16_windows_intelx86.exe'.
% checkpoint read: skypoint 5
% Starting barycentering for sky point 6 / 50
My task from wu 100294686 completed and reported. My quorum partner runs an i7 and has little work in queue, but reported several errors on GW work on July 4 and almost nothing on July 5. So confirmation might take a while.
Reported CPU time on my host was 28,896.21. This host is currently typically taking somewhat over 19,000 seconds on GW work.
The stderr text visible in the task page is quite extensive.
The stderr text visible in the task page is quite extensive.
It's detailled, but unfortunately not comprehensive. We only see the last (I think) 64 KiB of a file which will end up at over 750 KiB for a completed run. We join your result, for example, in the middle of the 46th. out of 50 skypoints. My benchmark/checkpoint restart happened ar skypoint 5, well before the segment we might expect to see displayed.
I also got a WU here, finished, reported and validated. The runtime of roughly 4h on my machine could be about right but the 920s for my wingman seem odd.
I also got a WU here, finished, reported and validated. The runtime of roughly 4h on my machine could be about right but the 920s for my wingman seem odd.
mickydl*
Your wingmate has
% checkpoint read: skypoint 48
so it re-started with just two skypoints to go. The app is failing to tell BOINC about checkpoints, and - perhaps as a consequence? - BOINC isn't remembering time already spent before a break in computing. The 920s will be the time taken for those last two skypoints only.
Task 236580621 is back from the P4, though again suffering (at least temporarily) from MIA wingman syndrome.
I'm still wondering why the ATLAS node that I'm waiting on for my first validation doesn't seem to be using any of its four CPUs for crunching - I see pleanty of valid BRP3cuda32fullCPU results, but all the CPU tasks (which are still being allocated) end up past deadline and self-aborted. Seems a waste of good bandwidth, somehow.
Hi Frank, RE: i
)
Hi Frank,
This is a deliberate policy decision on my part; let me explain why.
The search for gravitational waves is the fundamental reason for Einstein@Home, and I want to keep that at the core of our activities. The scientific impact of gravitational wave detections and observations is hard to overstate.
Detecting gravitational waves is hard -- it's not a walk in the park -- and I want to ensure that at least half of our computational resources go into that direction. I hope you understand!
Cheers,
Bruce
Director, Einstein@Home
Apparently the current app
)
Apparently the current app does checkpoint correctly, but it doesn't call boinc_checkpoint_completed(), which signals this to the Core Client.
The next app version will have this fixed, and more frequent progress updates, too. I already fixed this in the source code.
BM
BM
Task 236578787 completed and
)
Task 236578787 completed and reported. Out of curiosity, I looked through the result lists for my wingmate host 4092109. The Tesla card is returning results just fine, but I couldn't find any CPU results - just some post-deadline 'aborted by [user]', as we've discussed before.
On the checkpoint issue: it won't be visible in the limited amount of debug data returned by the client, but I found these lines in stderr.txt on my P4
19:29:15 (2376): [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/hsgamma_FGRP1_0.16_windows_intelx86.exe'.
% checkpoint read: skypoint 5
% Starting barycentering for sky point 6 / 50
and it seems to be carrying on OK from there.
My task from wu 100294686
)
My task from wu 100294686 completed and reported. My quorum partner runs an i7 and has little work in queue, but reported several errors on GW work on July 4 and almost nothing on July 5. So confirmation might take a while.
Reported CPU time on my host was 28,896.21. This host is currently typically taking somewhat over 19,000 seconds on GW work.
The stderr text visible in the task page is quite extensive.
RE: The stderr text visible
)
It's detailled, but unfortunately not comprehensive. We only see the last (I think) 64 KiB of a file which will end up at over 750 KiB for a completed run. We join your result, for example, in the middle of the 46th. out of 50 skypoints. My benchmark/checkpoint restart happened ar skypoint 5, well before the segment we might expect to see displayed.
We'll certainly reduce the
)
We'll certainly reduce the verbosity in the future.
BM
BM
I also got a WU here,
)
I also got a WU here, finished, reported and validated. The runtime of roughly 4h on my machine could be about right but the 920s for my wingman seem odd.
mickydl*
RE: I also got a WU here,
)
Your wingmate has
% checkpoint read: skypoint 48
so it re-started with just two skypoints to go. The app is failing to tell BOINC about checkpoints, and - perhaps as a consequence? - BOINC isn't remembering time already spent before a break in computing. The 920s will be the time taken for those last two skypoints only.Actually it seems to have
)
Actually it seems to have restarted after skypoint #49 to:
513 s + 409 s = 922 s
Task 236580621 is back from
)
Task 236580621 is back from the P4, though again suffering (at least temporarily) from MIA wingman syndrome.
I'm still wondering why the ATLAS node that I'm waiting on for my first validation doesn't seem to be using any of its four CPUs for crunching - I see pleanty of valid BRP3cuda32fullCPU results, but all the CPU tasks (which are still being allocated) end up past deadline and self-aborted. Seems a waste of good bandwidth, somehow.