Stuck in last gear

Anonymous

Here is the stderr.txt

Here is the stderr.txt information:

 

putenv 'LAL_DEBUG_LEVEL=3'
2019-09-21 07:17:06.0394 (41792) [normal]: This program is published under the GNU General Public License, version 2
2019-09-21 07:17:06.0554 (41792) [normal]: For details see http://einstein.phys.uwm.edu/license.php
2019-09-21 07:17:06.2294 (41792) [normal]: This Einstein@home App was built at: Apr  5 2018 14:15:53


2019-09-21 07:17:06.2464 (41792) [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/einstein_O2AS20-500_1.01_windows_x86_64.exe'.
Activated exception handling...
2019-09-21 07:17:06.2784 (41792) [debug]: BSGL output files
2019-09-21 07:17:06.2954 (41792) [debug]: Flags: LAL_DEBUG, OPTIMIZE, HS_OPTIMIZATION, GC_SSE2_OPT, X64, SSE, SSE2, GNUC X86 GNUX86
2019-09-21 07:17:06.3184 (41792) [debug]: Set up communication with graphics process.


DEPRECATION WARNING: program has invoked obsolete function XLALGetVersionString(). Please see XLALVCSInfoString() for information about a replacement.
Code-version: %% LAL: 6.18.0.1 (CLEAN f9f1c94b0a4ae84fd5e8c6992235dd36200ffd1b)
%% LALPulsar: 1.16.0.1 (CLEAN f9f1c94b0a4ae84fd5e8c6992235dd36200ffd1b)
%% LALApps: 6.21.0.1 (CLEAN f9f1c94b0a4ae84fd5e8c6992235dd36200ffd1b)


2019-09-21 07:17:09.4986 (41792) [normal]: Reading input data ... 2019-09-21 07:18:10.6433 (41792) [normal]: Search FstatMethod used: 'ResampGeneric'
2019-09-21 07:18:10.7323 (41792) [normal]: Recalc FstatMethod used: 'DemodSSE'
2019-09-21 07:18:28.7082 (41792) [normal]: Number of segments: 64, total number of SFTs in segments: 10190
 done.
% --- GPS reference time = 1177858472.0000 ,  GPS data mid time = 1177858472.0000
2019-09-21 07:18:28.8072 (41792) [normal]: dFreqStack = 3.340013e-006, df1dot = 1.637397e-010, df2dot = 0.000000e+000, df3dot = 0.000000e+000
% --- Setup, N = 64, T = 216000 s, Tobs = 19750204 s, gammaRefine = 500, gamma2Refine = 28226, gamma3Refine = 1


DEPRECATION WARNING: program has invoked obsolete function InitDopplerSkyScan(). Please see XLALInitDopplerSkyScan() for information about a replacement.
2019-09-21 07:18:38.0341 (41792) [normal]: INFO: No checkpoint checkpoint.cpt found - starting from scratch
% --- Cpt:0,  total:13110,  sky:1/690,  f1dot:1/19


0.% --- CG:989248 FG:14971  f1dotmin_fg:-2.724189077486e-009 df1dot_fg:3.268256487026e-013 f2dotmin_fg:0 df2dot_fg:0 f3dotmin_fg:0 df3dot_fg:1
INFO: Major Windows version: 6
c
.........c
.........
1..c
.............c
....
2........c
...........
3.c
..............c
....
4.........c
..........
5...c
..............c
..
6..........c
.........
7...c
..........c
......
8....c
............c
...
9........c
...........
10.c
............c
......
11.....c
...........c
...
12.........c
..........
13.c
.............c
.....
14.....c
............c
..
15...........c
........
16.....c
..............c


17.............c
......
18...c
..........c
......
19....c
............c
...
20........c
...........
21.c
.............c
.....
22......c
.............c


23...........c
........
24...c
............c
....
25........c
..........c
.
26..........c
.........
27.c
............c
......
28.....c
...........c
...
29.........c
..........
30.c
............c
......
31.....c
............c
..
32.........c
..........
33.c
............c
......
34....c
............c
...
35.......c
............c


36.............c
......
37......c
.............
38.c
............c
......
39......c
.............c


40............c
.......
41......c
.............c


42.............c
......
43.....c
..............c


44............c
.......
45......c
.............
46.c
.............c
.....
47........c
...........
48.c
..............c
....
49.........c
..........
50...c
..............c
..
51.......c
........c
....
52...c
........c
........c


53.......c
.........c
...
54......c
.........c
....
55....c
........c
.......
56.c
........c
........c
..
57.....c
........c
......
58.c
.........c
.........
59...c
............c
....
60.........c
..........
61...c
..............c
..
62..........c
.........
63....c
.........c
......
64.c
........c
........c
..
65.....c
........c
......
66.c
........c
.........c
.
67......c
........c
.....
68..c
........c
........c
.
69......c
........c
.....
70..c
........c
........c
.
71......c
........c
.....
72..c
..........c
.......
73...c
...........c
.....
74.....c
........c
......
75..c
.........c
........c


76.......c
........c
....
77....c
.........c
......
78...c
..........c
......
79...c
..........c
......
80..c
..........c
.......
81.c
..........c
........
82.c
.........c
........c
.
83......c
........c
.....
84..c
........c
........c
.
85......c
........c
.....
86..c
........c
........c
.
87......c
........c
.....
88...c
...........c
.....
89.....c
...........c
...
90......c
........c
.....
91..c
........c
........c
.
92......c
........c
.....
93..c
.........c
........c


94.......c
........c
....
95......c
............c
.
96.........c
..........
97.c
........c
........c
..
98.....c
........c
......
99.c
........c
........c
..
100.....c
........c
......
101.c
........c
.........c
.
102......c
........c
.....
103..c
........c
........c
.
104......c
.......c
......
105.c
.........c
........c
.
106......c
........c
.....
107..c
........c
........c
.
108......c
........c
.....
109...c
........c
........c


110.......c
.......c
.....
111..c
........c
.........
112.c
..........c
........
113.c
..........c
........
114.c
.........c
.........
115.c
............c
......
116......c
.............c


117...........c
........
118.....c
..............c


119.............c
......
120......c
.............c


121............c
.......
122.....c
.............c
.
123...
Anonymous

My bad.  I found the right

My bad.  I found the right file and am posting it below:

putenv 'LAL_DEBUG_LEVEL=3'
2019-09-15 23:15:51.7006 (45952) [normal]: This program is published under the GNU General Public License, version 2
2019-09-15 23:15:51.7066 (45952) [normal]: For details see http://einstein.phys.uwm.edu/license.php
2019-09-15 23:15:51.7086 (45952) [normal]: This Einstein@home App was built at: Apr  5 2018 14:15:53


2019-09-15 23:15:51.7126 (45952) [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/einstein_O2AS20-500_1.01_windows_x86_64.exe'.
Activated exception handling...
2019-09-15 23:15:51.7216 (45952) [debug]: BSGL output files
2019-09-15 23:15:51.7366 (45952) [debug]: Flags: LAL_DEBUG, OPTIMIZE, HS_OPTIMIZATION, GC_SSE2_OPT, X64, SSE, SSE2, GNUC X86 GNUX86
2019-09-15 23:15:51.7496 (45952) [debug]: Set up communication with graphics process.


DEPRECATION WARNING: program has invoked obsolete function XLALGetVersionString(). Please see XLALVCSInfoString() for information about a replacement.
Code-version: %% LAL: 6.18.0.1 (CLEAN f9f1c94b0a4ae84fd5e8c6992235dd36200ffd1b)
%% LALPulsar: 1.16.0.1 (CLEAN f9f1c94b0a4ae84fd5e8c6992235dd36200ffd1b)
%% LALApps: 6.21.0.1 (CLEAN f9f1c94b0a4ae84fd5e8c6992235dd36200ffd1b)


2019-09-15 23:15:55.9306 (45952) [normal]: Reading input data ... 2019-09-15 23:18:13.3769 (45952) [normal]: Search FstatMethod used: 'ResampGeneric'
2019-09-15 23:18:13.3779 (45952) [normal]: Recalc FstatMethod used: 'DemodSSE'
2019-09-15 23:18:32.3088 (45952) [normal]: Number of segments: 64, total number of SFTs in segments: 10190
 done.
% --- GPS reference time = 1177858472.0000 ,  GPS data mid time = 1177858472.0000
2019-09-15 23:18:32.3988 (45952) [normal]: dFreqStack = 3.340013e-006, df1dot = 1.637397e-010, df2dot = 0.000000e+000, df3dot = 0.000000e+000
% --- Setup, N = 64, T = 216000 s, Tobs = 19750204 s, gammaRefine = 500, gamma2Refine = 28226, gamma3Refine = 1


DEPRECATION WARNING: program has invoked obsolete function InitDopplerSkyScan(). Please see XLALInitDopplerSkyScan() for information about a replacement.
putenv 'LAL_DEBUG_LEVEL=3'
2019-09-16 07:13:57.0566 (10348) [normal]: This program is published under the GNU General Public License, version 2
2019-09-16 07:13:57.0566 (10348) [normal]: For details see http://einstein.phys.uwm.edu/license.php
2019-09-16 07:13:57.0722 (10348) [normal]: This Einstein@home App was built at: Apr  5 2018 14:15:53


2019-09-16 07:13:57.0722 (10348) [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/einstein_O2AS20-500_1.01_windows_x86_64.exe'.
Activated exception handling...
2019-09-16 07:13:57.0878 (10348) [debug]: BSGL output files
2019-09-16 07:13:59.5567 (10348) [debug]: Flags: LAL_DEBUG, OPTIMIZE, HS_OPTIMIZATION, GC_SSE2_OPT, X64, SSE, SSE2, GNUC X86 GNUX86
2019-09-16 07:13:59.5567 (10348) [debug]: Set up communication with graphics process.


DEPRECATION WARNING: program has invoked obsolete function XLALGetVersionString(). Please see XLALVCSInfoString() for information about a replacement.
Code-version: %% LAL: 6.18.0.1 (CLEAN f9f1c94b0a4ae84fd5e8c6992235dd36200ffd1b)
%% LALPulsar: 1.16.0.1 (CLEAN f9f1c94b0a4ae84fd5e8c6992235dd36200ffd1b)
%% LALApps: 6.21.0.1 (CLEAN f9f1c94b0a4ae84fd5e8c6992235dd36200ffd1b)


2019-09-16 07:14:01.8543 (10348) [normal]: Reading input data ... 2019-09-16 07:15:53.1511 (10348) [normal]: Search FstatMethod used: 'ResampGeneric'
2019-09-16 07:15:53.1511 (10348) [normal]: Recalc FstatMethod used: 'DemodSSE'
2019-09-16 07:16:07.0586 (10348) [normal]: Number of segments: 64, total number of SFTs in segments: 10190
 done.
% --- GPS reference time = 1177858472.0000 ,  GPS data mid time = 1177858472.0000
2019-09-16 07:16:07.0742 (10348) [normal]: dFreqStack = 3.340013e-006, df1dot = 1.637397e-010, df2dot = 0.000000e+000, df3dot = 0.000000e+000
% --- Setup, N = 64, T = 216000 s, Tobs = 19750204 s, gammaRefine = 500, gamma2Refine = 28226, gamma3Refine = 1


DEPRECATION WARNING: program has invoked obsolete function InitDopplerSkyScan(). Please see XLALInitDopplerSkyScan() for information about a replacement.
Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2956933055
RAC: 720994

That's actually very useful.

That's actually very useful. The first text you posted showed the application reaching beyond the startup phase, and embarking on

INFO: Major Windows version: 6
c
.........c
.........
1..c
.............c
....
2........c

which I think is where checkpointing is likely to start.

But the second file you posted never reached that stage. The question for the devs is: why?

Edit - looking at the Gravitational wave results for your computer, roughly half have completed and half have stalled. Someone should look at those too, but it's beyond my knowledge.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117576323228
RAC: 35261047

Many thanks to Richard for

Many thanks to Richard for showing you how to look at the task properties and for getting you to post the log information for the science app while it is running.

Bobby Conger wrote:
So how do I write the checkpoint that should have been written, or should I just restart the computer?

Checkpoints are created automatically by the running science application when it reaches particular stages of the computation.  It's up to the app to decide when the suitable time has arrived.  The user cannot do this manually.

It's quite fortuitous that you posted the two lots of output.  The first lot is for a task running normally (it would seem) with checkpoints being created.  The rows of dots (.....) show calculation loops being completed and the 'c' usually indicates a point where a checkpoint is created.  I'm not sure that every 'c' actually represents a written checkpoint but at least some of them must.  The second log you posted has no checkpoints.

If you open each log in separate windows of a text editor and put them side by side on the screen, it makes it very easy to see where the first difference occurs.  The two logs are very similar up to a certain point.  You should be able to see it very easily.  The difference occurs where the 'good' log shows

2019-09-21 07:18:38.0341 (41792) [normal]: INFO: No checkpoint checkpoint.cpt found - starting from scratch

and the 'bad' log shows

putenv 'LAL_DEBUG_LEVEL=3'

This line in the 'bad' log is the very same output text that both logs start with.  It seems like the app is re-launching itself.  The next line contains a time stamp - and that shows a time approximately 8 hours after the original launch time.  In other words, it seems like things got stuck for around 8 hours and then something caused the app to start right back at the beginning.  After relaunch, the app produced a repeat set of log output and then got stuck again at the very same point as last time.  At no stage was a first checkpoint ever written so for all the time this particular task was supposedly runnning, the 'progress' you were seeing wasn't 'real' - it was just BOINC's 'simulated' progress whilst waiting for the very first checkpoint.

There are no other people that I've noticed complaining about this type of behaviour for this particular app.  It seems rather likely to be specific to your computer.  I suspect it will need someone with physical access to your machine to work out what might be causing it.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117576323228
RAC: 35261047

It took me a while to

It took me a while to research the logs and compose an answer so I hadn't noticed that Richard had also replied in the meantime.

He raises a point that I was also wondering about - some tasks do finish and some don't, maybe in a 50/50 ratio.  Your machine is a rather old dual core (Core 2 Duo).  I have no idea if this is possible but could there be a problem with one of the cores?  Maybe you need to look into testing the processor cores somehow.

Cheers,
Gary.

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

Gary Roberts wrote:It took me

Gary Roberts wrote:

It took me a while to research the logs and compose an answer so I hadn't noticed that Richard had also replied in the meantime.

He raises a point that I was also wondering about - some tasks do finish and some don't, maybe in a 50/50 ratio.  Your machine is a rather old dual core (Core 2 Duo).  I have no idea if this is possible but could there be a problem with one of the cores?  Maybe you need to look into testing the processor cores somehow.

It is possible to force the machine to run on a single core, either in the BIOS, or in the windows  environmental settings (CPU =1) if I remember. Setting which core runs a task can be determined using the affinity setting in task manager. 

One off my boxes is a Core 2 Duo, which has run the CPU GW task with no issues, so there is something going on. The methods listed above might help you determine if one of the cores is causing the problem. 

Clear skies,
Matt
archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7221554931
RAC: 967170

Matt White wrote:It is

Matt White wrote:
It is possible to force the machine to run on a single core, either in the BIOS, or in the windows  environmental settings (CPU =1) if I remember. Setting which core runs a task can be determined using the affinity setting in task manager.

While Windows Process Explorer allows you to assign a running task to run on only a specified CPU (that is the term the interface uses, not core) the setting does not apply to the next task.  So if you want to try running on just one for a day, then the other for a day, you may wish to download and install Process Lasso, which several of us use to set CPU affinities (and also priorities) for various tasks to tune Einstein operation efficiency versus system impact.

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

archae86 wrote:So if you want

archae86 wrote:
So if you want to try running on just one for a day, then the other for a day, you may wish to download and install Process Lasso, which several of us use to set CPU affinities (and also priorities) for various tasks to tune Einstein operation efficiency versus system impact.

Didn't know about that one, good tip!

Clear skies,
Matt
Anonymous

Thanks for the tips.  I'll

Thanks for the tips.  I'll check the core's and do some more plugging around.  Thanks again.

Anonymous

I decided to remove and

I decided to remove and re-install the software, however, I cannot get a project to work on.  The website shows me as logged in but I cannot get a project to start over again or back with the same project.  Please help

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.