Unexpected XML tag or syntax

Peterc_555

Joined: 19 Oct 09

Posts: 2

Credit: 1491362

RAC: 0

Me too. Thanks for the

22 Jul 2010 6:02:55 UTC

Message 98647 in response to message 98646

(moderation:

)

Me too.

Thanks for the informative updates, Gary. I guess I'll just have to stop checking my status for a while and find something useful to do. ;-)

Cheers;

Peter.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250545788

RAC: 34618

Sorry for late reply, been

22 Jul 2010 7:47:07 UTC

Message 98648

(moderation:

)

Sorry for late reply, been quite busy last days.

Could anyone affected please post or send me his file sched_reply_einstein.phys.uwm.edu.xml from the BOINC directory?

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117655222751

RAC: 35180212

RE: Could anyone affected

22 Jul 2010 8:07:58 UTC

Message 98649 in response to message 98648

(moderation:

)

Quote:

Could anyone affected please post or send me his file sched_reply_einstein.phys.uwm.edu.xml from the BOINC directory?

I already posted the important bit a few messages earlier :-).

Here is a bigger snippet. It contains the specification which is truncated part way through the block and with the following block joined in at the point of truncation. This is a task that is failing to be sent to one of my hosts. Every example I've looked at so far looks pretty much the same - identical point where garbage chars start. Make sure you scroll to the right to see the garbage at the end of the string.

EDIT: I've highlighted the string in red - color tags don't work inside code blocks so I had to break the snippet into 3 code blocks to get the important one to show in red.

64775518345837.898438
1295510366916760.000000
251658240.000000
100000000.000000
h1_1140.15_S5R4__673_S5GC1a
einstein_S5GC1

earth_05_09
earth

sun_05_09
sun

skygrid_1150Hz_S5GC1.dat
skygrid_1150Hz_S5GC1.dat

h1_1140.15_S5R4
h1_1140.15_S5R4

h1_1140.15_S5R7
h1_1140.15_S5R7

l1_1140.15_S5R4
l1_1140.15_S5R4

l1_1140.15_S5R7
l1_1140.15_S5R7

h1_1140.20_S5R4
h1_1140.20_S5R4

h1_1140.20_S5R7
h1_1140.20_S5R7

l1_1140.20_S5R4
l1_1140.20_S5R4

l1_1140.20_S5R7
l1_1140.20_S5R7

h1_1140.25_S5R4
h1_1140.25_S5R4

h1_1140.25_S5R7
h1_1140.25_S5R7

l1_1140.25_S5R4
l1_1140.25_S5R4

l1_1140.25_S5R7
l1_1140.25_S5R7

h1_1140.30_S5R4
h1_1140.30_S5R4

h1_1140.30_S5R7
h1_1140.30_S5R7

l1_1140.30_S5R4
l1_1140.30_S5R4

l1_1140.30_S5R7
l1_1140.30_S5R7

h1_1140.35_S5R4
h1_1140.35_S5R4

h1_1140.35_S5R7
h1_1140.35_S5R7

l1_1140.35_S5R4
l1_1140.35_S5R4

l1_1140.35_S5R7
l1_1140.35_S5R7

h1_1140.40_S5R4
h1_1140.40_S5R4

h1_1140.40_S5R7
h1_1140.40_S5R7

l1_1140.40_S5R4
l1_1140.40_S5R4

l1_1140.40_S5R7
l1_1140.40_S5R7

h1_1140.45_S5R4
h1_1140.45_S5R4

h1_1140.45_S5R7
h1_1140.45_S5R7

l1_1140.45_S5R4
l1_1140.45_S5R4

l1_1140.45_S5R7
l1_1140.45_S5R7

h1_1140.50_S5R4
h1_1140.50_S5R4

h1_1140.50_S5R7
h1_1140.50_S5R7

l1_1140.50_S5R4
l1_1140.50_S5R4

l1_1140.50_S5R7
l1_1140.50_S5R7

h1_1140.55_S5R4
h1_1140.55_S5R4

h1_1140.55_S5R7
h1_1140.55_S5R7

l1_1140.55_S5R4
l1_1140.55_S5R4

l1_1140.55_S5R7
l1_1140.55_S5R7

h1_1140.60_S5R4
h1_1140.60_S5R4

h1_1140.60_S5R7
h1_1140.60_S5R7

l1_1140.60_S5R4
l1_1140.60_S5R4

l1_1140.60_S5R7
l1_1140.60_S5R7

h1_1140.65_S5R4
h1_1140.65_S5R4

h1_1140.65_S5R7
h1_1140.65_S5R7

l1_1140.65_S5R4
l1_1140.65_S5R4

l1_1140.65_S5R7
l1_1140.65_S5R7

 --Freq=1140.4174671 --FreqBand=0.05 --dFreq=6.71056161393e-06 --f1dot=-2.64248266531e-09 --f1dotBand=2.90673093185e-09 --df1dot=5.77553186099e-10 --skyGridFile=skygrid_1150Hz_S5GC1.dat --numSkyPartitions=887 --partitionIndex=673 --tStack=90000 --nStacksMax=205 --gammaRefine=1399 --ephemE=earth --ephemÃ¡

h1_1140.15_S5R4__673_S5GC1a_0_0
  
  
  
  6000000
  http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/file_upload_handler

635cac8b004a0da0ccb0ccd4cb503deb2dcd06fec4c0e217204fc4d505a3cf10
32d44c8611f7fe6000291638631b1e03553954e0f8dd80a608c5610b626ac6ce
e9e705d720d2a9e274fef0cfbafb297f4a0337ba46327c996555fa8d602c372a
ad5944dc403141251de16bbffe3d0881451788d35ab72de774df44b76c08e43d
.

Cheers,
Gary.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250545788

RAC: 34618

RE: One of my hosts is

22 Jul 2010 8:15:08 UTC

Message 98650

(moderation:

)

Quote:

One of my hosts is getting the following error message when trying to report a result.

7/21/2010 10:11:35 AM [error] Task h1_1105.75_S5R4__737_S5GC1a: bad command line
7/21/2010 10:11:35 AM [error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax

Looking at the workunit h1_1105.75_S5R4__737_S5GC1a I see that some host has already finished this workunit successful, but with a pretty old Core Client (5.4.11). What Client versions does this error happen on?

Quote:

I googled around, and found that Primegrid had a similar problem. They ended up downgrading the scheduler to fix it. Did Einstein recently upgrade to a newer scheduler and get the same bug?

We didn't change the scheduler recently.

The problem could have been there for a while: The workunit generator of GC1 shifts from lower (analysis) frequencies to higher ones. The higher the frequency of a task, the more data is required, which means more data files and thus longer command lines. It's just now that our GC1 workunits hit a buffer limit.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250545788

RAC: 34618

Thanks, Gary. What I was

22 Jul 2010 8:20:34 UTC

Message 98651 in response to message 98649

(moderation:

)

Thanks, Gary.

What I was missing is the name (or ID) of the workunit to which this command line belonged, to track it through the system.

We are surely hitting a buffer limit, but my suspicion is that this limit is actually in the Core Client, not on the server side.

Would anyone try a downgrade?

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117655222751

RAC: 35180212

RE: Thanks, Gary. What I

22 Jul 2010 8:35:04 UTC

Message 98652 in response to message 98651

(moderation:

)

Quote:

Thanks, Gary.

What I was missing is the name (or ID) of the workunit to which this command line belonged, to track it through the system.

We are surely hitting a buffer limit, but my suspicion is that this limit is actually in the Core Client, not on the server side.

Would anyone try a downgrade?

BM

The machines I'm seeing this on are 6.10.56 but I've just also seen it on a 6.2.15 linux host. That's pretty old anyway so how far back do we need to go?

I used to use 5.10.45 but that version (although supposedly stable) had a pretty serious bug that caused me to lose many hundreds of tasks during a couple of server outages. I'm reluctant to go back there - it was the end of version 5 as I recall. Maybe all version 6 BOINCs will have this so ... any suggestions on what version you want me to try?

Cheers,
Gary.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250545788

RAC: 34618

RE: The machines I'm seeing

22 Jul 2010 9:05:14 UTC

Message 98653 in response to message 98652

(moderation:

)

Quote:

The machines I'm seeing this on are 6.10.56 but I've just also seen it on a 6.2.15 linux host. That's pretty old anyway so how far back do we need to go?

Yep, that's what I'm wondering, too.

Quote:

I used to use 5.10.45 but that version (although supposedly stable) had a pretty serious bug that caused me to lose many hundreds of tasks during a couple of server outages. I'm reluctant to go back there - it was the end of version 5 as I recall. Maybe all version 6 BOINCs will have this so ... any suggestions on what version you want me to try?

No. I wrote to BOINC developers, but they won't get up earlier than 6h from now.

If this is a problem to be fixed in the client, it will take some time to do so, and all 6.x Clients would probably have this.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117655222751

RAC: 35180212

I downgraded a machine to

22 Jul 2010 9:10:49 UTC

Message 98654 in response to message 98653

(moderation:

)

I downgraded a machine to 5.10.45 and the exact same problem still shows up on that version as well :-(.

EDIT:
I've been running tasks for frequencies between 1139.65 and 1141.30 on a substantial group of machines for around a month now. Thousands of tasks have been done without any issues until now.

Logic says that it can't be the BOINC client or else why have I been able to work at these high frequencies for so long?

Unfortunately. logic also says that it's likely that something has changed in how new tasks for these frequencies are being generated.

I've found a 5.8.13 BOINC and installed it. Still exactly the same problem with exactly the same sched_reply.

Cheers,
Gary.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250545788

RAC: 34618

There's one little thing we

22 Jul 2010 9:49:47 UTC

Message 98655

(moderation:

)

There's one little thing we changed in the project configuration two days ago that I thought to be completely unrelated.

Please update the project (possibly needs a project reset to work).

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117655222751

RAC: 35180212

RE: Please update the

22 Jul 2010 10:17:06 UTC

Message 98656 in response to message 98655

(moderation:

)

Quote:

Please update the project (possibly needs a project reset to work).

I've just reinstalled 6.10.58 on the machine I'd taken back to 5.8.13. After restarting BOINC and trying an update, the same error messages are reported.

I have a bunch of machines with work in progress and I'm not prepared to lose what has been done over the period this problem has been around. there'd be more than 100 tasks involved. Sure, most have probably already been reported even though each client thinks to the contrary. Still, there's the work in progress and I don't particularly want to junk my caches.

Is there a manual edit to the state file that could achieve the same result as a full reset?

Cheers,
Gary.

Unexpected XML tag or syntax

Forums › Problems and Bug Reports

Comment viewing options

Forums › Problems and Bug Reports