Unexpected XML tag or syntax

Hartmut Geissbauer
Hartmut Geissbauer
Joined: 5 Jan 06
Posts: 31
Credit: 152941307
RAC: 0

OK, I've done a reset on the

Message 98657 in response to message 98655

OK, I've done a reset on the project. but still the same problem.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117659089395
RAC: 35178894

RE: OK, I've done a reset

Message 98658 in response to message 98657

Quote:
OK, I've done a reset on the project. but still the same problem.


I'm sorry, but I was just about to post that resetting is likely to be useless.

By chance, I've found one of my machines that has been on NNT (no new tasks) since before the problem started. It's been returning completed work quite happily and running down its cache completely oblivious to the bad sched_reply files and the associated error messages. As an experiment, I removed NNT and allowed it to request work - which of course failed dismally and the machine has now joined the others with this problem.

There is still something to be fixed on the servers so people should take no action like resetting, which will just trash what remains of your cache without fixing anything.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117659089395
RAC: 35178894

OK, I've now found another

OK, I've now found another machine that last requested work at 00:27UTC on 21st July which was before the problem started. It has returned completed tasks but hasn't requested new work so the messages log contains no errors yet.

I'll keep this one in reserve (NNT is now set) until there's something further to test.

Cheers,
Gary.

rob (Bruiser)
rob (Bruiser)
Joined: 11 May 10
Posts: 3
Credit: 1750913
RAC: 0

Just throwing my voice in as

Just throwing my voice in as I am also experiencing the same issue. My Boinc version is 6.10.18 which I have installed on two XP workstations. Off the top of my head I can not remember which version I am running on my Linux laptop at home.

The problem is only occurring on one of my XP workstations. The Linux and 2nd XP workstation is working just fine.

Very bizarre.

MarkJ
MarkJ
Joined: 28 Feb 08
Posts: 437
Credit: 139002861
RAC: 0

I too am getting this on one

I too am getting this on one machine. Schedule requests for other projects on the same machine seem to work fine. BOINC version is 6.10.58 and OS is Win7 x64.

Log snippet

Quote:
22/07/2010 9:46:26 PM Einstein@Home Sending scheduler request: To report completed tasks.
22/07/2010 9:46:26 PM Einstein@Home Reporting 12 completed tasks, not requesting new tasks
22/07/2010 9:46:50 PM [error] Task h1_1087.80_S5R4__701_S5GC1a: bad command line
22/07/2010 9:46:50 PM Einstein@Home [error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax
22/07/2010 9:46:50 PM Einstein@Home [error] No close tag in scheduler reply
22/07/2010 9:47:51 PM Einstein@Home Sending scheduler request: To report completed tasks.
22/07/2010 9:47:51 PM Einstein@Home Reporting 12 completed tasks, not requesting new tasks
22/07/2010 9:48:14 PM [error] Task h1_1087.80_S5R4__701_S5GC1a: bad command line
22/07/2010 9:48:14 PM Einstein@Home [error] Can't parse workunit in scheduler reply: unexpected XML tag or syntax
22/07/2010 9:48:14 PM Einstein@Home [error] No close tag in scheduler reply
22/07/2010 9:48:25 PM climateprediction.net Sending scheduler request: To send trickle-up message.
22/07/2010 9:48:25 PM climateprediction.net Not reporting or requesting tasks
22/07/2010 9:48:25 PM GPUGRID update requested by user
22/07/2010 9:48:27 PM climateprediction.net Scheduler request completed
22/07/2010 9:48:32 PM GPUGRID Sending scheduler request: Requested by user.
22/07/2010 9:48:32 PM GPUGRID Reporting 2 completed tasks, not requesting new tasks
22/07/2010 9:48:34 PM GPUGRID Scheduler request completed

Sched_Reply snippet

Quote:


l1_1088.30_S5R7
l1_1088.30_S5R7

--Freq=1088.0674671 --FreqBand=0.05 --dFreq=6.71056161393e-06 --f1dot=-2.64248266531e-09 --f1dotBand=2.90673093185e-09 --df1dot=5.77553186099e-10 --skyGridFile=skygrid_1090Hz_S5GC1.dat --numSkyPartitions=797 --partitionIndex=701 --tStack=90000 --nStacksMax=205 --gammaRefine=1399 --ephemE=earth --ephemJX¦ÃIDL



earth_05_09
http://einstein.ligo.caltech.edu/download/3e9/earth_05_09

As it says the command_line tag seems to have lost its close tag.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117659089395
RAC: 35178894

RE: The problem is only

Message 98662 in response to message 98660

Quote:

The problem is only occurring on one of my XP workstations. The Linux and 2nd XP workstation is working just fine.

Very bizarre.


Not really bizarre since the problem seems to be related to the size of the whole ... block that the scheduler is attempting to create and insert into the sched_reply response that is being sent to your client. The number of data files that need to be handled increases with increasing frequency so I bet if you look at the actual frequency in the task names being done on different hosts you will see a pattern.

Those that are failing will have frequencies in the task names probably around the 1000.xx to 1200.xx range. All of mine are around 1140.xx. Those hosts that don't seem to be having problems will probably be doing tasks at much lower frequencies - something like 500.xx to 800.xx for example. These numbers are only guesses at this stage. I don't know the real transition point.

As an example, take a look at MarkJ's log snippet (in the next message to yours) for a host showing the problem. The frequency visible there is high - 1087.80 Hz. Also take a look at the opening post in this thread. Also a high frequency of 1105.75.

The really bizarre thing is why this suddenly started about 1.5 days ago when everything was fine with large frequencies before that. My hosts have done thousands of 'high frequency' tasks over the last month or so before this issue suddenly arose.

Cheers,
Gary.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250547862
RAC: 34425

One of the few things we

One of the few things we changed last days was to add new download
mirrors, probably lengthening the server reply quite a bit. Maybe we hit a limit there.

I shortened the list on the server again by removing two mirrors, please have another go at it.

BM

BM

Grenadier
Grenadier
Joined: 9 Feb 05
Posts: 14
Credit: 2823344
RAC: 0

Success! I had 2 hosts with

Success! I had 2 hosts with the problem, and both just cleared out their pending result reporting, with the expected message about them having already reported. One that needed work has also just downloaded work.

Thanks for the quick help on this.

Hartmut Geissbauer
Hartmut Geissbauer
Joined: 5 Jan 06
Posts: 31
Credit: 152941307
RAC: 0

That fixed it fror me

Message 98665 in response to message 98663

That fixed it fror me too.
Thank you very much.

rob (Bruiser)
rob (Bruiser)
Joined: 11 May 10
Posts: 3
Credit: 1750913
RAC: 0

Worked for me also. All is

Worked for me also. All is well now...Thanks

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.