new units not downloading

kenlo
kenlo
Joined: 1 Jun 05
Posts: 16
Credit: 28,206
RAC: 0
Topic 189428

new h1 units not downloading

kenlo

Thierry Van Driessche
Thierry Van Dri...
Joined: 9 Feb 05
Posts: 210
Credit: 229,929
RAC: 0

new units not downloading

Quote:
new h1 units not downloading


Any relevant message(s) from the messages tab of Boinc would be interesting to post.

Greetings from Belgium
Thierry

kenlo
kenlo
Joined: 1 Jun 05
Posts: 16
Credit: 28,206
RAC: 0

RE: new h1 units not

Quote:
new h1 units not downloading


06/27/05 19:02:34||Starting BOINC client version 4.43 for windows_intelx86
06/27/05 19:02:34||Data directory: D:\Program Files\BOINC
06/27/05 19:02:35|Einstein@Home|Computer ID: 307979; location: home; project prefs: default
06/27/05 19:02:35|orbit@home|Computer ID: 682; location: home; project prefs: default
06/27/05 19:02:35||General prefs: from Einstein@Home (last modified 2005-06-13 13:31:31)
06/27/05 19:02:35||General prefs: no separate prefs for home; using your defaults
06/27/05 19:02:35||Remote control not allowed; using loopback address
06/27/05 19:02:35|Einstein@Home|Resuming computation for result H1_0326.5__0326.9_0.1_T21_Fin1_2 using einstein version 4.79
06/27/05 19:02:35|orbit@home|Deferring communication with project for 14 hours, 48 minutes, and 26 seconds
06/27/05 19:02:35|Einstein@Home|Started download of h1_0326.5
06/27/05 19:02:35||schedule_cpus: must schedule
06/27/05 19:02:49|Einstein@Home|Temporarily failed download of h1_0326.5: 416
06/27/05 19:02:52|Einstein@Home|Started download of h1_0326.5
06/27/05 19:03:03|Einstein@Home|Temporarily failed download of h1_0326.5: 416
06/27/05 19:03:06|Einstein@Home|Started download of h1_0326.5

kenlo

Ulrich Metzner
Ulrich Metzner
Joined: 22 Jan 05
Posts: 113
Credit: 963,370
RAC: 0

Here an excerpt from

Here an excerpt from proxomitron log:

+++GET 30654+++
GET /download/38/h1_0205.0 HTTP/1.0
User-Agent: BOINC client
Host: einstein.astro.gla.ac.uk:80
Range: bytes=14736000-
Accept: */*
Connection: keep-alive

+++RESP 30654+++
HTTP/1.0 416 Requested Range Not Satisfiable
Date: Mon, 27 Jun 2005 23:41:50 GMT
Server: Apache/2.0.54 (Debian GNU/Linux) DAV/2 SVN/1.1.4 mod_python/3.1.3 Python/2.3.5 PHP/4.3.10-15 mod_ssl/2.0.54 OpenSSL/0.9.7e mod_perl/1.999.21 Perl/v5.8.4
Keep-Alive: timeout=15, max=89
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
+++CLOSE 30654+++

+++GET 30655+++
GET /download/38/h1_0205.0 HTTP/1.0
User-Agent: BOINC client
Host: einstein.astro.gla.ac.uk:80
Range: bytes=14736000-
Accept: */*
Connection: keep-alive

+++RESP 30655+++
HTTP/1.0 416 Requested Range Not Satisfiable
Date: Mon, 27 Jun 2005 23:41:54 GMT
Server: Apache/2.0.54 (Debian GNU/Linux) DAV/2 SVN/1.1.4 mod_python/3.1.3 Python/2.3.5 PHP/4.3.10-15 mod_ssl/2.0.54 OpenSSL/0.9.7e mod_perl/1.999.21 Perl/v5.8.4
Keep-Alive: timeout=15, max=88
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
+++CLOSE 30655+++

There is some filesize wrong!

Aloha, Uli

Ananas
Ananas
Joined: 22 Jan 05
Posts: 272
Credit: 2,500,681
RAC: 0

I had the same problem just

I had the same problem just now and I had to reset the project on that PC.

The reason :

It had two download tasks running on exactly the same file. (h1_0400.0)

One was downloaded successfully with the expected file size and the other downloader "wondered where those bytes all came from" and reported a file size error too with a retry every few seconds.

BOINC 4.19, Dual CPU P3s

After the reset it did download stuff successfully but still it shows is "download failed". Nothing missing but I guess I cannot allow BOINC to have two files with the same filename ;-)

There must be something damaged on server/scheduler side or in the WU XML config.

Ananas
Ananas
Joined: 22 Jan 05
Posts: 272
Credit: 2,500,681
RAC: 0

After a reset I got a

After a reset I got a H1_501.0

Same problem first - but then after successful(!) transfer of H1_501.0 BOINC got a request to delete H1_501.0 while it was still downloading H1_501.0 on the other download thread.

Of course the client didn't like that too much either - now there's a checksum error, 2 tasks are crunching and a few are still in "downloading" state

Very weird !

Ananas
Ananas
Joined: 22 Jan 05
Posts: 272
Credit: 2,500,681
RAC: 0

The story continues : After

The story continues : After manually contacting the scheduler to report the error, it tried to delete H1_501.0

BOINC was very sad and told me it couldn't delete H1_501.0 .... but the work units are happy now and not trying to download H1_501.0 again (as it's still there of course)
___________

I guess it's the WU configuration that is wrong, the scheduler request which I saved after the first problem had this in it :


H1_0400.0



h1_0400.0

i.e. twice the same stuff

I would rate this as a critical problem

Robert Nelson
Robert Nelson
Joined: 19 Mar 05
Posts: 5
Credit: 99,737,918
RAC: 88,339

RE: I would rate this as

Message 13512 in response to message 13511

Quote:

I would rate this as a critical problem


Same here, just caught one machine in an endless loop here is an excerpt
6/27/2005 8:16:21 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:23 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:24 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:25 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:27 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:27 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:28 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:30 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:30 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:31 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
It went on till I aborted transfer which appears to have killed the issue. This machine running Einstein beta and 4.45 windows.

Walt Gribben
Walt Gribben
Joined: 20 Feb 05
Posts: 219
Credit: 1,645,393
RAC: 0

RE: RE: I would rate

Message 13513 in response to message 13512

Quote:
Quote:

I would rate this as a critical problem


Same here, just caught one machine in an endless loop here is an excerpt
6/27/2005 8:16:21 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:23 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:24 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:25 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:27 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:27 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:28 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
6/27/2005 8:16:30 PM|Einstein@Home|Couldn't delete file projects/einstein.phys.uwm.edu/h1_0673.0
6/27/2005 8:16:30 PM|Einstein@Home|Started download of h1_0673.0
6/27/2005 8:16:31 PM|Einstein@Home|Temporarily failed download of h1_0673.0: 416
It went on till I aborted transfer which appears to have killed the issue. This machine running Einstein beta and 4.45 windows.

Shut down boinc and restart it. Usually "exit" in boincmgr will do it, but the boinc process must end. If it doesn't, use the taskmanager to "kill" it.

Theres a bug in BOINC where temporarily failed downloads keep the file open which can cause the problems you see. When boinc ends, Windows will close all the files.

kenlo
kenlo
Joined: 1 Jun 05
Posts: 16
Credit: 28,206
RAC: 0

RE: new h1 units not

Quote:
new h1 units not downloading


all i did after the bad download was to abort it and it seems to be running ok now.

kenlo

Walt Gribben
Walt Gribben
Joined: 20 Feb 05
Posts: 219
Credit: 1,645,393
RAC: 0

RE: RE: new h1 units not

Message 13515 in response to message 13514

Quote:
Quote:
new h1 units not downloading

all i did after the bad download was to abort it and it seems to be running ok now.

Thats good. But run Process Explorer, look at the handles for the BOINC process, and see if theres any for h1_0326.5. Or any other h1_* file.

Its fine for the einstein application to use these, but BOINC shouldn't hold on to the file. It'll cause problems later, when BOINC has to delete it. Which shouldn't be for a few weeks yet, when the scheduler decides its time to work in a different set of data.

EDIT:

The "download looping" problem is in boinc 4.43 and fixed with 4.45. Don't remember whether 4.45 fixes the "open handle" one though.

EDIT**2:

From Roberts post, I'd say the "open handle" bug isn't fixed in 4.45. Thats what happens when downloads fail like that, if BOINC leaves the file open, it can't delete the file to download it again. Thats a problem for Einstein@home, where one file is downloaded for all the WU's to use. In that case, its probably a good idea to restart BOINC.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.