No Downloads

Davidinfl
Davidinfl
Joined: 6 Dec 05
Posts: 2
Credit: 3003
RAC: 0

RE: I am getting no

Message 20477 in response to message 20476

Quote:

I am getting no downloads. These are the messages when I try:
12/7/2005 10:18:28 PM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
12/7/2005 10:18:28 PM|SETI@home|Reason: Requested by user
12/7/2005 10:18:28 PM|SETI@home|Note: not requesting new work or reporting results
12/7/2005 10:18:33 PM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded

The same thing happens when einstein tries.

12/7/2005 9:55:02 PM|Einstein@Home|Sending scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi
12/7/2005 9:55:02 PM|Einstein@Home|Reason: Requested by user
12/7/2005 9:55:02 PM|Einstein@Home|Note: not requesting new work or reporting results
12/7/2005 9:55:07 PM|Einstein@Home|Scheduler request to http://einstein.phys.uwm.edu/EinsteinAtHome_cgi/cgi succeeded

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 99440614
RAC: 0

RE: 12/7/2005 9:55:02

Message 20478 in response to message 20477

Quote:
12/7/2005 9:55:02 PM|Einstein@Home|Note: not requesting new work or reporting results

This tells us that BOINC doesn't _want_ any work from Einstein (or SETI). If you are running another project, then that project has "fallen behind" on whatever resource share you have given it, and BOINC is trying to run it by itself long enough to catch up. When it is caught up, then work will be downloaded for the other projects.

If you are not running another project, or if you are but it is not doing any work, then there is something wrong and it needs to be fixed; in this case, let us know how many results from that project you have, and their statuses (ie; uploading, downloading, aborted...)

If you don't want to honor your resource shares and you want to resume getting SETI and Einstein work "now", there are several ways to do it - there is a utility called BOINCDV that will reset your Long Term Debt figures "safely", and restart all resource shares from zero. That is what I would recommend, other than leaving it alone and letting it do it's thing.

You can also hit "suspend" on the project that is working; this will force a download from either SETI or Einstein or both. Or you can increase your cache size on the website preferences page by a bit. Either of these runs the risk of causing you to miss a deadline however...

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118620150710
RAC: 18105078

This message log extract is

This message log extract is great for the helpers to see what is going on. Thanks very much for posting it.

The key bit is:-

Quote:
12/7/2005 9:55:02 PM|Einstein@Home|Note: not requesting new work or reporting results

This means (in BOINC's opinion) there is already enough work, most likely from other projects that have a higher priority to run, so BOINC doesn't want to get more work from EAH and Seti just for the moment. That decision could change at any time and you just have to trust that BOINC will get it right. In my opinion, BOINC always does get it "right", unless it is interfered with by an ill-advised but otherwise determined user :).

To comment more fully on any particular case, a user would need to supply additional information such as:-

  • * The number of supported projects in total
    * Their particular resource shares
    * The number of work units for each project already stored on your machine
    * The value of the "Connect to network every x.xx days" preference setting
    * Any recent preference changes you have made
    * The current LTD (Long Term Debt) values for each project
    * The type of internet connection you have
    * Your policy for daily hours of use (ie is it 24/7, etc)
    * Your policy for allowing BOINC to run when you are using your computer

plus other things I may have forgotten about :). Like duration correction factor for each project :). This actually could be an issue if it was way different from what is expected - like 0.2 or 5.0 for example :).

Some of these things we can see by looking up your computer (if you have it unhidden) on the website but it does save a lot of chasing around if you are kind enough to list it all in your problem report.

Cheers,
Gary.

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 99440614
RAC: 0

Duration Correction Factor

Duration Correction Factor from each of the projects...

:-)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118620150710
RAC: 18105078

RE: Duration Correction

Message 20481 in response to message 20480

Quote:

Duration Correction Factor from each of the projects...

:-)

Done :). I suspect you were probably teasing but I've actually (after an unexplained crash) seen some wierd values that have taken a while to get corrected again. So I've stuck it in as an afterthought so that people might go and think about what it does :).

Cheers,
Gary.

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 99440614
RAC: 0

RE: Done :). I suspect you

Message 20482 in response to message 20481

Quote:
Done :). I suspect you were probably teasing

Nope. It's part of the calculation of how much work to download and how much is on hand - it directly influences the estimated completion times, which are the primary factor... and it's usually not useful, because once you have the LTD's, cache size, and current load, you don't care. But.

For SETI and Einstein, with "fairly consistent" WU run lengths, it's a real good indicator of performance. For Rosetta (and others), it's only marginally useful, if at all, for it's intended purpose, but can be useful for figuring why the user is in NWF. If the project that is actually still running is Rosetta, and the user's DCF is, say, 0.5, but the result that's currently running shows 8 hours to completion, then knowing that a CPU like the user's would normally take 2 hrs, you can see that the project estimated that one (before DCF) at 4 hours. It's been adjusted upward by BOINC Manager because it's taking significantly longer to run than expected. So, if 2 hrs of Rosetta work was requested, but 8 or more hours were received, BOINC Manager is likely (if it was already "on the edge") to go into NWF until 6 hours of that have been worked off. (If their DCF was over 1, you couldn't say this, as all you would know is that one of the last 5-6 WUs they had was _also_ large, and you wouldn't know how much was 'requested' this time; it's the fact that THIS is the result that is the exception, that helps.) It's not so much the DCF itself, as it is the difference between what the DCF tells you to expect and what the current result is doing, in order to see how long before more work is allowed.

It's a very indirect indicator, I'll admit - while it IS used in the calculations, because the numbers we see are already past that point, we don't have to have it ourselves to, for example, add up the "to completion" column. Before the DCF, adding that up was worthless as it could be 50% off in either direction.

You challenged me. :-) You listed a number of things that do indeed affect the situation, although some rarely "matter" (oh, speaking of that, along with the two policy figures you list, CPU efficiency is used in figuring the cache, I think...) and I certainly wouldn't normally think to ask for them. Usually "are you running another project" and "does it say 'not requesting new work'" are all it takes for the "not requesting work" ones, unless they want to know "when will it". Most of the rest of the figures are more likely to be needed in the "my cache size is 3 days and I'm only getting 9.3 hours of work" cases. And in that one, the DCF applies more directly.

EDIT:: The more I think on this, the less I make sense to myself. The LTD figures, which "decide" NWF, don't care about DCF... or most of the rest of those you listed. Only cache size and the LTD's themselves. ??? Now I'm confused. Where is JMVII when you need him! :-)

jimpas
jimpas
Joined: 27 Nov 05
Posts: 8
Credit: 67581
RAC: 0

Gary, Reply to Msg. 22698

Gary,

Reply to Msg. 22698 (Jim)(I didn't want to include the entire msg. as it would be redundant.) ;)

Aarrgh! After installing some new software, I decided it was time to defrag my hard drive. I exited the programs I had running (including BOINC) before I did a full shut down. I rebooted, shut down the auto-start programs (BOINC is one) & defragged. I then restarted BOINC and checked the "Message" tab - only messages there were those indicating my computer details and that EAH was crunching. All previous messages were gone! I KNEW I should have just copied them all to a text file... but I didn't. Sorry. At least from my mistake I can proudly say "I am experienced!" ;0)

You wrote:
"Block 3 - Here is a problem. For some reason your dialup connection didn't seem to be working. I have no experience of dialup or how reliable it is. I haven't noticed any server problems at EAH over the last couple of days so I don't think it's a BOINC problem. From your log, it lasted about 1.5 hours and then suddenly was gone."

As the computer to use I had to list mine as "localhost" to get going. I also had BOINC use my default dial-up connection. However, each time BOINC dials up and after my username & password were verified, it would disconnect. A pop-up from the task bar reading, "BOINC has successfully disconnected from the Internet." SO, I make a lot of trips to my computer to manually dial-up so BOINC can do its thing. (Good thing I'm retired!) That 1.5 hour time frame was probably when BOINC couldn't dial-up, and therefore could not connect to the Eistein host.

I'll get the disconnecting thing worked out. I think it's my Network Connection Sharing Service that is no longer loaded. After the latest Windows update, the IE firewall was no longer running. When I tried to restart it, I received an error msg from Windows that the NCS was running, asked me if I wanted Windows to start it (which I clicked 'yes'), and then told me Windows was unable to start the NCS, then exited. I just got an update that should have corrected that. (I'll check once I get off the Internet.)

"Block 4 (ommitted 1st paragraph).
It's very hard to give you the full story here because you have left out all the seti messages from what you show. I presume that while you were trying to force BOINC to get more EAH work, Seti was actually crunching away, wasn't it?"

No, at the time I had 3 SETI WUs waiting to upload, 1 waiting to download, and none to work on.

"There are probably zillions of failed upload/download messages from Seti but that's OK. You really need to learn to trust BOINC."

There were! That's why I didn't include those in the post.

"The best way to allow EAH to have more work is not to thrash EAH by resetting it but simply to suspend the dodgy project (seti) and get it out of the equation. That's the best way to tell BOINC that Seti is down. Notice that when you reset a project you throw away all the project files and executables so you have to go and download them all again."

Thank you... I'm going to suspend SETI until their server problems are corrected.

"Block 5 - This is the bit that starts at 6:03pm and is repeated 19 times umtil success at 1:32am. You reproduce the first block which has the error code -106. Here is the error code listing in the Wiki and there is quite a bit of detail about -106. I have a lot of machines crunching EAH and I've seen zero problems connecting. ..."

Tonight while on the Internet and reading a news article, I heard the familiar "click" of the modem disconnecting, although the connection icon was still in the task bar. The monitor icon representing my computer would flash once in a while - which would normally indicate a packet being sent - but no return flash on the second monitor icon (representing Earthlink's server). This was probably the cause of the 106 errors the other day, not BOINC.

"Block 6 - Finally at 1:32am you are able to download all the needed stuff (programs) for EAH to start again. By 1:39am the downloads are complete and you've started crunching again. For this period between 6.03pm and 1:39am what was seti doing? Was it crunching?"

At that time I may have got on the Internet and had BOINC "retry communications." That's when I got the download.

"Block 7 - This is the bit where you say things are being removed from memory. All I can see is an exact duplicate of the previous block. Did you copy and paste the wrong stuff??
12/6/2005 1:39:39 AM|Einstein@Home|Starting result l1_0570.5__0570.5_0.1_T07_S4lD_3 using einstein version 479
12/6/2005 2:25:45 PM|Einstein@Home|Computation for result l1_0570.5__0570.5_0.1_T07_S4lD_3 finished"

Yes, I forgot to post the lines showing the results being removed from memory, and then EAH 'starting results' for the same file. The 'removed from memory' line is the same as you get after results are uploaded, only the upload never happened. I did triple check the file name and they were identical. I still have no idea why the same file was hashed over and over. Now I won't since those messages are gone.

QUESTION: If I don't want to lose the messages each time BOINC is exited (for one reason or another), is my only recourse to either manually copy them to a text file, or create a batch file or macro to do it?

"These two lines show a result starting and finishing. The elapsed time is less than 1 hour so this result was partly crunched at some earlier time. I can't find any evidence of this earlier crunching in the edited version of the log you have supplied. We really need the full logs, no matter how big they are, if we are going to work out what's happening."

Next time I'll post the FULL log beginning with any problems. Hopefully there won't be any.

"Also, another thing, your log starts with the large data file "l1_0570.5". It also finishes with the same large data file. Yet there was at least one "reset" in the middle. By definition, a project reset throws away all current data and downloads a new lot. There was no evidence of the downloading of a new large data file in your log so I'm very puzzled. Did you actually do a Project reset??"

Yes.

Please don't take this as some sort of inquisition - that's certainly not my intention. I'd just like to find out what is causing your problems when you shouldn't really be having any. Not from EAH anyway.

No offense taken - I'm very easy going, don't get offended easily and have LOTS of patience. That's why I don't take married life seriously. Oops! That remark also got me some sore ribs from my wife's elbow! :0|
I copied & printed the details you stated from another post that a user should provide to assist with any problems. I'll keep that in mind.

Thanks Gary.
Jim

Lynette
Lynette
Joined: 1 Dec 05
Posts: 13
Credit: 8501
RAC: 0

Hi again.... Is it normal

Message 20484 in response to message 20471

Hi again....

Is it normal for results to sit around for ages without being able to upload.. I have had a couple sitting for ages and I just keep gettin the sames messages as copied below...

Lynette

08/12/2005 11:28:26|SETI@home|Backing off 3 hours, 6 minutes, and 12 seconds on upload of file 20mr05aa.6043.26833.579808.29_2_0
08/12/2005 11:28:35|SETI@home|Started upload of 20mr05aa.6043.28657.479830.247_3_0
08/12/2005 11:30:37|SETI@home|Temporarily failed upload of 20mr05aa.6043.28657.479830.247_3_0: error 500
08/12/2005 11:30:37|SETI@home|Backing off 1 hours, 23 minutes, and 23 seconds on upload of file 20mr05aa.6043.28657.479830.247_3_0

KSMarksPsych
KSMarksPsych
Moderator
Joined: 15 Oct 05
Posts: 2702
Credit: 4090227
RAC: 0

RE: Hi again.... Is it

Message 20485 in response to message 20484

Quote:

Hi again....

Is it normal for results to sit around for ages without being able to upload.. I have had a couple sitting for ages and I just keep gettin the sames messages as copied below...

Lynette

08/12/2005 11:28:26|SETI@home|Backing off 3 hours, 6 minutes, and 12 seconds on upload of file 20mr05aa.6043.26833.579808.29_2_0
08/12/2005 11:28:35|SETI@home|Started upload of 20mr05aa.6043.28657.479830.247_3_0
08/12/2005 11:30:37|SETI@home|Temporarily failed upload of 20mr05aa.6043.28657.479830.247_3_0: error 500
08/12/2005 11:30:37|SETI@home|Backing off 1 hours, 23 minutes, and 23 seconds on upload of file 20mr05aa.6043.28657.479830.247_3_0

Lynette. The SETI server is basically down at the moment due to the change over from Classic to BOINC. When the server traffic slows down, you'll be able to upload the results. The backoff times use some sort of exponential function to keep from hammering the servers ever two seconds.

The best thing you might be able to do at the moment is suspend the project and let another project get it's crunching time. No use hammering the servers when they can't handle it.

Good luck and happy crunching!

Kathryn

Kathryn :o)

Einstein@Home Moderator

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 1

RE: Lynette. The SETI

Message 20486 in response to message 20485

Quote:
Lynette. The SETI server is basically down at the moment due to the change over from Classic to BOINC.

I am not sure what you call down...

Please always read the front page news first:

Quote:
December 5, 2005
We are experiencing heavy traffic on our data server. This is preventing some result uploads/workunit downloads. We are working on the problem. More in Technical News.


See also Server Status.

A snippet of the News:
Meanwhile, we are still dropping connections on the upload server. But the good news is that we are successfully handling about 4 result uploads for every workunit download, which means the upload server is indeed catching up.

We're getting about 35 results a second and sending out about 8 workunits a second at the time of writing.

So 35 results at 10.5KB/result = 367.5KB + (8 times 354KB = 2,832KB) = 3,199.5KB per second... (that's 25.596 Mbit on the connection)

You call that down? I guess you're not satisfied easily. ;)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.