Why will it not share?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118620004044
RAC: 18098112

OK, let's get serious. You

OK, let's get serious.

You said earlier that Seti was suspended. Is this still true?

You said earlier that the reason why EAH results were delayed for 14 days was that all the Seti results were done first. My guess is that your Seti WUs would take about 3 to 4 hours or so each so if you were doing Seti continuously for many days, that's a helluva lot of Seti results. How could you have that many to do if you really have a 0.1 day cache? With a 0.1 day cache you should have only one or two Seti units on your machine at any time.

When did you set your cache to 0.1? What was it before that?

How many Climate work units do you have at the moment?

Which project is actually stuck running at the moment? Looks like EAH?? I've just seen a result uploaded not too long ago.

If EAH is running solidly at the moment, Climate must have had more than its fair share sometime in the past and BOINC is catching up. BOINC is paying back the debt to EAH because it's had less than its fair share. BOINC will continue to do this until the debt has been repaid.

This may take some time depending on how big the debt is. If you leave BOINC alone, it will recover the situation and normal round-robin scheduling will resume when BOINC is good and ready.

What can you do to speed this up. Well the first thing you have to answer is why it got into a mess in the first place. Take a look at your results list for EAH. You will see 6 new results (your then daily limit) were downloaded on 10 Dec and as soon as it went past midnight a further 4 were downloaded. Getting this much work in a batch means that BOINC is filling up your cache. Sorry, but you must have a connect setting far larger than 0.1 as we've been trying to tell you for a while. So here are some more serious questions:-

What venue does your computer belong to for each project? Please go to each website and check this out carefully.

Do you have just the default venue prefs set on each website or have you defined extra prefs for "home" or "school" or "work"? Check this out carefully on each website.

Please find the file stdoutdae.txt in your BOINC folder. You need to open this with a suitable text editor and find the startup messages from the last time you started BOINC, whenever that was. Be warned that this file might be quite large and you will need to search through it until you find the most recent message saying something like:-

Quote:
2005-12-07 15:21:23 [---] Starting BOINC client version 5.2.13 for windows_intelx86

You need to copy and paste this line plus about the following 40 lines in to a message here, thanks.

Once you have answered all the previous questions and once you have the log snippet posted here, we should be able to tell you what's really going on.

Cheers,
Gary.

jason
jason
Joined: 22 Nov 05
Posts: 9
Credit: 2932
RAC: 0

RE: You said earlier that

Quote:
You said earlier that Seti was suspended. Is this still true?

seti is not suspended but am unable to upload or download any new packets

Quote:
You said earlier that the reason why EAH results were delayed for 14 days was that all the Seti results were done first. My guess is that your Seti WUs would take about 3 to 4 hours or so each so if you were doing Seti continuously for many days, that's a helluva lot of Seti results. How could you have that many to do if you really have a 0.1 day cache? With a 0.1 day cache you should have only one or two Seti units on your machine at any time.

the whole thing for some reason WAS set to 0.1 days, no idea why but to get started my dad suggested setting it to 7 days to get some downloaded. but reverted back to 0.1 days when i had a few in queue.

Quote:

How many Climate work units do you have at the moment?

only the one report date 05/11/2006

Quote:
Which project is actually stuck running at the moment? Looks like EAH?? I've just seen a result uploaded not too long ago.

it was seti now its EAH

Quote:
If EAH is running solidly at the moment, Climate must have had more than its fair share sometime in the past and BOINC is catching up. BOINC is paying back the debt to EAH because it's had less than its fair share. BOINC will continue to do this until the debt has been repaid.

climate has had 4 hrs 16 mins since installing boinc on system

Quote:

What venue does your computer belong to for each project? Please go to each website and check this out carefully.

the venue for all projects is "HOME"

Quote:

You need to copy and paste this line plus about the following 40 lines in to a message here, thanks.

005-12-11 08:26:46 [---] Starting BOINC client version 5.2.13 for windows_intelx86
2005-12-11 08:26:46 [---] libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3
2005-12-11 08:26:46 [---] Data directory: H:\\Program Files\\BOINC
2005-12-11 08:26:47 [---] Processor: 1 AuthenticAMD AMD Sempron(tm) 2600+
2005-12-11 08:26:47 [---] Memory: 1023.48 MB physical, 2.40 GB virtual
2005-12-11 08:26:47 [---] Disk: 76.33 GB total, 66.54 GB free
2005-12-11 08:26:47 [climateprediction.net] Computer ID: 258562; location: ; project prefs: default
2005-12-11 08:26:47 [Einstein@Home] Computer ID: 445648; location: ; project prefs: default
2005-12-11 08:26:47 [SETI@home] Computer ID: 1754311; location: home; project prefs: default
2005-12-11 08:26:47 [---] General prefs: from Einstein@Home (last modified 2005-11-26 08:15:26)
2005-12-11 08:26:47 [---] General prefs: using your defaults
2005-12-11 08:26:48 [---] Remote control not allowed; using loopback address
2005-12-11 08:26:48 [climateprediction.net] Deferring computation for result 1j8v_100092335_1
2005-12-11 08:26:48 [Einstein@Home] Resuming computation for result w1_0833.5__0833.5_0.1_T05_S4hD_2 using einstein version 479
2005-12-11 08:26:48 [---] Using earliest-deadline-first scheduling because computer is overcommitted.
2005-12-11 08:26:48 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-12-11 08:26:48 [SETI@home] Reason: Requested by user
2005-12-11 08:26:48 [SETI@home] Reporting 2 results
2005-12-11 08:27:04 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500
2005-12-11 08:27:04 [SETI@home] No schedulers responded
2005-12-11 08:28:04 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-12-11 08:28:04 [SETI@home] Reason: Requested by user
2005-12-11 08:28:04 [SETI@home] Reporting 2 results
2005-12-11 08:28:26 [---] Couldn't connect to hostname [setiboinc.ssl.berkeley.edu]
2005-12-11 08:28:29 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of -106
2005-12-11 08:28:29 [SETI@home] No schedulers responded
2005-12-11 08:29:29 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-12-11 08:29:29 [SETI@home] Reason: Requested by user
2005-12-11 08:29:29 [SETI@home] Reporting 2 results
2005-12-11 08:29:39 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500
2005-12-11 08:29:39 [SETI@home] No schedulers responded
2005-12-11 08:30:39 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-12-11 08:30:39 [SETI@home] Reason: Requested by user
2005-12-11 08:30:39 [SETI@home] Reporting 2 results
2005-12-11 08:31:01 [---] Couldn't connect to hostname [setiboinc.ssl.berkeley.edu]
2005-12-11 08:31:04 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of -106
2005-12-11 08:31:04 [SETI@home] No schedulers responded
2005-12-11 08:32:54 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-12-11 08:32:54 [SETI@home] Reason: Requested by user
2005-12-11 08:32:54 [SETI@home] Reporting 2 results
2005-12-11 08:33:16 [---] Couldn't connect to hostname [setiboinc.ssl.berkeley.edu]
2005-12-11 08:33:19 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of -106
2005-12-11 08:33:19 [SETI@home] No schedulers responded
2005-12-11 08:34:34 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-12-11 08:34:34 [SETI@home] Reason: Requested by user
2005-12-11 08:34:34 [SETI@home] Reporting 2 results
2005-12-11 08:34:44 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500
2005-12-11 08:34:44 [SETI@home] No schedulers responded
2005-12-11 08:51:40 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-12-11 08:51:40 [SETI@home] Reason: Requested by user
2005-12-11 08:51:40 [SETI@home] Reporting 2 results
2005-12-11 08:51:50 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500
2005-12-11 08:51:50 [SETI@home] No schedulers responded
2005-12-11 08:59:40 [SETI@home] Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
2005-12-11 08:59:40 [SETI@home] Reason: Requested by user
2005-12-11 08:59:40 [SETI@home] Reporting 2 results
2005-12-11 08:59:50 [SETI@home] Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi failed with a return value of 500

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 99440614
RAC: 0

Just a quick note - if SETI

Just a quick note - if SETI has any work in "downloading" status, you should SUSPEND the project until they get their issues solved. You could be sitting there with no work at all, and the scheduler would still be happy as any "downloading" work counts as being present (since it could show up at any second). Regardless of any of that, you should Suspend SETI _anyway_, since you aren't currently doing any work for them - that will simplify the troubleshooting. "No new work", etc. will not do it - it must be Suspended in the Projects tab to be "out of the equation" for the scheduler.

The CPDN due 5/11/06 - is that May 11th or November 5th? If May 11, and this is a Sulphur WU with a one-year deadline, you've had it on your CPU an awfully long time to get only 4 hours of CPU time. If November 5, that's still over a month... you probably need to give a little more resource share to CPDN (or leave your computer on more hours running BOINC), or 6 months from now, you'll find yourself doing nothing but CPDN for weeks at a stretch.

Gary's already done all the investigation so I'm not going to repeat all of that unless he can't get back to you, so this is just a guess... when you went to 7 days then back to 0.1 in two big jumps, with CPDN present, you bit off a huge hunk of work. The reasoning behind increasing the cache was sound - you wanted more work and weren't getting any - it's just that the problem was SETI, not the scheduler, so suspending SETI would have been the real solution. Your computer is still "overcommitted", trying to compensate for this overload, but once it is down to a single Einstein WU, it _should_ kick back in to normal operation with no other intervention. Unless Gary's found something else that I haven't seen.

EDIT:: Thinking about it, even the 7 day cache wouldn't have been a major problem if you were running 24/7. (Although 4 days is the most I would recommend with your projects, and 3 would be better, smaller than that better yet - _I_ would set it in your case to 0.5.) Let me do more guessing; you set up BOINC, then before _it_ knew you weren't going to run 24/7, you told it to get 7 days work. The scheduler is pretty smart - if you normally run BOINC only 4 hours per day, for example, it will only get 28 hours worth of work for that 7 days. But if it hasn't got that pattern "down" yet... it gets 7x24 hours worth. Then if you only run 4 hours/day, you've got 42 "real days" worth, and you're in trouble. Under "view your computers", down near the bottom on that page are some figures for % time BOINC running, efficiency, etc... If any of those are significantly lower than 90%, can you tell us what those values all are?

jason
jason
Joined: 22 Nov 05
Posts: 9
Credit: 2932
RAC: 0

umm these

umm these numbers?????

i assume its 5th november but it could be the US date system..possibly as its a US program...which would make it 11th of may

Michael Roycraft
Michael Roycraft
Joined: 10 Mar 05
Posts: 846
Credit: 157718
RAC: 0

(edit) @ Jason. I'm sorry

(edit) @ Jason. I'm sorry for not including you yet, Jason, but I need to pass this by Gary and Bill first, in case it would screw up a strategy that they are working out. Thank you in advance for your patience. (end edit)

@ Gary & Bill,

Sorry for jumping in here, and if I've missed anything significant. I've read through the thread, but perhaps have not assimilated some things.
It seems to me from what Jason has said that this is a likely scenario: His dad helped him with initial setup, including 7-day "Connect to", probably with only Seti attached, so Seti stacked him up with a full plate and 7-day deadline. Sometime after crunching a few Seti WUs, climate was added and probably even Einstein, and Boinc calculated there was now room to add a little of each into the mix. Then the "Connect to" was reduced and Boinc finally realized that 24/7 was not going to be the pattern, so everything was over-committed.
By now he's well into the 7-day deadline for Seti, absolutely no possibility of delivering their work on-schedule, but Boinc still has him with a queue of d/l Seti work. The way Seti is jammed up, it is likely that none of the Seti will be returned ontime, so it's all expendable. I would conclude that Jason should
1) disconnect from the internet
2) unsuspend Seti
3) abort all Seti work in-process
4) delete all Seti parent datafiles in-folder
5) if possible, abort any enqueued d/l
6) set Seti to "no new work"
7) suspend Seti
8) reconnect to the internet

Let me know how this idea flies, and what adjustments might be needed, before we go as far as recommending it to Jason.

microcraft
"The arc of history is long, but it bends toward justice" - MLK

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 99440614
RAC: 0

RE: The way Seti is jammed

Message 21335 in response to message 21334

Quote:
The way Seti is jammed up, it is likely that none of the Seti will be returned ontime, so it's all expendable.

@Michael: Gary must not be available, he hasn't come back... but SETI is extending deadlines, so there's really no need to trash the work that's already been done there. I think the rest of what you describe is possible, from what I know, but in this case, I'd just suspend SETI and pretend it isn't there, resume it when they're up again and the "uploadings" change to "ready to report". It'll keep trying to upload even when suspended.

@Jason: That 80% figure throws my guess out. 0.89 efficiency isn't bad, either. 1.08 DCF is a little high, but you've not done enough WUs for that to be accurate yet...

I show that you have 9 Einstein WUs in your cache, at least three days worth, probably four. If your cache setting of 0.1 has made it to your computer, it BETTER not ask for any more work for a few days! I think you're just going to have to let it work that backlog down a bit - when it's down to one WU for Einstein, it should start alternating between that and CPDN again. I don't know why it got into the "overcommitted" state to begin with, based on what we're seeing, even with the temporary 7-day bump, but once it's there, there really isn't a good option other than just letting it work it's own way out.

If it doesn't start giving equal time when you're down to one Einstein WU, please post back here and we'll dig deeper, get you to download BOINCView so we can see the LTDs, etc. Or Gary may have found something in the interim; he's the resident guru! :-)

Michael Roycraft
Michael Roycraft
Joined: 10 Mar 05
Posts: 846
Credit: 157718
RAC: 0

Bill, The fact that it

Bill,

The fact that it will keep trying to upload is why I suggested trashing all. If you think it has a chance of returning within the extended deadline, then maybe keep the finished work. One problem I see is that it does still continue to try to upload, which is just contributing to the Denial of Servics (DoS) problem over @Seti

microcraft
"The arc of history is long, but it bends toward justice" - MLK

jason
jason
Joined: 22 Nov 05
Posts: 9
Credit: 2932
RAC: 0

Hi Bill And Michael

Hi Bill And Michael

thanks for all the input, i will give it time to settle down after changes and see if it starts sharing with other projects as you suggest.

as for "scrapping" seti.......i only have 2 work units to upload that are both finished so going to put it into suspend mode and just wait until seti is back online and upload them. hopefully by that point the BOINC manager will have settled down somewhat and will be sharing the CPU time fairly

CROSS FINGERS ALL ;-)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118620004044
RAC: 18098112

Hey Guys, I gotta sleep

Hey Guys,

I gotta sleep sometime!!

I haven't got much time right now but PLEASE DON'T follow Michael's recipe and try aborting anything.

The real key, which I thought had already been done by Jason's comment earlier about suspending Seti, is simply to SUSPEND SETI and get it out of the equation. Seti admins have already announced the extension of deadlines and if you leave the stuck stuff alone and then you see it start to move, it'll be a good indication that Seti might be over its problems so go check them out in the announcements on their front page.

Follow what Bill is saying. I believe he is pretty much on the money.

I'll be back when I can.

Cheers,
Gary.

Tern
Tern
Joined: 27 Jul 05
Posts: 309
Credit: 99440614
RAC: 0

RE: I gotta sleep

Message 21339 in response to message 21338

Quote:
I gotta sleep sometime!!

Who gave you permission to do THAT?

Quote:
Follow what Bill is saying. I believe he is pretty much on the money.

... why did I just hear something in the back of my head about "even a blind hog..." - hm.

As far as the SETI work - I have no idea when they'll get the servers going, or how long they'll leave the credits "open" after that, but killing off two uploads isn't going to make any difference in their situation, when there are people out there with hundreds or thousands _each_ in some cases. If these two make it up and get credit, great, if not, he's no worse off than if he aborted them.

I _am_ hearing some complaints that if you have a _lot_ of uploads pending, the actual upload portion may run at a higher-than-idle priority on your computer, causing it to be sluggish. Shouldn't matter until you have enough that even spread over 3 or 4 hours, you're trying to upload more than you're waiting. Not an issue in _this_ case, luckily!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.