A thought on WU quota

history
history
Joined: 22 Jan 05
Posts: 127
Credit: 7573923
RAC: 0

This is a follow up on my

This is a follow up on my previous post regarding my new Opty 146. I am currently burning it in on Rosetta at 2.8 ghz on air. Can you imagine if I was running the optimized apps at Einstein? This is a 150.00 oem chip performing like an Athlon 64 4400. The Einstein server would be trashing this chip with "no work available" every 12 hours. If I am going to heat the house, I'll do it with work 24/7. BTW, the CDNgeezer bugged out of the "Curse of 32" team at Rosetta almost as soon as it was established. Thanks for the bait and switch. Send your Rosetta credits to the "Curse of 32" team. Joining a team does not eliminate your individual credit.

Regards-tweakster

AnRM
AnRM
Joined: 9 Feb 05
Posts: 213
Credit: 4346941
RAC: 0

RE: BTW, the CDNgeezer

Message 28316 in response to message 28315

Quote:

BTW, the CDNgeezer bugged out of the "Curse of 32" team at Rosetta almost as soon as it was established. Thanks for the bait and switch. Send your Rosetta credits to the "Curse of 32" team. Joining a team does not eliminate your individual credit.

Regards-tweakster


>Not to worry my Son....just a temporary lapse. I was MIA for a few days....Cheers, Rog.

Honza
Honza
Joined: 10 Nov 04
Posts: 136
Credit: 3332354
RAC: 0

RE: Given we have plenty of

Message 28317 in response to message 28313

Quote:
Given we have plenty of computing power available, why not set the initial replication to 4, and if need be increase the daily quota to something like 40 (to compensate for the extra result needed)?

This is a good point to raise my "smarter scheduler" idea again.
If WUs were provided to similar host characteristics (i.e. similar turn-around time), there would be perhaps only 1/3 of pending WUs. Plus, it would solve the trouble when fast crunchers waiting a week for slow one and/or those with large cache (hence high turn-around time).

I hope host characteristics will get actively used in (near) future and does not remain dead statistical numbers...

Webmaster Yoda
Webmaster Yoda
Joined: 15 Mar 05
Posts: 17
Credit: 608427
RAC: 0

RE: This is a good point to

Message 28318 in response to message 28317

Quote:
This is a good point to raise my "smarter scheduler" idea again.
If WUs were provided to similar host characteristics (i.e. similar turn-around time), there would be perhaps only 1/3 of pending WUs. Plus, it would solve the trouble when fast crunchers waiting a week for slow one and/or those with large cache (hence high turn-around time).

Yep, sounds good to me too. Also, going along with this, where a WU does need to be re-issued (after someone fails to return it, etc), send it to a host with fast turn-around time, rather than one that takes another 2 weeks.

I have something like 200WU pending - many of them due to someone not returning a result and in many of those cases the fourth host has a turn-around time of 7 or more days.

Pooh Bear 27
Pooh Bear 27
Joined: 20 Mar 05
Posts: 1376
Credit: 20312671
RAC: 0

This project used to send 4,

This project used to send 4, but then it was deemed the project was tight enough that 3 was enough. Now that it has expanded, the idea of 4 might be something to think of, again.

As for quotas, I run multiple projects, so I never see the quota. I know it's a subject not everyone wants to think about, but now some people are finally going other places. BOINC does a good job of multiple projects. If you set Einstein to a high number to be your main work, and then another project as a low number, you would not have to do any managing. You'd do your limit of Einstein, then crunch the other project until the next day, keeping your machines nice and warm. There are many good projects that could be considered.

I am sure the limits are for a reason. I have seen some things written about it by the development team, before (if you search through the forums, there are some answers). What I want to say about it they are asking us to do a project for them, but it's thier project. They control it they way they see fit. They take advice (as seen with Akos), but they also do things for themselves. They own it, we just crunch for it. I am happy to be apart of this project, and am excited to see it going so well. If they up the quota and start crashing, because they can no keep up with the demand, everyone would be crabbing about that. They know what their systems can handle, so allow them to keep their project the most stable project around.

Honza
Honza
Joined: 10 Nov 04
Posts: 136
Credit: 3332354
RAC: 0

...yet another reason why

...yet another reason why make scheduler smarter, Pooh Bear 27.
I got no enough info to express my opinion on 3 or 4 of initioal replacation. But eighter way, it doesn't help much with quorum issue (not at all) nor pending WUs (until scheduler works as it works now), nor project with server load (same amount of ResultsIds pending, crunched...and less WUs completed)

Implaction that servers can't handle more Results processed and is near the limit already is, I hope wrong.
At least now, when newer/faster oficial applications for various platforms are tested.

Yes, BOINC does very good on multiple projects (and pretty good on multi-core/CPU).
There are participant that prefer using their CPUs in an effecient way.
Consider this scenario: instead of splitting each CPU share 50:50 between two projects (and consumimg more RAM while switching apps), put one machine (be it AMD, which does better here) 100% on Einstein and another one (be it Intel's Presshot) on...let's say SETI, where Intel's CPU does better job.
A single platform, multiple projects...and one glitch - daily quota.

The idea of "backup" project is a good one - once a project get dry on WUs (LHC) or meet daily quota (EAH), run a backup project. Not a solution - just a cure of symptoms...but it works.

paul and kirsty yates
paul and kirsty...
Joined: 25 Nov 05
Posts: 15
Credit: 623610
RAC: 0

i have just checked my

i have just checked my results and see that i am waiting for one more cruncher to return (and my next w/u is also on there pc as well)
i checked there account and found that they has over 320 w/u to crunch
they crunch in about 45000.00cpu secs each w/u
they odviosly have a day to day uplink as they are d/loading w/u every day

at present they are working on w/u from the 20th and still d/loading more

how is this fair on EVERYONE else with these w/u

also i have to suspend projects at moment due to upgrade to 5.4.9 to get just 1 w/u

"dont want to pick on the poor person as i dont know them but ???!!!"

MarkF
MarkF
Joined: 12 Apr 05
Posts: 393
Credit: 1516715
RAC: 0

RE: i have just checked my

Message 28322 in response to message 28321

Quote:

i have just checked my results and see that i am waiting for one more cruncher to return (and my next w/u is also on there pc as well)
i checked there account and found that they has over 320 w/u to crunch
they crunch in about 45000.00cpu secs each w/u
they odviosly have a day to day uplink as they are d/loading w/u every day

at present they are working on w/u from the 20th and still d/loading more

how is this fair on EVERYONE else with these w/u

also i have to suspend projects at moment due to upgrade to 5.4.9 to get just 1 w/u

"dont want to pick on the poor person as i dont know them but ???!!!"


It has been my expierence that folks what are gumming up the works frequently don't know it. It might be worth while to address them directly with a polite explaination of the problems they are creating.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5842
Credit: 109386720141
RAC: 35926451

RE: i have just checked my

Message 28323 in response to message 28321

Quote:
i have just checked my results and see that i am waiting for one more cruncher to return (and my next w/u is also on there pc as well)
i checked there account and found that they has over 320 w/u to crunch
they crunch in about 45000.00cpu secs each w/u

Yes, you are waiting and the person you are waiting for will most likely never return that particular result you are waiting for. If you go to the very earliest part of his results list you will see that he was downloading work and then returning it quite promptly, only a day or so after receiving it. If you check one of those results you will see that the version of BOINC is 4.19, which is now way out of date. We can conclude that this person is probably the "set and forget" type and he probably doesn't even realise there is a problem.

The problem started on May 21. There was valid work downloaded and returned on that day. Since then until very recently, there have been around 280 downloads which haven't been returned. Now on May 26, his machine is suddenly returning work again. His machine would not be processing and returning the most recent work if it actually had the older work, so we can conclude that his machine doesn't actually have the older work despite what the results list on the server says. He probably has a whole batch of "ghost" results.

Back in the days where version 4.19 was the current BOINC version, there was a situation where some people were plagued with "ghost" results. This was caused by a glitch in communications where the server had sent work but the client never received it. This problem doesn't occur now with current versions because of improved client/server communications. Network problems still occur but now they are recognised and the data is retransmitted. I'm pretty sure that version 4.19 can't do this so the 280 missing results will never be resent to the original machine. We will just have to wait for them to time out and be resent to different machines.

Should we be blaming the person who owns the offending machine? Of course not since there is no requirement for him to be closely monitoring his list of results. BOINC is supposed to be "set and forget". If he is not reading these message boards, there is no way that we can inform him of the problem. The only people who could contact him are project staff and they have far more important things to do than attempt to babysit all the potential problem situations out there. BOINC will take care of it itself in due course so we shouldn't even give it a second thought.

Quote:
how is this fair on EVERYONE else with these w/u

As I've mentioned, it's not a serious problem since the work will timeout and be resent. I know it is frustrating for you but you will eventually get the credit for that work. However, there is a different situation that you are inflicting on others and others may get upset with you because this different situation will not be corrected over time. Did you notice that you are only claiming 9 credits for that pending result? This is caused by your choice to run an Akosf optimised science app without running a calibrating version of the BOINC client. Any person running the standard app would be claiming much higher, probably around 45 - 50. If two of the three only claim 9 credits then everybody gets 9 so the person running the standard app gets heavily penalised. Some people get very upset about this. There is a "sticky" thread in this forum about running calibrating clients.

Quote:
also i have to suspend projects at moment due to upgrade to 5.4.9 to get just 1 w/u

If BOINC doesn't want to download more EAH work, it is probably because other projects have a higher priority. By suspending other projects just to get EAH work you are forcing BOINC to break your own preferences for resource shares. Why don't you just change your preferences to give EAH a bigger resource share if you want to do more EAH work? This behaviour you are seeing is how BOINC is supposed to manage the work according to your resource share settings and is probably nothing to do with an upgrade to 5.4.9. If you want more specific advice you should state how many projects you are running and what their respective resource shares are set at. How many hours a day you run your machine could also be a significant factor. You have a turnaround time listed as close to six days so either the EAH resource share is very low or you don't run your machine very often.

Hope this helps and good luck with your pendings!!

Cheers,

Cheers,
Gary.

paul and kirsty yates
paul and kirsty...
Joined: 25 Nov 05
Posts: 15
Credit: 623610
RAC: 0

thanks for your reply

thanks for your reply gary

1)Should we be blaming the person who owns the offending machine?
as i said i wasn`t blaming them its just a pain
is there anyway that ANY project could check the version of boinc and imform the user of any upgrades (in case they dont check the forums on a regular basis??)
or could boinc do this itself ??
at least they will be resent

2)Did you notice that you are only claiming 9 credits for that pending result? This is caused by your choice to run an Akosf optimised science app without running a calibrating version of the BOINC client.

yes i know and i am i a cursed if i do and cursed if i dont situation as i run 14 projects and the only one that a optimised client will help with is einstien as seti claimes set credit now and all the others i will be accused of cheating by claiming more credit than i am meant too (i have read ALL THE THREADS on ALL the forums about this) and as einstien will soon also be using set credit (i beleive ) i thought that it was the easiest way to go (i have only just started to use an optimised app (around the 20/5/06)but what a difference and i beleive that callibrated clients take about 30 w/u to settle down so that would take me until about august at my present rate of crunching

3)If BOINC doesn't want to download more EAH work, it is probably because other projects have a higher priority

boinc doesent want to upload ANY projects at the moment i just have to wait for it to settle down after the upgrade (at time of writing work fetch has been enabled for 2 hours and still no work i have adsl always connected and projects set to allow new work )will only let me download if i suspend all "work" and update manually i have let it get down to one 1 hour w/u before now and still no new w/u sent
i have to go out today so will leave it and see what happens when i return

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.