Too fast

ADDMP
ADDMP
Joined: 25 Feb 05
Posts: 104
Credit: 7332049
RAC: 0
Topic 190653

I have a dual-core athlon-64 computer that has been stopped by E@H for a 6-hour enforced delay because it completed its daily alotment of 32 units before 24 hours. It has been running some "albert" units in about 4000 sec or 1.1 hours for each unit for each core. That means in 24 hours it should complete 2*(24/1.1)= 43.6 units with both cores running.

I think there might be some problem with the alotment calculation at E@H that assigned 32 unts max per day.

I can delete the installation & re-install if that is the only fix.

[This computer is named XBOX...]

ADDMP

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 1580

Too fast

No it's nothing wrong with your install. The problem is that unlike Einstiens which were fairly consistant in size Alberts can vary by a factor of 4, and the scheduler doesn't take the difference into account when handing out work. You're not the first person to have problems with a really fast machine and short Alberts, some highend macs've been burned as well. For the moment the best you can do is to add a 2nd project and set the work distribution to 99/1. The 2nd project will then (almost) only run when you're out of work for e@h, and a starvation attack here will put it far enough ahead that the 2nd project won't do anything at all for a time.

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1119
Credit: 172127663
RAC: 0

RE: I have a dual-core

Quote:

I have a dual-core athlon-64 computer that has been stopped by E@H for a 6-hour enforced delay because it completed its daily alotment of 32 units before 24 hours. It has been running some "albert" units in about 4000 sec or 1.1 hours for each unit for each core. That means in 24 hours it should complete 2*(24/1.1)= 43.6 units with both cores running.

I think there might be some problem with the alotment calculation at E@H that assigned 32 unts max per day.

I can delete the installation & re-install if that is the only fix.

[This computer is named XBOX...]

ADDMP

Looking at the scheduler logs (available by following on-line links) I see that your computer has:

2006-01-21 23:40:03.0533 [PID=19068] [debug   ] CONTENT_LENGTH=4514 
2006-01-21 23:40:03.1788 [PID=19068] [normal  ] Handling request:   host 522041, platform i686-pc-linux-gnu, version 5.2.13, RSF 1.000000
2006-01-21 23:40:03.1788 [PID=19068] [normal  ] OS version Linux 2.6.13-15-smp
2006-01-21 23:40:03.1876 [PID=19068] [debug   ] Request [HOST#522041] Database [HOST#522041] Request [RPC#0] Database [RPC#0]
2006-01-21 23:40:03.1884 [PID=19068] [normal  ] Processing request  [HOST#522041]  [RPC#0] core client version 5.2.13
2006-01-21 23:40:03.5179 [PID=19068] [debug   ]   Result is on [HOST#522041]: r1_0148.5__190_S4R2a_2
2006-01-21 23:40:03.5180 [PID=19068] [debug   ]   Result is on [HOST#522041]: r1_0148.5__189_S4R2a_1
2006-01-21 23:40:03.5180 [PID=19068] [debug   ]   Result is on [HOST#522041]: r1_0148.5__188_S4R2a_1
2006-01-21 23:40:03.5190 [PID=19068] [normal  ]   [HOST#522041] got request for 1831.035555 seconds of work; available disk 16.098672 GB


So really the question in my mind is, why doesn't this machine have a LOT more results on it? And why isn't it requesting more than 1800 seconds of work?

PS: one of your intel boxes is reporting results with zero CPU time. Consider updating BOINC to fix this problem.

Director, Einstein@Home

ADDMP
ADDMP
Joined: 25 Feb 05
Posts: 104
Credit: 7332049
RAC: 0

RE: No it's nothing wrong

Message 24373 in response to message 24371

Quote:
No it's nothing wrong with your install. The problem is that unlike Einstiens which were fairly consistant in size Alberts can vary by a factor of 4, and the scheduler doesn't take the difference into account when handing out work. You're not the first person to have problems with a really fast machine and short Alberts, some highend macs've been burned as well. For the moment the best you can do is to add a 2nd project and set the work distribution to 99/1. The 2nd project will then (almost) only run when you're out of work for e@h, and a starvation attack here will put it far enough ahead that the 2nd project won't do anything at all for a time.

Thanks for that news & suggestion. I had some trouble getting BOINC to allow me to run two E@H versions at once, but I now have both a linux/wine/windows version and a native linux version running simultaneously at different levels of "nice"ness. That should be OK, but I'll see what happens as they return results.

ADDMP

ADDMP
ADDMP
Joined: 25 Feb 05
Posts: 104
Credit: 7332049
RAC: 0

RE: RE: I think there

Message 24374 in response to message 24372

Quote:
Quote:

I think there might be some problem with the alotment calculation at E@H that assigned 32 unts max per day.

I can delete the installation & re-install if that is the only fix.

[This computer is named XBOX...]

ADDMP

Bruce Allen wrote:
Looking at the scheduler logs (available by following on-line links) I see that your computer has:
[code]...

So really the question in my mind is, why doesn't this machine have a LOT more results on it? And why isn't it requesting more than 1800 seconds of work?

Quote:

Sorry, I can't interpret the info you listed, but very likely you are looking at logs after I did a lot of tinkering trying to force the computer to get more units. It is now running both the native linux & the windows versions of BOINC simultaneously. The native Linux version is niced-out to a much lower priority, & should not need many units.

When it was running straight linux/wine/windows, it usually had about 8 units simltaneously either waiting to run or running or waiting to return.

But nevertheless, I think if you check its completed results, they were running in about 4000 seconds, & that is about 43 units a day, but it was restricted to receiving only 32 units a day.

Quote:
of your intel boxes is reporting results with zero CPU time. Consider updating BOINC to fix this problem.

Thanks, l'll check it. I am running linux/wine/windows version on most boxes & I have not been able to get that working with the newer BOINCs. So it is a trade off between running slower with the native linux version & getting occasional glitches with wine.

I might convert the Intel boxes back to native Linux, since that version was more efficient with Intel than with Athlons.

ADDMP

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1119
Credit: 172127663
RAC: 0

RE: But nevertheless, I

Message 24375 in response to message 24374

Quote:
But nevertheless, I think if you check its completed results, they were running in about 4000 seconds, & that is about 43 units a day, but it was restricted to receiving only 32 units a day.


I've bumped up the per cpu quotas by another factor of two. Let's see if that fixes this problem.

Director, Einstein@Home

AnRM
AnRM
Joined: 9 Feb 05
Posts: 213
Credit: 4346941
RAC: 0

Thanks, Bruce, for increasing

Thanks, Bruce, for increasing the MDQ to 32.....our dual core and faster 64s will be happy again! I can also increase the E@H share on these machines back to the pre-'Albert' levels. Tweakster will be happy and warm as well!....Cheers, Rog.

Michael Roycraft
Michael Roycraft
Joined: 10 Mar 05
Posts: 846
Credit: 157718
RAC: 0

RE: I've bumped up the per

Message 24377 in response to message 24375

Quote:
I've bumped up the per cpu quotas by another factor of two. Let's see if that fixes this problem.

OMG, Sir, no, tell me you didn't! At least not without implementing a "abort queued work at uninstall" bit, AND returning to a replication of 4. Already, we have a huge and increasing quantity on "pendings" due to the normal numbers of project dropouts and eyes-bigger-than-their-bellies WU hoggers. Your databases will be bloated, and it will not be an unusual case to have folks waiting a month or more for pending credits to be resolved. This will be a headache of epic proportions, and even it quickly reversed, the "hangover" will last for a month.

Dr. Allen, the "too fast, not enough work" problem only affected a (relative) handful of the fastest hosts, crunching the shortest WUs, less than 3% (I would guesstimate) and was safely solved by adding a backup project with minimal share. It was at most temporary, as you said that the "shorties" have been nearly depleted. This 32/day quota will affect most of the balance of crunchers, and new complaints will increase exponentially. It ain't gonna be pretty!

Respectfully,

Michael

edited for typos

microcraft
"The arc of history is long, but it bends toward justice" - MLK

Wurgl (speak^Wcrunching for Special: Off-Topic)
Wurgl (speak^Wc...
Joined: 11 Feb 05
Posts: 321
Credit: 140550008
RAC: 0

RE: OMG, Sir, no, tell me

Message 24378 in response to message 24377

Quote:
OMG, Sir, no, tell me you didn't! At least not without implementing a "abort queued work at uninstall" bit, AND returning to a replication of 4. Already, we have a huge and increasing quantity on "pendings" due to the normal numbers of project dropouts and eyes-bigger-than-their-bellies WU hoggers.

What do you like more? A box which is sitting idle waiting for the next day to get more work or pendings which cause excellent cobblestones somewhen later?

I do not care for the pendings, they are as good as my money on my bank. Somewhen I will get all of of them. Michael, be patient, the time works for you :-)

Michael Roycraft
Michael Roycraft
Joined: 10 Mar 05
Posts: 846
Credit: 157718
RAC: 0

RE: What do you like more?

Message 24379 in response to message 24378

Quote:
What do you like more? A box which is sitting idle waiting for the next day to get more work or pendings which cause excellent cobblestones somewhen later?

Again, no excuse for an idle box, except stubborness. Repeat the mantra - secondary project, minimal share, secondary project, minimal share. That, exactly, is what BOINC was designed to do. It doesn't require a Nostradamus to foresee the upcoming flood of unhappy participants. Where we once could see the light at the end of the tunnel (depletion of short WUs), now the light is transformed into a diesel truck hauling a double-wide trailerhome. Is it you few who are going to personally handle all the flood of complaints? How selfish can a few people be, to cause problems for the majority instead of using the designed-in solution?

Quote:
I do not care for the pendings, they are as good as my money on my bank. Somewhen I will get all of of them. Michael, be patient, the time works for you :-)

I'm sorry, but I don't have time to be patient. I'm on a rather short "deadline" (bad pun) myself.

Respects,

Michael

edited - to soften the tone
edit - reference to personal difficulties deleted

microcraft
"The arc of history is long, but it bends toward justice" - MLK

AnRM
AnRM
Joined: 9 Feb 05
Posts: 213
Credit: 4346941
RAC: 0

Michael, I think you should

Michael, I think you should have more faith in the project Admins on their MDQ change. I'm sure they have done the equivelent of a 'cost/benefit' analysis on this and they realise that more and more dual core/faster boxes are coming on line and they will have to adjust the MDQ sooner or later. Yes, you may have to wait longer for a 'problem' WU but they are still 'money in the bank' IMHO it will not solve the problem of people attaching and leaving with WUs unprocessed. The delay in validation has already increased with the shift to 14 days and initial replication reduction to 3. I think the MDQ has some effect but is not the critical problem....Cheers, Rog.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.