Segregation of hosts according to �speed�

Ziran
Ziran
Joined: 26 Nov 04
Posts: 194
Credit: 339738
RAC: 1036
Topic 190493

With the new Albert application we have WU’s with different runtimes varying from 25-100% of an “ordinary� Einstein WU. If a fast host gets short WU’s, it will run in to Maximum daily WU quota. To reduce this problem the project have increased Maximum daily WU quota. The quota is there to stop hosts with problems from downloading more work than it could possible handle. Increasing the quota means that more work is sent out to hosts with problem before it can be stopped.

From what I understand, the “length� of the WU depends on the frequency, and all WU’s in a data set are on about the same frequency.

My suggestion is that the data sets containing longer WU’s are sent to fast hosts and shorter WU’s to slower hosts. What’s needed is a ruff estimate of the speed, so the floating-point speed or integer-speed would do as an indicator of the hosts speed.

This would reduce the number of fast hosts reaching the Maximum daily WU quota, making it possible to have it set to a lower number. It will also reduce the turnaround time for the slower host and reducing the number of results returned late.

Then you're really interested in a subject, there is no way to avoid it. You have to read the Manual.

Ananas
Ananas
Joined: 22 Jan 05
Posts: 272
Credit: 2500681
RAC: 0

Segregation of hosts according to �speed�

Speed isn't the only thing to consider.

I prefer long WUs especially on those PCs that are behind a firewall because the firewall doesn't allow multiple connects with the same account - with a penalty timeout if I try it.

Fast WUs make the BOINC boxes connect more often which increases the chance to get those timeouts.

earlboy
earlboy
Joined: 1 Nov 05
Posts: 5
Credit: 7196033
RAC: 0

RE: My suggestion is that

Quote:

My suggestion is that the data sets containing longer WU’s are sent to fast hosts and shorter WU’s to slower hosts. What’s needed is a ruff estimate of the speed, so the floating-point speed or integer-speed would do as an indicator of the hosts speed.

I agree, this just happened to me today. My fastest box just got a bunch of these short WU, while my slower pc's are crunching the longer WUs. The result? The fastest PC is now idle because it reached it's maximun daily quota, while the others are still crunching. If the short WU were sent to my slower boxes then none of my computers will be idle today.

MarkF
MarkF
Joined: 12 Apr 05
Posts: 393
Credit: 1516715
RAC: 0

How about mixing the WUs so

How about mixing the WUs so that each consecutive WU is a different frequency/speed?

Michael Roycraft
Michael Roycraft
Joined: 10 Mar 05
Posts: 846
Credit: 157718
RAC: 0

Hello, A week or more ago,

Hello,

A week or more ago, I posted to Dr. Allen, suggesting this very idea, and he assured me that the tiny WUs were nearly exhausted. I'm seeing some of them on my rig now.
Easy solution is to choose another project (not CPDN, for obvious reasons, and I've heard that LHC has no work at the moment), give it a 1% resource share, and it will keep your rigs from going idle while never being crunched while you have E@H work, except when it nears deadline.
I know, Mark, your sentiments against being pure Einstein, but we from the Wolverine state have always been adaptable, right?

Respects,

Michael

microcraft
"The arc of history is long, but it bends toward justice" - MLK

MarkF
MarkF
Joined: 12 Apr 05
Posts: 393
Credit: 1516715
RAC: 0

Michael: Please consider the

Michael:
Please consider the suggestion on its merits. I think mixing the WUs up would resolve the problem and require few changes in the code and/or procedures that support E@H.
To answer your question I am very adaptable IMHO. I do not expect the devs to drop everything to resolve a problem that in the overall is costing the project very little. But that does not keep me from suggesting alternatives to a change in E@H that is lowering productivity.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

RE: and I've heard that LHC

Message 23054 in response to message 23052

Quote:
and I've heard that LHC has no work at the moment


Better get a hearing aid, Michael and hurry over to LHC. ;)

Up, 408392 workunits to crunch
44880 workunits in progress
45 concurrent connections

All 1,000,000 turns though.

Michael Roycraft
Michael Roycraft
Joined: 10 Mar 05
Posts: 846
Credit: 157718
RAC: 0

RE: Michael: Please

Message 23055 in response to message 23053

Quote:
Michael:
Please consider the suggestion on its merits. I think mixing the WUs up would resolve the problem and require few changes in the code and/or procedures that support E@H.
To answer your question I am very adaptable IMHO. I do not expect the devs to drop everything to resolve a problem that in the overall is costing the project very little. But that does not keep me from suggesting alternatives to a change in E@H that is lowering productivity.

Mark,

I do think the idea has merit. I conceived the notion independently last week and went so far as to propose it to Bruce. I'm just another cruncher and volunteer on the helldesk, remember, and as such I hold no more influence in developers' decisions than anyone else. The alternative I suggested here earlier is just a way that we ourselves can solve the problem of idle machinery, for the moment, an easy way to take matters into our own hands. :-)

Michael
edited for punctuation
and for this:

in the previous post I wrote "your sentiments against being pure Einstein", whereas what I meant to say was your sentiments in favor of...

microcraft
"The arc of history is long, but it bends toward justice" - MLK

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

RE: I'm just another

Message 23056 in response to message 23055

Quote:
I'm just another cruncher and volunteer on the helldesk, remember, and as such I hold no more influence in developers' decisions than anyone else.


You hang around me too much. :-)

Now, as for the actual question/problem, BOINC was intended to do this. It may still be going to do this on certain projects, but that's for the future. BOINC isn't mature yet, let it grow.

Quote:
From what I understand, the “length� of the WU depends on the frequency, and all WU’s in a data set are on about the same frequency.


Maybe it is for this project, but not for all projects. A Seti unit can go fast or slow, depending on the elevation angle it was taken from. Ingleside has a good list of them for the new Seti_Enhanced application, here. The deadline of the unit will change accordingly, though.

MarkF
MarkF
Joined: 12 Apr 05
Posts: 393
Credit: 1516715
RAC: 0

Michael: I knew what you

Michael:
I knew what you meant and thank you for the responce. Yours is the first responce to the idea since I posted back on the 4th message. Keep up the good work.

SwampMidget
SwampMidget
Joined: 26 Dec 06
Posts: 2
Credit: 326
RAC: 0

Please excuse if this

Please excuse if this question is a little unclear as I'm not as physics savvy as you guys. I'm very enthusiastic to play even a small part in providing my computer down time, but I'm not even clear on what the proggy is processing LOL! I think it would help the newbie users' interest level if their could be some indication made on the client side when something out of the ordinary is found within their data.

_____________________
weird pictures& weird videos

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.