The coming distributed processing explosion

Alex

Joined: 1 Mar 05

Posts: 451

Credit: 507168262

RAC: 83617

Hello

12 Sep 2010 19:23:50 UTC

Message 99618 in response to message 99617

(moderation:

)

Hello Ageless,

Quote:

Yet there's a problem here. Seti Enhanced tasks are 350KB, whereas Astropulse are 8MB. Seti has got one 100Mbit connection into the lab, so when they allow Astropulse tasks to go out, they can send 1 at a time with a second slowly filtering through on the rest of the bandwidth. One AP takes up 64Mbit of the connection already. Most people out there getting the tasks in don't even have a speedy connection at 100Mbit. They may have it on their personal network, but not their internet connection, so as long as two computers are downloading Astropulse from Seti, all the other 249,511 that may be knocking on the door at that time will have to wait: saturated bandwidth.

Before you say, increase the feeder, or add another feeder... it won't help as they only have one 100Mbit connection into the lab that connects to all hosts out there. The Space and Sciences Lab just had a 1,000Mbit connection laid up the mountain, yet Seti is still only getting 100Mbit off of that. For them to lay their own Gigabit connection down the mountain would cost approximately 60,000 dollars. Money they do not have.

Quote:
MW has a deadline of more than a week, so I assume you are referring to the tree day outage of the SETI servers.

Stop assuming things. I was talking about their old deadline. I haven't run MW in a long time, only ran a test on my ATI HD4850 a couple of months ago. But even a 7 day deadline isn't very useful if you are allowed to download 5,000 to 10,000 tasks on the CPU. Not unless those tasks take no more than 60 seconds a piece.

The theme here is 'The coming distributed processing explosion'.

Some people posted problem of a specific project.
I posted, how other projects have solved their specific problems. And I also posted, that, in my mind, there is always a way to solve a problem.
I did ask, if a solution from another project could be used at SETI. To be exact, I posted
Would that help? or
Is it really necessary ...

What I earned:

Quote:

Madness (personal opinion).

This is not my way of life. I do it different.
I tried to zip a result-file from SETI. The result is: 367kB unzipped, 268kB zipped (7-zip). I'm shure that someone may find a different algorithm to get a better compression. And I'm also shure that someone can explain the effect for the given bandwith. And I'm also shure that programmes know how to make this zipping/unzipping automatically.
Maybe they are still doing that. But I never found any hint about that. Or simply I didn't recognize it, as you did not know the 8-day deadline from MW and a very usual message from server: No work sent. Reached limit of 48 tasks in progress (i7, 2 GPU's).
But please keep in mind, this is a form of brainstorming, not an attempt to be smarter than the developers.

Quote:

Uhm... make an educated guess.

No more comment on that.

Quote:

Assume something. You're good at doing that.

I take this as a compliment.

Regards

Alexander

ExtraTerrestria...

Joined: 10 Nov 04

Posts: 770

Credit: 578346872

RAC: 197536

Well.. it's trivial that a

12 Sep 2010 20:19:34 UTC

Message 99619

(moderation:

)

Well.. it's trivial that a project which needs to transfer large amounts of data also needs an appropriate internet connection. That's also a problem which is trivial to solve if you've got the money. Otherwise you could obviously compress your data (I remember people suggesting this back in the days of seti classic, not sure if it's being done in the current seti project). The up-/downloads of some files might also be distributed among different locations, as Einstein does it.

Over at Milkyway it's not madness.. at least not the part which has been discussed here. Their project setup is totally different from SETI or Einstein. They've got a function which describes the difference between the observed milkyway structure and a calculated structure. The latter is derived from a bunch of parameters. Their goal is to find the real structure by searching through the parameter space.
Such a search requires many function evaluations, which are costly. That's what we're doing (each WU contains 3 separate functions calls now, if I remember correctly). It also requires collecting results rather quickly and deciding which steps to take next, i.e. which parameters will probably yield a better solution. They can't send out new work before previous results get back. They've got many different searches going at a time, otherwise they couldn't keep our machines busy.

That's why each host is only allowed a rather small number of concurrent WUs there - they need results back. And that's why you'll never see large amounts of WUs ready to send. The SETI WUs are generated in a different way (basically data splitting) and there it doesn't matter when results get back, as long as the database can take it.

If SETI can't change the size of the data packets (flexible programming anyone?) then they could still pack multiple data packets into one BOINC WU to reduce database load. Old PCs wouldn't like it, so one could just as well vary the amount of work within a WU and either hand them out based on host speed or create a user preference similar to Rosetta@Home, where users can choose how long their WUs should run.

Anyway, a project has to deal with traffic and if they're supporting GPUs with a very large range of possible host speeds, as well as rapidly increasing speeds of individual hosts. IMO adapting the amount of work per WU (not completely arbitrary, just allow some sane steps) is a really nice way to keep the server loads manageable. I think BOINC works best if WUs take >1h and hosts only have to contact the server every couple of hours at most.

MrS

Scanning for our furry friends since Jan 2002

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6588

Credit: 317448849

RAC: 370712

RE: But hopefully we can

12 Sep 2010 21:54:31 UTC

Message 99620 in response to message 99616

(moderation:

)

Quote:

But hopefully we can get the answers. It's the basis for Matt's and my next investment.

Alex, you really should read the threads that you wish to contribute to, and any answers given. The content of your posts reveal that you actually haven't done either. If you had, then you'd already know that the answers aren't yet known. No one here is your personal secretariat. Specifically : when the developers get the time to work further on the cuda stuff and yield releases for testing and comment you will know. Keep your money in your wallet for now.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Matt Giwer

Joined: 12 Dec 05

Posts: 144

Credit: 6891649

RAC: 0

RE: People can do what they

13 Sep 2010 1:47:53 UTC

Message 99621 in response to message 99607

(moderation:

)

Quote:

People can do what they want with their money. Here in Italy they buy SUVs just to go the supermarket instead of crossing the Sahara.So do people at SETI@home buying Fermi cards. I wonder if they ever read the "Theory of the leisure class" by Thorstein Veblen. It was one of the first books they made me read at Thomas Jefferson School in St.Louis back in the Fifties. Conspicuous Consumption was the term used by Veblen.
Tullio

I could care what less people do with their money too. But if the the choice is between squeezing 20-30% more work through the hottest computer and GPUs on the market or 200% more work by getting two computers for the same money the obvious choice is the latter.

tullio

Joined: 22 Jan 05

Posts: 2118

Credit: 61407735

RAC: 0

Perhaps the best way to judge

13 Sep 2010 8:40:54 UTC

Message 99622

(moderation:

)

Perhaps the best way to judge a computer would be to know its Mflops/watt performance. See:
GreenList
Tullio

ExtraTerrestria...

Joined: 10 Nov 04

Posts: 770

Credit: 578346872

RAC: 197536

No: purchase cost and usable

13 Sep 2010 20:28:53 UTC

Message 99623 in response to message 99622

(moderation:

)

No: purchase cost and usable lifetime also count.

Consider this: dynamic, non-leakage power consumption of CPUs increases with V^2, whereas maximum frequency scales approximately with V^1. That means you gain efficiency by reducing the voltage. But how far should you go? Judging just by performance/flops you should run your mighty 3 GHz i7 (or whatever modern CPU) at the lowest voltage it can still work at (probably between 0.7 and 0.9 V). The resulting frequency will be a couple of 100 MHz.

That'll give you maximum CPU efficiency in terms of performance/flops, but you're not getting much work done for your investment. If you factor in the entire system power consumption the optimum working point shifts to higher voltage & frequency points, because you're adding a constant baseline power draw.

If you consider that you'll only run this CPU a couple of years and that buying it cost you the same, no matter how fast you run it, you might want to clock it even higher. The question then becomes: how much power (=money) is more work done worth to you?

MrS

Scanning for our furry friends since Jan 2002

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7224074931

RAC: 1009217

RE: If you factor in the

14 Sep 2010 3:39:14 UTC

Message 99624 in response to message 99623

(moderation:

)

Quote:

If you factor in the entire system power consumption the optimum working point shifts to higher voltage & frequency points, because you're adding a constant baseline power draw.

Agreed on that part (though an earlier part is non-factual, the minimum operating voltage stops going down with frequency at a rather higher frequency than you think, at last on the parts where I've had occasion to generate or see the data).

A businessman or economist would not have much trouble saying that to first order you want to get the minimum cost per unit lifetime work generated, using for cost the present discounted value of all costs, including purchase price, power consumption, maintenance (including your own time fiddling), and applicable charge for space consumed, annoyed spouse...

Quote:

If you factor in the entire system power consumption the optimum working point shifts to higher voltage & frequency points...

So true. I have purring behind me a brand new build. While it is to be my daily driver, and my audio hobby machine, I meant it to be a good and efficient BOINC contributor as well. To that end I specifically wanted a 32nm Westmere, as that process is reputed to benefit from more real attempt to reduce leakage current, and just generally that product/design combination is thought to have especially high computational output/power. It is not a bad machine, but not a great one, as I failed to consider just how high power my motherboard and graphics choices might be. The idle power of the box at default CPU parameters is 100 watts. So I think my best plan is to run it at appreciably over spec clock rate at slightly over nominal voltage, and save back the power cost by turning relegating one E6600 machine to being turned on only when used, rather than 24/7. I don't think any X58 chipset motherboard is even moderately low-power. Six months from now I suspect a consumer-oriented Sandy Bridge CPU with no added graphics card on a consumer motherboard would do a very nice job of filling the role I tried to build to. (I don't game)

I think for many people posting here, turning off their least power-efficient machine would be the single greenest thing they could do easily. My E6600 may well be above the fleet median on compute/power. But it is very, very far below my new E5620, even with the higher than expected motherboard, graphics card, and CPU leakage power.

Matt Giwer

Joined: 12 Dec 05

Posts: 144

Credit: 6891649

RAC: 0

RE: Perhaps the best way to

14 Sep 2010 3:41:04 UTC

Message 99625 in response to message 99622

(moderation:

)

Quote:

Perhaps the best way to judge a computer would be to know its Mflops/watt performance. See:
GreenList
Tullio

Problem with those is they take the rated wattage of the power supply not what it actually draws. Spinning up HDs takes more peak watts than keeping them running for example. Actually measuring the wattage drawn in normal operation is required.

In winter heating is good and saves money while in the summer requires more A/C. In your climate is that a break even?

A long time ago I did a quick calculation and copying those charity ads saying "For 17 cents a day you can help ET find a friend in this lonely universe." I won't swear to that number but 17*365 isn't that much in comparison to the cost of the computer. A monitor that goes into standby is much more important if you leave it on 24/7 to save replacement cost.

On top of that mflops do not translate from CPU to GPU very easily.

===

There are always two separate items the cost of acquisition and the cost of ownership. For myself I have a new computer ever three years or so and migrate the previous to crunching but a six year old computer's performance is relatively so slow it is not worth running. So more than a single old computer is a waste of electricity. So the estimated cost of ownership is the sum over six years. That does get a bit close to acquisition cost.

tullio

Joined: 22 Jan 05

Posts: 2118

Credit: 61407735

RAC: 0

I am running a SUN WS 24/7

14 Sep 2010 4:44:41 UTC

Message 99626 in response to message 99625

(moderation:

)

I am running a SUN WS 24/7 since January 2008 and consume 5 or 6 kWh/day. At 16 eurocent/kWh that is less than 1 euro/day. I have no air conditioning and no diskwashing machine, only a washing machine,a TV set, two stereos and an occasional lawn mower. Lamps are all high efficiency bulbs. Heating is by methane gas and that is my bigger expenditure. I have a 4.8 Mbit ADSL connection out of 7 nominal and have bought an UPS just to save the PC running SuSE Linux. Before the SUN with its Opteron 1210 at 1.8 GHz I have used a 400 MHz PII for eight years and then gave it to my son so that his son could enjoy the MS Flight Simulator. On Linux I only have an ACM Combat Simulator. I am running 6 BOINC projects and never miss a deadline. But now SETI is down and LHC is not giving any work to Linux boxes. I am waiting for the CERN-VM virtual machine. I was also running an OpenSolaris guest on top of VirtualBox and a SETI@home app by Dotsch on it, but it was very slow.Cheers.
Tullio

tolafoph

Joined: 14 Sep 07

Posts: 122

Credit: 74659937

RAC: 0

Hi, I just found a

14 Sep 2010 22:02:37 UTC

Message 99627

(moderation:

)

Hi,

I just found a screenshot I took almost 3 years ago. the floating point speed was 66 TFLOPS.

The coming distributed processing explosion

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner