The coming distributed processing explosion

ExtraTerrestria...

Joined: 10 Nov 04

Posts: 770

Credit: 540280797

RAC: 133365

Just a side note: as far as I

2 Oct 2010 13:43:09 UTC

Message 99669 in response to message 99668

(moderation:

)

Just a side note: as far as I know Green Houses work of course by capturing hot air, but the other main factor is capturing radition: the visible surface of the sun is about 6000K and emits with a certain spectrum quite close to a black body at that temperature (that's why we like to keep our monitors at ~6000K color temperature.. because that's the "white" we're used to). This radiation reaches earth and parts of it are filtered out (specific molecule's resonances and especially the UV is absorbed by the ozone layer etc.). The remaining spectrum is modified (called AM1, AM1.5 or something in the solar world, depending on your latidue), but basically still the same rather highly energetic / short wavelength one. Upon absorption within the Green House the absorbing body is raised in temperature and emits black body radiation itself (in fact it does it all the time, but when it's heated above ambient it looses energy to the surrounding via this radiation). However, the spectrum is different since the radiating body will have approximately room temperature (300K). This radation is centered far in the infrared, low energy / long wavelength region. If you make your glass reflect and/or absorb this light you effectively capture it and suppress another cooling mechanism.

MrS

Scanning for our furry friends since Jan 2002

Matt Giwer

Joined: 12 Dec 05

Posts: 144

Credit: 6891649

RAC: 0

RE: Just a side note: as

3 Oct 2010 8:23:17 UTC

Message 99670 in response to message 99669

(moderation:

)

Quote:

Just a side note: as far as I know Green Houses work of course by capturing hot air, but the other main factor is capturing radition: the visible surface of the sun is about 6000K and emits with a certain spectrum quite close to a black body at that temperature (that's why we like to keep our monitors at ~6000K color temperature.. because that's the "white" we're used to). This radiation reaches earth and parts of it are filtered out (specific molecule's resonances and especially the UV is absorbed by the ozone layer etc.). The remaining spectrum is modified (called AM1, AM1.5 or something in the solar world, depending on your latidue), but basically still the same rather highly energetic / short wavelength one. Upon absorption within the Green House the absorbing body is raised in temperature and emits black body radiation itself (in fact it does it all the time, but when it's heated above ambient it looses energy to the surrounding via this radiation). However, the spectrum is different since the radiating body will have approximately room temperature (300K). This radation is centered far in the infrared, low energy / long wavelength region. If you make your glass reflect and/or absorb this light you effectively capture it and suppress another cooling mechanism.

MrS

Mea Culpa for the digression on heating. There is so much half-baked nonsense going around I thought it worth including.

So let me drop back to the major point that I did not realize until I moved to Florida, how high the sun is in the sky. Prior to here I was in Wash DC and Cincinnati, Ohio essentially the same latitude. It is a noticeable contributor even without thinking about it. The fact that shadows are shorter is noticeable. I didn't notice it when I was here on business but as soon as I was outside every day I noticed something was different.

And I didn't notice the internal effect until the A/C was out for a few days. The place got incredibly hot. And then zone A/C leaves some rooms incredibly hot, hotter than I ever experienced back home even during A/C failures up north.

Even though the climate in Florida is better because of the peninsula effect than states further north the increase in population in Florida is a function of the popularity of air conditioning. If you look just at a Mercator projection map you don't realize just how much further south Florida is.

Matt Giwer

Joined: 12 Dec 05

Posts: 144

Credit: 6891649

RAC: 0

The thread got off topic last

1 Apr 2011 7:03:52 UTC

Message 99671

(moderation:

)

The thread got off topic last time. Hoping that will not happen again let me talk about the explosion again.

And again NOT a bragging war. I am simply recounting what has been my "standard" setup since 2000. Everyone has their own standard configuration, your own computer, the wife's, the kids, whatever. I probably have no more than most people and do not have a wife who would never settle for a half powered hand-me-down nor kids who insist upon something they can brag about or fast enough for online games. What I have now is more like I had when I had a wife and kids around the house. So NO bragging war, please. The cost savings from a few extravagant gifts for grandchildren and feeding, clothing, and housing children is so great even with the computers I am ahead of the game.

My main "power" use for computers is rendering animations. For many years I kept my previous computers online as I bought when processor speed doubled. So it was an easy equation CPU cycles were 1+1/2+1/4 and 1/8 was not worth the electricity == buy the newest and retire the oldest. As rendering is a few days of real frame creation per month of scene development first seti@home and later boinc became the obvious use for the unused cycles.

Computers last longer when running 24/7 than if turned on and off every day. And they last even longer if they run at a constant temperature, avoiding heat cycling, so boinc keeps them warm. Being noble and altruistic in contributing processing has a practical side, Ayn Randian side.

About 18th months ago I put my first x4 core online and since then the equation has not been processor speed but cores. Just this week I completed by three computer setup with 4 cores each at 1.8, 2.6 and 3.4 GHz. So with my old rule of doubled clock I have an extra computer at 2.6 but I am back where I wanted to be with three computers to handle the workload. (And I have yet to up the quality of my rendering to match the available processing.)

As I asked after the processing explosion now I have gone from three single cores at 2.4, 1.5 and 0.8 GHz to the above setup. In roughly 18 months I have increased my boinc throughput by a factor of 12 using the original configuration but with state of the art machines. My target price per machine has always been about $500 and only violated that with the fastest new machine at $700 -- but if I had waited just one month I would have spent $800 for a 6 core 4GHz machine so cost per CPU cycle would have been even lower.

So the explosion has started as I predicted. Everyone who upgrades is going to be on this curve. It is now hard to find single core new machines. Four and more cores will become the norm as software expands to use the available processing just as work expands to fill the time available. Where 2 cores are the most common today it will be 4 cores by next year this time.

Now I can play server and bandwidth games as well as anyone not in the business of making it work. But with a normal upgrade cycle of three years and given the start of 2 core machines by next year this time 2.+ cores should be the average and doubling every three years at the most. Looking at some of the literature there appears a good chance of doubling the number of cores every three years as well as at least 25% increase in clock speed per three years.

That will not bother my bandwidth in the least even if it does not increase even though it does increase so fast the local cable company can't make TV ads fast enough to advertise the latest speeds.

I regularly read of problems at S@H with bandwidth and server limits which is about what on would expect with the oldest and most popular. Milkyway seems to have had such problems but since they kicked me out for my politics I haven't been able to follow it up.

Without intent of ingratiation, einstein appears to be up to the current workload given the roughly constant pending credits, i.e. awarded credit appears like clockwork whereas S@H has to call a weekly two day holiday. Not criticizing just fact -- they hate criticism so don't let them read this. ;)

This is going to hit huge. Literally on the order of 4x the cores meaning 4x the results in three years. And the good news is the servers on the @home side will also have the same increase and are hopefully using multi-core Xeon chips with virtualization already. Everything on the server side has to increase to deal with this.

Hard drives are behind the curve at the moment. There should be 4TB drives at $100 by now but there are only a couple 3TB near $200 available. Gbit LANs chips are the state of the art.

The big boys solve this by creating cloud computing. And they are big enough to be selling cloud services. I'm an old fart and I don't trust clouds either. But if all clouds would connect up and only sell their local management services and split other profits by hardware contribution I would be less skeptical -- think Redhat linux. If one goes down they all go down. It would become a planet-wide infrastructure. If it goes down, my loss is the least of the world's problems.

Anyway, just a heads up on what is not just my fevered imagination but in fact what I have done in the normal course of what I do without concern for any of the boinc projects.

And lets be clear I am NOT a gamer. I am not buying CUDA cards just increase my RAC. I am not buying computers just to increase my RAC -- not that I mind the bragging rights.

This is just to do what I do every day. It is what everyone is going to have just to do what they do every day very soon in terms of annual budgets and the five year plans government grant agencies love. Grant proposals for 2012 should have been completed months ago. You have already lost a year.

ExtraTerrestria...

Joined: 10 Nov 04

Posts: 770

Credit: 540280797

RAC: 133365

Hey Matt, actually I'm not

3 Apr 2011 12:19:15 UTC

Message 99672 in response to message 99671

(moderation:

)

Hey Matt,

actually I'm not sure what you're looking for, or what's left to discuss. For me the problem is quite clear:

- multi-processors and GPUs are increasing the available crunching power much faster than in the single core era, no doubt about this
- due to its inherently parallel "independent task"-based nature distributed computing can make perfect use of this power (in contrast to most other software)

The increase in theoretical performance does translate into real world performance for us. This puts a higher load onto the projects and servers. The limiting factors are:

1) internet bandwidth
2) database load

Point 1 is expensive, but straight forward to solve: buy a fat pipe, use your bandwidth efficiently (compress data, avoid sending duplicates) and if this is not enough distribute your load over several locations with their own connections. You've got to have the money to do this, otherwise you probably shouldn't try to run a project (or make it smaller).

Point 2 is more challenging, but not hopeless. First and most obvious: keep the WUs large enough so your BOINC database is not overwhelmed. You should always be able to do this by putting more work into each WU. Ideally you'd offer WUs with different amounts of work in them, so you can serve high end as well as low end CPUs. This becomes even more important with GPUs, where the spread in speed is even larger.
Using these techniques you should be able to handle almost any load on your BOINC database. Furthermore by now there are goodies such as this availble, offering 200k random IOPs for "a mere" 10k$. That's Bitchin' fast storage, for sure :D

And generally having more projects helps spreading the load as well. One project which supports several applications or sub-projects could also distribute these among different physical servers for further load balancing.

Maybe the most problematic would then be the actual science database, which does not have to fully correspond to the BOINC database. The granularity of results will probably be different and results have to be kept much longer. But much less information per result is needed. And at some point the actual outcome of the results can be merged into something different (solid archive rather than a database e.g.). What I mean is this: once you decided that your result is correct you no longer care who handed it in when and when it was sent out. [For Einstein] All you need to save is "at which point in the sky did I find anthing interesting at which date and frequency".

This is a storage problem fundamentally different from the BOINC database. You need to store lots of data, but less per WU / result. And you don't need realtime access and modifications to this, so you can archive it somewhere once you're done with it. How this is done in practice depends on each project and its requirements and capabilities.

Summing it up I think the direction is pretty clear. Actually going there requires some effort, though. As always ;)

MrS

Scanning for our furry friends since Jan 2002

telegd

Joined: 17 Apr 07

Posts: 91

Credit: 10212522

RAC: 0

Interestingly, now that the

4 Apr 2011 6:00:02 UTC

Message 99673 in response to message 99671

(moderation:

)

Interestingly, now that the BRP application is running, CUDA apps have really come into their own on E@H. This was not the case for the ABP app when the earlier part of this thread was happening.

Quote:

And lets be clear I am NOT a gamer. I am not buying CUDA cards just increase my RAC.

I use a console for games because upgrading a computer all the time is a waste (as well as having to run a useless operating system). However, you can pick up a basic CUDA card for $60:

http://www.newegg.com/Product/Product.aspx?Item=N82E16814134090&cm_re=gt220-_-14-134-090-_-Product

I have one of these and (for the Binary pulsar search), it beats by miles anything my 4-core i7 can do running full blast. That is not for bragging rights (it is a very basic card), I just like to contribute to a great project. RAC is a simple approximation to how useful I am being.

Of course, if you want to put money into CUDA, many people are running 3 to 4 BRP apps on one GPU at the same time. The throughput is staggering. Even at around $300 for a card, you can't match that kind of computing power even with lots of refurbished multicore systems. You do need enough CPU power to feed the thirsty GPU, but probably not a top-of-the-line system.

Of course, this should be qualified by the fact that the main science goal of E@H, the gravitational wave data, does not run on GPUs and so your scenario is probably more efficient for contributing to that.

So, back to your original comment, I think that E@H has coped pretty well in the last few months with the upsurge in GPU contribution to processing power. We would have to ask the Devs if they have had to make any major changes lately to cope with the throughput...

astro-marwil

Joined: 28 May 05

Posts: 518

Credit: 417580166

RAC: 629294

RE: We would have to ask

6 Apr 2011 19:40:28 UTC

Message 99674 in response to message 99673

(moderation:

)

Quote:

We would have to ask the Devs if they have had to make any major changes lately to cope with the throughput...

E@H enlarged the server farm before starting GPU work for us. Since that time the waiting time for validation reduced drastical, represented by the number of files waiting for validation. But this may also have reasons by the server software too.

Kind regards
Martin

Matt Giwer

Joined: 12 Dec 05

Posts: 144

Credit: 6891649

RAC: 0

RE: Hey Matt, actually I'm

10 Apr 2011 10:51:09 UTC

Message 99675 in response to message 99672

(moderation:

)

Quote:

Hey Matt,

actually I'm not sure what you're looking for, or what's left to discuss. For me the problem is quite clear:

- multi-processors and GPUs are increasing the available crunching power much faster than in the single core era, no doubt about this
- due to its inherently parallel "independent task"-based nature distributed computing can make perfect use of this power (in contrast to most other software)

The increase in theoretical performance does translate into real world performance for us. This puts a higher load onto the projects and servers. The limiting factors are:

1) internet bandwidth
2) database load

Point 1 is expensive, but straight forward to solve: buy a fat pipe, use your bandwidth efficiently (compress data, avoid sending duplicates) and if this is not enough distribute your load over several locations with their own connections. You've got to have the money to do this, otherwise you probably shouldn't try to run a project (or make it smaller).

Point 2 is more challenging, but not hopeless. First and most obvious: keep the WUs large enough so your BOINC database is not overwhelmed. You should always be able to do this by putting more work into each WU. Ideally you'd offer WUs with different amounts of work in them, so you can serve high end as well as low end CPUs. This becomes even more important with GPUs, where the spread in speed is even larger.
Using these techniques you should be able to handle almost any load on your BOINC database. Furthermore by now there are goodies such as this availble, offering 200k random IOPs for "a mere" 10k$. That's Bitchin' fast storage, for sure :D

And generally having more projects helps spreading the load as well. One project which supports several applications or sub-projects could also distribute these among different physical servers for further load balancing.

Maybe the most problematic would then be the actual science database, which does not have to fully correspond to the BOINC database. The granularity of results will probably be different and results have to be kept much longer. But much less information per result is needed. And at some point the actual outcome of the results can be merged into something different (solid archive rather than a database e.g.). What I mean is this: once you decided that your result is correct you no longer care who handed it in when and when it was sent out. [For Einstein] All you need to save is "at which point in the sky did I find anthing interesting at which date and frequency".

This is a storage problem fundamentally different from the BOINC database. You need to store lots of data, but less per WU / result. And you don't need realtime access and modifications to this, so you can archive it somewhere once you're done with it. How this is done in practice depends on each project and its requirements and capabilities.

Summing it up I think the direction is pretty clear. Actually going there requires some effort, though. As always ;)

MrS

Thank you. I do need to back up a step or three. I have not actually been on the receiving end from clients so my terminology is likely a bit odd. And I can also be out in left field making a problem out of nothing.

The receiving side as servers with many cores (two dual core Xeons) per server and with even more virtual machines on the same server. The server side is already where clients are now going. There is no "ahead of the curve" improvement to keep servers ahead of clients. Therefore clients are going to quickly grow faster than server capacity.

The improvement is no longer simply clock speed where server and client speeds advanced in lockstep. What we have are client cores catching up to server cores with no equivalent server side improvement in the pipeline.

This is a different kind of problem. Web servers, database servers and most any other I can think of do not see this problem as their load is per computer regardless of the number of cores. And for people like that the more clients, the more income and thus more servers.

Boinc projects have always been per core but until recently there has only been one core per computer.

So there are some simple questions. Is the present server running at more than half capacity? If so it should be overwhelmed by the increase in number of cores by next year this time. I am looking at dual core computers at average user prices having been around for nearly two years and assuming a three year replacement rate for home computers. I am also looking at four core computers falling into the normal purchase price range some time this summer.

Last time I was involved in a bureaucratic budget process one had to budget years in advance. Heads up. You will need to increase server capacity and given the budget cycle the cost better be in the 2013 budget because the 2012 budget was frozen months ago.

On to the bandwidth, it is proportional to the cores available. If it is metered that too will double soon. If flat rate, how much is being used?

I am not quite sure about the database issue. The current glitch hangup at 2TB drives should be over soon with 4 and 8 following rapidly. It has been immensely satisfying retiring 360 and 500 GB drives and I look forward to retiring the last of my 1TBs by next week this time. But I do not have massive random access requirements so I am not familiar with Boinc type problems.

But if the issue is preparing WUs and receiving the results again then there is the same question, what is the current utilization (top results for linux) and if it more than half then demand is going to exceed supply real soon.

Maybe I am talking about a falling sky when there is no problem at all. It would be nice if that were the case but I do not think that is the case.

Matt Giwer

Joined: 12 Dec 05

Posts: 144

Credit: 6891649

RAC: 0

RE: Interestingly, now that

10 Apr 2011 11:09:28 UTC

Message 99676 in response to message 99673

(moderation:

)

Quote:

Interestingly, now that the BRP application is running, CUDA apps have really come into their own on E@H. This was not the case for the ABP app when the earlier part of this thread was happening.

Quote:
And lets be clear I am NOT a gamer. I am not buying CUDA cards just increase my RAC.

I use a console for games because upgrading a computer

BTW: NOT denigrating gaming. I did not quit gaming, I swore off gaming after I started getting callouses on my left palm from holding the Atari 800 joystick.

Quote:

all the time is a waste (as well as having to run a useless operating system). However, you can pick up a basic CUDA card for $60:

http://www.newegg.com/Product/Product.aspx?Item=N82E16814134090&cm_re=gt220-_-14-134-090-_-Product

I have one of these and (for the Binary pulsar search), it beats by miles anything my 4-core i7 can do running full blast. That is not for bragging rights (it is a very basic card), I just like to contribute to a great project. RAC is a simple approximation to how useful I am being.

Actually I have two graphics cards unused and because of the risk of game addiction I have avoided installing them. I do not want to spend that kind of time on games and I know I will.

I agree with the projects and as a licensed physicist I still with solid physics boinc projects like this and seti and milkyway until the opinionated stuffed arseholes decided they did not like my politics and stole all my credit points, the animals.

Sorry about that but at times even I need to vent.

Quote:

Of course, if you want to put money into CUDA, many people are running 3 to 4 BRP apps on one GPU at the same time. The throughput is staggering. Even at around $300 for a card, you can't match that kind of computing power even with lots of refurbished multicore systems. You do need enough CPU power to feed the thirsty GPU, but probably not a top-of-the-line system.

Of course, this should be qualified by the fact that the main science goal of E@H, the gravitational wave data, does not run on GPUs and so your scenario is probably more efficient for contributing to that.

So, back to your original comment, I think that E@H has coped pretty well in the last few months with the upsurge in GPU contribution to processing power. We would have to ask the Devs if they have had to make any major changes lately to cope with the throughput...

As with all my good resolutions I will certainly some day install the cards after I get the other things done and finish my book debunking the old testament and a couple other projects.

I agree with you really. I should stop procrastinating and just install the damned things and get it over with.

Matt Giwer

Joined: 12 Dec 05

Posts: 144

Credit: 6891649

RAC: 0

RE: RE: We would have to

10 Apr 2011 11:13:20 UTC

Message 99677 in response to message 99674

(moderation:

)

Quote:

Quote:
We would have to ask the Devs if they have had to make any major changes lately to cope with the throughput...

E@H enlarged the server farm before starting GPU work for us. Since that time the waiting time for validation reduced drastical, represented by the number of files waiting for validation. But this may also have reasons by the server software too.

Kind regards
Martin

I have nothing but praise for the E@H servers. I have not noticed a lag in throughput whereas other projects can be described as erratic at best. It makes one almost believe in the Newtonian clockwork universe.

The coming distributed processing explosion

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner