Most efficient CPU: PPC G5 single-core

Martin P.
Martin P.
Joined: 17 Feb 05
Posts: 162
Credit: 40,156,217
RAC: 0
Topic 193073

The single Core PPC G5 CPU is, although 4 years old, still by far the most efficient CPU for Einstein. My G5 Dual 2.7 GHz machine takes appr. 64,000 seconds for a typical 451 credits WU. My second G5 Dual 2.5 GHz Mac needs appr. 72,000 seconds for a 471 credits WU. Both computers run E@H at a rate of 35% only (rest is dedicated to SETI@Home, Rosetta and World Community Grid).

When you compare this to the most recent Quad Core Xeons or AMDs only the Top-3 computers can keep up with that pace (calculated per core!)! Congratulations to the programmers who obviously optimized the worker app perfectly!

This is amazing - such an old technology outperforming the latest.

archae86
archae86
Joined: 6 Dec 05
Posts: 3,071
Credit: 6,030,584,122
RAC: 2,374,063

Most efficient CPU: PPC G5 single-core

hummm... I beg to differ

At 64000 CPU seconds for a 441 credit Work Unit, you are consuming 142 CPU seconds for each credit.

My Q6600, on the 4.33 stock application, was taking 87000 CPU seconds for 656 credit WU's, or 132 CPU seconds per credit.

Since it has four cores, it is 4.3 times faster than your power PC in Einstein crunching.

There certainly are applications which don't get use out of extra cores, and for those applications doing a per-core comparison makes sense, but it makes no sense at all for Einstein in specific, or BOINC in general.

If one were to define efficiency in terms of Einstein output per incremental watt-hour, then my system goes from 155 watts at idle to 214 watts when doing four Einsteins. Thus the incremental energy consumption per Einstein credit is .54 watt-hour. I'll wager that the PPC is far higher than that, and hence far less efficient.

Dave Burbank
Dave Burbank
Joined: 30 Jan 06
Posts: 275
Credit: 1,548,376
RAC: 0

I agree with archae86, the

I agree with archae86, the Intel quads/octos are the most efficient cruncher per credit/watt.

My Q6600 (overclocked to 3.2Ghz) completes a 449 credit WU in 53,600s, which is still faster than your G5 per core. Yes it's overclocked, but that is half of the appeal of the core architecture, and should be considered.

It does look like the PPC/Intel Mac app is the most efficient of the apps, but the lack of "overclockability" of Mac's is a hindrance IMO to the platform.

There are 10^11 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers. - Richard Feynman

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 485,748,203
RAC: 1,369

RE: hummm... I beg to

Message 72126 in response to message 72124

Quote:

hummm... I beg to differ

At 64000 CPU seconds for a 441 credit Work Unit, you are consuming 142 CPU seconds for each credit.

My Q6600, on the 4.33 stock application, was taking 87000 CPU seconds for 656 credit WU's, or 132 CPU seconds per credit.
...

If one were to define efficiency in terms of Einstein output per incremental watt-hour, then my system goes from 155 watts at idle to 214 watts when doing four Einsteins. Thus the incremental energy consumption per Einstein credit is .54 watt-hour. I'll wager that the PPC is far higher than that, and hence far less efficient.

If your Q6600 takes 87000 CPU seconds for 656 credits, at 214 Watt, that sums up to
214 * (87000 / 3600) Watt-Hour / (4*656) credits = 1.97 Wh/cr

My Mac Mini (Core Duo) runs two cores at approx 40 Watt (seems low but is plausible, see here), and
takes 98,312.99 CPU sec for 455.85 credits:

40 * (98312.99 /3600) Watt-hour / (2*455.85) credits = 1,2 Wh/cr .

Hey, my macMini is 60 % more energy efficient than your Q6600 system ;-)! And it's not even overclocked or undervolted. OK, I guess if we compared just the CPU power consumption it would be a different picture. But still ...

CU

H-BE

archae86
archae86
Joined: 6 Dec 05
Posts: 3,071
Credit: 6,030,584,122
RAC: 2,374,063

RE: Hey, my macMini is 60

Message 72127 in response to message 72126

Quote:

Hey, my macMini is 60 % more energy efficient than your Q6600 system

Doubtless--mine is handicapped with three hard drives, two optical drives, a graphics card, fan regulators, a sound card, excess fans, and excessively capable power supply, and more, all either not needed at all or over-done for the narrow crunching role.

This is not a crunching PC--it is my primary personal computer, with uses from servicing all my Internet addictions, to being the platform for the investment management I provide for my mother and brothers, to doing all the processing related to my heavy audio recording, CD preparation, and production hobby. It just happens to crunch with its (huge) spare CPU capacity.

That's why I suggested looking at the credits per incremental watt-hour, if one is trying to measure the merits of the processor with respect to Einstein.

But I do agree that at the system level your numbers say you are beating me, fair and square. There may well be laptops out there with power-efficient internals and Core 2 chips chosen out of the low-voltage capable portion of the distribution which are very much better yet on a total system watt-hours/credit basis.

[AF>Le_Pommier>Macbidouille.com]CRISTOBOOL
[AF>Le_Pommi...
Joined: 10 Dec 05
Posts: 59
Credit: 62,971
RAC: 0

hello , i think that Martin P

hello , i think that Martin P is right , when i look at other computers who calculate same WU than my computer (G5 1,8Ghz) (and not compared with differents WU)

The core 2 duo or the Athlon X2 too have almost the same speed than my G5 (compared core to core with the same clock frequency of course)

But SETI@Home (alexkan app) run almost (10% faster) at the same speed on my G5 whereas the x86 app is compiled with Intel C/C++ compiler while PPC app is compiled with gcc and not with the best compiler for G4/G5 (IBM visualage compiler XLC/C++ for Mac OS X)

Look at SPEC2000 and SPEC2006 results and if you compared PPC970MP / athlonX2 and core 2 duo you will be surprised. (and take a look at POWER5 and POWER6 :) )

for example my last WU :

Over Success Done 66,816.16seconds 405.93 405.93 (opteron 185 2*2,6Ghz)

Over Success Done 84,547.14seconds 405.93 405.93 (G5 1*1,8Ghz)

Dave Burbank
Dave Burbank
Joined: 30 Jan 06
Posts: 275
Credit: 1,548,376
RAC: 0

RE: The core 2 duo or the

Message 72129 in response to message 72128

Quote:
The core 2 duo or the Athlon X2 too have almost the same speed than my G5 (compared core to core with the same clock frequency of course)

Ahhh, but thats like saying if I drive my Ferrari at 100kph is just as fast as my Ford Focus.

The newer Intel and AMD chips are capable of running at higher frequency with multiple cores using similar power consumption, and that IMO makes these chips so attractive.

There are 10^11 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers. - Richard Feynman

archae86
archae86
Joined: 6 Dec 05
Posts: 3,071
Credit: 6,030,584,122
RAC: 2,374,063

RE: The core 2 duo or the

Message 72130 in response to message 72128

Quote:
The core 2 duo or the Athlon X2 too have almost the same speed than my G5 (compared core to core with the same clock frequency of course)


That is not a very good way to compare machine capability.

Suppose I decided to compare the speed of my Audi at the same rpm as the speed of a Lamborghini. In a stroke I'd assume away the enormous advantage it has from a vastly higher rpm capability.

Work done per clock cycle is just a design decision, not a measure of some abstract virtue. As you consider breaking up the work flow into more steps (call it deeper pipelining, if that makes you feel good), you find you can raise the clock rate each time the previously speedlimiting work per step gets broken up. But the extra stages require more machine state (call it flipflops if you like), and much more hardware, which consumes power. So the stronger weight you give to power consumption and hardware cost, the less far you want to go down this road. Also the pipelining itself gets in your way for control transfers, and trying to manage that problem causes a further explosion in the amount of hardware.

The whole game of pretending compute/clock is a measure of virtue is fairly recent, and seems mostly a matter of those seeking to justify there corporate preferences. Back in the early 1980s, two sequential models of IBM mainframes changed a major design choice in this matter, differing by about a factor of two in work done per clock cycle, and no one blinked an eye. We all understood it was total throughput that mattered. Memory is fading, but I think it was the 3081 that did much _less_ work/clock that its predecessor, and it was generally regarded as one of their good ones, not one of their bad ones.

Been there, done that, own way too many of the T-shirts. (I was a microprocessor design engineer intermittently from 1975 to 1988, and worked in the part of the organization building the things from 1988 to 2004).

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 485,748,203
RAC: 1,369

Hi! Clock frequency is

Hi!

Clock frequency is nothing, efficiency everything. Note that IIRC, 4 of the Top 10 supercomputers on this planet are Blue Gene installations by IBM. They are build from Power PC 440 cores running at a clock frequency of 700 (!!) MHz.

CU
H-BE

th3
th3
Joined: 24 Aug 06
Posts: 208
Credit: 2,208,434
RAC: 0

Anyway i agree that G5's have

Anyway i agree that G5's have a good performance and that the app seem to be well optimized for PPC. Not as good performance as Core 2 but still very respectable, its old but obviously not yet "old crap" like for example Prescott already is =). And as pointed out already, for performance per watt G5s are not much to write home about, so "efficient" is not the word i would use.

archae86
archae86
Joined: 6 Dec 05
Posts: 3,071
Credit: 6,030,584,122
RAC: 2,374,063

I have nothing good to say

I have nothing good to say about Prescott, not when it was an infant, young, old, or any other age. Hardly anyone allowed by their job to be candid does, by the way.

As to clock frequency, it is neither everything nor nothing. Just multiply it by useful work per clock, and you have total throughput, which mean a great deal.

I don't know what the folks touting efficiency mean by it either. Power efficiency is a useful, though not all-important parameter. Cost efficiency is also interesting. Price efficiency yet another.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.