I just built an AMD Phenom II x6 which does 6 Einstein GC WUs in about 18,000 sec (no GPU). System has 8GB 1600MHz DDR3, 1.5 TB Barracuda for right around $700. Running full bore it is 186W (average over 10 days). It's running stock 3.2GHz
The Binary Pulsar search takes right around 60,000 sec (CPU).
Not a fan boy of either manufacturer but I will probably need another system in a month or so and I'm trying to decide between an I7 and another Phenom.
Thanks,
Joe
There are reports that Intel is better here at Einstein, that may be true but I am not sure I would let be my deciding factor.
System costs are not that much different. It all depends on how much you are into overclocking. The 2500K (200€) has no hyper threading but can clock quite high so that the credits per day are quite the same as for a Phenom II X6 (~200€). HT of the 2600K (300€) adds roughly 30% more crunching power.
The AMD mainboards seem a little less expensive but this also depends on your preferences.
I wouldn´t change my system to a Phenom X6 and vice versa because the crunching power is more than sufficent.
Thanks. E@H helps me justify a bit more processing power (money) but the critical factor for me is response time in my other apps.
I'm still trying to wrap my feeble brain around the common knowledge that HT increases throughput of these compute bound tasks by 30%. I thought HT would only speed up context switches but it's hard to believe total multitasking overhead is 30% in an environment with a reasonable number of tasks so I must be missing something.
I'm trying to figure out how to turn off HT on my older Xeon and I5 powered laptop to see for myself. Any links? I'm running Ubuntu 10.04 on both.
I'm still trying to wrap my feeble brain around the common knowledge that HT increases throughput of these compute bound tasks by 30%. I thought HT would only speed up context switches but it's hard to believe total multitasking overhead is 30% in an environment with a reasonable number of tasks so I must be missing something.
Actually these speedups are observed in cases where the multitasking overhead is negligible.
It is true that one good way to think about HT is that it entails extremely rapid context switching--but it is not substituting for the explicit context switches visible to the OS. Instead it makes a pair of tasks each of which seems to the OS to have a nice long chunk of time uninterrupted (and un task-switched on a CPU) do extremely fine-grained trading of active use of the CPU.
So why does that help? Because the CPU has a number of separately-scheduled and somewhat independent resources, commonly including, but not limited to, Floating Point unit, main ALU, and external memory access. Each of these can be the next required source of data for one of the other, which means the other sits and waits. But, with the full copy of processor state for the "other" thread almost instantly available, that same unit may find it has data ready to process. In fact, though at any given moment only one of the two tasks is being "charged" CPU time, there are sufficient interlocks to allow some overlapped processing, so some work for both tasks may often be in process in the various portions of the machine at the same time
It is actually more complicated than that, but the big picture is that when hyperthreading works well, the various computing resources in the machine spend a higher fraction of their time actually doing something useful. There is switch overhead, but it is so small that it is quite rare to find cases where HT actually slows things down. There was such a case here on Einstein a few years ago, where some late stage in the wonderful sequence of akosf optimizations created code which gave slightly lower throughput with HT enabled on my Gallatin host than with it turned off. I don't know whether that same code would also suffer on the rather different HT implementation of the Nehalem family--quite possibly not.
So why does that help? Because the CPU has a number of separately-scheduled and somewhat independent resources, commonly including, but not limited to, Floating Point unit, main ALU, and external memory access.
Here are a couple of other comparisons to platforms in addition to the OP's comparison to the always popular over-clocked Q6600.
Configurations are as follows -
Sandy Bridge: i7-2600K @ 3.5 GHz, 2 x 4GB DDR3 1333, Nvidia GT 240
Nehalem: i7-920 @ 3.5 GHz, 3 x 2GB DDR3 1410, ATI 4670
Lynnfield: i7-860 @ 2.8 GHz, 4 x 2GB DDR3 1333, ATI 4670
Notes: Sandy Bridge just assembled without tuning and is not currently running any GPU jobs (trouble loading Nvidia drivers), OS = Ubuntu 10.10. The other two platforms have been under-volted to optimize power draws, OS = Windows 7.
All systems running with Hyper-Threading = ON, running 8 jobs at the same time. I also limited comparison to the current gravity wave jobs.
First I set my i7-920 to 3.5 GHz to do a clock for clock comparison:
i7-920: GW Job = 20,000 sec drawing 285 watts
i7-2600K: GW Job = 18,200 sec drawing 155 watts
Next, both the i7-860 and i7-2600K are drawing 155 watts
i7-860: GW Job = 28,300 sec
i7-2600K: GW Job = 18,200 sec
For those interested in under-volting this new chip here are some power numbers. I was not overly agressive with the under-volt, just a -0.1 volt offset (this board uses offsets versus setting an absolute number). Running stable for 1 week.
Configuration:
i7-2600K at stock 3.4 GHz
2 x 4GB DDR3-1333 at 1.5 volts
ASUS P8H67-M mATX motherboard (should have got a P67 board)
ATI 4670 with ATI drivers (GT-240 had HW failure, so removed)
620 watt 80+ Bronze PSU
Ubuntu 10.10
Running 8 gravity wave jobs at the same time (HT=ON) with v1.07 application.
ATI 4670 with ATI drivers (GT-240 had HW failure, so removed)
Can also confirm an idle power draw of 60 watts.
As you have an H67 board, I suppose you could have used the onboard graphics on your 2600K and saved appreciable power.
Why not? Games...?
Any idea how much the 4670 is contributing to your power numbers?
While your supply has a high efficiency rating, your system is running so far down in the capability range that you are probably giving up some efficiency to what you would have with a comparably good but lower capacity supply.
RE: May I ask what that
)
There are reports that Intel is better here at Einstein, that may be true but I am not sure I would let be my deciding factor.
System costs are not that
)
System costs are not that much different. It all depends on how much you are into overclocking. The 2500K (200€) has no hyper threading but can clock quite high so that the credits per day are quite the same as for a Phenom II X6 (~200€). HT of the 2600K (300€) adds roughly 30% more crunching power.
The AMD mainboards seem a little less expensive but this also depends on your preferences.
I wouldn´t change my system to a Phenom X6 and vice versa because the crunching power is more than sufficent.
Mike, Hotze, Thanks. E@H
)
Mike, Hotze,
Thanks. E@H helps me justify a bit more processing power (money) but the critical factor for me is response time in my other apps.
I'm still trying to wrap my feeble brain around the common knowledge that HT increases throughput of these compute bound tasks by 30%. I thought HT would only speed up context switches but it's hard to believe total multitasking overhead is 30% in an environment with a reasonable number of tasks so I must be missing something.
I'm trying to figure out how to turn off HT on my older Xeon and I5 powered laptop to see for myself. Any links? I'm running Ubuntu 10.04 on both.
Joe
Did you try the BIOS? I can
)
Did you try the BIOS?
I can disable it for my i5 dual-core, or use only one core with HT.
RE: Did you try the BIOS? I
)
I did not. I was hoping for something to read first but I'll check it out when I get back home today.
Joe
RE: I'm still trying to
)
Actually these speedups are observed in cases where the multitasking overhead is negligible.
It is true that one good way to think about HT is that it entails extremely rapid context switching--but it is not substituting for the explicit context switches visible to the OS. Instead it makes a pair of tasks each of which seems to the OS to have a nice long chunk of time uninterrupted (and un task-switched on a CPU) do extremely fine-grained trading of active use of the CPU.
So why does that help? Because the CPU has a number of separately-scheduled and somewhat independent resources, commonly including, but not limited to, Floating Point unit, main ALU, and external memory access. Each of these can be the next required source of data for one of the other, which means the other sits and waits. But, with the full copy of processor state for the "other" thread almost instantly available, that same unit may find it has data ready to process. In fact, though at any given moment only one of the two tasks is being "charged" CPU time, there are sufficient interlocks to allow some overlapped processing, so some work for both tasks may often be in process in the various portions of the machine at the same time
It is actually more complicated than that, but the big picture is that when hyperthreading works well, the various computing resources in the machine spend a higher fraction of their time actually doing something useful. There is switch overhead, but it is so small that it is quite rare to find cases where HT actually slows things down. There was such a case here on Einstein a few years ago, where some late stage in the wonderful sequence of akosf optimizations created code which gave slightly lower throughput with HT enabled on my Gallatin host than with it turned off. I don't know whether that same code would also suffer on the rather different HT implementation of the Nehalem family--quite possibly not.
RE: So why does that help?
)
Thank you.
I'm beginning to understand.
Joe
Here are a couple of other
)
Here are a couple of other comparisons to platforms in addition to the OP's comparison to the always popular over-clocked Q6600.
Configurations are as follows -
Sandy Bridge: i7-2600K @ 3.5 GHz, 2 x 4GB DDR3 1333, Nvidia GT 240
Nehalem: i7-920 @ 3.5 GHz, 3 x 2GB DDR3 1410, ATI 4670
Lynnfield: i7-860 @ 2.8 GHz, 4 x 2GB DDR3 1333, ATI 4670
Notes: Sandy Bridge just assembled without tuning and is not currently running any GPU jobs (trouble loading Nvidia drivers), OS = Ubuntu 10.10. The other two platforms have been under-volted to optimize power draws, OS = Windows 7.
All systems running with Hyper-Threading = ON, running 8 jobs at the same time. I also limited comparison to the current gravity wave jobs.
First I set my i7-920 to 3.5 GHz to do a clock for clock comparison:
i7-920: GW Job = 20,000 sec drawing 285 watts
i7-2600K: GW Job = 18,200 sec drawing 155 watts
Next, both the i7-860 and i7-2600K are drawing 155 watts
i7-860: GW Job = 28,300 sec
i7-2600K: GW Job = 18,200 sec
For those interested in
)
For those interested in under-volting this new chip here are some power numbers. I was not overly agressive with the under-volt, just a -0.1 volt offset (this board uses offsets versus setting an absolute number). Running stable for 1 week.
Configuration:
i7-2600K at stock 3.4 GHz
2 x 4GB DDR3-1333 at 1.5 volts
ASUS P8H67-M mATX motherboard (should have got a P67 board)
ATI 4670 with ATI drivers (GT-240 had HW failure, so removed)
620 watt 80+ Bronze PSU
Ubuntu 10.10
Running 8 gravity wave jobs at the same time (HT=ON) with v1.07 application.
Power = 128 watts or 3.08 Kilowatt hours / Day
Monthly electrical cost = 3.08 * 30 days * $0.11 Kwh = $10.16
Can also confirm an idle power draw of 60 watts.
RE: ATI 4670 with ATI
)
As you have an H67 board, I suppose you could have used the onboard graphics on your 2600K and saved appreciable power.
Why not? Games...?
Any idea how much the 4670 is contributing to your power numbers?
While your supply has a high efficiency rating, your system is running so far down in the capability range that you are probably giving up some efficiency to what you would have with a comparably good but lower capacity supply.