Difference between 2 PCs

Nico2
Nico2
Joined: 8 Jun 12
Posts: 6
Credit: 23025616
RAC: 0
Topic 197437

Hi, I am using 2 LGA775 PCs for E@H: PC1 and PC2. PC1 is not only faster than the other but much faster. I expect faster since it has a better processor (Q9550 vs a Q6600) and a better motherboard (ASUS P5Q3 versus ASUS Benicia IPILB-LB). However I do not expect to see estimated run times 4 to 10 times faster (works out to about 4x faster after calculations). The slower PC has the benefit of a slightly better graphics card with 20% more shaders (R9270X instead of a HD7850) and so I don't understand the slowness of PC2.

Faster: PC1: Q9550+P5Q3+16GB DDR3 ram + HD7850
Slower: PC2: Q6600+Benicia+8GB DDR2 ram + AMD R9 270X

Both PCs run Comodo Internet Security Premium.

Here are 2 example WUs from the faster PC1's BOINC manager (followed by comparable WUs for PC2):

Application: Gravitational Wave S6 Directed Search (CasA) 1.05 (SSE2)
100800 GFLOPs
Estimated time: 10:41:11

Application: Gamma-ray pulsar search #3 1.11 (FGRPopencl-ati)
150000 GFLOPs
Estimated time: 8:28:52

From the slower PC2's BOINC manager: similar WUs have 4 to 10 times longer estimated run times:

Application: Gravitational Wave S6 Directed Search (CasA) 1.05 (SSE2)
100800 GFLOPs
Estimated time: 41:52:22

Application: Gamma-ray pulsar search #3 1.11 (FGRPopencl-ati)
150000 GFLOPs
Estimated time: 99:41:51

Can someone suggest why PC2 is crunching 4x slower than PC1? I would love to improve the timing! Perhaps PC2 is picking up some issue in the older less sophisticated Benicia motherboard. If so I have a unused Asus Maximus Formula motherboard that should do better.

Thanks for any comments, Nick

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7271731734
RAC: 1806222

Difference between 2 PCs

If you want to know how fast they are crunching, look at their completion times, not at the estimated times.

Seriously. There are several important ways the estimates can go off base in ways that make them a spectacularly bad way to compare the merits of specific computers.

Nico2
Nico2
Joined: 8 Jun 12
Posts: 6
Credit: 23025616
RAC: 0

Thanks. Completion times are

Thanks. Completion times are 4x faster on PC1. PC2 is comparable in hardware but unexpectedly slow.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

For the Gravity Wave S6 tasks

For the Gravity Wave S6 tasks the run times varies between ~37000-43000s and ~56500-58500s, if I were to guess I'd say that the newer and faster processor combined with faster RAM might make up for most of that difference.

Looking at the GPU tasks I'm wondering if you load up CPU tasks on all cores? If so try to set Boinc to use one core less so that free core can be used to support the GPU tasks.
Are you running more than one task at a time on the GPUs? The same number of tasks on both GPUs?

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 582474809
RAC: 139180

I your completed tasks I see

I your completed tasks I see the following roughly averaged runtimes, PC1 vs. PC2:

BRP4G-opencl-ati: 8ks vs. 11ks per task (33% difference), which is not too bad considering the ~20% difference in GPU performance. These WUs run mostly within the GPU.

FGRPopencl-ati: 10ks vs. 16ks, which starts to become more severe. These WUs require more CPU support and probably more communication between CPU and GPU.

CasA: 40ks vs. 58ks, a significant difference. These UWs run entirely on the CPU and hence strongly depend on CPU clock (11% advantage for PC1) as well as main memory bandwidth and latency.

In the latter case PC1 has an advantage as well: DDR3 and the faster 1333 MHz FSB (vs. 1066). Your Q6600 could actually run with a faster FSB as well, if the BIOS would let you set this. You'd need to lower the CPU multiplier, though, in order not to accidently overclock the CPU.

Regarding the GPU wUs and especially the FGR's there's another factor entering the equation: your Benicia mainboard has a G33 chipset, which only offers PCIe 1.1. The P5Q3 and the Maximus Formula can run the GPU at PCIe 2.0 for double the bandwidth. This does help Einstein. The GPU would still like it a bit faster with PCIe 3, but the step from 1 to 2 is more important.

BTW: these LGA775 systems are rather energy-inefficient copared to todays offerings. If you're running 24/7 and paying about as much for electricity as over here in Germany, this could cost you a roughly estimated 100€ per year more than a moderately powerful modern system. This gets even worse if these machines still have their original power supply. Is either one of them a dedicated cruncher?

MrS

Scanning for our furry friends since Jan 2002

Alex Plantema
Alex Plantema
Joined: 9 Feb 05
Posts: 16
Credit: 20362656
RAC: 0

DDR3 is faster than DDR2, the

DDR3 is faster than DDR2, the Q9550 has more 2nd level cache memory than the Q6600, supports SSE4.1 instructions and uses less power. The Q9550 was released in January 2008, the Q6600 in January 2007.

BackGroundMAN
BackGroundMAN
Joined: 25 Feb 05
Posts: 58
Credit: 246736656
RAC: 0

RE: DDR3 is faster than

Quote:
DDR3 is faster than DDR2, the Q9550 has more 2nd level cache memory than the Q6600, supports SSE4.1 instructions and uses less power. The Q9550 was released in January 2008, the Q6600 in January 2007.

In one of my PCs (AMD PhenomII X6-1090T) I change from DDR2-1066 to DDR3-1600 (also a new mainboard) without any changes in the computation times for Gravitational Wave S6 Directed Search (CasA-LinuxX64). I didn't change anything else in the PC (e.g reinstall Linux). The overall performance gain for this PC is about ~30% (compilation tasks).

I am changing some kernel configuration to see if there is any impact in the CasA performance but I assume that the CasA computations are more cache depending than memory depending.

Nico2
Nico2
Joined: 8 Jun 12
Posts: 6
Credit: 23025616
RAC: 0

RE: Looking at the GPU

Quote:
Looking at the GPU tasks I'm wondering if you load up CPU tasks on all cores? If so try to set Boinc to use one core less so that free core can be used to support the GPU tasks.
Are you running more than one task at a time on the GPUs? The same number of tasks on both GPUs?

Yes, I am trying multiple simultaneous GPU tasks on both machines. On the faster PC (Q9550+P5Q3+HD7850+DDR3) I sometimes see up to 3 GPU tasks and the GPU utilization goes over 90% and relatively stable from minute to minute.

The slower machine (Q6600+Benicia+R9 270X+DDR2) also sees up to 3 simultaneous GPU tasks but the GPU utilization is usually less than 40%. The GPU utilization varies in a choppy way from minute to minute, like a boxcar function.

The slower PC might work better if I could set the GPU utilization factor (in E@H preferences) to 1.0 for the slow PC and keep the GPU utilization factor at 0.33 for the fast PC.

I appreciate all the thoughtful replies, thanks so much. Nick

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

RE: The slower PC might

Quote:
The slower PC might work better if I could set the GPU utilization factor (in E@H preferences) to 1.0 for the slow PC and keep the GPU utilization factor at 0.33 for the fast PC.


Then go to you Einstein@home prefs and click on "Add separate preferences for" and just choose a location, once that done got to your list of computers and click on detail for the one you want to change and at the bottom you can set the location, match that up with the new prefs and on the next server contact Boinc will get the new set of prefs. For the utilization factor you need to be assigned new GPU work for the change to take effect.

Nico2
Nico2
Joined: 8 Jun 12
Posts: 6
Credit: 23025616
RAC: 0

RE: BTW: these LGA775

Quote:

BTW: these LGA775 systems are rather energy-inefficient copared to todays offerings. If you're running 24/7 and paying about as much for electricity as over here in Germany, this could cost you a roughly estimated 100€ per year more than a moderately powerful modern system. This gets even worse if these machines still have their original power supply. Is either one of them a dedicated cruncher?

MrS

Thanks MrS. After more looking the worst case difference in validated task run times is with the BRP5-opencl-ati tasks: 20k seconds versus 80k seconds run time even though the cpu time is the same on both PCs, about 3500 seconds. I was hoping to see more from the Benicia mobo but you and the others have pointed out what I suspected: I will have to swap it out for the old Asus Formula. Hopefully I will replace one of the PCs with an LGA2011 system.

My 2 BOINC PCs all self built from my "old" parts bin... with new power supplies, graphics, solid state drives etc. They are sort of dedicated, running about 50% of the time. I am building up capability to do millimetre wave modelling at home and I enjoy the idea of contributing what I can to E@H. I monitor the total power consumed with a new Cyberpower power-factor correcting UPS. The electricity for the 2 PC systems annually costs about 200 euro. - Nick

Nico2
Nico2
Joined: 8 Jun 12
Posts: 6
Credit: 23025616
RAC: 0

Thanks Holmis! I was

Thanks Holmis! I was wondering how to do that.

OK, there it is and it works better: the R9 270X card GPU % utilization on the slower PC is much more uniform from minute to minute: in addition I let one of the 4 cores be idle. The utilization is 33%: running 1CPU + 1 ATI GPU with "Gamma Ray Pulsar Search #3". 33% GPU is less than I would like but for now it is OK since it is still running off the old Benicia motherboard.

Quote:

Then go to you Einstein@home prefs and click on "Add separate preferences for" and just choose a location, once that done got to your list of computers and click on detail for the one you want to change and at the bottom you can set the location, match that up with the new prefs and on the next server contact Boinc will get the new set of prefs. For the utilization factor you need to be assigned new GPU work for the change to take effect.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.