AMD 9600 Black OC effort

Astro
Astro
Joined: 18 Jan 05
Posts: 257
Credit: 1000560
RAC: 0

I'll try to put some

I'll try to put some "perspective" on what I've found. Since I own and run an AMD64 2800, a mobile AMD64 3700, an AMD64 3700, an AMD64 X2 4800, an AMD64 X2 5200, and an AMD64 X2 6000, and am currently running them on Boincsimap (whose wus are pretty consitently finishing at nearly identical times). Each is on a micro mobo, and only overclocked between 5-8%)

I'm a bit disappointed with the 9600. I had hopes and dreams of it being able to do more faster than it is. The AMD64 2800 takes 1:02 to do one simap wu, the AMD 64 3700 takes 44 minutes, the AMD64 X2 4800 takes 42 minutes, the AMD64 X2 5200 takes 38 minutes, the AMD64 X2 6000 does one in 35 minutes.

This quad at stock took 45 minutes, so it was somewhat slower per wu than the AMD64 X2 4800, but now that I'm up to 2.7 Ghz, it's finishing them in 38:40/38:50, or the same as my AMD64 X2 5200. So, I'm seeing this machine behave like two X2 4800's or 5200's depending on clock.

Astro
Astro
Joined: 18 Jan 05
Posts: 257
Credit: 1000560
RAC: 0

Last nite before retiring, I

Last nite before retiring, I bumped her down a notch. This morning I awoke to see something different than the lockup/screen freeze I've been getting. Gkrellm shows two cores off and two cores in an orange color instead of the green. The boxes for the boinc manager had an outline but not content.

I bumped it back to the 2.6 setting shown earlier. Look like I'm having trouble getting stable at 2.7 too.

Astro
Astro
Joined: 18 Jan 05
Posts: 257
Credit: 1000560
RAC: 0

Here's the current settings.

Here's the current settings. Upped Vdimm to 2.15 and cut timings to 4,4,4,12. AT 2.6 Ghz

FSB freq - 200
PCIE freq - 100
Processor freq mult - 13X
cpu-nb link speed - auto
cpu volt - 1.30
cpu-nb volt - auto
dram volt - 2.15
Mem freq 533/1066
2T, 4,4,4,12

Not sure if this will help with Boinc simap, but it may elsewhere. gkrellm reports 3.993 G free out of the 4 I have. So most of it must reside in the 2M L2. Is that right?

Astro
Astro
Joined: 18 Jan 05
Posts: 257
Credit: 1000560
RAC: 0

It's run stable all day after

It's run stable all day after last adjustments.

I installed 4 gig of Corsair XMS2 DDR2 1066 memory (4X1), but it was my intention to use two of those sticks for the upcoming Q6600 (Along with the identical extra PSU and hard drive I ordered with the bits for this one). Today, I place the order for the Q6600, Asus P5K-VM mobo, and Artic Cooler 7 freezer pro heatsink/fan. Anyway, After several wus I see no difference in performance and will be leaving them out from this point forward. So, down to 2 Gig of ram.

Winterknight
Winterknight
Joined: 4 Jun 05
Posts: 1222
Credit: 312469585
RAC: 646769

Tony, Have just read an

Tony,
Have just read an article on the 3 core Phenom, and there are some links there to several forums that say there is a problem with the third cpu, core 2, assume they are numbered 0, 1, 2, 3.

Basic info is keep this core at standard speed and overclock on the other three cores.

The article is at Toms Hardware, which now seems to be down, so cannot give direct link to article. Googling Pheom BSOD, brings up lots of discussions, even AMD's own forum, A clock interrupt was not received on a secondary

Andy

Winterknight
Winterknight
Joined: 4 Jun 05
Posts: 1222
Credit: 312469585
RAC: 646769

Tom's Hardware is back up the

Tom's Hardware is back up the article is A First Look at AMD's Triple Core Phenom and it's next to last para on page 2, that has links.

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86308057
RAC: 599

RE: Tony, Have just read an

Message 79819 in response to message 79817

Quote:

Tony,
Have just read an article on the 3 core Phenom, and there are some links there to several forums that say there is a problem with the third cpu, core 2, assume they are numbered 0, 1, 2, 3.

Basic info is keep this core at standard speed and overclock on the other three cores.

The article is at Toms Hardware, which now seems to be down, so cannot give direct link to article. Googling Pheom BSOD, brings up lots of discussions, even AMD's own forum, A clock interrupt was not received on a secondary


I've just stumbled upon similar comments elsewhere. So far, there's no clear detail other than "something to do with Core #2 perhaps"...

Tony,

if you're still testing Linux on your box and trying OC-ing, a very interesting test would be to disable each core in turn and see what difference in stability you might see...

As root, use:

echo 0 >> /sys/devices/system/cpu/cpu2/online

to disable cpu2

And:

echo 1 >> /sys/devices/system/cpu/cpu2/online

to enable oncemore.

Try that for cpu0 cpu1 and cpu3 also?

The results could be a clincher to clear up the various blurred comments whizzing around like wildfire at the moment!

Good luck,
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7024914931
RAC: 1808801

RE: Try that for cpu0 cpu1

Message 79820 in response to message 79819

Quote:

Try that for cpu0 cpu1 and cpu3 also?

The results could be a clincher to clear up the various blurred comments whizzing around like wildfire at the moment!


That would indeed be an interesting experiment, which I'd love to see results from.

As a guy with a career in microprocessor design, test, and manufacture, let me just caution a little on the "clincher" part.

Any design, manufactured on any production equipment, will have unintended internal variation, some of it systematic, and some of it random.

If you see variation in speed among your cores, on a single sample, by itself it won't clarify whether a similar result may be expected on all parts of that stepping, all parts whose polysilicon layer was printed with the same reticle, or is just a unique one-off caused by a local subcritical defect.

I agree in this case, given the net buzz, that if you find the same core to be slower by the same amount that others have reported, the unique one-off case becomes less likely. If the reports are actually based on more than one observed sample, then your additional observation would certainly support the systematic model.

Even if it is systematic, it may well not affect all material. No two reticles are identical, and no two core subfields are even close to identical. On the only slightly charitable assumption that the AMD folks did not make a major foul-up, and worse yet, fail to notice it, my favorite theories at the moment are either that:

1. confirmation bias has helped people looking for the effect to see it when their method may have been inadequate.
2. a production reticle had a larger than usual difference among core sub-fields, and that reticle accounted for substantial production.

If you don't realize how very near the edge of frank pattern failure modern photo-masking runs, the idea that normal reticle variation would be a factor may seem strange to you. But in fact the underlying fundamental line-width control of production photolithography has for generations been failing to keep up with the Moore's law trend. So margins which were many sigma when I got into the manufacturing end of the trade in 1988 had shrunk hugely by the time I retired in 2004. People live with it by "managing the marginality", but things like this can and do happen. (those who know a little more may object that the exposure wavelength has shorted--but in fact it has not shortened by anywhere near the reduction in printed line-width. The many other improvements have not in sum kept up the pace.)

Of course, we all make mistakes, and AMD may have just fouled up, seen it, and under the hideous deadline pressure of new product introduction, may have just decided to ship it. I've certainly seen worse.

I don't know to what degree software means really allow one to "turn off" cores of a Core 2 Quad. Further, many overclockers are running near motherboard limitations, which might obscure the effect. Still, it would be great fun if a group tried this sort of test on a few E6600s. Probably the best way to proceed would be to reduce the CPU voltage to nominal, or even slightly below, to avoid ambiguities from motherboard limitations.

If we compared how much core to core speed variation is seen on a few E6600s to that seen on a few of the AMD quads we'd have a better grip on whether something unusual was present on the AMD part.

While it would be "fairer" to check Q6600s, I'm pretty sure the two dice in them don't have any fixed position on wafer relation, so while that also would be interesting, it is a completely different question. Still, if Q6600 cores, with the extra variation of somewhat randomized die selection, proved better matched than the AMD quads, that would also be a comparison making the case that an unusual problem is at hand, and not just net buzz.

Astro
Astro
Joined: 18 Jan 05
Posts: 257
Credit: 1000560
RAC: 0

OK, then, Here's the current

OK, then, Here's the current settings. AT 2.6 Ghz

FSB freq - 200
PCIE freq - 100
Processor freq mult - 13X
cpu-nb link speed - auto
cpu volt - 1.30
cpu-nb volt - auto
dram volt - 2.15
Mem freq 533/1066
2T, 4,4,4,12

exactly the same for the couple days and stable as far as I know. No freeze ups, strange behaviour etc. Remember it has the following components:

Enermax 620W 80+ eff PSU
2 Gigs Corsair XMS2, DDR2 1066 ram (two 1 G sticks, configured for dual channel).
Asus M3A motherboard
AMD 9600 black edition processor/s
Artic Cooling Freezer 64 Pro heatsink/fan (has been lapped)
Hitachi Deskstar 80G Sata II
Mandriva 64B 2007 Spring Linux OS

I just got home, it's mighty warm in this office so I'll leave the windows open to get the room back to normal, then start increasing the FSB and Voltage in the manner previously demonstrated to show where it begins to get flaky. I have to do it in the next few hours, as I go to bed early, and tomorrow is "play with the new Q6600 day". LOL

th3
th3
Joined: 24 Aug 06
Posts: 208
Credit: 2208434
RAC: 0

IMO its not ideal to run your

IMO its not ideal to run your RAM so high when establishing the CPUs limit, i would slow it down a notch or loosen the timings for now and then tighten it up after you found your CPUs limit. Isolate the CPU without stressing other components first.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.