Need your help!

Zhang Chi
Zhang Chi
Joined: 27 Aug 06
Posts: 210
Credit: 4406105
RAC: 0

Special thanks to you! I 've

Special thanks to you!
I 've learnt more about computer from your posts.
I will go to test my cpu according your suggestions.
I know little about computer. If there is something wrong in my words, please point them out. I am gald to learn something from you!

Hello everyone!I'm Zhang Chi from China.I am 16 and I am a middle school student.And I love science. I want to be a scientist in the future!

Dave Burbank
Dave Burbank
Joined: 30 Jan 06
Posts: 275
Credit: 1548376
RAC: 0

And I am glad to share my

And I am glad to share my thoughts with you.

Good luck with the overclocking!

There are 10^11 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers. - Richard Feynman

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 1580

Longer term stability is also

Longer term stability is also something to watch out for. My A64x2 can pass 12hr mentest86 and 72hr prime95 stress tests at 2.9gig, but anywhere above 2.8 I'd suffer a random crash once a month.

Dave Burbank
Dave Burbank
Joined: 30 Jan 06
Posts: 275
Credit: 1548376
RAC: 0

Reboot or BSOD? Could it be

Reboot or BSOD? Could it be a random power surge?

Wow you really stressed that system, I don't think I've ever gone longer than 24hr. Probably because I couldn't handle a failure at that point, I'd be so disappointed. I've had it fail just after 8hr...I wasn't to pleased!

There are 10^11 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers. - Richard Feynman

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 1580

My machines are UPSed and I'm

My machines are UPSed and I'm using a brandname 750w PSU so a power fluctuation is unlikely as a cause. I've seen total system freeze with optional .5s audio loop, bsod, and random reboot as a manifestations of the problem.

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7023824931
RAC: 1805761

RE: Here is a link to

Message 70826 in response to message 70820

Quote:


Here is a link to download Orthos stress tester : http://www.overclock.net/attachments/downloads/36840-orthos-v20060420-orthos_exe_20060420.zip?d=1165737486

I believe "small FFT's" is generally regarded as the best test to run.


Thanks for the link. I was a bit surprised to see that it drew almost exactly the same power on my Q6600 as do four Einstein's. Then I finally checked Windows Task Manager and realized it only had work running on two of the four CPUs.

I suppose if one's intent is to get higher temps, one could set BOINC for a two CPU limit, run Einstein, the start up Orthos.

I did not run long enough to get truly documented comparisons, but here is a rough idea on my Q6600 system, currently running 3.24 GHz at a commanded 1.4375V CPU voltage. The power levels are input, so most power three hard drives, two optical drives, a fairly low-power graphics card, power supply inefficiency, and all the motherboard stuff in addition to the CPU.

Idle: 153 watts
Orthos: 245
Einstein*4: 247
Einstein*2: 207
Einstein*2 plus Orthos: 270 watts

Interestingly, according to Task Manager, the 2 Einstein plus Orthos combination only gets two cores up to 100% utilization, leaving the other two very roughly at 90%.

I imagine the overclockers will want a Mersenne Prime application that loads quad cores properly.

Googling a bit, I spotted someone's workaround for getting two copies of Prime95 up to stress a dual core. The method seemed to generalize easily to four, so I tried it. Neglecting the methods, the key point is that four instances are started, each with affinity to a specific core, and each limited in memory to about a fifth the actual installed system memory. I filled the cores in order 0 to 3 (I suspect this may affect power levels at the intermediate steps, but did not check)

1 instance: 218 watts
2 instances: 255 watts
3 instances: 262 watts
4 instances: 271 watts

In bringing up a Core 2 Duo E6600 in March 2007, I noticed that at CPU voltage adequate to all other purposes, I'd get occasional Result errors (early aborts detected by the ap, not misvalidations on results reported as successfully completed). I think by the end I'd raised the CPU voltage by at least four increments over that required for other purposes, and the last level at which error happened the rate was something like once in two weeks. SETI (on the current KWSN ap of the time) was error-free at an appreciably lower voltage.

Your mileage may vary.

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7023824931
RAC: 1805761

RE: 4 instances: 271

Message 70827 in response to message 70826

Quote:

4 instances: 271 watts


Not one minute after I posted that comment, with four instances of Prime95 on CPU affinity still running, my system did a spontaneous reboot.

This is intriguing, as this exact configuration has run nearly 100 hours on this system at this clock rate and voltage usually with four Einsteins, but also with 1 to four SETIs.

The reported die temperatures were nowhere near the maximum I've seen, as in some tests the room has been several degrees C hotter, and more importantly I've greatly lowered the fan rates a few times with the express purpose of running up die temperatures. I've seen about 10C higher, if not 15. So it appears something in Orthos triggered a system reboot at conditions Einstein would happily tolerate.

This was after perhaps ten minutes, so I intend to raise the CPU voltage a little and try again. I doubt 3.24 GHz is my long-term clock for this system, but I'd like to establish a safe operating point near here, to help understand margins, before dropping a bit to conserve power, and extend the likely life of the system.

Dave Burbank
Dave Burbank
Joined: 30 Jan 06
Posts: 275
Credit: 1548376
RAC: 0

Ahhh yes, I should have

Ahhh yes, I should have mentioned that, for quad cores you need to run two instances of Orthos with affinity set accordingly. Apparently the newest version of Prime95 will start four threads with the affinity properly set to each core.

Great job with testing total power consumption, I was concerned that a o/c'ed Q6600 would draw much more power. 271 watts, not bad for four cores.

I find that Prime95/Orthos is harder on a system (will cause errors) than E@H. A sudden reboot or BSOD can quite often be attributed to unstable RAM, maybe try upping the RAM voltage a bit, or loosen the timings a little.

There are 10^11 stars in the galaxy. That used to be a huge number. But it's only a hundred billion. It's less than the national deficit! We used to call them astronomical numbers. Now we should call them economical numbers. - Richard Feynman

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7023824931
RAC: 1805761

RE: I find that

Message 70829 in response to message 70828

Quote:
I find that Prime95/Orthos is harder on a system (will cause errors) than E@H. A sudden reboot or BSOD can quite often be attributed to unstable RAM, maybe try upping the RAM voltage a bit, or loosen the timings a little.


Thanks for the advice. I think maybe I'll try a stretch of MemTest86+.

On the other hand, my RAM is actually underclocked (720 vs. spec of 800 MHz), and I let the board choose the RAS/CAS etc. stuff, which looked conservative. And, yes I've raised the RAM voltage 0.1 volt so it should be running at its nominal 1.8 instead of the chipset's nominal 1.7 that hardly any RAM has as its nominal.

None of which means it is not a RAM problem--I might just have a bad stick, for that matter.

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7023824931
RAC: 1805761

RE: RE: ...in other

Message 70830 in response to message 70820

Quote:
Quote:
...in other words, I think Speedfan is closer to the truth for Q6600 B3 stepping than is Coretemp--there is a 15C offset because of this issue

I've gotten into many a discussions on OC forums on this topic, and have come to the opposite conclusion (with respect).

I started typing up a response, but then realized a post I wrote on another forum basically summarizes my opinion and explains it as best I can: http://www.overclock.net/2405987-post25.html

I have no computer industry experience, and have come to this conclusion through my readings from Intel data sheets and other info on the net. I fully realize I could be wrong and am curious to how you have come to your conclusion.

Quote:
I've tested my CPU, but not using Prime95 or Orthos. I used Spuer Pi. It did pretty well. Now temperature is down to 55 C,still overlocked to 3000MHz.
And a question:
How high can E 4400 overlock to? I 've tried to overlock to 3500MHz. Temperature reaches 65 C.

IMHO SuperPi is not enough to fully stress test a CPU, mainly because it doesn't run for a long period of time. I can run SuperPi 32M at 3906 MHz successfully, but it would fail Prime95 or Orthos (a multithreaded version of Prime95 suited for multicore CPUs) within minutes or even seconds.

With a DC project like E@H, where your computer will be running at 100% CPU load for hours...days...weeks...months...years, having a overclocked system that is fully stable is crucial. No one like to loose a WU after 15 hours of crunch time because of an unstable overclock. So I highly recommend running Orthos for at least 8 hours and ideally even more. It will generate much more heat than SuperPi and a few degrees C more than BOINC, so it is a good test of how hot your CPU will get on a hot day.

Here is a link to download Orthos stress tester : http://www.overclock.net/attachments/downloads/36840-orthos-v20060420-orthos_exe_20060420.zip?d=1165737486

I believe "small FFT's" is generally regarded as the best test to run.


Clearly the key question, assuming common reports on the net are correct, is "What is the correct TJunction to be used in interpreting DTS data for Q6600 B3 stepping"

Some references assert that MSR register 0xEE bit 30 is the magic decoder ring. 1 means 85C, 0 means 100 C.

Everest 4.00.976 downloaded today asserts for my B3 Q6600 purchase on July 23 (darn, I missed the G0):
MSR 000000EE 0000-0000-817D-CB00

I've read that bit 30 of register 0xEE is the 85/100 key, with 1 meaning 85. Assuming they count from the LSB, with LSB numbering 0, then I see 0 here, which should mean 100--fitting the pessimist view. However the Lavalys developer has posted quite recently that he is awaiting clarification from Intel--he clearly thinks there are inconsistencies here. But this probably explains why so many applications think the Q6600 is so amazingly hot, even when it is idle, and the resulting readings are preposterous.

At the moment the one that actually contains temperature data reads:

MSR 0000019C 0000-0000-881C-0000

The reference I found (_not_ official, that apparently requires NDA) says the "1C" part here is hex offset from TJunction. That gets me 28C, which for a TJunction of 85 gives me 57, or for a TJunction of 100 gives me 72.

I picked ambient temperature and comparison to an E6600 Conroe system in the same case/motherboard as my reference. On that basis, TJunction makes much more sense for Q6600 than does 100.

So far as I can see, you picked comparison to reported temperature from the "processor" measurement as your gold standard. In an ideal world that should work find, but the in contrasting the processor and DTS sensors, I think we should consider:

1. different location
2. different sensor technology
3. different vendor

as reasons for appreciable cumulative relative error. I can see right off the bat that the processor sensors for the two Gigabyte GA965P-DS3 motherboards I own are appreciably offset. One reads far below ambient when I idle the system and open the case, much more than the other.

I continue to think that 85 is closer the mark for my sample of Q6600 than 100, but admit I'm not sure.

It sure does not help that Intel treats this all as some sort of state secret, and also something that we poor scum just should not care about.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.