Hyperthreading and Task number Impact Observations

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6591
Credit: 327096324
RAC: 179000

RE: Nr. 3: what if the

Quote:
Nr. 3: what if the scheduler wasn't HT-aware? If it assigned tasks completely random? I just quickly tried to count the possibilities for scheduling 3 tasks over 3 HT cores and arrived at 24 lucky possibilities and 20 unlucky ones. If we assume the same probability for each one that would mean 20/(24+20) = 45% unlucky assignments. That's not exactly the 28% from the previous paragraph and I'd be surprised if I didn't make some mistake here. But in my opinion it's somewhat close.. and should be much further away if the scheduler worked remotely the way I'd imagine.


Ah. Three physical cores ( 1, 2, 3 ) each twice virtualised ( a, b ) with three distinct processes ( A, B, C )?[pre]
Phys Virt
1 | a | A A A A A A A A A A
| b | B B B B A A A A A A
2 | a | C B B B B B B A A
| b | C C B B C B B B B A
3 | a | C C C B C C B C B
| b | C C C C C C C C C
________________________________________________________________________
Y Y Y Y Y Y Y Y[/pre]
Where Y indicates 'good' combinations that don't compete for a physical core. Have I expressed your desired scenario correctly?

You just permute from ABC to ACB, BAC, BCA, CAB, CBA to get the others. But the fraction of good ones is still the same. The total good is 8 * 6 = 48, but out of 19 * 6 = 114 in all. Thus 8/19 = 48/114 ~ 0.42105 or 42% lucky and thus 48% unlucky ( assumes random assignment of task to virtual core ).

Cheers, Mike.

( edit ) I suppose we're gonna want a 4 HT core by 4 tasks matrix .... as Arnie says : I'll be back. :-)

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

BilBg
BilBg
Joined: 27 May 07
Posts: 56
Credit: 23998
RAC: 0

RE: Thus 8/19 = 48/114 ~

Quote:
Thus 8/19 = 48/114 ~ 0.42105 or 42% lucky and thus 48% unlucky ( assumes random assignment of task to virtual core ).

Excellent explanation!

The funny part is that you did right the hard part and wrong the easy last arithmetic ;)

It's 100-42 = 58% unlucky

The "Answer to the Ultimate Question of Life, the Universe, and Everything" (42) get in the way :)
[url]http://en.wikipedia.org/wiki/42_(number)[/url]
http://en.wikipedia.org/wiki/Answer_to_the_Ultimate_Question_of_Life,_the_Universe,_and_Everything#The_number_42

[pre] [/pre]

- ALF - "Find out what you don't do well ..... then don't do it!" :)

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6591
Credit: 327096324
RAC: 179000

Typo? No .... err sales tax?

Typo? No .... err sales tax? No .... err brain fade? yes .... :-)

As they say in Anchorman ( The Legend of Ron Burgundy ).... They've done studies, you know. 60% of the time it works, every time. ...

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6591
Credit: 327096324
RAC: 179000

OK here's, I believe, the

OK here's, I believe, the cases for 4 tasks on 4HT cores :

As you can see out of 70 cases in all, those marked with a 'Y' are 'good' ( 16 ). The remainder are not equivalent though, some as marked by the caret '^' are where two physical cores are contending and two cores idle ( 6 ). Those not especially marked have one physical core contending, two active but not contending, and one idle ( 70 - 16 - 6 = 48 ).

[ Same comment as before regarding the ( 24 ) permutations of A, B, C and D amongst themselves. ]

So that's 22.86% ( 16/70 ) are good, 8.57% ( 6/70 ) are worst, and 68.57% ( 48/70 ) are mediocre [ yup, that adds to 100! :-) ]

Cheers, Mike.

( edit ) Ya, 70 total cases is right = [8! /((8 - 4)! * 4!) = (8 * 7 * 6 * 5)/(4 * 3 * 2 * 1) = 7 * 2 * 5 = 70.

( edit ) Whoops, missed a case ( marked with '*' ) for 3 tasks on 3 HT cores :[pre]
Phys Virt
1 | a | A A A A A A A A A A
| b | B B B B A A A A A A
2 | a | C B B B B B B A A A
| b | C C B B C B B B B A
3 | a | C C C B C C B C B B
| b | C C C C C C C C C C
___________________________________________________________________________
Y Y Y Y Y Y Y Y *[/pre]
Thus 8/20 = 0.4 or 40% lucky and thus 60% unlucky. Sorry ......

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

archae86
archae86
Joined: 6 Dec 05
Posts: 3162
Credit: 7319055021
RAC: 2313306

RE: So that's 22.86% (

Quote:

So that's 22.86% ( 16/70 ) are good, 8.57% ( 6/70 ) are worst, and 68.57% ( 48/70 ) are mediocre [ yup, that adds to 100! :-) ]

Cheers, Mike.


I spent some time getting relative performance estimates for the assignments Mike here calls Good, Mediocre, and Worst.

For this purpose I used a new range of frequencies/seq, having exhausted my stock of the previous, but actually think these are still very close to the previous in work content. However I shifted the reported CPU time to elapsed, rather than reported task, particularly because I observed some anomalous behavior in my test condition:

Some of the time, Windows would not activate all of the currently executing Einstein tasks, even though the affinity prescriptions left open a virtual CPU. One practical impact was a much bigger discrepancy between elapsed time and reported CPU time than usual.

Paradoxically, the most restrictive case of allowing a task to run on one and only one virtual CPU, which one would expect to suffer from the occasional case of waiting for a higher priority system task which happened to get assigned that CPU during one of the many times per second that the Einstein task is out on a context switch (waiting for disk, or ...). Suffer it doubtless does, but on my rather artificial test cases, sometimes the most restrictive assignment was far more productive than the most free one (specifically, in the case of running four Einstein tasks on two physical cores--thus four CPUs--it was much more productive to hard assign each of the four tasks, than to all all four free range).

Here are some comparison numbers. I'll leave it to others to estimate whether these--matched with Mike's proportions, suggest that Windows is doing a bit better or a bit worse than one might predict on random assignment.

The top populated HT_4 row corresponds to Mike's Good (.7622), the next two rows both represent Mike's Bad, but I suspect the first of the two (.5692) is more representative of actual occurrence, and the throughput composite line (.6596) is my estimate of Mike's Mediocre case. The relative performance numbers in this paragraph are all estimates of aggregate system throughput compared to a HT_8 case on the same work.

One should perhaps mention that the designers of the Windows scheme probably did not consider maximizing system aggregate throughput of persistent very low priority tasks as high on their desired outcome list.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6591
Credit: 327096324
RAC: 179000

To be clear for other readers

To be clear for other readers : by elapsed time is meant as per a clock on the wall from the task beginning until completed, whereas CPU time is the total of accumulated time slices devoted to the task. The difference would be how long the task is waiting to be executed on a CPU. Think of a CPU, physical or virtual, as a contended resource which requires allocation - this is an OS dependent function that switches tasks to CPUs on a priority basis. Tasks thus compete amongst themselves for CPU time, including the tasks we "don't see" like a heap of mundane OS stuff. But our tasks need those mundane ones to perform ( disk access say ) and thus may be 'blocked' in proceeding while awaiting their completion.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

archae86
archae86
Joined: 6 Dec 05
Posts: 3162
Credit: 7319055021
RAC: 2313306

Mike Hewson wrote: by elapsed

Mike Hewson wrote:
by elapsed time is meant as per a clock on the wall from the task beginning until completed, whereas CPU time is the total of accumulated time slices devoted to the task.


Just so. And most of the time you want to quote CPU time for various comparisons, as it is generally less subject to variation from the "otherwise" state of the system than is elapsed time.

But for my tests I believe I did a pretty good job of avoiding nearly all the appreciable time consumers, so that for most cases either would serve--save in this toxic mis-allocation, where an available task theoretically in execution nevertheless fails to get assigned to an available CPU at substantial likelihood over an extended period of time.

On a completely unrelated note--I appear to have killed my Westmere host late this afternoon. I was debugging the problem of only getting 4G out of 6G of RAM, and had just completed the last step in satisfying Corsair's RMA requirements by testing each RAM module of the set separately to memtest86+. It failed even to boot with the offending module, so could be deemed to have failed the test. But something happened in transitioning back to a known good RAM configuration (for one thing, I think I failed to turn off the power supply before shuffling RAM modules, a rookie mistake for sure), and as of now the system gives no signs of life at all save for consuming 20 watts from the wall. No fan spins, no beep codes, no Mobo Dr Debug digits displayed, no sounds for hard drive or CD drive--in fact no detectable response to pressing the front panel "power" button at all--not even in power consumption which remains steady at 20W (before this death, the behavior was that on turning on the real power switch on the supply. it would go up to about 20W over a couple of seconds, stay there for a couple of seconds, then descend to about 5W until the front panel button was pressed). Yes I have exercised the ClrCMOS jumper. At the moment I suspect death of the motherboard or of the power supply, though there are some other possibilities. I plan to sleep on it, and tomorrow disconnect very nearly everything (eventually including the CPU).

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6591
Credit: 327096324
RAC: 179000

RE: On a completely

Quote:
On a completely unrelated note--I appear to have killed my Westmere host late this afternoon. I was debugging the problem of only getting 4G out of 6G of RAM, and had just completed the last step in satisfying Corsair's RMA requirements by testing each RAM module of the set separately to memtest86+. It failed even to boot with the offending module, so could be deemed to have failed the test. But something happened in transitioning back to a known good RAM configuration (for one thing, I think I failed to turn off the power supply before shuffling RAM modules, a rookie mistake for sure), and as of now the system gives no signs of life at all save for consuming 20 watts from the wall. No fan spins, no beep codes, no Mobo Dr Debug digits displayed, no sounds for hard drive or CD drive--in fact no detectable response to pressing the front panel "power" button at all--not even in power consumption which remains steady at 20W (before this death, the behavior was that on turning on the real power switch on the supply. it would go up to about 20W over a couple of seconds, stay there for a couple of seconds, then descend to about 5W until the front panel button was pressed). Yes I have exercised the ClrCMOS jumper. At the moment I suspect death of the motherboard or of the power supply, though there are some other possibilities. I plan to sleep on it, and tomorrow disconnect very nearly everything (eventually including the CPU).


Oh. With a bit of luck it'll turn out to be something cheaper and simpler like the power supply not supplying a trickle current to the board ( so that it knows when the power switch has been toggled via the mobo input pins ) for full switch on. Swap in a known good PSU and see what happens ... I've seen this before and found/claimed the PSU capacitors at fault ( age plus paper'n'paste and not solid-state ). So your 20W could represent a 'short' across the capacitors, dropping the voltage of outputs ( including the PSU's own fans ), and of course ripple control, thus little happens on the mobo. Look for eburnation of the capacitor connections to the printed circuit board. I really like Corsairs myself.

I'll look at the recent Westmere data and see if I can soundly deduce anything.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

archae86
archae86
Joined: 6 Dec 05
Posts: 3162
Credit: 7319055021
RAC: 2313306

Mike Hewson wrote:Oh. With a

Mike Hewson wrote:
Oh. With a bit of luck it'll turn out to be something cheaper and simpler like the power supply not supplying a trickle current to the board ( so that it knows when the power switch has been toggled via the mobo input pins ) for full switch on. Swap in a known good PSU and see what happens ...


I disconnected the suspect supply and did rudimentary testing. With a 575 ohm resistor across Vsb to meet minimum current requirement, I saw 5.1V on VSB. When I shorted /PS_ON to ground, I was voltages on all main supplies too close to correct to give this stone cold dead behavior even though I was not providing any load to them. Then I attached a different supply to the motherboard 8 pin and 24 pin ATX connectors, and got the same behavior, save only that the power consumed was 10W instead of 20W.

I failed to mention this before, but when healthy, the system draw when "off" (meaning standby) before was something like 2W. I suspect something failed on the motherboard is putting a heavy load on Vsb. If the motherboard itself is not at fault, I think the most likely thing is that I fried the CPU during mishandling the RAM in a way that happens to present an intolerable load to motherboard or supply. So once I have the HSF and CPU off, I'll probably do a last trial to see if the mobo shows a little life (debug LED at least) in that state.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6591
Credit: 327096324
RAC: 179000

RE: I disconnected the

Quote:

I disconnected the suspect supply and did rudimentary testing. With a 575 ohm resistor across Vsb to meet minimum current requirement, I saw 5.1V on VSB. When I shorted /PS_ON to ground, I was voltages on all main supplies too close to correct to give this stone cold dead behavior even though I was not providing any load to them. Then I attached a different supply to the motherboard 8 pin and 24 pin ATX connectors, and got the same behavior, save only that the power consumed was 10W instead of 20W.

I failed to mention this before, but when healthy, the system draw when "off" (meaning standby) before was something like 2W. I suspect something failed on the motherboard is putting a heavy load on Vsb. If the motherboard itself is not at fault, I think the most likely thing is that I fried the CPU during mishandling the RAM in a way that happens to present an intolerable load to motherboard or supply. So once I have the HSF and CPU off, I'll probably do a last trial to see if the mobo shows a little life (debug LED at least) in that state.


Darn. Let's hope it's the mobo.

[aside]
I lost a CPU once because of a cheap case. When screwing in the case cover screws, the soft metal in the edges of the screw holes made little shavings. One fell across the CPU pins at one edge ( pre CPU heat-sink days, and pre air-in-a-can days ) causing a dead short on boot up and dead CPU. What are the odds on that? Some days it can be like 'for want of a nail, a horseshoe was lost ....' :-)
[/aside]

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.