WCG ARF tasks used to take about 12 hours to run. Now they take from 14-20 hours on the same CPU.
U@h tasks don't seem to take as long. Neither do PrimeGrid (if you cherry pick your list for shorter run times).
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
WCG ARF tasks used to take about 12 hours to run. Now they take from 14-20 hours on the same CPU.
U@h tasks don't seem to take as long. Neither do PrimeGrid (if you cherry pick your list for shorter run times).
This and another couple of pc's were taking over 24 hours to finish an ARP task, these pc's are running in the 2.LOW mhz range and are only good for the physical number of tasks they can do at one time. Their psu's limit them to single plug gpu's which is okay with me as I have enough of them.
This swapping out the cpu's is a test to see if it works and if it does then I may be able to extend the life of them for another couple of years and retire some of the 6 and 8 core pc's that are also running.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
And in the HPC (High Performance Computing) options it includes cTDP "OPN-Max" and PPL "OPN-Max" for settings on manual.
I can't seem to find a glossary.
I am inferring they mean "240" for an Epyc 7742 CPU. Which is apparently its "max" according to Google.
Is that right?
It has also sorts of other interesting settings we could try too :)
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Those settings are just another name for the fused limits in the silicon.
So, yes the cTDP and PPL fused limits for the 7742 is 240W.
The settings in the various Tuning Guides generally are for very specific workloads. So specific that the cpus themselves are designed for those specific workloads.
As in some settings are only for the cpus that support the server that routes calls through a cell tower. Or the settings specifically for an AWS cloud server running a database.
Generally you should just leave all the settings at default for generic workloads. Our distributed computing varies enough between projects that defaults are best. About the only thing worth "tuning" is to set the cpu for maximum all-core performance with cTDP and PPL limits at the fused max.
If you were to limit an Epyc cpu to a single project that has very high demands on L3 memory for example, you might get some benefit from tuning use of the L3 cache. But that setting might then hurt performance on a different project that uses the cpu in a different manner.
Unless you have lots of time for specific testing of the various tweaks and levers you can pull in the BIOS, I find it best to just ignore most of them since I run too many projects that require divergent tunings so best to tune for generic workloads that work for all projects at acceptable levels of performance.
When I was reading the reviews/info about the newly announced Xeon line, my head was spinning with all of the possible acceleration varieties possible with these CPUs.. It's pretty wild.
Although we employ a large amount of Xeons, and probably will have more in the future, I wouldn't even know where to begin with these upcoming Xeons and how they could help boinc projects.
For those that were interested in the Threadripper PRO 5995WX 64 core system- I will have more data next week about system temperatures (along with pictures of the heatsink, etc). It is a pretty amazing CPU and a relatively small form factor.
Running at 100% (all-core speed is 3.05GHz), CPU holding at 67.8C on air cooling. Room is ~24C.
Trying to determine why the A4500 is taking longer to run a work unit than our other A4500 on a much older system. I think it has to do with Windows 11 (Professional) but I am not sure. Feel free to roll your eyes about Windows and not Linux.... students are working on two other self-made workstations that WILL have Ubuntu, but those are still in limbo.
Could be the issue with Windows scheduler which still has a lot of catching up to do to achieve parity with how it works on Intel hardware.
I don't know whether benchmarking the new TR host would make BOINC and the OS choose better optimization but I see that host hasn't been benchmarked yet. Still at the default BOINC values for new systems.
I have a teammate that was investigating for over a year why older Intel systems use less cpu resources compared to AMD system in identical configurations.
Turns out the science application was not configured correctly for AMD systems while it was correct for Intel cpus.
So it could be a lot of different variable that are the reason for the discrepancy.
I see the other older Intel host is also faster than the TR host though not as glaringly.
[Edit] I also wonder about the core to core latencies that are generally higher with AMD processors compared to the ring bus architecture of Intel cpus.
You might want to play around with locking a core to the specific gpu thread so that it doesn't get moved off it and around to other dies.
WCG ARF tasks used to take
)
WCG ARF tasks used to take about 12 hours to run. Now they take from 14-20 hours on the same CPU.
U@h tasks don't seem to take as long. Neither do PrimeGrid (if you cherry pick your list for shorter run times).
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Tom M wrote: WCG ARF tasks
)
This and another couple of pc's were taking over 24 hours to finish an ARP task, these pc's are running in the 2.LOW mhz range and are only good for the physical number of tasks they can do at one time. Their psu's limit them to single plug gpu's which is okay with me as I have enough of them.
This swapping out the cpu's is a test to see if it works and if it does then I may be able to extend the life of them for another couple of years and retire some of the 6 and 8 core pc's that are also running.
What happens when you power
)
What happens when you power restrict two performance leading CPUs?
https://www.anandtech.com/show/17641/lighter-touch-cpu-power-scaling-13900k-7950x">Here
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Bad link in your post.
)
Bad link in your post. Fixed.
A Lighter Touch: Exploring CPU Power Scaling On Core i9-13900K and Ryzen 9 7950X
Keith Myers wrote: Bad link
)
Fat Fingers on Phone. Thank you Keith.
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I was reading the Epyc
)
I was reading the Epyc "tuning" manual here.
And in the HPC (High Performance Computing) options it includes cTDP "OPN-Max" and PPL "OPN-Max" for settings on manual.
I can't seem to find a glossary.
I am inferring they mean "240" for an Epyc 7742 CPU. Which is apparently its "max" according to Google.
Is that right?
It has also sorts of other interesting settings we could try too :)
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
Those settings are just
)
Those settings are just another name for the fused limits in the silicon.
So, yes the cTDP and PPL fused limits for the 7742 is 240W.
The settings in the various Tuning Guides generally are for very specific workloads. So specific that the cpus themselves are designed for those specific workloads.
As in some settings are only for the cpus that support the server that routes calls through a cell tower. Or the settings specifically for an AWS cloud server running a database.
Generally you should just leave all the settings at default for generic workloads. Our distributed computing varies enough between projects that defaults are best. About the only thing worth "tuning" is to set the cpu for maximum all-core performance with cTDP and PPL limits at the fused max.
If you were to limit an Epyc cpu to a single project that has very high demands on L3 memory for example, you might get some benefit from tuning use of the L3 cache. But that setting might then hurt performance on a different project that uses the cpu in a different manner.
Unless you have lots of time for specific testing of the various tweaks and levers you can pull in the BIOS, I find it best to just ignore most of them since I run too many projects that require divergent tunings so best to tune for generic workloads that work for all projects at acceptable levels of performance.
When I was reading the
)
When I was reading the reviews/info about the newly announced Xeon line, my head was spinning with all of the possible acceleration varieties possible with these CPUs.. It's pretty wild.
Although we employ a large amount of Xeons, and probably will have more in the future, I wouldn't even know where to begin with these upcoming Xeons and how they could help boinc projects.
I have a lot more reading to do...
For those that were
)
For those that were interested in the Threadripper PRO 5995WX 64 core system- I will have more data next week about system temperatures (along with pictures of the heatsink, etc). It is a pretty amazing CPU and a relatively small form factor.
Running at 100% (all-core speed is 3.05GHz), CPU holding at 67.8C on air cooling. Room is ~24C.
Trying to determine why the A4500 is taking longer to run a work unit than our other A4500 on a much older system. I think it has to do with Windows 11 (Professional) but I am not sure. Feel free to roll your eyes about Windows and not Linux.... students are working on two other self-made workstations that WILL have Ubuntu, but those are still in limbo.
Link to the host info: https://einsteinathome.org/host/13073591
Pushing WCG tasks through it to give it a nice stress test.
Could be the issue with
)
Could be the issue with Windows scheduler which still has a lot of catching up to do to achieve parity with how it works on Intel hardware.
I don't know whether benchmarking the new TR host would make BOINC and the OS choose better optimization but I see that host hasn't been benchmarked yet. Still at the default BOINC values for new systems.
I have a teammate that was investigating for over a year why older Intel systems use less cpu resources compared to AMD system in identical configurations.
Turns out the science application was not configured correctly for AMD systems while it was correct for Intel cpus.
So it could be a lot of different variable that are the reason for the discrepancy.
I see the other older Intel host is also faster than the TR host though not as glaringly.
[Edit] I also wonder about the core to core latencies that are generally higher with AMD processors compared to the ring bus architecture of Intel cpus.
You might want to play around with locking a core to the specific gpu thread so that it doesn't get moved off it and around to other dies.