AMD 5800X3D, magic cache or pricy disappointment?

Peter van Kalleveen
Peter van Kalleveen
Joined: 15 Jan 19
Posts: 45
Credit: 250329645
RAC: 0
Topic 227454

Hey Guys,

Here to share some early early experiences with the 3d cache sku.

Could not find real good compute references so i just purchased one, i was hoping it could maybe resolve the relative long loading time when starting a GPU WU and last 10% of the WU, when it seems not much is happening but minutes pass by.

So first impressions of the 5800x3d was holy crap, it just goes from the starting block blasts the WU to the GPU and keeps good tempo.

Right past the 90% mark, up to the 99%............ and that again takes minutes before completing unfortunately.

I do have to say my GPU utilization is pretty good, have the feeling the cache is helping up until the last 1%.

 

Currently running the 5800x3d without SMT enabled and Infinity fabric 1:1 with dram 3200mhz CL14

Have 3 WU threads with 1 core per WU assigned.

GPU is an AMD Radeon VII Pro card and an average O3AS WU is around 11 minutes. +/- 30 sec.

 

Any other testing idea's regarding the Cache monster? Could get some better Dram and go to 3800mhz with 1:1

Or anyone also got some experience with the SKU? cant really remember compared to a 5800x or similar if its much better or not.

Feels a bit faster at least.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17543361110
RAC: 6390909

Do you have any cpu only

Do you have any cpu only project to compare runtimes of a non-3D sku against the 3D sku?  Curious if the larger L3 cache has any benefits at all.  Probably still up to the individual science application whether the large L3 cache can be used.

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17543361110
RAC: 6390909

Michael at Phoronix.com did

Michael at Phoronix.com did both gaming and productivity/machine learning tests and it was a wash for Linux gaming but the cpu test suite tests showed the same impressive improvements that his previous Milan-X tests did.

 

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6534
Credit: 284700169
RAC: 113726

FWIW : the last bit of the WU

FWIW : the last bit of the WU is in producing a quite long ordered list of candidate signals discovered, to be reported back to the project. This typically produces that pause in the apparent progress of the processing after 90%. I'd say your observations probably indicate that this phase is either not particularly cache dependent, or if so then produces many cache misses.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Peter van Kalleveen
Peter van Kalleveen
Joined: 15 Jan 19
Posts: 45
Credit: 250329645
RAC: 0

Unfortunately i don't have a

Unfortunately i don't have a normal 5800x on hand.

Only my trx40 but that's a gen 2 ryzen, so i put the radeon pro in my highly overclocked I7-12700KF rig.

Results where very surprising indeed.

This rig has all efficiency cores turned off so the ring bus can clock to 4.8ghz on 1:1 ratio with the performance cores. Memory is ddr4 at 4000mhz at cl 16. So a very beefy processing setup.

1 Core per WU

And it gets to around the 11.30 minutes with only 2 WU active, the actual GPU processing part even takes a bit longer than on the Ryzen with the 3 WU.

Ryzen did 3wu around 11 minutes

After some looking around at performance metrics, when looking in task manager you have the option of displaying additionally the Kernel time, so you can see how fully loaded the core is, and how long the core is actually waiting for work if i understand correctly.

With the I7 the cores are loaded to 80-100 % but the Kernel time metric is really low, around 10-40% depending on the stage of the WU.

With the 5800x3d The Kernel time is almost matching the core load, so if that's around 80-90% the Kernel time is around 70-80%

So not a true apples to apples comparison but i must say i was pleasantly surprised that the 5800x3d sort a blows the 12700 out of the water, because to my knowledge the i7 was superior to the 5800 in almost all cases.

Maybe i will to a 2wu test with the radeon and 5800x3d to make a better comparison but almost certain it will be even faster

More cache for the people!

Peter van Kalleveen
Peter van Kalleveen
Joined: 15 Jan 19
Posts: 45
Credit: 250329645
RAC: 0

Unfortunately i don't have a

Dont impatiantly click post buttons or you're post gets there in triple

Peter van Kalleveen
Peter van Kalleveen
Joined: 15 Jan 19
Posts: 45
Credit: 250329645
RAC: 0

Unfortunately i don't have a

Keith Myers wrote:

Do you have any cpu only project to compare runtimes of a non-3D sku against the 3D sku?  Curious if the larger L3 cache has any benefits at all.  Probably still up to the individual science application whether the large L3 cache can be used.

So would probably be a good test as a sort of benchmark, but GPU acceleration is just so much better of getting a lot of compute done.

Its also only a 8 core so the amount of cpu compute that can be done is very limited. 

I hoped and it seems to help a bit with the CPU tasks around the GPU WU's, unfortunately the CPU part is still around 50% of total processing time so its not a magic 500% speedup for pushing more compute faster on the system.

But does seem to help a pretty decent amount at least.

Now have to find a am4 motherboard that can house 4 gpu's, 2 WU per GPU and would be a beast.

 

And a small detail: Very efficiently.

Its humming along around the 75W for the CPU, while the alder lake system hovers around 160W

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17543361110
RAC: 6390909

MSI X570 Godlike comes to

MSI X570 Godlike comes to mind with 4 full length PCI slots spaced 2 card slot widths apart. Would support 4 gpus natively on the board without extenders.

 

mikey
mikey
Joined: 22 Jan 05
Posts: 11888
Credit: 1828065059
RAC: 206393

Peter van Kalleveen

Peter van Kalleveen wrote:

Dont impatiantly click post buttons or you're post gets there in triple 

Try two periods as the only thing in a post and it might disappear, as long as you are within the edit time.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4699
Credit: 17543361110
RAC: 6390909

Unfortunately negated if you

Unfortunately negated if you have a signature.  That keeps the post visible.  Guilty as charged.

 

Peter van Kalleveen
Peter van Kalleveen
Joined: 15 Jan 19
Posts: 45
Credit: 250329645
RAC: 0

i know, tried to find one.

i know, tried to find one. Not for sale anymore and crazy expensive when it was on sale.

Wonder if GPU performance suffers much if i just use a bifurcation riser on the primary x16 slot to x4x4x4x4.

If its not to heavy on the PCIe bus transfer could be great solution. Some good PCIE band cables to keep it at gen3.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.