(Solved) AMD FX 4320 and 8350 performance

Konz Andy
Konz Andy
Joined: 22 Aug 10
Posts: 7
Credit: 12449779
RAC: 0
Topic 201195

Hello

 

I got a new (used) computer FX 8350 with an Asus crosshair formula Z board where  I inserted my GTX 660TI on.

My brother got the FX 4320 on a Asus m5a standard board with an GTX 770

 

How is it possible that for the CPU task my computer take 50.000s compare to my brothers computer who take only around 40.000s?

Do I have a weak setup or pore drivers? all the cores turn at 100% and there are both 3WU on the GPU

 

you can check the results Username "Andy KONZ" and "petz1972"

 

thanks for your help

 

Andy

Zalster
Zalster
Joined: 26 Nov 13
Posts: 2852
Credit: 2878991581
RAC: 318330

You didn't specify how many

You didn't specify how many work units you are running on the CPUs in both systems. 

Konz Andy
Konz Andy
Joined: 22 Aug 10
Posts: 7
Credit: 12449779
RAC: 0

Hello,   on the 4320 4 wu

Hello,

 

on the 4320 4 wu cpu and 3 wu on the gpu

 

on the 8350 8 wu cpu and 3 wu on the gpu

 

the only advantage my brother has with the 4320 ist the better gpu card and a 1866 mhz ram compared to my 1600 mhz ram

 

greetings

Zalster
Zalster
Joined: 26 Nov 13
Posts: 2852
Credit: 2878991581
RAC: 318330

While it's true your brother

While it's true your brother does have a better GPU, that should not affect the time to complete of your CPU work units.   Ram speeds does affect somewhat but not to the extent that you have stated.  

I kind of figured that you and him would be running full bore on the CPUs with work units.

The GPU will require a small amount of CPU usage for their task.  They get this from the CPU core that is currently crunching a work units on the CPU. Now usually this is not a lot so it can sneak a few cpu cycles for its own use without disrupting the CPU work unit too much.

So that leaves the CPU themselves as the possibility as to why the CPU work units are slower on your machine compared to your brothers.

You are probably thinking, I have a 8 core and he has a 4 core but his times are faster, that doesn't make sense.

Except you have to remember while he has a 4 core, he also has 4 floating points. So each Core gets it's own floating point. The 8350 has 8 cores but only 4 floating points, so 2 cores will have to take turns sharing 1 floating point. As such, the time to complete will be longer as each work units is waiting it's turn at using that floating point.

Of course, this is a guess (I used to own a 8350 and upgraded to the 9370 at one point but I don't own any of those chips anymore). The easiest way to test this theory is to limit the amount of CPU work units to say 3 and see if the times decrease significantly. Why 3, 1 core for the GPU and 3 cores for CPU work units. 

Of course this presents the problem of how to get your machine to only crunch 3 CPU work units. For me, I used an app_config.xml to limit the number of work units by type so that I only run a certain number on the CPU. This might be more than you are willing to try at this point.

You could try limiting the number of cores in the bonic preferences (I'm sure someone will suggest this) but I don't believe that will actually change your times much.  You can try it and see but if it doesn't do anything, you can ask me and I'll try to explain why I think that method doesn't work.

Sorry to throw so much at you. I hope some of this gives you some ideas to work with.

Zalster

Sebastian M. Bobrecki
Sebastian M. Bo...
Joined: 20 Feb 05
Posts: 59
Credit: 659547936
RAC: 311791

Zalster wrote:...Except you

Zalster wrote:

...

Except you have to remember while he has a 4 core, he also has 4 floating points. So each Core gets it's own floating point. The 8350 has 8 cores but only 4 floating points, so 2 cores will have to take turns sharing 1 floating point. As such, the time to complete will be longer as each work units is waiting it's turn at using that floating point.

...

This isn't true. FX-4320 have 2 modules with 2 cores (threads) each and FX-8350 have 4 modules with 2 cores (threads) each. Each modules also have two 128bit floating point units. One for each core (thread). As they are 128 bits they have to be combined when there is need to perform a 256-bit AVX instructions. What is not recommended, as it's very slow in that mode. But this is not that case cause there are only tasks using SSE on this hosts.

@Konz Andy

I bet that behavior that you observe is caused by the available memory bandwidth. But not the small difference between 1600 and 1866. It's cause in your case memory bandwidth is shared between twice as many task (excluding GPUs). Try to set max percent of CPUs in BOINC manager to 50% for some time. You should see similar times to your brother or even better as OS task scheduler should assign each task to separate module so they don't have to share L2 cache (each module have 2MB L2 cache shared between two cores (threads)). But of course Your overall throughput will be lower than it's now cause You will trade 50% of tasks for ~25% (per task) shorter computing time.

Konz Andy
Konz Andy
Joined: 22 Aug 10
Posts: 7
Credit: 12449779
RAC: 0

Hello   thanks for your

Hello

 

thanks for your Help, I will try to reduce cpu usage to 50%

 

but one other question is:

 

I want 1 core deactivate for CPU WU but want to reserve it for THE GPU tasks

When I reduce the multicpu usage to 87%, all of the core will reduce a small amount but no one is really idle

How can I deactivate 1 core completely and tell Boinc to use that idle core only for the GPU WU

 

greetings, Andy

archae86
archae86
Joined: 6 Dec 05
Posts: 2323
Credit: 1454638175
RAC: 1421401

Konz Andy wrote:How can I

Konz Andy wrote:
How can I deactivate 1 core completely and tell Boinc to use that idle core only for the GPU WU

BOINC does not do CPU affinity, which is what you are talking about.  One application which does give you an interface to tell Windows to do program-specific CPU affinity is Process Lasso.

However, I don't think you will find the result of the specific configuration you mention to be favorable.

Zalster
Zalster
Joined: 26 Nov 13
Posts: 2852
Credit: 2878991581
RAC: 318330

As you can see Andy, using

As you can see Andy, using the above methods will not allow you to free up any CPU cores only for the GPU.

But can it be done, Yes.

But this isn't for everyone or one that is commonly recommended here on Einstein.   

However, you can restrict how many work units are run on the CPU and GPU by use of the app_config.xml

I use it when I want to run only a few CPU task on my CPU and leave the rest free for the GPUs.

If you wish to proceed I can show you what an app_config.xml would look like and where you would install it into the Einstein@home folder.

But it would require me to test it before I posted it for you as I don't currently run Gamma Rays, I was running the Gravity wave and would to rewrite a portion of the app_config.xml to make sure it works for you.

Zalster

 

fubared
fubared
Joined: 1 Jul 06
Posts: 1
Credit: 1068660
RAC: 0

You haven't mentioned

You haven't mentioned cooling, could you be throttling? I had an A10-5800 and that ran hot at idle. If you don't have a large tower cooler or running on water, 100% usage will get toasty.

You have to expect doubling the workload will result in slower processing per WU due to resource contention. But it took you 50,000sec to do 8WU while it took him 40,000sec to do 4. It will take him 80,000sec to do 8 so overall, your computer is still faster. 

mikey
mikey
Joined: 22 Jan 05
Posts: 4010
Credit: 369510802
RAC: 24322

Konz Andy wrote: How can I

Konz Andy wrote:

 

How can I deactivate 1 core completely and tell Boinc to use that idle core only for the GPU WU

 

greetings, Andy

Go into the Boinc Manager on your pc and under Options, Computing Preferences, and the Computing tab change it to "use at most 99% of the available cpus", that way one cpu core is free for whatever else the pc wants to use and Boinc will not use it. In your case it will use it for the gpu if it's needed, if not it will sit idle.

Konz Andy
Konz Andy
Joined: 22 Aug 10
Posts: 7
Credit: 12449779
RAC: 0

Hello   The problem is

Hello

 

The problem is solved

 

I changed the 1600Mhz DDR3 to 1866Mhz DDR3 Ram, with lower latency

 

now the CPU time went from 50.000 to 40.000

You can check this on my task list

 

Thanks for your Help

Greetings, Andy KONZ

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.