GPU Problem

TJ
TJ
Joined: 11 Feb 05
Posts: 178
Credit: 21,041,858
RAC: 0

RE: I can run 0 or 1 cpu

Quote:

I can run 0 or 1 cpu task and have a level gpu load. Running a second cpu task gets me a smaller downgrade in speed.

Even with 2 tasks and the GPU load, I'm under 75% processor load, so it seems like the bottle neck experimentally, but I can't see the reason for it in my head.


I have seen that too on my quad. One CPU WU does for 1 or 2 % reduce the GPU load, 2 CPU task, reduce it further up to 10% depending on the tasks. Three will even do more. Several crunchers set CPU use to 99%. If I do that than GPU load is 0, from time to time 25-40% for a few seconds, 0 and so on.
Keep in mind that when BOINC says, 0.49 or 0.75, or 0.xx CPU use it will always be one core and with HT thus 2 (as that is one core).

Greetings from
TJ

Pavel Hanak
Pavel Hanak
Joined: 27 Jul 06
Posts: 9
Credit: 58,214,482
RAC: 265,544

Hi all, I just want to report

Hi all, I just want to report that I seem to have somewhat extreme case of the problem discussed here with "einsteinbinary_BRP5_1.34_windows_x86_64__opencl-ati" app. I have a PC with 6-core/12-thread CPU (Intel i7-970) and Radeon HD7970 GPU, running on W7 64-bit and 7.0.64 BOINC Manager. And like others have reported here, the more CPU cores are cunching, the slower is the GPU. I played with "use at most xxx % of processors" in BOINC Manager's local computing preferences and here is a short table of results:

1 CPU core crunching - 82% GPU utilization
2 CPU cores crunching - 73% GPU utilization
3 CPU cores crunching - 67% GPU utilization
4 CPU cores crunching - 66% GPU utilization
5 CPU cores crunching - 65% GPU utilization
6 CPU cores crunching - 64% GPU utilization
7 CPU cores crunching - 60% GPU utilization
8 CPU cores crunching - 54% GPU utilization
9 CPU cores crunching - 50% GPU utilization
10 CPU cores crunching - 35% GPU utilization
11 CPU cores crunching - isolated spikes of GPU activity (similar to picture in Chris' post number 125395), average utilization aroud 2%
12 CPU cores crunching - essentially 0% GPU utilization

I also tested Milkyway and SETI apps for ATI GPU, but as far as I can tell, those work fine. Well, apart from the CPU affinity problem I described in this thread:

http://milkyway.cs.rpi.edu/milkyway/forum_thread.php?id=3293

In that thread, one Milkyway programmer mentioned that BOINC currently (?) has problems with CPU scheduler, which might be the reason why Einstein GPU app acts so weird. Unfortunately, the CPU affinity fix I found for the Milkyway GPU app has no effect on Einstein GPU app.

Chris
Chris
Joined: 9 Apr 12
Posts: 61
Credit: 45,056,670
RAC: 0

Now, rather suddenly I've

Now, rather suddenly I've gone from a 13,000 rac to 18,000. I do have one core of CPU work doing E@H but the gpu spikes have leveled off.

I'm going to give it a third core of CPU work to see what happens.

Logforme
Logforme
Joined: 13 Aug 10
Posts: 332
Credit: 1,714,373,961
RAC: 0

RE: I also tested Milkyway

Quote:
I also tested Milkyway and SETI apps for ATI GPU, but as far as I can tell, those work fine. Well, apart from the CPU affinity problem I described in this thread:


My understanding is that the E@H GPU apps are very demanding on the CPU and the PCIe bandwidth. Much of the work can't be done on the GPU but have to be done on the CPU. This means that a lot of data needs to be moved between the CPU and the GPU very often.
My guess is that MW@H have calculations that can be almost entirely done on the GPU, making other apps running on the CPU not affecting the GPU ones as much.

Pavel Hanak
Pavel Hanak
Joined: 27 Jul 06
Posts: 9
Credit: 58,214,482
RAC: 265,544

RE: My understanding is

Quote:

My understanding is that the E@H GPU apps are very demanding on the CPU and the PCIe bandwidth. Much of the work can't be done on the GPU but have to be done on the CPU. This means that a lot of data needs to be moved between the CPU and the GPU very often.

My guess is that MW@H have calculations that can be almost entirely done on the GPU, making other apps running on the CPU not affecting the GPU ones as much.

I find such explanation highly unlikely. If it was true, I would be barely able to move the cursor on desktop when all CPU cores were crunching, and things like playing videos would be completely impossible. Yet, that doesn't happen, not even if I assign Einstein app to realtime CPU priority. Only Einstein GPU app almost stops when all CPU cores crunch. So there must be a problem somewhere in BOINC or Einstein app.

Besides, the HD7970 is in PCIe x16 slot, which has 8 GB/s bandwidth in each direction. I'd hardly call that a bottleneck...

TJ
TJ
Joined: 11 Feb 05
Posts: 178
Credit: 21,041,858
RAC: 0

RE: RE: My understanding

Quote:
Quote:

My understanding is that the E@H GPU apps are very demanding on the CPU and the PCIe bandwidth. Much of the work can't be done on the GPU but have to be done on the CPU. This means that a lot of data needs to be moved between the CPU and the GPU very often.

My guess is that MW@H have calculations that can be almost entirely done on the GPU, making other apps running on the CPU not affecting the GPU ones as much.

I find such explanation highly unlikely. If it was true, I would be barely able to move the cursor on desktop when all CPU cores were crunching, and things like playing videos would be completely impossible. Yet, that doesn't happen, not even if I assign Einstein app to realtime CPU priority. Only Einstein GPU app almost stops when all CPU cores crunch. So there must be a problem somewhere in BOINC or Einstein app.

Besides, the HD7970 is in PCIe x16 slot, which has 8 GB/s bandwidth in each direction. I'd hardly call that a bottleneck...

The explanation from Logforme is quite good. Some projects and even different WU's in the same project have different PGU and MCU loads, while the CPU works like a feeder. And even when an app says, 0.695 CPU use or 0.26 CPU use , or 0,xxx CPU use it is always one core. And when all cores are busy (crunching or whatever) then the GPU load will drop to almost zero.

That one still can use a mouse has to do with either the app does not occupy the CPU completely or one settings within the project page where you can set how many you want to keep free for graphics.

Greetings from
TJ

Pavel Hanak
Pavel Hanak
Joined: 27 Jul 06
Posts: 9
Credit: 58,214,482
RAC: 265,544

RE: The explanation from

Quote:


The explanation from Logforme is quite good. Some projects and even different WU's in the same project have different PGU and MCU loads, while the CPU works like a feeder. And even when an app says, 0.695 CPU use or 0.26 CPU use , or 0,xxx CPU use it is always one core. And when all cores are busy (crunching or whatever) then the GPU load will drop to almost zero.

That one still can use a mouse has to do with either the app does not occupy the CPU completely or one settings within the project page where you can set how many you want to keep free for graphics.

Except Milkyway people were telling me exactly the same "explanation", but in the end, I found out the problem lies in CPU affinities. And SETI GPU app uses 0.5 CPU too, yet it runs fine. So pardon me if I still think there is some problem with Einstein GPU app. And I'm not alone here, obviously... ;-)

Chris
Chris
Joined: 9 Apr 12
Posts: 61
Credit: 45,056,670
RAC: 0

That was not the way it was

That was not the way it was working for months after I got this GPU. I don't know if it was a system change of some sort or a change here. I was able to run 4 CPU cores (CPDN mostly) and saw nearly no change in GPU run times if I left a core open or not.

Now, I still can't use 3 CPU cores. Why I need to have 2 free cores to run something that requires .2 CPU cores is not exactly clear at the moment. CPU load is 60% (2 CPU tasks, Windows, and the GPU feeding).

TJ
TJ
Joined: 11 Feb 05
Posts: 178
Credit: 21,041,858
RAC: 0

Was it exactly the same type

Was it exactly the same type of WU?

You can go to BOINC manager and set: On multiprocessor systems, use at most ... % of the processors, to 99 and watch GPU-Z what happens. Then you will see.

Greetings from
TJ

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9,352,143
RAC: 0

RE: RE: The explanation

Quote:
Quote:


The explanation from Logforme is quite good. Some projects and even different WU's in the same project have different PGU and MCU loads, while the CPU works like a feeder. And even when an app says, 0.695 CPU use or 0.26 CPU use , or 0,xxx CPU use it is always one core. And when all cores are busy (crunching or whatever) then the GPU load will drop to almost zero.

That one still can use a mouse has to do with either the app does not occupy the CPU completely or one settings within the project page where you can set how many you want to keep free for graphics.

Except Milkyway people were telling me exactly the same "explanation", but in the end, I found out the problem lies in CPU affinities. And SETI GPU app uses 0.5 CPU too, yet it runs fine. So pardon me if I still think there is some problem with Einstein GPU app. And I'm not alone here, obviously... ;-)

I suggest you turn off Hyperthreading and re-run your experiments again.

I can tell you for a fact locking CPU affinity for GPU tasks on AMD/AMD platforms hurts GPU utilization significantly once you reach the point where the GPU needs close to one cores worth of CPU support, even if you have left one core free per GPU. If you run multiple tasks per GPU and do it, you might as well disable the GPU in that case.

One thing you've neglected to mention through all this is what impact locking affinity has had on the CPU tasks running on the machine at the same time. My money is on that at least one of them is going to be brought to a virtual standstill when using all cores, regardless of which way you run the CPU.

The other bone I have to pick is about your conversation with Jake over at MW regarding the BOINC scheduler. The simple fact of the matter is the BOINC task scheduler only determines which BOINC tasks need to be running at any given time in order to meet its requirements to make all task deadlines and resource share preferences. For the CPU, it has NEVER had anything to do with deciding where a given type of task is going to run, and for the GPU the only 'control' (other than passing along a more than likely ill advised application request to lock affinity to the OS) it has is whether the user has specified whether to use certain GPU resources or not, or in the default BOINC configuration to use just the most 'powerful' one if more than one type exists. Some would argue BOINC can just barely do that adequately, let alone try to actually run the physical hardware (now there's a scary thought).

The determination at the hardware level is, and always has been the exclusive province of the OS. And guess what, the operating system is under no obligation to honor that request if it conflicts with other priorities it has to meet. That being said, the most current releases of all them will try their best to honor it (including Windows incredibly enough). The part I find amazing is how well it all works nowadays overall without having to go in and do a lot of hand tuning (really tedious due to shooting at a constantly moving target).

As far as 'low' GPU utilization for EAH apps goes, who are you to say the app has a problem? Have you done a thorough algorithm and code analysis? I know I haven't so I can't say anything about it one way or the other.

Two things I have learned about this project over the years. The first is they NEVER release an application here which hasn't been pretty well thought out and tested. IOW's, it performs the calculations they need to have done, the way they have to be done in order to meet the mission objectives of the project.

The second is they have shown time and time again they have no problem with innovative new solutions to getting more work done in less time, as long as point one is not compromised. Therefore, I have to assume there is a really good reason why I can't drive EAH tasks to 100 percent GPU utilization on any of my hosts, regardless of whether I can do that with projects' applications or not.

So at best, I think the only thing you can safely say on this matter is that for HT enabled Intels there may be an advantage in some cases to overriding default OS control of thread scheduling. However, the simple fact it didn't do anything for EAH proves it is NOT the 'silver bullet' solution for all cases as you were implying it was at MW, and thus the idea that a project (or BOINC for that matter) should summarily decide what's best for a given host in this regard is a really bad idea. When it comes to BOINC, heaven knows they make enough bad decisions that don't get outside of their own process space, let alone start making ones which could have REALLY negative effects on the host where they reside as a background task guest.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.