AMD R9 280X OpenCL application issues
Hello folks,
I recently bought an MSI R9 280X to replace my MSI GTX 660 Ti, and the resulting switch has caused Einstein@Home performance to plummet.
When a GPU application is running, the card's clock speed mostly stays at the low-power 300/150 MHz (core/memory) state, occasionally rises to 500/1500 MHz, and never reaches the full 1050/1500 MHz. All this time, GPU load is between 0 and 50%. note that games fully tax the GPU (in terms of clock speed and load), and so does Folding@Home's OpenCL benchmark (single- and double-precision).
Here's a picture of the clock speeds from my keyboard LCD system monitor (the base line is 300 Mhz, the peaks are 500 MHz, and the max extent of the graph is 1100 MHz):
Any ideas? Before switching, I completed all my tasks on the old GPU, uninstalled BOINC, deleted BOINC's program data folder, and then reinstalled BOINC and reattached to my projects, just to make sure any CUDA/OpenCL confusion didn't arise.
Thanks; I just want to get back up to speed.
Edit: This means that tasks are taking much much longer to finish, and likely won't complete before the deadline (I've used the 'No new tasks' button though, so hopefully there'll be no more waste). My current tasks (2013/11/09) have a 2013/11/13 deadline, and each one will take an estimated 20 hours to finish. The tasks have gone from taking ~2,200 seconds on the NVIDIA GPU to ~45,000 seconds on the AMD GPU.
AMD R9 280X OpenCL application issues
)
Are you running CPU-tasks on all of your CPU cores?
With OpenCL the general advice is to free at least one CPU core to support the GPU otherwise the performance of OpenCL apps drop significantly.
So go to your computing prefs and change "On multiprocessors, use at most 100% of the processors" to 75% or if your using local prefs change the corresponding setting in Boinc Manager.
Smells indeed like your GPU
)
Smells indeed like your GPU is only running one task and has the default 0.5 CPU per GPU task issue (which results in the GPU getting just about no CPU time, as BOINC erroneously rounds that down to 0 when load is on all other CPU cores, instead of guaranteeing 50% CPU core time as it should).
IMHO it would be best to set your BRP Utilization factor in the Einstein@Home preferences to 0.5 = 2 parallel GPU tasks (or even 0.25 = 4 GPU tasks). That should solve the issue and skyrocket your GPU output.
That's basically the easiest and most efficient fix, and it will also maximize your GPU output.
RE: Are you running
)
Perfect! This worked. Now my CPU usage from the CPU tasks occupies cores 1-3 and the last CPU core is at 50-60% feeding the GPU (I presume). GPU frequency is rock solid at 1050 MHz and GPU load hovers between 50-70% most of the time, dropping to 20% for a few seconds at a time.
I freed up a CPU core as Holmis suggested, but GPU load is still not 100%. I suppose running two (or more) tasks on the GPU will fix that?
Is there any downside to changing the BRP Utilization Factor? That DANGEROUS! message is ominous.
RE: Is there any downside
)
The warning is a CYA statement. I am running 3 gpu jobs on a GTX 650 Ti. I too was a bit concerned at first so I tried two GPU tasks and monitored the GPU temp to get a feel for how it would be effected. Later I "up'd" it to 3 tasks which for my setup seems to be ideal.
RE: RE: Is there any
)
Right but why need a Cover Your Ass statement unless it doesn't work as it should? I'm just wondering why it's there in the first place.
Glad it's working better
)
Glad it's working better now!
The dangerous warning is there to make you think one more time before increasing the load on your system. If you opt to run more tasks in parallel on the GPU its power requirement will increase and that could overload the PSU if it's not up to par with the increased power usage. It could also overheat the GPU or the whole system if the case cooling is to weak.
If you opt to change the utilization factor just keep an eye on the temps and back of if their getting to high. You could also set the processor usage back to full as Boinc will add together the cpu usage part of the gpu tasks and reserve the core by itself if it's above 1. So if you run 2 tasks with (0.5 CPU + 0.5 ATI) it will add them and reserve 1 core and if you do as FalconFly suggested and try 4 tasks it would be (0.5 CPU + 0.25 ATI) and thus reserve 2 cores.
Just remember that the setting will not be used until new GPU work is downloaded.
RE: Glad it's working
)
Thanks! Cooling and power are not a problem. I'll try a factor of 0.5 when I run out of the current work units.
Remember when you do that
)
Remember when you do that (BRP Utilization factor), restricting BOINC CPU usage is no longer needed (2x 0.5CPU will cause BOINC to dedicate one CPU core for GPU tasks). Setting only takes effect after a new GPU Workunit created with this setting is downloaded and processed, just keep that in mind and be patient when waiting for it to kick in.
Im under the impression that this is what most users run (minimum 2 Tasks per GPU).
Intel Systems equipped with PCIe3.0 Slots can afford to run 4 (midrange GPU) or even 6/8 (HighEnd GPU) tasks parallel, provided the Video card is run on a full 16x PCIe3.0 slot.
This will further improve performance and GPU load to its realistic maximum (true 100% is impossible to achieve due to overhead and other technical reasons).
AMD Platforms hardly improve performance when going beyond 2 parallel tasks due to PCIe2.0 limitations, 4 tasks is IMHO the absolute maximum that makes sense there for HighEnd GPUs.
RE: Remember when you do
)
Thanks again!
Yes, I've set the BRP factor to 0.5, but I've left the CPU cores setting to 'use three cores at most' for now, until I get new tasks that use half a GPU.
I do have an Intel Z87 motherboard with a 16x PCI-E 3.0 slot so that shouldn't be a problem for the full 288 GB/s bandwidth that this R9 280X memory can handle.
This is going slightly off topic, but how do I determine how many concurrent tasks is most efficient for my GPU (2/4/6/8)? Is it just trial and error? This AMD R9 280X is (exactly, except for clock speeds) equivalent to a Radeon HD 7970.
This is entirely based on
)
This is entirely based on experiences of other users, most reported that for a card this powerful on PCIe 3.0, upto 8 GPU tasks still produce a slight gain if the CPU is also powerful enough.
This is only valid IMHO if the system is a purely dedicated cruncher not used for any other task.
If it's a typical System used for other tasks as well, I'd start off with 2 GPU tasks to get a good number of results and see how it behaves (i.e. GPU temperatures and generated fan noise) - then step it up to 4 tasks and compare.
Also, I imagine having 4 or even more taks loaded on the GPU will inreasingly interfere with normal everyday Desktop operations (slowdowns, visual artifacts, especially HD videos or Flash animations may not run smooth etc.). It will definitely increase temperatures, that need to be checked out at least once to ensure they're still in a safe region. Again, a good set of Fans blowing fresh air onto the entire Video Card is almost mandatory, not so much for the GPU itself but for the many capacitors/voltage regulators etc. on the Video Card's PCB. Those tend to suffer the most from excess/prolonged heat, which decreases their lifespan.