Hello again,
My GPU usage sits at around 20-30% when running a single wu for GPU. I lowered the factor to 0.33 in hopes of running 3 at a time to increase the usage and updated through BOINC, yet I still only run 1 at a time.
Is this setting for multi-GPU's or can it still be used to increase simultaneous WU crunching?
Thanks.
Copyright © 2024 Einstein@Home. All rights reserved.
Lowering GPU Utilization factor won't allow more WU's to be done
)
Sounds like you have the right setting, but they don't take effect until Boinc downloads a new task.
OpenCL apps usually like a
)
OpenCL apps usually like a free CPU core as well. You can free one up by reducing the % CPUs available to BOINC or using an app_config file to override the default value used by the various Einstein apps.
BOINC blog
Your most recently downloaded
)
Your most recently downloaded GPU task was downloaded at 25 Dec 2015, 22:57:53 UTC. Probably that was before you made the adjustment request on the webpage. But in the special case of this one parameter which adjusts the number of simultaneous jobs run on a single GPU, the request does not take effect on your machine until a new task is downloaded after the request is posted.
Once a new task is downloaded, the revised simultaneous work request will take effect immediately, not waiting until that specific task begins execution.
To address your other question, no, this parameter has nothing to do with handling multiple GPU's.
I thought it changed once I
)
I thought it changed once I pushed Update on the project, but I'll wait until a new task downloads.
The GPU usage you report, of
)
The GPU usage you report, of 20 to 30%, is extremely low for a GTX 660 running Parkes PMPS. It strongly suggests that something a bit unusual on your system is causing unusually long delay from the time the GPU requests service from the supporting CPU application to the time that request gets acted on.
Possibly you are giving the CPU on that system other work to do which does not share well with the Einstein Parkes CPU support task.
Pretty usually configuring to run two tasks at a time on a GPU gives considerable benefit here, but in the special case of your unusually low utilization I am less confident of the benefit. Running three often gives a modest benefit over two, but usually much less than the jump from one to two.
Your account shows that you run work from several other projects. Sometimes project mixes can induce unusual performance impacts. If you want to separate that issue, one method would be temporarily to suspend work from all save one project, or, more drastically, all save one task, just to get an idea of how the task behaves on your system without the conflict impacts.
I switched over. Currently
)
I switched over. Currently I'm doing only Mindmodeling, DistributedDataMining, Burp, and Einstein. Mindmodeling and Burp don't have tasks often so it's DistributedDataMining. I have Einstein as my sole GPU project.
Oddly enough, I suspended all but Einstein and it spiked to 65%. Even after I resumed other CPU projects, it stayed there around that load. I assume the 0.2 CPU it utilizes isn't enough.
RE: Oddly enough, I
)
The 0.2 is more a planning number than anything resembling a real-time allocation. Scheduling by the OS on your system manages the question of which task gets serviced when there is more than one requesting service for each available core. People here talk about "reserving a core" for servicing GPU tasks, but the methods they use don't actually do that, in general. In the absence of extensive affinity manipulation, what they usually mean is instructing BOINC not itself to start enough tasks to keep all the cores busy purely with BOINC work.
Still, the simple measure of cutting down the number of BOINC tasks using a setting at Computing Preferences|Usage Limits|Use at most nn% of the CPUs is often quite beneficial for GPU users here. That setting has no power over non-BOINC claimants to CPU usage, but for many of us BOINC tasks are most of the story. Our GPU is so many times more productive than our CPUs that sacrificing a little CPU output to get more GPU output seems right for many of us.
65% utilization is much more in the order of what I'd expect a healthy system to get for single-task Parkes GPU utilization than the 20-30 you reported before.
The thing is I didn't cut
)
The thing is I didn't cut down on anything. I just suspended all but Einstein, and resumed a minute after and Boinc seemed to magically kick itself to 60% GPU load. Even after restart both client and Boinc, it still stuck to it.
Maybe it's just the wu, then it will drop, but so far it's churning on 60-65% GPU.
I'll check back soon to see if it stuck.
Edit: New WU started. Stays 55-65% load. Drops to 35% every minute for a second. No idea why it was so low at first but it effectively shaved off almost 2 hours of the length of my wu's.
2 at a time changed it from
)
2 at a time changed it from 45-60% now. I freed up one core and it didn't change much. I can't cease my CPU projects just to get 80% of GPU to work. I'll try three at a time and go from there.
RE: 2 at a time changed it
)
What are you using to measure this load? Don't get too hung up on this number, as long as it's not ridiculously low. Elapsed time to crunch tasks is a better metric to follow. First establish a baseline performance when doing tasks singly (1x). You expect to see fairly constant numbers for elapsed times but there can be some variation. I see just two tasks around 8-9ksecs so I presume they were perhaps done 1x?
You should then run 2x until you complete enough tasks to determine a proper elapsed time for that mode. The time should be less than double what you get for 1x - perhaps quite a bit less. When you have that number, you should try 3x and perhaps see a much lesser improvement. If you don't do things systematically, you can easily fool yourself. You can get individual variability but a reasonable stable average if you accumulate a decent sample size - at least 10-20 tasks under stable crunching conditions.
To enable quick changes in task concurrency, you may find it better to use the application configuration mechanism (app_config.xml) which is documented towards the bottom of this page. This file overrides any GPU utilization factor setting and changes are applied immediately by using 'reread config files' function in BOINC Manager advanced view. If you'd like to use this file, a number of people could provide you with a basic example.
My experience with nvidia GPUs is that you don't need a 'free core' for sensible rates of GPU task concurrency. I wouldn't expect any significant improvement for 2x but you may get a slight benefit at 3x. I wouldn't be in a hurry to run 3x until I'd really determined the true performance at 2x. I use 'free cores' with AMD GPUs but not with nvidia. I've seen people report better GPU performance on high end nvidia GPUs by having a free core.
You'd be better off establishing the true 2x performance first. You'd also be better off deciding to use the cuda55 beta test app rather than the older (and slower) cuda32 one. To do this you need to select the preference to allow beta test apps. You could expect about a 20% performance improvement over the older app. This cuda55 app seems quite stable so there's no real reason not to be using it.
Cheers,
Gary.