Only using one or two GPUs for the most time, 4 available

Kashra
Kashra
Joined: 2 Apr 12
Posts: 4
Credit: 2112544077
RAC: 565156
Topic 203750

I got a problem running the workunits for GPU on all available GPUs.
the cc_config.xml has the use_all_coprocs entry and it works well for other projects like Seti. But with Einstein there is some odd, erradic behavior in regard to parallel GPU work units.

Sometimes there is only one GPU actively crunching a WU, sometimes 2 or 3 and rarely 4 at the same time.

But the second I start up another project (unpause it), like Seti, suddenly all 4 GPUs are occupied with crunching a Einstein GPU task until the competing project is paused again.

I have no idea why Einstein won´t use just all GPUs it can use, but instead shows this odd behavior.

My setting are to use 50% of all CPUs and all 4 GPUs.
The GPUs are R9 295X2 cards.
Does anyone know a solution to this?

 

Here is my Boinc startup log:

 

15.12.2016 09:30:51 |  | Starting BOINC client version 7.6.33 for windows_x86_64 15.12.2016 09:30:51 |  | log flags: file_xfer, sched_ops, task 15.12.2016 09:30:51 |  | Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8 15.12.2016 09:30:51 |  | Data directory: F:\BOINC\data 15.12.2016 09:30:51 |  | Running under account xxx 15.12.2016 09:30:54 |  | OpenCL: AMD/ATI GPU 0: AMD Radeon R9 200 Series (driver version 2117.14 (VM), device version OpenCL 2.0 AMD-APP (2117.14), 4096MB, 4096MB available, 5519 GFLOPS peak) 15.12.2016 09:30:54 |  | OpenCL: AMD/ATI GPU 1: AMD Radeon R9 200 Series (driver version 2117.14 (VM), device version OpenCL 2.0 AMD-APP (2117.14), 4096MB, 4096MB available, 5519 GFLOPS peak) 15.12.2016 09:30:54 |  | OpenCL: AMD/ATI GPU 2: AMD Radeon R9 200 Series (driver version 2117.14 (VM), device version OpenCL 2.0 AMD-APP (2117.14), 4096MB, 4096MB available, 5519 GFLOPS peak) 15.12.2016 09:30:54 |  | OpenCL: AMD/ATI GPU 3: AMD Radeon R9 200 Series (driver version 2117.14 (VM), device version OpenCL 2.0 AMD-APP (2117.14), 4096MB, 4096MB available, 5519 GFLOPS peak) 15.12.2016 09:30:54 |  | OpenCL CPU:  (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version , device version ) 15.12.2016 09:30:54 | SETI@home | Found app_info.xml; using anonymous platform 15.12.2016 09:30:54 |  | Host name: g36-PC 15.12.2016 09:30:54 |  | Processor: 16 GenuineIntel  [Family 6 Model 45 Stepping 6] 15.12.2016 09:30:54 |  | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm avx vmx smx tm2 dca pbe 15.12.2016 09:30:54 |  | OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00) 15.12.2016 09:30:54 |  | Memory: 31.94 GB physical, 72.96 GB virtual 15.12.2016 09:30:54 |  | Disk: 476.94 GB total, 116.20 GB free 15.12.2016 09:30:54 |  | Local time is UTC +1 hours 15.12.2016 09:30:54 |  | Config: use all coprocessors 15.12.2016 09:30:54 | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 11714846; resource share 200 15.12.2016 09:30:54 | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 7477312; resource share 100 15.12.2016 09:30:54 | Einstein@Home | General prefs: from Einstein@Home (last modified 15-Oct-2016 06:04:17) 15.12.2016 09:30:54 | Einstein@Home | Host location: none 15.12.2016 09:30:54 | Einstein@Home | General prefs: using your defaults 15.12.2016 09:30:54 |  | Reading preferences override file 15.12.2016 09:30:54 |  | Preferences: 15.12.2016 09:30:54 |  | max memory usage when active: 32384.10MB 15.12.2016 09:30:54 |  | max memory usage when idle: 32384.10MB 15.12.2016 09:30:54 |  | max disk usage: 100.00GB 15.12.2016 09:30:54 |  | max CPUs used: 7

chainsaw
chainsaw
Joined: 11 Aug 16
Posts: 2
Credit: 40083499
RAC: 0

I have similar problem as

I have similar problem as well. My older NVIDIA drivers crashed but Windows was able to recover and restart the driver. After I had updated the driver I noticed Einstein was utilizing only one of two GPUs. The new drivers crash occasionally as well but Windows will not recover and I need to restart the computer. The crash always occurs at suspendin the process.

archae86
archae86
Joined: 6 Dec 05
Posts: 3145
Credit: 7056744931
RAC: 1602209

While I don't have a model

While I don't have a model for the exact symptoms you describe, two candidate issues are excess total queue leading to "high priority" processing order distortions, and inadequate supply of allowed CPU cores for the desired amount of GPU tasks.  The current major Einstein GPU work type is unusually CPU-hungry, and at least to my hosts is distributed with an indication to the scheduler that it should assume consumption of a full CPU core for each GPU task.  This is quite unlike before.  Especially if one has the controls set to allow more than one GPU task at a time per GPU, one can rather quickly run out of CPU cores on a multi-GPU machine.

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

Where do you have your Use

Where do you have your Use all GPUs file located?  I don't see a mention of cc_config.xml in there. I do see use all processors but at what level of the BOINC directory is it located in?

chainsaw
chainsaw
Joined: 11 Aug 16
Posts: 2
Credit: 40083499
RAC: 0

My computer ran six months

My computer ran six months without problems. I remember now that first time I noticed this problem with graphics drivers was when I launched Path of Exile. This happened maybe five times but unpredictably, not at every launch. At that time BOINC was not affected ayn way that I noticed. Back then I ran BOINC in the background when CPU load was less than 25%. Path of Exile is very easy on the CPU. Just the graphics driver crashed and Windows recovered after freezing for minutes.

 

Since my first reply I have deleted with \Program Data\ and \users\chainsaw\whatnot\BOINC and all, and reinstalled. Now BOINC runs as I expect it to run. However, I already had one hangup after resuming GPU work after playing Path of Exile.

https://www.dropbox.com/s/6os3uin969fkmzc/BOINC.png?dl=0

BOINC.png?dl=0

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.