Can anyone explain this result?

Darrell
Darrell
Joined: 11 Nov 04
Posts: 32
Credit: 15397991
RAC: 0

Don't have an Eistein app at

Don't have an Eistein app at the moment, but here is the init_data.xml from a PrimeGrid app on the HD5850:

ATI
0
0
1.000000
0.129620

and from a Moo app on the HD3000

ATI
1
0
0.500000
0.250000

and from a Collatz on the HD3000

ATI
1
0
0.500000
0.250000

Darrell
Darrell
Joined: 11 Nov 04
Posts: 32
Credit: 15397991
RAC: 0

More interesting data from

More interesting data from the Collatz init_data.xml:


2
AuthenticAMD
AMD Athlon(tm) II X2 250 Processor [Family 16 Model 6 Stepping 3]
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni cx16 popcnt syscall nx lm svm sse4a osvw ibs skinit wdt page1gb rdtscp 3dnowext 3dnow
2776355959.056587
7092600648.636612
500000000.000000
1397686473.251541
0
8320770048.000000
1048576.000000
16639631360.000000
1000096137216.000000
878541012992.000000
Microsoft Windows 7
Professional x64 Edition, Service Pack 1, (06.01.7601.00)
4.3.10

2
ATI Radeon HD 5800/5900 series (Cypress/Hemlock)
1039138816.000000
1
1
4176000000000.000000
1.4.1741
8
1024
2047
2047
725
1000
64
18
1
256
4096
16384
16384
16384


ATI Radeon HD 5800/5900 series (Cypress/Hemlock)
Advanced Micro Devices, Inc.
4098
1
0
190
63
1
1
cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing
1073741824
32768
725
18
OpenCL 1.2 AMD-APP (938.2)
OpenCL 1.2 AMD-APP (938.2)
CAL 1.4.1741 (VM)

Advanced Micro Devices, Inc.

AMD Athlon(tm) II X2 250 Processor
AuthenticAMD
4098
1
0
191
63
1
3
cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing
8320770048
32768
3000
2
OpenCL 1.2 AMD-APP (938.2)
OpenCL 1.2 AMD-APP (938.2)
2.0 (sse2)

Note: The HD5850 runs the standard driver from AMD and the HD3000 is using a modded driver from a friend over on Guru3d.

Darrell
Darrell
Joined: 11 Nov 04
Posts: 32
Credit: 15397991
RAC: 0

Have removed the exclusions

Have removed the exclusions and set the preferences to send only the BRP4G or CasA tasks, both of which I've seen run on the HD3000, but shouldn't. Unfortunately right now I have a PrimeGrid running on the HD3000 (which shouldn't), so it may be awhile before I get an Einstein task.

From the init_data.xml from the PrimeGrid task:

2
ATI Radeon HD 5800/5900 series (Cypress/Hemlock)
1039138816.000000
1
1
4176000000000.000000
1.4.1741

Note the count. There is a another section which describes the opencl capable CPU. So it appears that Boinc is telling the apps that there are two opencl GPUs or is it just saying I have two Ati GPUs and the HD5850 is the opencl capable one.

Also, I have run the CLinfo program and have the data in a text file, let me know how much of it you would like to see.

While composing and editing this post Boinc downloaded a CasA task.

4/25/2014 5:19:37 PM | Einstein@Home | [coproc] ATI instance 0; 0.500000 pending for h1_0959.70_S6Directed__S6CasAf40a_959.95Hz_1169_1
4/25/2014 5:19:37 PM | SETI@home Beta Test | [coproc] ATI instance 0; 0.500000 pending for 22my13ag.11060.9883.438086664200.16.28_2
4/25/2014 5:19:37 PM | SETI@home Beta Test | [coproc] ATI instance 0; 0.500000 pending for 22my13ag.11060.9883.438086664200.16.81_0
4/25/2014 5:19:37 PM | PrimeGrid | [coproc] ATI instance 0; 0.500000 pending for genefer_1048576_388708_4
4/25/2014 5:19:37 PM | Einstein@Home | [coproc] ATI instance 1: confirming 0.500000 instance for h1_0959.70_S6Directed__S6CasAf40a_959.95Hz_1169_1
4/25/2014 5:19:37 PM | SETI@home Beta Test | [coproc] ATI instance 0: confirming 0.500000 instance for 22my13ag.11060.9883.438086664200.16.28_2
4/25/2014 5:19:37 PM | SETI@home | [coproc] Insufficient ATI for 26au08ae.22880.260603.438086664195.12.111_0: need 0.500000
4/25/2014 5:19:37 PM | SETI@home Beta Test | [coproc] ATI instance 0: confirming 0.500000 instance for 22my13ag.11060.9883.438086664200.16.81_0
4/25/2014 5:19:37 PM | SETI@home | [coproc] Insufficient ATI for 24mr08af.22844.16841.438086664198.12.185_1: need 0.500000
4/25/2014 5:19:37 PM | PrimeGrid | [coproc] ATI instance 1: confirming 0.500000 instance for genefer_1048576_388708_4

The pending lines are all for instance 0 (the HD5850), but confirming are for the Setis on the HD5850 and the Einstein and PrimeGrid on the HD3000.

Going stop the client and run into through the boinc simulator.

Darrell
Darrell
Joined: 11 Nov 04
Posts: 32
Credit: 15397991
RAC: 0

Scenario is Nummber 115 at

Scenario is Nummber 115 at http://boinc.berkeley.edu/dev/sim_web.php?action=show_scenario&name=115

Upon restart of the client, the CasA task was moved to the HD5850 with the Seti task and the PrimeGrid task is by itself on the HD3000. This matches the results in the simulator, but not what Boinc was showing before shutdown.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

Thank you, I'll go through

Thank you, I'll go through the logs to see if I can find anything out of the ordinary.

Darrell
Darrell
Joined: 11 Nov 04
Posts: 32
Credit: 15397991
RAC: 0

So after the CasA task was

So after the CasA task was running for awhile I suspended it and added exclusions for BRP4G and CasA to not use device 0 (HD5850), had boinc reread the config files and resumed the CasA. When the CasA task restarted, boinc says it is confirmed on device 1 (HD3000), so when when I went to bed I had two Seti tasks on device 0 (HD5850) and CasA and PrimeGrid's Genefer on device 1 (HD3000). Since all four tasks have a thread using "amdocl.dll", they must really be executing on the HD5850 or the CPU. Checking boinc this morning, the CasA task did not have as much elapsed time as I thought it should and after checking my tasks on the Einstein site saw that at 6 am the CasA task had thrown an exception:

http://einsteinathome.org/task/433137527

Boinc received another CasA task, so I am back to 2 Seti on device 0 and Genefer and CasA on device 1. Process Explorer shows the Seti and Casa to be using < 1% of the CPU and < 5% of the GPU, if fact CasA is using < 1% of the GPU. Genefer is using about 40% of the CPU and 90% of the GPU.

I would like to take a moment to praise the Einstein developers as they are the only ones to have their screensaver work when running an opencl task. THANK YOU!!

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

With thanks, David added a

With thanks, David added a possible fix for this problem for a future (BOINC 7.3/7.4) client:

client: don't include GPUs that lack OpenCL/Cal/CUDA when main GPU has it

E.g.: if the "best" AMD GPU can do OpenCL, don't include AMD GPUs that can't,
even if use_all_gpus is set.
Otherwise lots of jobs will error out.


I did point out extra that none of your work erred, but that it merely showed as if it ran on the HD3000 --as near as I can figure it-- while in reality it ran on the HD5850. The above fix will probably fix that as well.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

There is however a caveat

There is however a caveat here. Apparently BOINC showing work being done on the wrong GPU is either a problem with the science application(s), or with the BOINC API. To be continued.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5866
Credit: 111900762609
RAC: 35355792

Jord, I'd just like to

Jord,

I'd just like to give you a big 'thank you' for the efforts you put in liaising with the BOINC Devs and bringing problems like this to their attention. It must take a lot of time and effort and I'd just like to acknowledge that and express my appreciation.

Thanks also go to Darrell for his persistence with this particular matter.

Cheers,
Gary.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

RE: It must take a lot of

Quote:
It must take a lot of time and effort


This one wasn't as bad. My time at BOINC is severely limited these days, and will still be over the coming 3 months, but this problem was easy to relay to the developers as thanks to Darrell and Richard there was a lot of information to read through and point out. Well documented.

Although there are still lingering problems, but those are for the developers to solve, that's entering me mail box but doesn't need my complete attention. :)

Which leaves me with a lot more hours to spend on TESO. If you want something bugged: that's the game to get. LOL.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.