Interesting. Hm, there's only a gpu_opencl_dev_index in init_data.xml. The list it points to must be in the client (or actually in a shared memory). The index is different, but the entries may still point to the same device. Weird, though.
The app calls a BOINC API function boinc_get_opencl_ids() [1] and passes this info to the OpenCL clCreateContext() [2] function. That's all from the app side.
I don't know how exactly boinc_get_opencl_ids() works, and apparently was even mistaken about that. I can guess that <gpu_opencl_dev_index> points to an internal table that is not exposed to the app, and I have no means to debug the client. Does anyone know? Any debugging that you can activate to get a better understanding?
When the client has evaluated the hardware at startup, it writes a file "coproc_info.xml" in the user's BOINC data folder, with a separate openCL section listing what it found for each device. That might provide some clues.
Remember that BOINC re-sequences the device numbers, to make what it assesses to be the 'best' device number 0 - that may differ from other detection tools.
Maybe not the WU that is on GPU1 pauses at 49.5% then when it restarts at 50% it moves to back to GPU 0, whats on 0 stays on 0.
I'm not sure whats happening at 50%?
the new app is designed to process the 2Hz range in two 1Hz passes instead of one pass to reduce VRAM use. Maybe Bernd can comment if it's releasing the GPU context and creating a new one during the switch. but that would be my guess.
is it possible that if the app is grabbing a new context and it's not well defined which GPU to use at that time, that it's just picking the first one by default?
This is certainly interesting. Indeed the app releases the context in the middle and creates a new one, but there's nothing different from the first half, it should use the same device information. I think that boinc_get_opencl_ids() is called again there, but the info is stored in a global variable, it may also be that it's not called again and the existing info is just used. In any case it shouldn't differ from the first time. I'll take a look at the code again.
Heres another: cat
)
Heres another:
cat /var/lib/boinc-client/slots/0/init_data.xml
<gpu_type>NVIDIA</gpu_type>
<gpu_device_num>1</gpu_device_num>
<gpu_opencl_dev_index>1</gpu_opencl_dev_index>
cat /var/lib/boinc-client/slots/1/init_data.xml
<gpu_type>NVIDIA</gpu_type>
<gpu_device_num>0</gpu_device_num>
<gpu_opencl_dev_index>0</gpu_opencl_dev_index>
<gpu_usage>1.000000</gpu_usage>
Interesting. Hm, there's only
)
Interesting. Hm, there's only a gpu_opencl_dev_index in init_data.xml. The list it points to must be in the client (or actually in a shared memory). The index is different, but the entries may still point to the same device. Weird, though.
BM
I snipped out some the parts
)
I snipped out some the parts of the init_data.xml is there other parts that are helpful?.
the behavour on my computer is repoduceable so can make data easily.
how is the actual launching the applicaion to a specific GPU done?
in clinfo each GPU has a GUID so it should be possiable to check.
The app calls a BOINC API
)
The app calls a BOINC API function boinc_get_opencl_ids() [1] and passes this info to the OpenCL clCreateContext() [2] function. That's all from the app side.
I don't know how exactly boinc_get_opencl_ids() works, and apparently was even mistaken about that. I can guess that <gpu_opencl_dev_index> points to an internal table that is not exposed to the app, and I have no means to debug the client. Does anyone know? Any debugging that you can activate to get a better understanding?
[1] https://boinc.berkeley.edu/trac/wiki/OpenclApps
[2] https://man.opencl.org/clCreateContext.html
BM
David Anderson would prefer
)
David Anderson would prefer us to refer to https://github.com/BOINC/boinc/wiki/OpenclApps, but it doesn't look as if the actual content has changed in the last 10 years.
When the client has evaluated the hardware at startup, it writes a file "coproc_info.xml" in the user's BOINC data folder, with a separate openCL section listing what it found for each device. That might provide some clues.
Remember that BOINC re-sequences the device numbers, to make what it assesses to be the 'best' device number 0 - that may differ from other detection tools.
The boinc_get_opencl_ids() is
)
The boinc_get_opencl_ids() is in this code:
https://github.com/BOINC/boinc/blob/058a8094f5a1ab9759f76aa7551f5df5ac6bd370/client/gpu_opencl.cpp#L706
It just builds the list and passes back the data, I think.
here is full data for init
cat /var/lib/boinc-client/slots/0/init_data.xml
<app_init_data>
<major_version>7</major_version>
<minor_version>24</minor_version>
<release>1</release>
<app_version>106</app_version>
<userid>1048642</userid>
<teamid>0</teamid>
<hostid>13159167</hostid>
<app_name>einstein_O3AS</app_name>
<project_preferences>
<graphics fps="20" width="800" height="600" quality="low"/>
<libc215>0</libc215>
<also_run_cpu>0</also_run_cpu>
<allow_non_preferred_apps>0</allow_non_preferred_apps>
<gpu_util_brp>1.00</gpu_util_brp>
<gpu_util_fgrp>1.00</gpu_util_fgrp>
<gpu_util_gw>1.00</gpu_util_gw>
</project_preferences>
<user_name>Toby Broom</user_name>
<project_dir>/var/lib/boinc-client/projects/einstein.phys.uwm.edu</project_dir>
<boinc_dir>/var/lib/boinc-client</boinc_dir>
<authenticator>59a0bdafedcbb4656046b4a734060916</authenticator>
<wu_name>h1_0977.80_O3aC01Cl1In0__O3ASHF1b_978.00Hz_13135</wu_name>
<result_name>h1_0977.80_O3aC01Cl1In0__O3ASHF1b_978.00Hz_13135_0</result_name>
<shm_key>-1</shm_key>
<slot>0</slot>
<client_pid>2416</client_pid>
<wu_cpu_time>0.000000</wu_cpu_time>
<starting_elapsed_time>0.000000</starting_elapsed_time>
<using_sandbox>0</using_sandbox>
<vm_extensions_disabled>0</vm_extensions_disabled>
<user_total_credit>34150653.000000</user_total_credit>
<user_expavg_credit>934735.709994</user_expavg_credit>
<host_total_credit>34150653.000000</host_total_credit>
<host_expavg_credit>934750.060478</host_expavg_credit>
<resource_share_fraction>1.000000</resource_share_fraction>
<checkpoint_period>60.000000</checkpoint_period>
<fraction_done_start>0.000000</fraction_done_start>
<fraction_done_end>1.000000</fraction_done_end>
<gpu_type>NVIDIA</gpu_type>
<gpu_device_num>1</gpu_device_num>
<gpu_opencl_dev_index>1</gpu_opencl_dev_index>
<gpu_usage>1.000000</gpu_usage>
<ncpus>0.900000</ncpus>
<rsc_fpops_est>720000000000000.000000</rsc_fpops_est>
<rsc_fpops_bound>14400000000000000.000000</rsc_fpops_bound>
<rsc_memory_bound>314572800.000000</rsc_memory_bound>
<rsc_disk_bound>419430400.000000</rsc_disk_bound>
<computation_deadline>1700046125.000000</computation_deadline>
<vbox_window>0</vbox_window>
<no_priority_change>0</no_priority_change>
<process_priority>-1</process_priority>
<process_priority_special>-1</process_priority_special>
<host_info>
<timezone>3600</timezone>
<domain_name>Nitrogen</domain_name>
<ip_addr>127.0.1.1</ip_addr>
<host_cpid>8979589f4a7c5339b993bedf684458c9</host_cpid>
<p_ncpus>32</p_ncpus>
<p_vendor>AuthenticAMD</p_vendor>
<p_model>AMD Ryzen Threadripper PRO 5955WX 16-Cores [Family 25 Model 8 Stepping 2]</p_model>
<p_features>fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 invpcid_single hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin brs arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm</p_features>
<p_fpops>2979926657.001706</p_fpops>
<p_iops>6860599356.319954</p_iops>
<p_membw>1000000000.000000</p_membw>
<p_calculated>1698093010.991753</p_calculated>
<p_vm_extensions_disabled>0</p_vm_extensions_disabled>
<m_nbytes>540645339136.000000</m_nbytes>
<m_cache>524288.000000</m_cache>
<m_swap>2147479552.000000</m_swap>
<d_total>1006450962432.000000</d_total>
<d_free>932461023232.000000</d_free>
<os_name>Linux Ubuntu</os_name>
<os_version>Ubuntu 23.10 [6.5.0-10-generic|libc 2.38]</os_version>
<n_usable_coprocs>2</n_usable_coprocs>
<wsl_available>0</wsl_available>
<virtualbox_version>7.0.10_Ubuntur158379</virtualbox_version>
<coprocs>
<coproc_cuda>
<count>2</count>
<name>NVIDIA RTX A5000</name>
<available_ram>25425608704.000000</available_ram>
<have_cuda>1</have_cuda>
<have_opencl>1</have_opencl>
<peak_flops>27770880000000.000000</peak_flops>
<cudaVersion>12020</cudaVersion>
<drvVersion>53599</drvVersion>
<totalGlobalMem>25425608704.000000</totalGlobalMem>
<sharedMemPerBlock>49152.000000</sharedMemPerBlock>
<regsPerBlock>65536</regsPerBlock>
<warpSize>32</warpSize>
<memPitch>2147483647.000000</memPitch>
<maxThreadsPerBlock>1024</maxThreadsPerBlock>
<maxThreadsDim>1024 1024 64</maxThreadsDim>
<maxGridSize>2147483647 65535 65535</maxGridSize>
<clockRate>1695000</clockRate>
<totalConstMem>65536.000000</totalConstMem>
<major>8</major>
<minor>6</minor>
<textureAlignment>512.000000</textureAlignment>
<deviceOverlap>1</deviceOverlap>
<multiProcessorCount>64</multiProcessorCount>
<coproc_opencl>
<name>NVIDIA RTX A5000</name>
<vendor>NVIDIA Corporation</vendor>
<vendor_id>4318</vendor_id>
<available>1</available>
<half_fp_config>0</half_fp_config>
<single_fp_config>191</single_fp_config>
<double_fp_config>63</double_fp_config>
<endian_little>1</endian_little>
<execution_capabilities>1</execution_capabilities>
<extensions>cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_khr_gl_event cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_nv_kernel_attribute cl_khr_device_uuid cl_khr_pci_bus_info cl_khr_external_semaphore cl_khr_external_memory cl_khr_external_semaphore_opaque_fd cl_khr_external_memory_opaque_fd</extensions>
<global_mem_size>25425608704</global_mem_size>
<local_mem_size>49152</local_mem_size>
<max_clock_frequency>1695</max_clock_frequency>
<max_compute_units>64</max_compute_units>
<nv_compute_capability_major>8</nv_compute_capability_major>
<nv_compute_capability_minor>6</nv_compute_capability_minor>
<amd_simd_per_compute_unit>0</amd_simd_per_compute_unit>
<amd_simd_width>0</amd_simd_width>
<amd_simd_instruction_width>0</amd_simd_instruction_width>
<opencl_platform_version>OpenCL 3.0 CUDA 12.2.147</opencl_platform_version>
<opencl_device_version>OpenCL 3.0 CUDA</opencl_device_version>
<opencl_driver_version>535.129.03</opencl_driver_version>
</coproc_opencl>
<pci_info>
<bus_id>65</bus_id>
<device_id>0</device_id>
<domain_id>0</domain_id>
</pci_info>
<pci_info>
<bus_id>97</bus_id>
<device_id>0</device_id>
<domain_id>0</domain_id>
</pci_info>
</coproc_cuda>
</coprocs>
</host_info>
<proxy_info>
<use_http_proxy/>
<socks_server_name></socks_server_name>
<socks_server_port>80</socks_server_port>
<http_server_name>192.168.1.179</http_server_name>
<http_server_port>3128</http_server_port>
<socks5_user_name></socks5_user_name>
<socks5_user_passwd></socks5_user_passwd>
<socks5_remote_dns>0</socks5_remote_dns>
<http_user_name></http_user_name>
<http_user_passwd></http_user_passwd>
<no_proxy></no_proxy>
<no_autodetect>0</no_autodetect>
</proxy_info>
<global_preferences>
<source_project>https://lhcathome.cern.ch/lhcathome/</source_project>
<mod_time>1697832210.000000</mod_time>
<battery_charge_min_pct>90.000000</battery_charge_min_pct>
<battery_max_temperature>40.000000</battery_max_temperature>
<run_on_batteries>0</run_on_batteries>
<run_if_user_active>1</run_if_user_active>
<run_gpu_if_user_active>1</run_gpu_if_user_active>
<suspend_if_no_recent_input>0.000000</suspend_if_no_recent_input>
<suspend_cpu_usage>0.000000</suspend_cpu_usage>
<start_hour>0.000000</start_hour>
<end_hour>0.000000</end_hour>
<net_start_hour>0.000000</net_start_hour>
<net_end_hour>0.000000</net_end_hour>
<leave_apps_in_memory>0</leave_apps_in_memory>
<confirm_before_connecting>0</confirm_before_connecting>
<hangup_if_dialed>0</hangup_if_dialed>
<dont_verify_images>0</dont_verify_images>
<work_buf_min_days>0.200000</work_buf_min_days>
<work_buf_additional_days>0.100000</work_buf_additional_days>
<max_ncpus_pct>90.000000</max_ncpus_pct>
<niu_max_ncpus_pct>90.000000</niu_max_ncpus_pct>
<niu_cpu_usage_limit>100.000000</niu_cpu_usage_limit>
<niu_suspend_cpu_usage>0.000000</niu_suspend_cpu_usage>
<cpu_scheduling_period_minutes>60.000000</cpu_scheduling_period_minutes>
<disk_interval>60.000000</disk_interval>
<disk_max_used_gb>400.000000</disk_max_used_gb>
<disk_max_used_pct>100.000000</disk_max_used_pct>
<disk_min_free_gb>10.000000</disk_min_free_gb>
<vm_max_used_pct>50.000000</vm_max_used_pct>
<ram_max_used_busy_pct>97.000000</ram_max_used_busy_pct>
<ram_max_used_idle_pct>97.000000</ram_max_used_idle_pct>
<idle_time_to_run>1.000000</idle_time_to_run>
<max_bytes_sec_up>0.000000</max_bytes_sec_up>
<max_bytes_sec_down>0.000000</max_bytes_sec_down>
<cpu_usage_limit>100.000000</cpu_usage_limit>
<daily_xfer_limit_mb>0.000000</daily_xfer_limit_mb>
<daily_xfer_period_days>0</daily_xfer_period_days>
<override_file_present>1</override_file_present>
<network_wifi_only>1</network_wifi_only>
</global_preferences>
<app_file>einstein_O3AS_1.06_x86_64-pc-linux-gnu__GW-opencl-nvidia-2</app_file>
<app_file>O3ASHF1b_0.config</app_file>
<app_file>O3ASHF1b_1.config</app_file>
</app_init_data>
Its seems OK, there is a different PCI_ID for the 2 cards
cat /var/lib/boinc-client/coproc_info.xml
<coprocs>
<have_cuda>1</have_cuda>
<cuda_version>12020</cuda_version>
<coproc_cuda>
<count>1</count>
<name>NVIDIA RTX A5000</name>
<available_ram>25425608704.000000</available_ram>
<have_cuda>1</have_cuda>
<have_opencl>0</have_opencl>
<peak_flops>27770880000000.000000</peak_flops>
<cudaVersion>12020</cudaVersion>
<drvVersion>53599</drvVersion>
<totalGlobalMem>25425608704.000000</totalGlobalMem>
<sharedMemPerBlock>49152.000000</sharedMemPerBlock>
<regsPerBlock>65536</regsPerBlock>
<warpSize>32</warpSize>
<memPitch>2147483647.000000</memPitch>
<maxThreadsPerBlock>1024</maxThreadsPerBlock>
<maxThreadsDim>1024 1024 64</maxThreadsDim>
<maxGridSize>2147483647 65535 65535</maxGridSize>
<clockRate>1695000</clockRate>
<totalConstMem>65536.000000</totalConstMem>
<major>8</major>
<minor>6</minor>
<textureAlignment>512.000000</textureAlignment>
<deviceOverlap>1</deviceOverlap>
<multiProcessorCount>64</multiProcessorCount>
<pci_info>
<bus_id>65</bus_id>
<device_id>0</device_id>
<domain_id>0</domain_id>
</pci_info>
</coproc_cuda>
<coproc_cuda>
<count>1</count>
<name>NVIDIA RTX A5000</name>
<available_ram>25425608704.000000</available_ram>
<have_cuda>1</have_cuda>
<have_opencl>0</have_opencl>
<peak_flops>27770880000000.000000</peak_flops>
<cudaVersion>12020</cudaVersion>
<drvVersion>53599</drvVersion>
<totalGlobalMem>25425608704.000000</totalGlobalMem>
<sharedMemPerBlock>49152.000000</sharedMemPerBlock>
<regsPerBlock>65536</regsPerBlock>
<warpSize>32</warpSize>
<memPitch>2147483647.000000</memPitch>
<maxThreadsPerBlock>1024</maxThreadsPerBlock>
<maxThreadsDim>1024 1024 64</maxThreadsDim>
<maxGridSize>2147483647 65535 65535</maxGridSize>
<clockRate>1695000</clockRate>
<totalConstMem>65536.000000</totalConstMem>
<major>8</major>
<minor>6</minor>
<textureAlignment>512.000000</textureAlignment>
<deviceOverlap>1</deviceOverlap>
<multiProcessorCount>64</multiProcessorCount>
<pci_info>
<bus_id>97</bus_id>
<device_id>0</device_id>
<domain_id>0</domain_id>
</pci_info>
</coproc_cuda>
<nvidia_opencl>
<name>NVIDIA RTX A5000</name>
<vendor>NVIDIA Corporation</vendor>
<vendor_id>4318</vendor_id>
<available>1</available>
<half_fp_config>0</half_fp_config>
<single_fp_config>191</single_fp_config>
<double_fp_config>63</double_fp_config>
<endian_little>1</endian_little>
<execution_capabilities>1</execution_capabilities>
<extensions>cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_khr_gl_event cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_nv_kernel_attribute cl_khr_device_uuid cl_khr_pci_bus_info cl_khr_external_semaphore cl_khr_external_memory cl_khr_external_semaphore_opaque_fd cl_khr_external_memory_opaque_fd</extensions>
<global_mem_size>25425608704</global_mem_size>
<local_mem_size>49152</local_mem_size>
<max_clock_frequency>1695</max_clock_frequency>
<max_compute_units>64</max_compute_units>
<nv_compute_capability_major>8</nv_compute_capability_major>
<nv_compute_capability_minor>6</nv_compute_capability_minor>
<amd_simd_per_compute_unit>0</amd_simd_per_compute_unit>
<amd_simd_width>0</amd_simd_width>
<amd_simd_instruction_width>0</amd_simd_instruction_width>
<opencl_platform_version>OpenCL 3.0 CUDA 12.2.147</opencl_platform_version>
<opencl_device_version>OpenCL 3.0 CUDA</opencl_device_version>
<opencl_driver_version>535.129.03</opencl_driver_version>
<device_num>0</device_num>
<peak_flops>27770880000000.000000</peak_flops>
<opencl_available_ram>25425608704.000000</opencl_available_ram>
<opencl_device_index>0</opencl_device_index>
<warn_bad_cuda>0</warn_bad_cuda>
</nvidia_opencl>
<nvidia_opencl>
<name>NVIDIA RTX A5000</name>
<vendor>NVIDIA Corporation</vendor>
<vendor_id>4318</vendor_id>
<available>1</available>
<half_fp_config>0</half_fp_config>
<single_fp_config>191</single_fp_config>
<double_fp_config>63</double_fp_config>
<endian_little>1</endian_little>
<execution_capabilities>1</execution_capabilities>
<extensions>cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_khr_gl_event cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_nv_kernel_attribute cl_khr_device_uuid cl_khr_pci_bus_info cl_khr_external_semaphore cl_khr_external_memory cl_khr_external_semaphore_opaque_fd cl_khr_external_memory_opaque_fd</extensions>
<global_mem_size>25425608704</global_mem_size>
<local_mem_size>49152</local_mem_size>
<max_clock_frequency>1695</max_clock_frequency>
<max_compute_units>64</max_compute_units>
<nv_compute_capability_major>8</nv_compute_capability_major>
<nv_compute_capability_minor>6</nv_compute_capability_minor>
<amd_simd_per_compute_unit>0</amd_simd_per_compute_unit>
<amd_simd_width>0</amd_simd_width>
<amd_simd_instruction_width>0</amd_simd_instruction_width>
<opencl_platform_version>OpenCL 3.0 CUDA 12.2.147</opencl_platform_version>
<opencl_device_version>OpenCL 3.0 CUDA</opencl_device_version>
<opencl_driver_version>535.129.03</opencl_driver_version>
<device_num>1</device_num>
<peak_flops>27770880000000.000000</peak_flops>
<opencl_available_ram>25425608704.000000</opencl_available_ram>
<opencl_device_index>1</opencl_device_index>
<warn_bad_cuda>0</warn_bad_cuda>
</nvidia_opencl>
<warning>NVIDIA library reports 2 GPUs</warning>
<warning>[coproc] cuMemGetInfo(0) returned 201</warning>
<warning>[coproc] cuMemGetInfo(1) returned 201</warning>
<warning>ATI: libaticalrt.so: cannot open shared object file: No such file or directory</warning>
</coprocs>
I think its an nvidia issue?
)
I think its an nvidia issue? The process hops when the 2nd task starts
8 Nov 2023 21:11:21 | Einstein@Home | [coproc] Assigning NVIDIA instance 1 to h1_0977.80_O3aC01Cl1In0__O3ASHF1b_978.00Hz_13137_0
8 Nov 2023 21:11:21 | Einstein@Home | [task] ACTIVE_TASK::start(): forked process: pid 131447
8 Nov 2023 21:18:13 | Einstein@Home | [coproc] Assigning NVIDIA instance 0 to h1_0977.80_O3aC01Cl1In0__O3ASHF1b_978.00Hz_13136_0
8 Nov 2023 21:18:13 | Einstein@Home | [task] ACTIVE_TASK::start(): forked process: pid 132103
Maybe not the WU that is on
)
Maybe not the WU that is on GPU1 pauses at 49.5% then when it restarts at 50% it moves to back to GPU 0, whats on 0 stays on 0.
I'm not sure whats happening at 50%?
Toby Broom wrote:Maybe not
)
the new app is designed to process the 2Hz range in two 1Hz passes instead of one pass to reduce VRAM use. Maybe Bernd can comment if it's releasing the GPU context and creating a new one during the switch. but that would be my guess.
is it possible that if the app is grabbing a new context and it's not well defined which GPU to use at that time, that it's just picking the first one by default?
_________________________________________________________________________
This is certainly
)
This is certainly interesting. Indeed the app releases the context in the middle and creates a new one, but there's nothing different from the first half, it should use the same device information. I think that boinc_get_opencl_ids() is called again there, but the info is stored in a global variable, it may also be that it's not called again and the existing info is just used. In any case it shouldn't differ from the first time. I'll take a look at the code again.
BM