Important news on BRP7 and FGRPB1 work on E@H

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,914
Credit: 44,079,279,309
RAC: 63,751,429

that might explain it. I know

that might explain it. I know the ARC series of cards do not have hardware FP64 support. and maybe most of their iGPUs lack it as well?

is there a mechanism in the code to fall back to CPU for the FP64 parts? or is that supposed to be handled by the driver?

_________________________________________________________________________

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,305
Credit: 249,002,894
RAC: 33,973

The plan class

The plan class (BRP7-opencl-intel_gpu) is set to require fp64 support, which means to get such "work" the BOINC client must (erroneously) report that the device supports that. That's a bad bug in the client then.

In FGRP double precision is required only in the first stage before the FFT. The app does this on the CPU if the GPU lacks double precision support. In BRP7 double precision is used deeper in the computation, there is not much left to do for the GPU if you do that on the CPU. BRP4 is all single precision, though.

In the early days of GPU computing NVidia could emulate double precision computation in software (using two floats IIRC), but that was CUDA only.

BM

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4,918
Credit: 18,471,264,220
RAC: 5,966,341

Intel Arc GPUs don't support

Intel Arc GPUs don't support hardware accelerated FP64 calculations. However, they do support emulated FP64 calculations for niche cases.  This is because Intel removed FP64 hardware support in their 12th generation GPU architecture. Instead, they added code to the driver to emulate FP64 calculations with software instructions.

 

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 234
Credit: 9,538,042,253
RAC: 15,827,438

This might be a rookie

This might be a rookie question, but does only the opencl version do this ("app does this on the CPU if the GPU lacks double precision support")? Does the cuda version of the app use fp64 on the GPU? 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4,918
Credit: 18,471,264,220
RAC: 5,966,341

Bernd's reply is for only the

Bernd's reply is for only the FGRPB1G application.  Does not apply to the BRP7 applications.

All Nvidia cards (AFAIK) have always had hardware implemented FP64 calculation capability.

So not necssary tor worry about shifting FP64 calcs from the gpu to the cpu. (At least, if happy with gpu FP64 precision)

 

Scrooge McDuck
Scrooge McDuck
Joined: 2 May 07
Posts: 1,008
Credit: 17,389,650
RAC: 12,484

Bernd Machenschalk

Bernd Machenschalk wrote:

The plan class (BRP7-opencl-intel_gpu) is set to require fp64 support, which means to get such "work" the BOINC client must (erroneously) report that the device supports that. That's a bad bug in the client then.

I have an older Core i7-4770 with Intel GPU. This host received lots of BRP7 GPU tasks for the new beta BRP7 app for Intel GPU some days ago. They all errored out immediately.

This host finished FGRPB1G GPU tasks for years successfully (also validates). Logfiles of each FGRPB1G task always state the following:

Using OpenCL device "Intel(R) HD Graphics 4600" by: Intel(R) Corporation
Max allocation limit: 340158054
Global mem size: 1360632218
Error in OpenCL context: Build program failure.
OpenCL compiling FAILED! : -11 . Error message: :10:30: error: use of type 'double' requires cl_khr_fp64 extension to be enabled
 
error: front end compiler failed build.
OpenCL device has no FP64 support

Clearly this iGPU doesn't support FP64 but successfully crunched FGRPB1G GPU tasks on iGPU.

Nevertheless, for the first time, days ago, it gets lots of BRP7 tasks which all errored out immediately. Why does this host then get BRP7 tasks if it's unfit for them? There's only the Intel GPU. No Nvidia or ATI/AMD card.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,870
Credit: 116,032,029,494
RAC: 35,669,721

Gary Roberts

Gary Roberts wrote:

....  GPU tasks are being completely starved of CPU support.

Out of interest since there hasn't been a response, I took a quick look at the 4 most recently completed BRP7 tasks, all returned 28th Sep UTC.  The run times (and status) were:-

  1.   27,365 secs - valid
  2.   26,564 secs - valid
  3.   13,942 secs - validate error
  4.     1,269 secs - validate error

Looks like the GPU is throwing a hissy fit now that it's being pushed to actually do some work.  The OP should check for cooling or perhaps voltage/frequency issues.

Cheers,
Gary.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4,918
Credit: 18,471,264,220
RAC: 5,966,341

Re-read your own quote of

Re-read your own quote of Bernd's reply for your answer.  The problem is the BOINC client mis-identifying a gpu as FP64 capable.

There is nothing that Bernd can do with the BRP7 application other than reconfiguring the server scheduler to not send any BRP7 task to any Intel gpu.

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,870
Credit: 116,032,029,494
RAC: 35,669,721

With regard to the validation

With regard to the validation performance of the Linux v0.17 BRP7 app, yesterday I posted the following:-

Gary Roberts wrote:

...  As of right now (running single tasks) it has completed and returned 25 tasks.  The results are:-

  • Pending        =   10
  • Valid             =   10
  • Invalid          =     1
  • Error             =     0
  • Inconclusive  =     4

In my experience, because Linux is the minor player, a big fraction of inconclusive results in a Windows/Linux comparison eventually translate to invalid for the Linux machine.

Here is an updated set of results (52 tasks returned) just to check if the initial trend is changing in any way:-

  • Pending        =   19
  • Valid             =   23
  • Invalid          =     2
  • Error             =     0
  • Inconclusive  =     8

The trend is continuing - all numbers are increasing at much the same rate.

This doesn't seem any better than an earlier test I ran on a completely different machine using the v0.15 app.

 

Cheers,
Gary.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3,914
Credit: 44,079,279,309
RAC: 63,751,429

Bernd Machenschalk

Bernd Machenschalk wrote:

The plan class (BRP7-opencl-intel_gpu) is set to require fp64 support, which means to get such "work" the BOINC client must (erroneously) report that the device supports that. That's a bad bug in the client then.

I'd be interested to see what Scrooge's coproc_info.xml shows. I'm not sure which parameter exactly the project server is looking at, but in my coproc file it lists:

Quote:
      <name>Intel(R) Iris(R) Xe Graphics [0x9a49]</name>
      <vendor>Intel(R) Corporation</vendor>
      <vendor_id>32902</vendor_id>
      <available>1</available>
      <half_fp_config>63</half_fp_config>
      <single_fp_config>63</single_fp_config>
      <double_fp_config>0</double_fp_config>
      <endian_little>1</endian_little>
      <execution_capabilities>1</execution_capabilities>
      <extensions>cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_command_queue_families cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_driver_diagnostics cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_intel_subgroups_char cl_intel_subgroups_long cl_khr_il_program cl_intel_mem_force_host_memory cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_subgroup_clustered_reduce cl_intel_device_attribute_query cl_khr_suggested_local_work_size cl_intel_spirv_media_block_io cl_intel_spirv_subgroups cl_khr_spirv_no_integer_wrap_decoration cl_intel_unified_shared_memory cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_intel_planar_yuv cl_intel_packed_yuv cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_3d_image_writes cl_intel_media_block_io cl_intel_va_api_media_sharing cl_intel_sharing_format_query cl_khr_pci_bus_info cl_intel_subgroup_local_block_io </extensions>



I'm not questioning my own errored tasks, since I forced it's hand with a custom app_info to get the tasks and override any such checks. but you can see my device does report a lack of fp64 support.

_________________________________________________________________________

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.