Important news on BRP7 and FGRPB1 work on E@H

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46627192642

RAC: 64186629

that might explain it. I know

28 Sep 2023 15:03:18 UTC

Message 217592

(moderation:

)

that might explain it. I know the ARC series of cards do not have hardware FP64 support. and maybe most of their iGPUs lack it as well?

is there a mechanism in the code to fall back to CPU for the FP64 parts? or is that supposed to be handled by the driver?

_________________________________________________________________________

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250375056

RAC: 35277

The plan class

28 Sep 2023 17:54:00 UTC

Message 217599

(moderation:

)

The plan class (BRP7-opencl-intel_gpu) is set to require fp64 support, which means to get such "work" the BOINC client must (erroneously) report that the device supports that. That's a bad bug in the client then.

In FGRP double precision is required only in the first stage before the FFT. The app does this on the CPU if the GPU lacks double precision support. In BRP7 double precision is used deeper in the computation, there is not much left to do for the GPU if you do that on the CPU. BRP4 is all single precision, though.

In the early days of GPU computing NVidia could emulate double precision computation in software (using two floats IIRC), but that was CUDA only.

Keith Myers

Joined: 11 Feb 11

Posts: 4963

Credit: 18701699043

RAC: 6265491

Intel Arc GPUs don't support

28 Sep 2023 19:00:29 UTC

Message 217601

(moderation:

)

Intel Arc GPUs don't support hardware accelerated FP64 calculations. However, they do support emulated FP64 calculations for niche cases. This is because Intel removed FP64 hardware support in their 12th generation GPU architecture. Instead, they added code to the driver to emulate FP64 calculations with software instructions.

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 238

Credit: 10518185586

RAC: 27095247

This might be a rookie

28 Sep 2023 19:16:09 UTC

Message 217603

(moderation:

)

This might be a rookie question, but does only the opencl version do this ("app does this on the CPU if the GPU lacks double precision support")? Does the cuda version of the app use fp64 on the GPU?

Keith Myers

Joined: 11 Feb 11

Posts: 4963

Credit: 18701699043

RAC: 6265491

Bernd's reply is for only the

28 Sep 2023 21:14:36 UTC

Message 217610 in response to message 217603

(moderation:

)

Bernd's reply is for only the FGRPB1G application. Does not apply to the BRP7 applications.

All Nvidia cards (AFAIK) have always had hardware implemented FP64 calculation capability.

So not necssary tor worry about shifting FP64 calcs from the gpu to the cpu. (At least, if happy with gpu FP64 precision)

Scrooge McDuck

Joined: 2 May 07

Posts: 1052

Credit: 17869688

RAC: 12142

Bernd Machenschalk

28 Sep 2023 21:35:48 UTC

Message 217611 in response to message 217599

(moderation:

)

Bernd Machenschalk wrote:

The plan class (BRP7-opencl-intel_gpu) is set to require fp64 support, which means to get such "work" the BOINC client must (erroneously) report that the device supports that. That's a bad bug in the client then.

I have an older Core i7-4770 with Intel GPU. This host received lots of BRP7 GPU tasks for the new beta BRP7 app for Intel GPU some days ago. They all errored out immediately.

This host finished FGRPB1G GPU tasks for years successfully (also validates). Logfiles of each FGRPB1G task always state the following:

Using OpenCL device "Intel(R) HD Graphics 4600" by: Intel(R) Corporation
Max allocation limit: 340158054
Global mem size: 1360632218
Error in OpenCL context: Build program failure.
OpenCL compiling FAILED! : -11 . Error message: :10:30: error: use of type 'double' requires cl_khr_fp64 extension to be enabled
 
error: front end compiler failed build.
OpenCL device has no FP64 support

Clearly this iGPU doesn't support FP64 but successfully crunched FGRPB1G GPU tasks on iGPU.

Nevertheless, for the first time, days ago, it gets lots of BRP7 tasks which all errored out immediately. Why does this host then get BRP7 tasks if it's unfit for them? There's only the Intel GPU. No Nvidia or ATI/AMD card.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117478790537

RAC: 35434673

Gary Roberts

28 Sep 2023 21:33:57 UTC

Message 217612 in response to message 217581

(moderation:

)

Gary Roberts wrote:

TRAPPIST-713 wrote:
https://einsteinathome.org/host/12832734
.... GPU tasks are being completely starved of CPU support.

Out of interest since there hasn't been a response, I took a quick look at the 4 most recently completed BRP7 tasks, all returned 28th Sep UTC. The run times (and status) were:-

27,365 secs - valid
26,564 secs - valid
13,942 secs - validate error
1,269 secs - validate error

Looks like the GPU is throwing a hissy fit now that it's being pushed to actually do some work. The OP should check for cooling or perhaps voltage/frequency issues.

Cheers,
Gary.

Keith Myers

Joined: 11 Feb 11

Posts: 4963

Credit: 18701699043

RAC: 6265491

Re-read your own quote of

28 Sep 2023 21:40:15 UTC

Message 217613 in response to message 217611

(moderation:

)

Re-read your own quote of Bernd's reply for your answer. The problem is the BOINC client mis-identifying a gpu as FP64 capable.

There is nothing that Bernd can do with the BRP7 application other than reconfiguring the server scheduler to not send any BRP7 task to any Intel gpu.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117478790537

RAC: 35434673

With regard to the validation

28 Sep 2023 22:06:00 UTC

Message 217615 in response to message 217576

(moderation:

)

With regard to the validation performance of the Linux v0.17 BRP7 app, yesterday I posted the following:-

Gary Roberts wrote:

... As of right now (running single tasks) it has completed and returned 25 tasks. The results are:-

Pending    =   10

Valid             =   10

Invalid          =    1

Error             =     0

Inconclusive =     4

In my experience, because Linux is the minor player, a big fraction of inconclusive results in a Windows/Linux comparison eventually translate to invalid for the Linux machine.

Here is an updated set of results (52 tasks returned) just to check if the initial trend is changing in any way:-

Pending = 19
Valid = 23
Invalid = 2
Error = 0
Inconclusive = 8

The trend is continuing - all numbers are increasing at much the same rate.

This doesn't seem any better than an earlier test I ran on a completely different machine using the v0.15 app.

Cheers,
Gary.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3945

Credit: 46627192642

RAC: 64186629

Bernd Machenschalk

28 Sep 2023 22:12:45 UTC

Message 217616 in response to message 217599

(moderation:

)

Bernd Machenschalk wrote:

The plan class (BRP7-opencl-intel_gpu) is set to require fp64 support, which means to get such "work" the BOINC client must (erroneously) report that the device supports that. That's a bad bug in the client then.

I'd be interested to see what Scrooge's coproc_info.xml shows. I'm not sure which parameter exactly the project server is looking at, but in my coproc file it lists:

Quote:

      <name>Intel(R) Iris(R) Xe Graphics [0x9a49]</name>
      <vendor>Intel(R) Corporation</vendor>
      <vendor_id>32902</vendor_id>
      <available>1</available>
      <half_fp_config>63</half_fp_config>
      <single_fp_config>63</single_fp_config>
      <double_fp_config>0</double_fp_config>
      <endian_little>1</endian_little>
      <execution_capabilities>1</execution_capabilities>
      <extensions>cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_command_queue_families cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_driver_diagnostics cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_intel_subgroups_char cl_intel_subgroups_long cl_khr_il_program cl_intel_mem_force_host_memory cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_subgroup_clustered_reduce cl_intel_device_attribute_query cl_khr_suggested_local_work_size cl_intel_spirv_media_block_io cl_intel_spirv_subgroups cl_khr_spirv_no_integer_wrap_decoration cl_intel_unified_shared_memory cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_intel_planar_yuv cl_intel_packed_yuv cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_3d_image_writes cl_intel_media_block_io cl_intel_va_api_media_sharing cl_intel_sharing_format_query cl_khr_pci_bus_info cl_intel_subgroup_local_block_io </extensions>

I'm not questioning my own errored tasks, since I forced it's hand with a custom app_info to get the tasks and override any such checks. but you can see my device does report a lack of fp64 support.

_________________________________________________________________________

Important news on BRP7 and FGRPB1 work on E@H

Forums › Technical News

Comment viewing options

Forums › Technical News