Double precision ...

G.L.I.S.

Joined: 27 Dec 08

Posts: 2

Credit: 41247458

RAC: 4087

20 Oct 2023 18:56:30 UTC

Topic 230228

(moderation:

)

Maybe I discover hot water, but at least several GPUs succeed in double-precision calculation (except that the performance drops dramatically)...

mikey

Joined: 22 Jan 05

Posts: 12881

Credit: 1884391640

RAC: 129881

G.L.I.S. wrote: Maybe I

21 Oct 2023 4:23:51 UTC

Message 218383

(moderation:

)

G.L.I.S. wrote:

Maybe I discover hot water, but at least several GPUs succeed in double-precision calculation (except that the performance drops dramatically)...

Thjis is normal with the gpu tasks here at Einstein, the cpu does alot of finishing and preparing the task to be sent back to the Server. Nvidia gpu's have reduced the 64 bit precision ratio dramatically over the years and while AMD gpu's are better they too have recently dropped their dual precision rations. It seems games don't need it and they are by far the bigger driver of gpu sales.

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6594

Credit: 334859798

RAC: 387317

Broadly you'd think that for

21 Oct 2023 6:31:00 UTC

Message 218385

(moderation:

)

Broadly you'd think that for a given GPU, single-precision-speed to double-precision-speed would be around 2:1 like it is on, say, a Fermi (or a Kepler at 3:1). But most Nvidia consumer (i.e. gaming) GPUs are much worse like up to 32:1. Games just don't need a 64 bit solution, they don't even need a 33rd bit. Also such implementations don't usually adhere to IEEE standards for FP64 like in handling rounding and some fiddly exceptions. That is woeful for scientific aims.

Hence at E@H we search for candidates that are then given closer/conforming scrutiny after the GPU is finished. Some of that extra attention is on the user's CPU, and some is possibly done on the Atlas cluster at AEI after return of the work-unit results. Why this works at all as a search strategy is that we get to complete the examination of some subset of the parameter space in a reasonably finite time using the high parallelism of GPUs. (The technical detail on such issues is mind boggling.)

Cheers, Mike

(edit) A related issue is the degree of fineness/coarseness of the division of the parameter space that we study. An E@H work-unit is from a relatively coarse grained 'slice' of parameter space compared to a finer division that could occur. But we want to finish some inquiry of the parameter space in a reasonable time. So a real signal present in the detector data will give a lower match statistic ( probabilistic ) from a user's results compared to say a follow up examination of an interesting candidate by the Atlas cluster. This is called a hierarchical search. If we didn't have GPUs and their speed then some searches would not be practicable at all!

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

tullio

Joined: 22 Jan 05

Posts: 2118

Credit: 61407735

RAC: 0

I have been running Lateah

21 Oct 2023 13:15:29 UTC

Message 218392

(moderation:

)

I have been running Lateah GPU tasks on an Intel CPU i5 with its embedded GPU processor. Slow but no errors or invalid tasks.

Tullio

Scrooge McDuck

Joined: 2 May 07

Posts: 1114

Credit: 18706253

RAC: 10825

tullio schrieb:I have been

23 Oct 2023 9:37:45 UTC

Message 218452 in response to message 218392

(moderation:

)

tullio wrote:

I have been running Lateah GPU tasks on an Intel CPU i5 with its embedded GPU processor. Slow but no errors or invalid tasks.

If I remember Bernd's last explanations correctly, then the FGRPB1G search (LATeah on GPU) does not use FP64 calculations within the main part of the analysis, but only in the final follow up of the TOP10 candidates, which is done on CPU for Intel iGPUs. With BRP7, the main part of the analysis requires FP64, which is why there is no science app for iGPU.

I can only underline what Mike Hewson explained in detail. FP64 floating point does not mean a reliable arithmetics in a scientific sense which returns exactly the same result independent of different GPU types. (rounding errors, exceptions, etc.). That's what the FPU of a CPU does, namely comply with IEEE754.

StephieDolores

Joined: 23 Dec 24

Posts: 6

Credit: 13332

RAC: 1046

Scrooge McDuck wrote: tullio

2 Feb 2025 20:46:55 UTC

Message 232812 in response to message 218452

(moderation:

)

Scrooge McDuck wrote:

tullio wrote:
^{I have been running Lateah GPU tasks on an Intel CPU i5 with its embedded GPU processor. Slow but no errors or invalid tasks.}
If I remember Bernd's last explanations correctly, then the FGRPB1G search (LATeah on GPU) does not use FP64 calculations within the main part of the analysis, but only in the final follow up of the TOP10 candidates, which is done on CPU for Intel iGPUs. With BRP7, the main part of the analysis requires FP64, which is why there is no science app for iGPU.

^{I can only underline what Mike Hewson explained in detail. FP64 floating point does not mean a reliable arithmetics in a scientific sense which returns exactly the same result independent of different GPU types. (rounding errors, exceptions, etc.). That's what the FPU of a CPU does, namely comply with IEEE754.}

After so much searching, someone finally explained it properly.

Thank You from the bottom of my heart.

Double precision ...

Forums › Cruncher's Corner

G.L.I.S. wrote: Maybe I

Broadly you'd think that for

I have been running Lateah

tullio schrieb:I have been

Scrooge McDuck wrote: tullio

Comment viewing options

Forums › Cruncher's Corner