EM searches, BRP Raidiopulsar and FGRP Gamma-Ray Pulsar

Boca Raton Comm...

Joined: 4 Nov 15

Posts: 303

Credit: 11593905286

RAC: 14818671

Thanks for the insight. I

1 Sep 2022 2:11:04 UTC

Message 200436 in response to message 200435

(moderation:

)

Thanks for the insight. I leave a few cores open to help prevent bottlenecking which definitely helps. Gpu memory shouldn't be an issue with our gpus. I am still learning the details of gpus but do single precision tasks use the same cores of the gpu as double precision? I know that our ampere cards have more active tesor cores which are for double precision tasks? Am I off base with this?

Ian&Steve C.

Joined: 19 Jan 20

Posts: 4159

Credit: 50341041542

RAC: 41854517

Boca Raton Community HS

1 Sep 2022 13:19:35 UTC

Message 200440 in response to message 200436

(moderation:

)

Boca Raton Community HS wrote:

Thanks for the insight. I leave a few cores open to help prevent bottlenecking which definitely helps. Gpu memory shouldn't be an issue with our gpus. I am still learning the details of gpus but do single precision tasks use the same cores of the gpu as double precision? I know that our ampere cards have more active tesor cores which are for double precision tasks? Am I off base with this?

the tensor cores are not the same as FP cores. tensor cores are specialized hardware for inferencing workloads like ML and AI. No BOINC project (yet) uses this hardware.

GA102 die like in your A6000 or higher end GeForce 30-series cards (or any GA10x really) ~~don’t really have dedicated FP64 hardware. Pretty sure they just double up FP32 cores for that~~. But the higher end Nvidia cards based on the GA100 core like the A100 do have dedicated FP64 cores.

edit, correction:

the GA10x (Geforce 30x0, Ax000 "Quadro", etc) cards have only 2 FP64 cores per SM. but this is not depicted in most architecture diagrams so I missed it, had to dig into the white paper to find that. but with 128 FP32 cores/SM that explains why there's a 1:64 ratio in performance.

while the GA100 (A100) cards have 32 FP64 cores per SM and 64 FP32 cores per SM, for that nice 1:2 ratio in performance.

they basically swapped out the FP64 cores for the Ray Tracing cores on GA10x, which are not present on GA100.

https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf

_________________________________________________________________________

Keith Myers

Joined: 11 Feb 11

Posts: 5063

Credit: 19325242173

RAC: 7531312

Unless an application is

1 Sep 2022 3:09:18 UTC

Message 200441 in response to message 200436

(moderation:

)

Unless an application is coded specifically to use Tensor cores they won't be used.

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7406601687

RAC: 1934878

Bernd Machenschalk wrote: It

1 Sep 2022 4:58:47 UTC

Message 200444 in response to message 200421

(moderation:

)

Bernd Machenschalk wrote:

It seems with v. 0.11 we now have a working CUDA version on Windows. The first step is to get something working at all, then to improve validation,...

let's see how validation goes.

I've run dozens of tasks on the windows/AMD v.011 code. Most of these WUs are old ones with multiple tasks run on various hosts with older versions of the application, and I've gotten no validations from them.

But I recently got a validation, and the fabulous news is that my quorum partner was Windows/Cuda55 (so Nvidia) also running v0.11

As I had seen zero previous cases of successful validation for my AMD against Nvidia quorum partners of any description, this is a ray of hope on the validation front.

https://einsteinathome.org/workunit/667201533

Time will tell if this is a false dawn, or a harbinger of better times ahead.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4350

Credit: 253898194

RAC: 35434

Actually the only validation

1 Sep 2022 6:44:46 UTC

Message 200447

(moderation:

)

Actually the only validation we're interested in is the one among the four latest app versions (0.08 on Linux, 0.11 on Windows). All other app versions that results may be still around from (including, unfortunately, the Windows 0.08) have a slightly different computation code that would likely prevent successful validation.

But it seems we're getting closer, so far we have 361 valid and only one invalid from comparison of these app versions. Best validation by far so far.

Alexander Favorsky

Joined: 18 Jun 16

Posts: 36

Credit: 182361747

RAC: 69978

Hi! I'm doing v0.11 on

1 Sep 2022 7:28:27 UTC

Message 200448

(moderation:

)

Hi! I'm doing v0.11 on Windows right now and it has been stuck at 99.998% for almost an hour. I guess I'm gonna abort it if computation time exceeds 3 hours.

Ereignishorizont

Joined: 17 May 21

Posts: 19

Credit: 3137450195

RAC: 1752889

Nearly all of the v0.08-WUs

1 Sep 2022 9:22:11 UTC

Message 200450

(moderation:

)

Nearly all of the v0.08-WUs have massive problems on my computers. They start but the GPU doesn't compute anything from the beginning. The load is between 1% and 10%, mostly at 1%. Maybe that's the regular load caused by Linux.

Even though they show very low progress, I aborted them because they get stuck earlier or later.

Alexander Favorsky

Joined: 18 Jun 16

Posts: 36

Credit: 182361747

RAC: 69978

Alexander Favorsky

1 Sep 2022 10:13:24 UTC

Message 200451 in response to message 200448

(moderation:

)

Alexander Favorsky wrote:

Hi! I'm doing v0.11 on Windows right now and it has been stuck at 99.998% for almost an hour. I guess I'm gonna abort it if computation time exceeds 3 hours.

Have aborted the task at 4:13:12 of running time and still at 99.998%.

Maybe there's something wrong with v0.11 because v0.03 worked fine.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4350

Credit: 253898194

RAC: 35434

0.12 is out, it should fix

1 Sep 2022 11:50:00 UTC

Message 200454

(moderation:

)

0.12 is out, it should fix the memory issue reported by Petri33 (Thanks!.

Hopefully this is the only reason for the "hang" problem.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4350

Credit: 253898194

RAC: 35434

FWIW The memory issue that

1 Sep 2022 12:27:00 UTC

Message 200456

(moderation:

)

FWIW The memory issue that probably caused the "hang" problem has been in the BRP7 code all along, since version 0.01. It's a matter of the data (i.e. workunit) whether and when it's triggered, not so much of the app version.

EM searches, BRP Raidiopulsar and FGRP Gamma-Ray Pulsar

Forums › Technical News

Comment viewing options

Forums › Technical News