EM searches, BRP Raidiopulsar and FGRP Gamma-Ray Pulsar

6dj72cn8
6dj72cn8
Joined: 24 Jan 06
Posts: 24
Credit: 13,286,415
RAC: 0

Thank you for the

Thank you for the explanation, Bernd. As my Mac seems happy enough crunching (or perhaps only nibbling at) GRPB1G I didn't realise the situation was different for more recent apps.

mikey
mikey
Joined: 22 Jan 05
Posts: 9,772
Credit: 1,342,359,276
RAC: 1,044,631

Bernd Machenschalk

Bernd Machenschalk wrote:

6dj72cn8 wrote:

A Mac app might be ready by Christmas perhaps?

Apple dropped support for OpenCL and has issued the last box with NVidia GPU what - ten years ago? Unless some enthusiastic volunteer shows up that would want to port the App to Metal and Apple Silicon GPU I don't see that happen at all. As far as E@H is concerned, GPU computing on Apple is dead. We simply don't have the manpower to dive into Apples own universe.

That sounds like a call to Richard Haselgrove and his contacts might be in order to 1st see if it's feasible and 2nd to see if the time and energy is worth it for the potential number of users that would use it. I have to see users excluded but time marches on and technology changes but...

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 2,572
Credit: 21,140,601,886
RAC: 36,722,099

I think Bernd was referencing

I think Bernd was referencing the newer Macs with their own silicon (M1, M2, etc). not the older x86 based macs. really old x86 macs and macOS versions could support nvidia GPUs, and the more recent ones only supported AMD.

the x86 AMD openCL app for macos could likely be compiled for macOS with little porting necessary. but I wouldnt bother with an nvidia app.

and as Bernd says, there's no chance for any OpenCL Apple silicon app for the newer macs.

_________________________________________________________________________

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 68
Credit: 3,733,840,984
RAC: 12,264,155

Bernd Machenschalk

Bernd Machenschalk wrote:

FAST data was and still is in the planning and discussion, however I haven't seen a bit of real data yet, not even simulated. I don't know whether we'll get it at all and when, nor what it will look like. We needed to develop and change our pre-processing and application code quite a bit for the MeerKAT data, I really don't know whether we could process FAST data with the same pipeline.

Got it, thanks!

Ian&Steve C. wrote:

the tensor cores are not the same as FP cores. tensor cores are specialized hardware for inferencing workloads like ML and AI. No BOINC project (yet) uses this hardware. 
 

GA102 die like in your A6000 or higher end GeForce 30-series cards (or any GA10x really) don’t really have dedicated FP64 hardware. Pretty sure they just double up FP32 cores for that. But the higher end Nvidia cards based on the GA100 core like the A100 do have dedicated FP64 cores. 

edit, correction:

the GA10x (Geforce 30x0, Ax000 "Quadro", etc) cards have only 2 FP64 cores per SM. but this is not depicted in most architecture diagrams so I missed it, had to dig into the white paper to find that. but with 128 FP32 cores/SM that explains why there's a 1:64 ratio in performance.

while the GA100 (A100) cards have 32 FP64 cores per SM and 64 FP32 cores per SM, for that nice 1:2 ratio in performance.

they basically swapped out the FP64 cores for the Ray Tracing cores on GA10x, which are not present on GA100.
 

https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf

I just saw this edit- this makes sense. Thanks for digging into it. So, and correct me if I am wrong, the GA100 would be fantastic with the MEERKAT data and not great at work like gamma-ray pulsar binary search #1 or any other gpu work on E@H? Does the project code have to be different to work/take advantage of these FP64 cores?

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 2,572
Credit: 21,140,601,886
RAC: 36,722,099

The A100 (GA100 core) is good

The A100 (GA100 core) is good at everything. it's the fastest card available now, but also costs like $10,000/ea. well maybe the H100 is faster, but not really available.

 

the new app isnt totally dependent on FP64 speed, but seems to have some component of FP32 and also seems to scale well with memory bandwidth too. the A100 excels in all three.

 

 

_________________________________________________________________________

TRAPPIST-713
TRAPPIST-713
Joined: 13 May 20
Posts: 6
Credit: 1,348,021,004
RAC: 2,350,547

(MeerKAT) v0.12 runs

(MeerKAT) v0.12 runs extremely slow on my AMD Radeon RX 580, Run time (sec): 29,256 s and GPU utilization is close to 0%.

https://einsteinathome.org/task/1348804697

Gamma-ray pulsar binary search #1 on GPUs v1.22 (FGRPopencl1K-ati) runs ~ 520 s on the same computer.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6,400
Credit: 206,678,736
RAC: 73,129

Well for this rig at least,

Well for this rig at least, the Binary Radio Pulsar Search (MeerKAT) v0.13 (BRP7-opencl-nvidia) x86_64-pc-linux-gnu is working a treat since 4th September : consistently validating against Windows, ATI and the v0.12 (BRP7-cuda55) for that matter. It has the usual 1 - 2 per day invalids ie. about 5%, so good job ! Maybe not too much to adjust on your return after all ? ;-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

mikey
mikey
Joined: 22 Jan 05
Posts: 9,772
Credit: 1,342,359,276
RAC: 1,044,631

TRAPPIST-713

TRAPPIST-713 wrote:

(MeerKAT) v0.12 runs extremely slow on my AMD Radeon RX 580, Run time (sec): 29,256 s and GPU utilization is close to 0%.

https://einsteinathome.org/task/1348804697

Gamma-ray pulsar binary search #1 on GPUs v1.22 (FGRPopencl1K-ati) runs ~ 520 s on the same computer.

I also have an AMD 580 and it's doing 1140 seconds and cpu time 186 seconds and 3333 credits. Mine is also running Windows, I would suggest you stop all other cpu tasks and see if yours speeds up, it could be running out of cpu resources if it's using 0% cpu resources.

 

 

stfn
stfn
Joined: 7 Jun 21
Posts: 3
Credit: 27,048,285
RAC: 46,724

I have an ASUS Mining GPU

I have an ASUS Mining GPU with 4GB of RAM which I bought as a RX470 but reports itself as RX570. For the Meerkats the runtime is around 1450s with cpu time around 120s. I also had a few outlier tasks that completed in around 430s. Here are my stats for those curious.

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 463
Credit: 257,440,532
RAC: 22,092

I am beginning to like this. 

I am beginning to like the 0.12 (BRP7-cuda55).  The first one ran for 1,289 seconds on my GTX 1650 Super under Win10.

But I especially like the fact that it used less than 7% of a CPU core (Ryzen 5700X), and the current one less than 4% when I reserve two cores for it.  And the card is only at 61C, which makes it feasible for the summer.

As for valids, it looks like they will take care of themselves, though only against Windows and Nvidia thus far. (Maybe Bernd limited them to that?)

https://einsteinathome.org/host/12871756/tasks/4/0

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.