EM searches, BRP Raidiopulsar and FGRP Gamma-Ray Pulsar

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250420824
RAC: 34987

Indeed I disabled all app

Indeed I disabled all app versions, there is no use for these now. The CUDA App needs fixing beyond simple configuration, and we got enough results from the OpenCL App versions. I'm currently focusing on validation. It looks like the OpenCL results from NVidia don't agree very well with that of AMD, this will also need a deeper look.

However, there's something more urgent with higher priority I have to work on, deadlined next Wednesday.

Thank you for your contributions, testing, reports and patience!

BM

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46716102642
RAC: 64308662

Thanks for the update, Bernd.

Thanks for the update, Bernd. Looking forward to the next round of testing :)

_________________________________________________________________________

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46716102642
RAC: 64308662

Bernd Machenschalk

Bernd Machenschalk wrote:

The CUDA App needs fixing beyond simple configuration

If you have time, reach out to Petri33. he's well versed in CUDA applications (he hand wrote the SETI special CUDA app, and successfully built a BRP4G CUDA app) and might be able to help getting the CUDA app at least working. He doesn't really write Windows apps, but he knows a lot of the best practices for CUDA optimization and compiling that your team could translate to Windows, and at the very least get a nice Linux CUDA app going, if you are interested in that.

 

but is there a reason you are going the CUDA route for Windows/Nvidia, but OpenCL for Linux? or are you just trying different configurations to test the waters before deciding which to stick with?

 

CUDA has advantages for speed on Nvidia devices, but will have the caveat of needing a new application to support new devices (unless they are built to include PTX versions of their kernels).

OpenCL always will induce some overhead on Nvidia in the translation from OpenCL to CUDA, but has the advantage of being more portable across devices, so less application updates would be necessary. the overhead can be at least reduced/minimized with optimization however. 

 

 

_________________________________________________________________________

Jim1348
Jim1348
Joined: 19 Jan 06
Posts: 463
Credit: 257957147
RAC: 0

Ian&Steve C. wrote:If you

Ian&Steve C. wrote:
If you have time, reach out to Petri33. he's well versed in CUDA applications (he hand wrote the SETI special CUDA app, and successfully built a BRP4G CUDA app) and might be able to help getting the CUDA app at least working.

If anyone knows how to contact him, he might be interested in the SiDock work also.  They could use his skills.

https://www.sidock.si/sidock/forum_thread.php?id=207&postid=1679#1679

mikey
mikey
Joined: 22 Jan 05
Posts: 12680
Credit: 1839082786
RAC: 3918

Jim1348 wrote: Ian&Steve C.

Jim1348 wrote:

Ian&Steve C. wrote:
If you have time, reach out to Petri33. he's well versed in CUDA applications (he hand wrote the SETI special CUDA app, and successfully built a BRP4G CUDA app) and might be able to help getting the CUDA app at least working.

If anyone knows how to contact him, he might be interested in the SiDock work also.  They could use his skills.

https://www.sidock.si/sidock/forum_thread.php?id=207&postid=1679#1679 

I believe he is or was a Team Mate of Keith Myers but don't know how much influence Keith has on what Petri wants to work on, I do agree though that EVERY project could use his skills to at least look into making things more efficient but also understand that could involve ALOT of work outside of whatever else he does.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4963
Credit: 18711186236
RAC: 6353694

Last I've heard from Petri

Last I've heard from Petri was back in April when he finished up the FGRPB1G "special sauce" "AIO" app he released to the Einstein general public.

Said he was going to start tackling the BRP source code. Know that he has been noodling with a custom BRP4G gpu application. You can see the results from his computers. Typical 200X speed improvement over the stock apps.

The validator for the BRP4G apps is very picky.  It only likes results paired with like platforms, apps and OS's.  I believe Bernd has mentioned that is an issue that they still haven't resolved well yet. And the current issues they are having with the beta BRP7 apps point to a long development time I think.

But don't know whether he has started on anything related to BRP7 yet. I would expect that unless the BRP7 tasks/apps are completely different from the BRP4G tasks/apps, that the current code he is developing would port over fairly fast.

He has only worked on a single project/app one at a time so far.  Doubt he would split resources/time between two or more projects simultaneously.  He has tended to take a project app as far as it can go before moving on to the next challenge. I believe BRP4G/BRP7 is his current focus.

 

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7220454931
RAC: 977844

The Einstein applications

The Einstein applications page:

https://einsteinathome.org/apps.php

currently shows under BRP7 a Linux BRP7-cuda55 application marked as having been created a bit over two hours ago.  

For days now the BRP7 section has been empty each time I looked, so this may indicate Bernd has found some time to get back to this particular matter.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46716102642
RAC: 64308662

i got a few dozen of them.

i got a few dozen of them. all failed with the same errors from the windows systems. first RSA checksum errors, then a

Error launching CUDA TSM kernel

like before.

 

cuda55 is a bust i think. try a recent cuda version, or stick to the openCL version.

_________________________________________________________________________

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 3945
Credit: 46716102642
RAC: 64308662

attempts with a

attempts with a cuda55-appropriate GTX 550Ti failed in a similar fashion.

_________________________________________________________________________

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4312
Credit: 250420824
RAC: 34987

The curious thing is that the

The curious thing is that the same tasks run successfully with the very same (Linux) app even on our modern GPU machines - in standalone mode, i.e. without the BOINC client.

The CUDA code of the BRP7 app hasn't changed a bit since BRP4(G) times, where it also worked perfectly. The build process is also unchanged (well, libz and binutils might have been updated, but that shouldn't be relevant here). All that changed is the CPU code (it generates a template bank in memory rather than reading it from a file).

We did re-compile the (Linux) app with CUDA 10 and CUDA 11 (the versions that we use on Atlas) and didn't see a speedup of more than 10%. That isn't worth to lose compatibility. And the Windows cross-build of the CUDA app depends on a couple of patches that I'm not at all sure would work with more recent versions of headers, nvcc and cudart.

Of course, the BOINC version is a newer one, the one we used for the BRP4 App doesn't even build nowadays. That's more likely to be the reason.

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.