ABP1 CUDA applications

Elphidieus

Joined: 20 Feb 05

Posts: 245

Credit: 20603702

RAC: 0

Curious... Just noticed the

29 Nov 2009 15:54:48 UTC

Message 95627

(moderation:

)

Curious... Just noticed the option for ABP Search (SP) on my Preferences. Am I to assume it's referring to the CUDA App...?

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 692131622

RAC: 61860

RE: Curious... Just noticed

29 Nov 2009 16:08:57 UTC

Message 95628 in response to message 95627

(moderation:

)

Quote:

Curious... Just noticed the option for ABP Search (SP) on my Preferences. Am I to assume it's referring to the CUDA App...?

Not exactly.

The two options for "Arecibo Pulsar Search" and "ABP Search (SP)" refer to two different searches, ABP1 and ABP2.

Currently only ABP1 work is distributed, there are ABP1 apps for CPU and GPU (CUDA). If you want to continue crunching ABP1 on the CPU but would like not to crunch them on the GPU, you should deselect the "Use GPU" option in the same configuration screen.

While the ABP1 search is doing some of its processing in single precision (e.g. the part that also gets executed on the GPU in the CUDA app), other parts are still done in double precision. The new ABP2 apps will be doing nearly everything in single precision, which will also allow to put more load on the GPU with high efficiency.

CU
Bikeman

Elphidieus

Joined: 20 Feb 05

Posts: 245

Credit: 20603702

RAC: 0

RE: RE: Curious... Just

29 Nov 2009 16:47:28 UTC

Message 95629 in response to message 95628

(moderation:

)

Quote:

Quote:
Curious... Just noticed the option for ABP Search (SP) on my Preferences. Am I to assume it's referring to the CUDA App...?

Not exactly.

The two options for "Arecibo Pulsar Search" and "ABP Search (SP)" refer to two different searches, ABP1 and ABP2.

Currently only ABP1 work is distributed, there are ABP1 apps for CPU and GPU (CUDA). If you want to continue crunching ABP1 on the CPU but would like not to crunch them on the GPU, you should deselect the "Use GPU" option in the same configuration screen.

While the ABP1 search is doing some of its processing in single precision (e.g. the part that also gets executed on the GPU in the CUDA app), other parts are still done in double precision. The new ABP2 apps will be doing nearly everything in single precision, which will also allow to put more load on the GPU with high efficiency.

CU
Bikeman

Nah... I'm not too worry about CUDA apps, Macs can't even support CUDA. Just curious about the new SP option.

Any idea when will the new ABP2 search be implemented...?

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 692131622

RAC: 61860

RE: Any idea when will the

29 Nov 2009 17:33:58 UTC

Message 95630 in response to message 95629

(moderation:

)

Quote:

Any idea when will the new ABP2 search be implemented...?

According to Oliver's message here, probably in the next 1-2 weeks. I guess the road-map will be something like

- Beta-Test for ABP2 CPU app
- Beta-Test for ABP2 CUDA app
- Release of ABP2 CPU app
- Release of ABP2 CUDA app

CU
Bikeman

Olaf

Joined: 16 Sep 06

Posts: 26

Credit: 190763630

RAC: 0

RE: CUDA Beta App testers

30 Nov 2009 9:58:11 UTC

Message 95631

(moderation:

)

Quote:

CUDA Beta App testers should drain their work cache and switch back to the normal project work.

BM

This looks like an opt out condition ;o) With the beta version it worked on my
computers. First I simply removed the app_info.xml (after I finished all work
and stopped boinc). But starting it again resulted in the message, that BOINC
was too old (installed just one month ago) to get some work from einstein@home.
Surprising, because other computers without GPUs still use BOINC 5.10 with
getting work from einstein@home (I cannot update this, because the home
directory is the same for several computers, what seems to stop newer versions
of BOINC working, independently from the fact that they are installed in
different directories for each computer).
I switched to BOINC 6.10 for this new computer and the result is, that the
GPU gets no work anymore, because of the driver/CUDA2.3 condition - BOINC was
not able to get the correct driver version and it obviously uses CUDA2.2.
Well, no big problem, because there is anyway no big difference, whether it
works with or without the GPU. Now the CPUs start to crunch the ABP again.
The installed driver is already from derived from NVidia, not that what is
indicated from Debian to be the current stable version.

Because I do not want to reinstall every month new experimental drivers on an
else stable computer, for now this CORE i7 crunches without the GPU again.
For another new notebook I think I will continue to crunch with this
app_info.xml, just because it worked without problems and for this notebook
GPU those ABP are ok (looks like it is already to small or to slow for GPUgrid ;o)

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6550

Credit: 288197089

RAC: 62768

RE: RE: RE: ...And hey,

30 Nov 2009 11:41:04 UTC

Message 95632 in response to message 95626

(moderation:

)

Quote:

Quote:
Quote:
...And hey, don't tell me that I can leave the project if I don't like it. This is the most antisocial attitude I've heard of. So if you don't like to share the resources of this planet with others, maybe it's time for you to leave it!

i agree with that !!

No one told anyone to leave the project. XJR-Maniac was just quite rude without reason.

If you don't want Einstein CUDA tasks, deselect them in your preferences. Nothing easier than that.

Yes, I too am unsure how that deduction was made from Gary's post. :-)

In any case the primary requirement for CUDA to yield significant benefit is that the problem must lend itself to massive parallelism ( ideally thousands of threads, plus other restrictions ). This is a basic reason ( plus of course issues like compiler technology ) that leads to variable success with apps.

The development here at E@H is quite cautious, with a considerable user pool feeding back via beta testing. CUDA is no exception. While not always successful ( a failure outcome is within the definition of testing ), one hopes to be able to productively generalise beyond the test participants. One can opt out of CUDA if it doesn't fly well enough. In fact that is likely to be a common response for those with unsuitable hardware for optimal CUDA use. Alas as Oliver pointed out, without changing BOINC code ( not under E@H control ) then a default setting of opt-out was/is not available.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Jim Kleine

Joined: 25 Jan 06

Posts: 3

Credit: 1053126

RAC: 0

Hmmm ... the CUDA app also

30 Nov 2009 11:56:15 UTC

Message 95633

(moderation:

)

Hmmm ... the CUDA app also seems to have a similar problem to GPUGrid, in that it fails to correctly detect CUDA hardware on a Windows host that has been connected to with Remote Desktop and has not yet been accessed again from the console:

6.10.17

Activated exception handling...
[20:00:57][4508][INFO ] Starting data processing...
[20:00:57][4508][INFO ] Using CUDA device #0 "Device Emulation (CPU)" (518.40 GFLOPS)
[20:00:57][4508][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...

[20:00:58][4508][WARN ] Couldn't allocate 25165824 bytes of CUDA pinned host memory for resampled time series! Trying fallback...
[20:00:58][4508][WARN ] Couldn't allocate 25165832 bytes of CUDA pinned host memory for resampled time series FFT! Trying fallback...
[20:00:58][4508][ERROR] Error allocating CUDA device memory: 25165832 bytes (no CUDA-capable device is available)
[20:00:58][4508][ERROR] Demodulation failed (error: 3)!
20:00:58 (4508): called boinc_finish

The GPU in this host is a 512MB 9800GT:

6.10.17

# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce 9800 GT"
# Clock rate: 1.62 GHz
# Total amount of global memory: 536543232 bytes
# Number of multiprocessors: 14
# Number of cores: 112

GPUGrid work units that are already running will continue to run without issue across a remote desktop session, but no new work units can be started until you have logged in again from the console first. This may be related to GPUGrid seemingly using a deprecated function to test for the presence of CUDA hardware: I'm not sure if you have the same problem.

My current workaround for GPUGrid is to suspend work fetch when I am expecting to be away and to suspend all tasks except the currently active one. This strategy, of course, regularly results in GPU idle time.

Unfortunately, the "stealth" release of the CUDA version caught me unawares and resulted in a swag of errored work units within the space of ~90 seconds (about five hours ago, while I was out) due to this problem and that, in turn, has reduced this host's quota to 2/day, so it seems I won't be doing much Einstein work on this host (CPU or GPU) for some time :-(

In an ideal world (a) the CUDA hardware would continue to be available despite the Remote Desktop video driver having been invoked and (b) if a work unit fails for this reason, the queue of GPU WUs needs to be paused, since once the first has failed, all the others in the queue *are* going to suffer a similar fate.

Gundolf Jahn

Joined: 1 Mar 05

Posts: 1079

Credit: 341280

RAC: 0

That has nothing to do with

30 Nov 2009 12:56:53 UTC

Message 95634 in response to message 95633

(moderation:

)

That has nothing to do with any projects's application but all with Remote Desktop and how Microsoft handles the graphics drivers.

There've been plenty threads at SETI and BOINC dev about that topic.

GruÃŸ,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

Jim Kleine

Joined: 25 Jan 06

Posts: 3

Credit: 1053126

RAC: 0

RE: That has nothing to do

30 Nov 2009 13:40:48 UTC

Message 95635 in response to message 95634

(moderation:

)

Quote:

That has nothing to do with any projects's application but all with Remote Desktop and how Microsoft handles the graphics drivers.

There've been plenty threads at SETI and BOINC dev about that topic.

GruÃŸ,
Gundolf

I don't wish to appear rude, but did you actually read my post?

a) "... GPUGrid appears to be using a deprecated function ..."

b) If it is "unfixable" (and I'm not yet convinced, but I don't frequent the SETI board), then my suggestion that "if a work unit fails for this reason, the queue of GPU WUs needs to be paused" seems all the more relevant.

More importantly, running work units don't spontaneously abort upon a Remote Desktop connection being initiated, so the CUDA hardware clearly remains accessible to the *running* app, which leads me to question whether there is an assertion in the start-up code that isn't testing what it thinks it is testing.

As a follow up:

[BOINC] #936: CUDA devices not detected when logged in through Remote Desktop
[pre]
---------------------------+------------------------------------------------
Reporter: mart0258 | Owner:
Type: Enhancement | Status: closed
Priority: Undetermined | Milestone: Undetermined
Component: Undetermined | Version: 6.6.31
Resolution: fixed | Keywords:
---------------------------+------------------------------------------------[/pre]
Changes (by romw):
* status: new => closed
* resolution: => fixed
Comment:
This is now fixed in the 6.10 version of the BOINC client.

"resolution: => fixed" doesn't seem consistent with your comment, although this issue doesn't actually seem to be fixed as of 6.10.17.

Oliver Behnke

Moderator

Administrator

Joined: 4 Sep 07

Posts: 950

Credit: 25167626

RAC: 2

Hi Jim, RE: Hmmm ...

30 Nov 2009 15:19:45 UTC

Message 95636 in response to message 95633

(moderation:

)

Hi Jim,

Quote:

Hmmm ... the CUDA app also seems to have a similar problem to GPUGrid, in that it fails to correctly detect CUDA hardware on a Windows host that has been connected to with Remote Desktop and has not yet been accessed again from the console:

We (and the BOINC team) are already aware of that problem. Do you use Windows Vista or Windows 7 and run BOINC as a service?

Thanks,
Oliver

Einstein@Home Project

ABP1 CUDA applications

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner