ABP1 CUDA applications

SeersantLoom

Joined: 11 Nov 04

Posts: 3

Credit: 17343656

RAC: 0

RE: We have finally begun

14 Dec 2009 10:15:37 UTC

Message 95678

(moderation:

)

Quote:

We have finally begun to automatically deliver CUDA work & applications (plan class "ABP1cuda23") to machines that satisfy the following requirements:

- enabled NVIDIA GPU work in Einstein@home preferences
- NVidia GPU with at least 450MB of free memory
- Display Driver version 190.38 (&up), i.e. CUDA 2.3 capability
- BOINC Core Client version 6.10 (&up)

CUDA Beta App testers should drain their work cache and switch back to the normal project work.

BM

My box has all noted requirements: 8800GT card w/ 512MB RAM (don't know how much of it is exactly free, though, resolution is set to 1920x1200 24bpp), nvidia-drivers-190.42-r3, CUDA 2.3 and BOINC 6.10.18.

System and OS: AMD Athlon(tm) 64 X2 Dual Core Processor 6000+, 8GB RAM and runs Gentoo Linux.

Problem: ABPIcuda23 won't run on it. It stops right in the beginning of computation with error: "[ERROR] Error creating CUDA FFT plan (error code: 2)".

All the E@H CUDA tasks this client was assigned to failed with the same message.

One example: http://einsteinathome.org/task/150643235

Gundolf Jahn

Joined: 1 Mar 05

Posts: 1079

Credit: 341280

RAC: 0

RE: My box has all noted

14 Dec 2009 11:13:16 UTC

Message 95679 in response to message 95678

(moderation:

)

Quote:

My box has all noted requirements: 8800GT card w/ 512MB RAM (don't know how much of it is exactly free, though, resolution is set to 1920x1200 24bpp), nvidia-drivers-190.42-r3, CUDA 2.3 and BOINC 6.10.18.

Could you reduce the resolution, just to check if it's the culprit?

If it isn't or you don't want to leave the resolution at low, I think that you'll have to disallow CUDA processing for Einstein (Einstein@Home preferences, Use NVIDIA GPU if present?), unless someone else knows another reason for the errors.

To the developers:
It should be possible to give informational output in stdout about total/remaining GPU memory (SETI could do it at least ;-).

Perhaps (not so easily done:-) a "fallback to CPU mode" could be implemented when insufficient memory is detected.

GruÃŸ,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5885

Credit: 119138534174

RAC: 24447449

RE: All the E@H CUDA tasks

14 Dec 2009 20:01:25 UTC

Message 95680 in response to message 95678

(moderation:

)

Quote:

All the E@H CUDA tasks this client was assigned to failed with the same message.

One example: http://einsteinathome.org/task/150643235

To make it easier for others, I've made the link clickable.

The full error message is

[19:35:55][4928][ERROR] Error creating CUDA FFT plan (error code: 2)
[19:35:55][4928][ERROR] Demodulation failed (error: 3)!
19:35:55 (4928): called boinc_finish

If you ckeck out this response from Oliver in the CUDA app beta test thread, to an identical message, you can guess that you are running out of memory. Be aware that your other activities may have tied up graphics memory and perhaps that memory is still unavailable when E@H tries to start a task.

Cheers,
Gary.

Cupojoe

Joined: 24 Feb 05

Posts: 23

Credit: 33068058

RAC: 0

All I know is, I've gotten

15 Dec 2009 5:50:19 UTC

Message 95681

(moderation:

)

All I know is, I've gotten way too many CUDA units (which I don't mind, though if that's all I have, and I'm using my computer, nothing gets done -- no work is done on my GPU when the computer is in use, and when my computer is not in use, only one task is processed at a time -- though I have a dual core system -- since I only have one video card). If there were a better balance of CUDA to non-CUDA tasks, einstein@home would be getting a lot more math out of my computer...

_badger

Joined: 8 Mar 05

Posts: 12

Credit: 4623547

RAC: 0

RE: RE: All the E@H CUDA

15 Dec 2009 7:52:52 UTC

Message 95682 in response to message 95680

(moderation:

)

Quote:

Quote:
All the E@H CUDA tasks this client was assigned to failed with the same message.

One example: http://einsteinathome.org/task/150643235

To make it easier for others, I've made the link clickable.

The full error message is
[19:35:55][4928][ERROR] Error creating CUDA FFT plan (error code: 2)
[19:35:55][4928][ERROR] Demodulation failed (error: 3)!
19:35:55 (4928): called boinc_finish
If you ckeck out this response from Oliver in the CUDA app beta test thread, to an identical message, you can guess that you are running out of memory. Be aware that your other activities may have tied up graphics memory and perhaps that memory is still unavailable when E@H tries to start a task.

elmet

I have had the same error message 'Error creating CUDA FFT plan' downloading a block of ABP1cuda23 WU's. Have processed these successfully before and since.

Setup (kinda) similar to yours; Ubuntu 9.10 x64 on AMD Athlon 64 X2 4400, 2GB RAM, 2 x Asus 9800GT 512MB (not in SLI mode!). I'm running at a Philips CRT at 2048x1536 resolution so this not likely your problem.

BIONC 6.10.17 (manual install into /home//BOINC, not from the ubuntu respository).

nVidia driver 190.42 (downloaded from nVidia and manually installed).

My problem was a kernel upgrade, the nVidia driver ceased to work properly. Re-installed (followed these instructions, all has been well since).

Notes to Developers:

* Dosen't like the GPU's in SLI mode, got 'GPU device missing' on partially completed WU's, ignored any others still in the queue.
* With 2 core CPU and 2 GPUs, the BOINC scheduler is now ONLY running ABP1cuda23 WU's. Any CPU WUs are being ignored. Have watched another thread with a discussion on whether the BOINC scheduler will enter panic mode and process these. I waited until < 24 hrs before deadline, chickened out, set 'no new work' for the project, suspended the GPU WU's and processed the stragglers. I have more with a deadline of 24th Dec, I'll see what happens this time.
* Something thats not being said enough in this forum: Cheers on the work you guys are doing. Niggles like this minor! GPU integration is a major step forward, well done!

Michael Goetz

Joined: 11 Feb 05

Posts: 21

Credit: 3067690

RAC: 0

RE: * Dosen't like the

15 Dec 2009 9:08:39 UTC

Message 95683 in response to message 95682

(moderation:

)

Quote:

* Dosen't like the GPU's in SLI mode, got 'GPU device missing' on partially completed WU's, ignored any others still in the queue.

That's a known bug/feature. You can't use SLI with BOINC. Actually, it may be that you can't use SLI with CUDA (I don't remember.)

Quote:

* With 2 core CPU and 2 GPUs, the BOINC scheduler is now ONLY running ABP1cuda23 WU's. Any CPU WUs are being ignored. Have watched another thread with a discussion on whether the BOINC scheduler will enter panic mode and process these. I waited until < 24 hrs before deadline, chickened out, set 'no new work' for the project, suspended the GPU WU's and processed the stragglers. I have more with a deadline of 24th Dec, I'll see what happens this time.

It would have been interesting to see if it actually did let the CPU tasks go beyond deadline. I would have chickened out too, however!

I know that there's been a lot of work on VPU vs. GPU scheduling in the BOINC client -- or at least a lot of discussion and angst about it in the project forums. To be honest, I have no idea how well BOINC handles it, and it possibly (probably?) varies significantly depending on which version of the client you're running.

Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5885

Credit: 119138534174

RAC: 24447449

You have two computers

15 Dec 2009 9:11:15 UTC

Message 95684 in response to message 95681

(moderation:

)

You have two computers currently active on E@H. Firstly, there is a Core 2 Duo whose task list shows tasks that have used a GPU for crunching. Secondly, there is an AMD64 X2 whose task list shows both GW and APB1 tasks but isn't listed as having a suitable GPU. So I guess your following comments refer to the C2D.

As a general observation, both hosts have quite large caches - perhaps in the order of 6 days or so, or even more if you aren't crunching 24/7 and about double that if you are only crunching on 1 CPU as your comments seem to suggest. Your C2D has lots of aborted tasks and lots of 'client detached' tasks so it might be a good idea to lower your cache a bit if you have tasks excessive to your requirements.

Quote:

All I know is, I've gotten way too many CUDA units (which I don't mind, though if that's all I have, and I'm using my computer, nothing gets done -- no work is done on my GPU when the computer is in use, and when my computer is not in use, only one task is processed at a time -- though I have a dual core system -- since I only have one video card).

Let's see if I've got this straight. Your C2D gets both types of tasks, perhaps more GPU tasks than CPU tasks and you don't really mind that because it makes up for when you can't get CPU tasks? And you only ever have one task crunching at a time? Is this what you are saying?

You can control most things with preferences so it should be possible to have pretty much what you want (within reason) :-). The GPU app actually takes up 1 CPU + 1 GPU while a GPU task is crunching. The GPU is only quite lightly used for the duration. You should be able to have 2 tasks crunching simultaneously if you wish. If you are not using your computer and there is still only 1 task crunching, something is most likely at fault with your preference settings.

Quote:

If there were a better balance of CUDA to non-CUDA tasks, einstein@home would be getting a lot more math out of my computer...

You create that balance with your preferences so can you please go to your account page and tell us what the following settings are

1. Under computing preferences

* Suspend work while computer is in use?
* Suspend GPU work while computer is in use?
* On multiprocessors, use at most -- processors
* On multiprocessors, use at most --% of the processors
* Use at most -- percent of CPU time

2. Under E@H preferences

* Use CPU -- I imagine this is 'yes'?
* Use NVIDIA GPU -- I imagine this is 'yes'?
* Run only the selected applications -- tell us the 4 values here

Once you give us the answers we can take this further, thanks.

Cheers,
Gary.

hotze33

Joined: 10 Nov 04

Posts: 100

Credit: 368387400

RAC: 0

***It would have been

15 Dec 2009 11:35:11 UTC

Message 95685 in response to message 95683

(moderation:

)

***It would have been interesting to see if it actually did let the CPU tasks go beyond deadline. I would have chickened out too, however!***

The scheduler definitive kills the units. I have seen this in the beta app before.

SeersantLoom

Joined: 11 Nov 04

Posts: 3

Credit: 17343656

RAC: 0

RE: RE: My box has all

17 Dec 2009 8:47:11 UTC

Message 95686 in response to message 95679

(moderation:

)

Quote:

Quote:
My box has all noted requirements: 8800GT card w/ 512MB RAM (don't know how much of it is exactly free, though, resolution is set to 1920x1200 24bpp), nvidia-drivers-190.42-r3, CUDA 2.3 and BOINC 6.10.18.

Could you reduce the resolution, just to check if it's the culprit?

If it isn't or you don't want to leave the resolution at low, I think that you'll have to disallow CUDA processing for Einstein (Einstein@Home preferences, Use NVIDIA GPU if present?), unless someone else knows another reason for the errors.

To the developers:
It should be possible to give informational output in stdout about total/remaining GPU memory (SETI could do it at least ;-).

Perhaps (not so easily done:-) a "fallback to CPU mode" could be implemented when insufficient memory is detected.

GruÃŸ,
Gundolf

My 24" LCD display has this "native" resolution, I don't like the thought of lowering it. CUDA packages and nvidia-drivers should be OK. GPUGRID is doing fine (even with 3D screensaver and EVE-Online running at the same time). No compiz packages in the system, guess Xorg/KDE are not using as much 3D here.

E@H GPU tasks are now set to disabled. Did it soon after I saw them failing one after another. SETI hasn't sent me any GPU tasks, I think the cause is same as with E@H, only difference being it detected that in advance.

Elmet

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3589128696

RAC: 939585

RE: That's a known

19 Dec 2009 0:07:48 UTC

Message 95687 in response to message 95683

(moderation:

)

Quote:

That's a known bug/feature. You can't use SLI with BOINC. Actually, it may be that you can't use SLI with CUDA (I don't remember.)

That might've been true at one point, but my SLI 260's have been running 2x collatz and before that 2x milkyway WU's for months. (Dunno If I was still doing GPUgrid when I bought the 2nd card or not.)

ABP1 CUDA applications

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner