New to the GPU computing thing

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 984
Credit: 25171438
RAC: 34

RE: E.g: task 216348627

Quote:

E.g: task 216348627 shows:
[17:17:04][20435][INFO ] Seed for random number generator is 1075756411.
[17:17:08][20435][ERROR] Error creating CUDA FFT plan (error code: 2)
[17:17:08][20435][ERROR] Demodulation failed (error: 1011)!
[17:17:08][20435][WARN ] CUDA memory allocation problem encountered!
------> Returning control to BOINC, delaying restart for at least five minutes...
------> If this problem persists you should consider aborting this task.
[17:17:09][20444][INFO ] Application startup - thank you for supporting Einstein@Home!
[17:17:09][20444][INFO ] Starting data processing...

You missed to copy the next two lines showing the memory stats:

[17:17:09][20444][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 257 MB (255 MB free / 512 MB total) -> Used by this application: 0 MB

So in this case there are only 255 MB free which is typically not enough - we require 300 MB...

Best,
Oliver

Einstein@Home Project

Oliver Behnke
Oliver Behnke
Moderator
Administrator
Joined: 4 Sep 07
Posts: 984
Credit: 25171438
RAC: 34

RE: RE: I've been

Quote:
Quote:

I've been registered with Einstein for - awhile, lol. Haven't done a lot of crunching though.

Anyway - I have a GPU that Boinc can use, and Einstein is sending GPU work for it to use. Sadly, it seems that some of it can't be crunched.

I've reset the project, as per the suggestion in my logs.

I looked at three of the workunits mentioned in the first log excerpt - all of them are still in progress, meaning that our server doesn't yet that (and why) they failed. If they already failed on your machine please click on the BOINC client's "Update" button such that we can have a look at the details.

Seems that your problem is solved now: you completed at least one of the tasks you mentioned initially as problematic.

@Ageless: this is how the memory allocations should look like. It's close (just 62 MB left) but it's enough available global memory to successfully run the task.

Best,
Oliver

Einstein@Home Project

Runaway1956
Runaway1956
Joined: 24 Dec 05
Posts: 6
Credit: 134266
RAC: 0

Sorry, I didn't check this

Sorry, I didn't check this thread for a day or two . . .

It appears that, yes, those problematic tasks finished, one at a time. I did reset once, and it looked like I got all the same tasks sent to me again, so they fed through again, one at a time.

Reboot? LOL, I only ever reboot when the kernel is updated, or the power goes off. At the moment, I'm not seeing any error messages, so I'll just let things run.

I DID change my preferences, telling Einstein NOT to run CPU jobs for tasks which have GPU jobs available. I would like to keep my average on Seti and Rosetta up. (Funny, Rosetta still hasn't sent a GPU job to this machine - maybe I need to look into that next.)

Hopefully, Einstein can make full use of my GPU without cutting into my other crunching very much.

Thanks for your time, guys - if anything seems to change, I'll post again!!

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 76

RE: Reboot? LOL, I only

Quote:
Reboot? LOL, I only ever reboot when the kernel is updated, or the power goes off.


You may think it's funny, but do know that the only way to reset anything stuck in video memory is a reboot. You won't be able to reset it by exiting & restarting BOINC, as removing the science application from physical (computer) memory won't fix anything stuck in video memory. Only a power cycle can reset video memory (quick power off -> on).

If you won't do so, you'll only be thrashing work and eventually run into a quota of 1 task "problem".

Runaway1956
Runaway1956
Joined: 24 Dec 05
Posts: 6
Credit: 134266
RAC: 0

Hmmm. I wasn't aware of

Hmmm. I wasn't aware of that. Things can get "stuck" in video memory? Odd.

Anyway - power went off last evening it seems - when I got home from work, I had to login again, and start all my stuff up. There is also a new kernel available from the repositories, so I'll be rebooting again - oh, tomorrow morning I guess.

Right now, my RAC keeps climbing, and I have an entire page of downloads queued - whatever the problem was seems to have been resolved - at least for now. :)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.