CUDA errors - The window cannot act on the sent message. (0x3ea) - exit code 1002 (0x3ea)

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0
Topic 197535

Does anyone know why I get these errors, on Einstein GPU tasks, on my laptop's Quadro FX 3800M?

The machine is admittedly strapped for RAM at the moment, and I'll upgrade that in the future, but... why do some of the CUDA tasks fail with this error, but some process successfully??

http://einsteinathome.org/host/10383741

Please help.

7.3.15

The window cannot act on the sent message.
(0x3ea) - exit code 1002 (0x3ea)

Activated exception handling...
[21:35:17][976][INFO ] Starting data processing...
[21:35:18][976][ERROR] Failed to enable CUDA thread yielding for device #0 (error: 2)! Sorry, will try to occupy one CPU core...
[21:35:18][976][ERROR] Couldn't acquire CUDA context of device #0 (error: 2)!
[21:35:18][976][ERROR] Demodulation failed (error: 1002)!
21:35:18 (976): called boinc_finish

]]>

mikey
mikey
Joined: 22 Jan 05
Posts: 12718
Credit: 1839121161
RAC: 3588

CUDA errors - The window cannot act on the sent message. (0x3ea

Quote:

Does anyone know why I get these errors, on Einstein GPU tasks, on my laptop's Quadro FX 3800M?

The machine is admittedly strapped for RAM at the moment, and I'll upgrade that in the future, but... why do some of the CUDA tasks fail with this error, but some process successfully??

http://einsteinathome.org/host/10383741

Please help.

7.3.15

The window cannot act on the sent message.
(0x3ea) - exit code 1002 (0x3ea)

Activated exception handling...
[21:35:17][976][INFO ] Starting data processing...
[21:35:18][976][ERROR] Failed to enable CUDA thread yielding for device #0 (error: 2)! Sorry, will try to occupy one CPU core...
[21:35:18][976][ERROR] Couldn't acquire CUDA context of device #0 (error: 2)!
[21:35:18][976][ERROR] Demodulation failed (error: 1002)!
21:35:18 (976): called boinc_finish

]]>

Are you leaving a cpu core free while the gpu crunches? Either that or are you getting an error in Windows too?

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0

I am not getting any Windows

I am not getting any Windows errors. And I am not presently leaving a CPU core free for the GPU, do you honestly believe that would cause this error?

This honestly seems more like either an application error, or a CUDA programming resource allocation error... doesn't it?

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1895
Credit: 1418492986
RAC: 1083934

Looks like you got several of


Looks like you got several of them last week but got some good ones after the 7th

http://einsteinathome.org/host/10383741/tasks&offset=0&show_names=1&state=5&appid=0

Never had the error myself but saw this http://www.errorfixes.net/1002-0x3ea.php

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0

Thanks. I've done preliminary

Thanks. I've done preliminary research as well. I'm fairly positive that the problem I'm having is not due to an operating system installation error.

It is possible that it could be caused by either a) System runs out of memory maybe (It only has 4 GB, but is set to start 8 CPU tasks alongside 1 non-CPU-intensive task alongside 1 GPU task), ... or b) Maybe it gets too hot (but in that case I'd expect different types of errors, not the same one all the time), ... or c) Some error in the application itself (since I haven't noticed any SETI errors on that same GPU)

Is there any way an Einstein application developer could chime in with his/her opinion here?

Thanks,
Jacob

Edit:
Hmm... Looks like this laptop has had similar problems with SETI and SETI Beta. So, perhaps the error is in the drivers? Sometimes I restart the laptop and it goes better for a while.

http://setiathome.berkeley.edu/result.php?resultid=3447519609
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=16477974

Edit:
Looks like 337.50 was actually released for this mobile Quadro GPU, so.. I'll be upgrading the drivers, and testing with them. If the issue resurfaces, I'll report back, but it sounds like a driver error the more I think about it.

mikey
mikey
Joined: 22 Jan 05
Posts: 12718
Credit: 1839121161
RAC: 3588

RE: Thanks. I've done

Quote:

Thanks. I've done preliminary research as well. I'm fairly positive that the problem I'm having is not due to an operating system installation error.

It is possible that it could be caused by either a) System runs out of memory maybe (It only has 4 GB, but is set to start 8 CPU tasks alongside 1 non-CPU-intensive task alongside 1 GPU task), ... or b) Maybe it gets too hot (but in that case I'd expect different types of errors, not the same one all the time), ... or c) Some error in the application itself (since I haven't noticed any SETI errors on that same GPU)

Is there any way an Einstein application developer could chime in with his/her opinion here?

Thanks,
Jacob

Edit:
Hmm... Looks like this laptop has had similar problems with SETI and SETI Beta. So, perhaps the error is in the drivers? Sometimes I restart the laptop and it goes better for a while.

http://setiathome.berkeley.edu/result.php?resultid=3447519609
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=16477974

Edit:
Looks like 337.50 was actually released for this mobile Quadro GPU, so.. I'll be upgrading the drivers, and testing with them. If the issue resurfaces, I'll report back, but it sounds like a driver error the more I think about it.

The last SEVERAL Nvidia driver releases have NOT benefited crunchers, but HAVE benefited gamers instead. Some people are reporting a 10% decline in their rac after upgrading.

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0

Yes, I know. But this thread

Yes, I know. But this thread is about a specific error, and not about performance. Also, the latest Quadro 337.50 drivers actually do include quite a bit of new functionality that's not meant for gaming.

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0

I just wanted to chime in,

I just wanted to chime in, with an update to this issue.

I updated the Laptop's memory... boosting it from 4 GB RAM, to 20 GB RAM. I believe this has fixed the problem, as I haven't had any task failures since the upgrade. Though, I don't know why it wasn't working properly with just 4 GB RAM.

mikey
mikey
Joined: 22 Jan 05
Posts: 12718
Credit: 1839121161
RAC: 3588

RE: I just wanted to chime

Quote:

I just wanted to chime in, with an update to this issue.

I updated the Laptop's memory... boosting it from 4 GB RAM, to 20 GB RAM. I believe this has fixed the problem, as I haven't had any task failures since the upgrade. Though, I don't know why it wasn't working properly with just 4 GB RAM.

Hmmm since that isn't an option for most people maybe you can work with someone and setup a test to recreate the problem and find a workaround thru Boinc.

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0

I'm not sure it's that easy

I'm not sure it's that easy to replicate/test.

I think what is happening is that Windows does not have enough free memory at the time that CUDA performs its mallocs (memory allocation requests), causing the CUDA app to fail.

It might have something to do with running VM tasks (from Test4Theory and Climate@Home) on that laptop, not sure. Previously I had found/reported/fixed a memory issue with VM tasks (where BOINC was overcommitting memory while VM tasks were running), but this was fixed in the latest public release of BOINC and the latest alphas. But now I'm wondering if there might still be some lingering issue still, like when a VM task is paused, or started and waiting to run.

I doubt I'll pull the RAM back out to do additional testing, unless I happen to find tons of extra time (and patience) to do so.

Regards,
Jacob

mikey
mikey
Joined: 22 Jan 05
Posts: 12718
Credit: 1839121161
RAC: 3588

RE: I'm not sure it's that

Quote:

I'm not sure it's that easy to replicate/test.

I think what is happening is that Windows does not have enough free memory at the time that CUDA performs its mallocs (memory allocation requests), causing the CUDA app to fail.

It might have something to do with running VM tasks (from Test4Theory and Climate@Home) on that laptop, not sure. Previously I had found/reported/fixed a memory issue with VM tasks (where BOINC was overcommitting memory while VM tasks were running), but this was fixed in the latest public release of BOINC and the latest alphas. But now I'm wondering if there might still be some lingering issue still, like when a VM task is paused, or started and waiting to run.

I doubt I'll pull the RAM back out to do additional testing, unless I happen to find tons of extra time (and patience) to do so.

Regards,
Jacob

I was just thinking that Boinc should have recognized the problem and said so instead of just keep trying. Someone else may get frustrated and not crunch, when it is just a minor problem with some units at some projects. And yes it could be a long process to figure out what went wrong where, and to track and log it. I guess it could be a project problem too, it not recognizing that your gpu has too few resources available to crunch a unit as it starts up. This may even go back to the previous problem we discussed of how Boinc itself handles gpu's in general. If so then hopefully a fix is 'in the works'.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.