CUDA errors - The window cannot act on the sent message. (0x3ea) - exit code 1002 (0x3ea)

Jacob Klein

Joined: 22 Jun 11

Posts: 45

Credit: 114028547

RAC: 0

12 Apr 2014 22:02:29 UTC

Topic 197535

(moderation:

)

Does anyone know why I get these errors, on Einstein GPU tasks, on my laptop's Quadro FX 3800M?

The machine is admittedly strapped for RAM at the moment, and I'll upgrade that in the future, but... why do some of the CUDA tasks fail with this error, but some process successfully??

http://einsteinathome.org/host/10383741

Please help.

7.3.15

The window cannot act on the sent message.
(0x3ea) - exit code 1002 (0x3ea)

Activated exception handling...
[21:35:17][976][INFO ] Starting data processing...
[21:35:18][976][ERROR] Failed to enable CUDA thread yielding for device #0 (error: 2)! Sorry, will try to occupy one CPU core...
[21:35:18][976][ERROR] Couldn't acquire CUDA context of device #0 (error: 2)!
[21:35:18][976][ERROR] Demodulation failed (error: 1002)!
21:35:18 (976): called boinc_finish

]]>

mikey

Joined: 22 Jan 05

Posts: 12829

Credit: 1883752515

RAC: 1103183

CUDA errors - The window cannot act on the sent message. (0x3ea

12 Apr 2014 22:58:41 UTC

Message 121165

(moderation:

)

Quote:

Does anyone know why I get these errors, on Einstein GPU tasks, on my laptop's Quadro FX 3800M?

The machine is admittedly strapped for RAM at the moment, and I'll upgrade that in the future, but... why do some of the CUDA tasks fail with this error, but some process successfully??

http://einsteinathome.org/host/10383741

Please help.

7.3.15

The window cannot act on the sent message.
(0x3ea) - exit code 1002 (0x3ea)

Activated exception handling...
[21:35:17][976][INFO ] Starting data processing...
[21:35:18][976][ERROR] Failed to enable CUDA thread yielding for device #0 (error: 2)! Sorry, will try to occupy one CPU core...
[21:35:18][976][ERROR] Couldn't acquire CUDA context of device #0 (error: 2)!
[21:35:18][976][ERROR] Demodulation failed (error: 1002)!
21:35:18 (976): called boinc_finish

]]>

Are you leaving a cpu core free while the gpu crunches? Either that or are you getting an error in Windows too?

Jacob Klein

Joined: 22 Jun 11

Posts: 45

Credit: 114028547

RAC: 0

I am not getting any Windows

12 Apr 2014 23:05:43 UTC

Message 121166 in response to message 121165

(moderation:

)

I am not getting any Windows errors. And I am not presently leaving a CPU core free for the GPU, do you honestly believe that would cause this error?

This honestly seems more like either an application error, or a CUDA programming resource allocation error... doesn't it?

MAGIC Quantum M...

Joined: 18 Jan 05

Posts: 1929

Credit: 1465326144

RAC: 1314926

Looks like you got several of

13 Apr 2014 5:46:41 UTC

Message 121167

(moderation:

)

Looks like you got several of them last week but got some good ones after the 7th

http://einsteinathome.org/host/10383741/tasks&offset=0&show_names=1&state=5&appid=0

Never had the error myself but saw this http://www.errorfixes.net/1002-0x3ea.php

Jacob Klein

Joined: 22 Jun 11

Posts: 45

Credit: 114028547

RAC: 0

Thanks. I've done preliminary

13 Apr 2014 13:39:55 UTC

Message 121168 in response to message 121167

(moderation:

)

Thanks. I've done preliminary research as well. I'm fairly positive that the problem I'm having is not due to an operating system installation error.

It is possible that it could be caused by either a) System runs out of memory maybe (It only has 4 GB, but is set to start 8 CPU tasks alongside 1 non-CPU-intensive task alongside 1 GPU task), ... or b) Maybe it gets too hot (but in that case I'd expect different types of errors, not the same one all the time), ... or c) Some error in the application itself (since I haven't noticed any SETI errors on that same GPU)

Is there any way an Einstein application developer could chime in with his/her opinion here?

Thanks,
Jacob

Edit:
Hmm... Looks like this laptop has had similar problems with SETI and SETI Beta. So, perhaps the error is in the drivers? Sometimes I restart the laptop and it goes better for a while.

http://setiathome.berkeley.edu/result.php?resultid=3447519609
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=16477974

Edit:
Looks like 337.50 was actually released for this mobile Quadro GPU, so.. I'll be upgrading the drivers, and testing with them. If the issue resurfaces, I'll report back, but it sounds like a driver error the more I think about it.

mikey

Joined: 22 Jan 05

Posts: 12829

Credit: 1883752515

RAC: 1103183

RE: Thanks. I've done

14 Apr 2014 11:23:03 UTC

Message 121169 in response to message 121168

(moderation:

)

Quote:

Thanks. I've done preliminary research as well. I'm fairly positive that the problem I'm having is not due to an operating system installation error.

It is possible that it could be caused by either a) System runs out of memory maybe (It only has 4 GB, but is set to start 8 CPU tasks alongside 1 non-CPU-intensive task alongside 1 GPU task), ... or b) Maybe it gets too hot (but in that case I'd expect different types of errors, not the same one all the time), ... or c) Some error in the application itself (since I haven't noticed any SETI errors on that same GPU)

Is there any way an Einstein application developer could chime in with his/her opinion here?

Thanks,
Jacob

Edit:
Hmm... Looks like this laptop has had similar problems with SETI and SETI Beta. So, perhaps the error is in the drivers? Sometimes I restart the laptop and it goes better for a while.

http://setiathome.berkeley.edu/result.php?resultid=3447519609
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=16477974

Edit:
Looks like 337.50 was actually released for this mobile Quadro GPU, so.. I'll be upgrading the drivers, and testing with them. If the issue resurfaces, I'll report back, but it sounds like a driver error the more I think about it.

The last SEVERAL Nvidia driver releases have NOT benefited crunchers, but HAVE benefited gamers instead. Some people are reporting a 10% decline in their rac after upgrading.

Jacob Klein

Joined: 22 Jun 11

Posts: 45

Credit: 114028547

RAC: 0

Yes, I know. But this thread

14 Apr 2014 11:29:29 UTC

Message 121170 in response to message 121169

(moderation:

)

Yes, I know. But this thread is about a specific error, and not about performance. Also, the latest Quadro 337.50 drivers actually do include quite a bit of new functionality that's not meant for gaming.

Jacob Klein

Joined: 22 Jun 11

Posts: 45

Credit: 114028547

RAC: 0

I just wanted to chime in,

10 May 2014 18:03:34 UTC

Message 121171

(moderation:

)

I just wanted to chime in, with an update to this issue.

I updated the Laptop's memory... boosting it from 4 GB RAM, to 20 GB RAM. I believe this has fixed the problem, as I haven't had any task failures since the upgrade. Though, I don't know why it wasn't working properly with just 4 GB RAM.

mikey

Joined: 22 Jan 05

Posts: 12829

Credit: 1883752515

RAC: 1103183

RE: I just wanted to chime

11 May 2014 10:59:00 UTC

Message 121172 in response to message 121171

(moderation:

)

Quote:

I just wanted to chime in, with an update to this issue.

I updated the Laptop's memory... boosting it from 4 GB RAM, to 20 GB RAM. I believe this has fixed the problem, as I haven't had any task failures since the upgrade. Though, I don't know why it wasn't working properly with just 4 GB RAM.

Hmmm since that isn't an option for most people maybe you can work with someone and setup a test to recreate the problem and find a workaround thru Boinc.

Jacob Klein

Joined: 22 Jun 11

Posts: 45

Credit: 114028547

RAC: 0

I'm not sure it's that easy

11 May 2014 12:17:53 UTC

Message 121173 in response to message 121172

(moderation:

)

I'm not sure it's that easy to replicate/test.

I think what is happening is that Windows does not have enough free memory at the time that CUDA performs its mallocs (memory allocation requests), causing the CUDA app to fail.

It might have something to do with running VM tasks (from Test4Theory and Climate@Home) on that laptop, not sure. Previously I had found/reported/fixed a memory issue with VM tasks (where BOINC was overcommitting memory while VM tasks were running), but this was fixed in the latest public release of BOINC and the latest alphas. But now I'm wondering if there might still be some lingering issue still, like when a VM task is paused, or started and waiting to run.

I doubt I'll pull the RAM back out to do additional testing, unless I happen to find tons of extra time (and patience) to do so.

Regards,
Jacob

mikey

Joined: 22 Jan 05

Posts: 12829

Credit: 1883752515

RAC: 1103183

RE: I'm not sure it's that

12 May 2014 12:51:41 UTC

Message 121174 in response to message 121173

(moderation:

)

Quote:

I'm not sure it's that easy to replicate/test.

I think what is happening is that Windows does not have enough free memory at the time that CUDA performs its mallocs (memory allocation requests), causing the CUDA app to fail.

It might have something to do with running VM tasks (from Test4Theory and Climate@Home) on that laptop, not sure. Previously I had found/reported/fixed a memory issue with VM tasks (where BOINC was overcommitting memory while VM tasks were running), but this was fixed in the latest public release of BOINC and the latest alphas. But now I'm wondering if there might still be some lingering issue still, like when a VM task is paused, or started and waiting to run.

I doubt I'll pull the RAM back out to do additional testing, unless I happen to find tons of extra time (and patience) to do so.

Regards,
Jacob

I was just thinking that Boinc should have recognized the problem and said so instead of just keep trying. Someone else may get frustrated and not crunch, when it is just a minor problem with some units at some projects. And yes it could be a long process to figure out what went wrong where, and to track and log it. I guess it could be a project problem too, it not recognizing that your gpu has too few resources available to crunch a unit as it starts up. This may even go back to the previous problem we discussed of how Boinc itself handles gpu's in general. If so then hopefully a fix is 'in the works'.

CUDA errors - The window cannot act on the sent message. (0x3ea) - exit code 1002 (0x3ea)

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner