Hello,
my Nvidia GPU is correctly detected but all the Einstein GPU tasks fail on a computation error. (Milkyway GPU tasks work fine)
For example LATeah0046L_468.0_0_0.0_140560_1 (TASK 703746622)
I don't understand where is the problem.
First Output lines :
<core_client_version>7.8.4</core_client_version> <![CDATA[ <message> process exited with code 6 (0x6, -250)</message> <stderr_txt> 18:25:34 (11661): [normal]: This Einstein@home App was built at: Feb 15 2017 10:50:14
18:25:34 (11661): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia'.
18:25:34 (11661): [debug]: 1e+16 fp, 4.4e+09 fp/s, 2372293 s, 658h58m12s99
18:25:34 (11661): [normal]: % CPU usage: 1.000000, GPU usage: 1.000000
command line: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia --inputfile ../../projects/einstein.phys.uwm.edu/LATeah0046L.dat --alpha 4.42281478648 --delta -0.0345027837249 --skyRadius 2.152570e-06 --ldiBins 15 --f0start 460.0 --f0Band 8.0 --firstSkyPoint 0 --numSkyPoints 1 --f1dot -1e-13 --f1dotBand 1e-13 --df1dot 3.344368011e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 2097152.0 --toplist 10 --cohFollow 10 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --mmfu 0.1 --reftime 56100 --model 0 --f0orbit 0.005 --mismatch 0.1 --demodbinary 1 --BinaryPointFile ../../projects/einstein.phys.uwm.edu/templates_LATeah0046L_0468_140560.dat --debug 1 --device 0 -o LATeah0046L_468.0_0_0.0_140560_1_0.out
output files: 'LATeah0046L_468.0_0_0.0_140560_1_0.out' '../../projects/einstein.phys.uwm.edu/LATeah0046L_468.0_0_0.0_140560_1_0' 'LATeah0046L_468.0_0_0.0_140560_1_0.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah0046L_468.0_0_0.0_140560_1_1'
18:25:34 (11661): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
18:25:34 (11661): [debug]: glibc version/release: 2.26/stable
18:25:34 (11661): [debug]: Set up communication with graphics process.
-- signal handler called: signal 6
2 stack frames obtained for this thread: Frame 32: Binary file: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia (0x48b261) Source file: hs_boinc_extras.c (Function: sighandler / Line: 290) Frame 31: Binary file: /lib64/libc.so.6 (0x7f5293a3869b) Offset info: gsignal+0xcb Frame 30: Binary file: /lib64/libc.so.6 (0x7f5293a3869b) Offset info: gsignal+0xcb Frame 29: Binary file: /lib64/libc.so.6 (0x7f5293a3a3b1) Offset info: abort+0x141 Frame 28: Binary file: /lib64/libc.so.6 (0x7f5293a82a87) Offset info: +0x81a87 Frame 27: Binary file: /lib64/libc.so.6 (0x7f5293a89e8e) Offset info: +0x88e8e Frame 26: Binary file: /lib64/libc.so.6 (0x7f5293a8b989) Offset info: +0x8a989 Frame 25: Binary file: /lib64/libc.so.6 (0x7f5293a942ee) Offset info: cfree+0x6e Frame 24: Binary file: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia (0x6a7598) Offset info: _ZNSt13runtime_errorD2Ev+0x58 Source file: basic_string.h (Function: y / Line: 249) Source file: basic_string.h (Function: ~basic_string / Line: 539) Source file: stdexcept.cc (Function: y / Line: 68) Frame 23: Binary file: /lib64/libMesaOpenCL.so.1 (0x7f528abc8d9e) Offset info: +0x20d9e Frame 22: Binary file: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia (0x69992f) Source file: eh_throw.cc (Function: / Line: 52) Frame 21: Binary file: /lib64/libMesaOpenCL.so.1 (0x7f528ac4ae1f) Offset info: +0xa2e1f Frame 20: Binary file: /lib64/libMesaOpenCL.so.1 (0x7f528abf4cc4) Offset info: +0x4ccc4 Frame 19: Binary file: /lib64/libMesaOpenCL.so.1 (0x7f528abf4cf4) Offset info: +0x4ccf4 Frame 18: Binary file: /lib64/ld-linux-x86-64.so.2 (0x7f529478ee83) Offset info: +0x10e83 Frame 17: Binary file: /lib64/ld-linux-x86-64.so.2 (0x7f5294793dda) Offset info: +0x15dda Frame 16: Binary file: /lib64/libc.so.6 (0x7f5293b5f4df) Offset info: _dl_catch_error+0x8f Frame 15: Binary file: /lib64/ld-linux-x86-64.so.2 (0x7f52947932e9) Offset info: +0x152e9 Frame 14: Binary file: /lib64/libdl.so.2 (0x7f529413bf96) Offset info: +0xf96 Frame 13: Binary file: /lib64/libc.so.6 (0x7f5293b5f4df) Offset info: _dl_catch_error+0x8f Frame 12: Binary file: /lib64/libdl.so.2 (0x7f529413c715) Offset info: +0x1715 Frame 11: Binary file: /lib64/libdl.so.2 (0x7f529413c021) Offset info: dlopen+0x41 Frame 10: Binary file: /lib64/libOpenCL.so.1 (0x7f5294563a82) Offset info: +0x5a82 Frame 9: Binary file: /lib64/libOpenCL.so.1 (0x7f5294565a74) Offset info: clGetPlatformIDs+0x114 Frame 8: Binary file: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia (0x5baf44) Offset info: _Z24boinc_get_opencl_ids_auxPciiPP13_cl_device_idPP15_cl_platform_id+0x74 Source file: unknown (Function: / Line: 0) Frame 7: Binary file: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia (0x5bb46a) Offset info: _Z20boinc_get_opencl_idsPP13_cl_device_idPP15_cl_platform_id+0xe6 Source file: unknown (Function: I / Line: 0) Frame 6: Binary file: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia (0x48bc66) Offset info: eah_boinc_get_opencl_ids+0x26 Source file: hs_boinc_options.cpp (Function: eah_boinc_get_opencl_ids / Line: 136) Frame 5: Binary file: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia (0x48dcf4) Offset info: gen_fft_get_ctx+0x44 Source file: unknown (Function: gen_fft_get_ctx / Line: 0) Frame 4: Binary file: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia (0x47975c) Offset info: MAIN+0x15c Source file: HSgammaPulsar.c (Function: MAIN / Line: 4251) Frame 3: Binary file: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia (0x46c0ff) Offset info: main+0x5ff Source file: hs_boinc_extras.c (Function: worker / Line: 832) Source file: hs_boinc_extras.c (Function: main / Line: 1038) Frame 2: Binary file: /lib64/libc.so.6 (0x7f5293a2203a) Offset info: __libc_start_main+0xea Frame 1: Binary file: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.20_x86_64-pc-linux-gnu__FGRPopencl1K-nvidia (0x46e5f9) Source file: unknown (Function: _start / Line: 0)
End of stcaktrace
18:25:34 (11661): called boinc_finish
</stderr_txt>
]]>
Copyright © 2024 Einstein@Home. All rights reserved.
Hi and welcome to
)
Hi and welcome to Einstein@home!
When checking your tasks for hostID 12596871 I found that your computer managed to finish a few tasks before the errors started and that Task 702186181 seems to be the first one to fail.
The error message given in stderr is:
And then the task restarted and failed with "-- signal handler called: signal 6".
Something seems to have gone wrong while processing this tasks and that might have somehow gotten the graphics card or the driver into a unstable state and every task after that one fails with the same "signal 6" error.
Have you tried to reboot your computer?
Hi and thank youfor your
)
Hi and thank youfor your answer,
I just rebooted my computer some minutes ago, I did a project update and the problem is still there .
Errors on the four tasks (after the reboot):
LATeah0046L_612.0_0_0.0_1165895_0 703911012 LATeah0046L_612.0_0_0.0_1149580_0 703910985 LATeah0046L_612.0_0_0.0_1148325_0 703910983 LATeah0046L_612.0_0_0.0_1126990_1 703910949
Just after these 4 computation errors on Einstein project, my GPU did a Milkyway task with no error (there is never any error on the GPU Milkyway tasks).
benoit_7 wrote:Hi and thank
)
I looked at a few of your tasks that have had problems and others seem to be having problems with the workunits too, not everyone but it may just not be on your end alone.
Thank you for your help, I
)
Thank you for your help, I feel less lonely
benoit_7 skrev:Hi and thank
)
Sorry to hear that
Being a Windows user I'm afraid I can't help you with this problem but I hope that one of our Linux users will stop by and offer some advice.
Beats me. Have you tried
)
Beats me. Have you tried resetting the project?
Benoit, One thing I noticed
)
Benoit,
One thing I noticed in your stack trace is it seems to be using the Mesa OpenCL here, whereas if I look at one of your successful jobs over at Milkyway it seems to be using the NVIDIA OpenCL stuff.
I've got my one NVIDIA+Ubuntu set-up organized so that there's no trace of mesa-opencl-icd. I don't know whether you are able to try that or whether that would interfere with something else you have running...
Just a thought...
Good luck - Al.