Computation errors

Pete
Pete
Joined: 31 Jul 10
Posts: 14
Credit: 1020243718
RAC: 0
Topic 205524

I am running the Gamma ray pulsar binary search #1 on GPUs 1.20 and have suddenly today started to get computation errors with a complaint in the log of:-17/02/2017 13:55:26 | Einstein@Home | Task LATeah0012L_1124.0_0_0.0_5451720_0 exited with a DLL initialization error. There are many many repeats of this error.
I tried a project reset and the same errors resulted. Looking at the tasks in error I find that  no-one has yet successfully submitted a good result.They are almost all in progress still which I think means they may not have started yet. I found one result where someone else has a compute error and he had also had a lot of errors. 

Is this my problem? or is there a problem elsewhere?

The gpu runs quite happily on milky way gpu tasks still. Which I suppose only proves that my gpu is not totally bad.

Peter

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

dll errors usually mean

dll errors usually mean either it's missing or corrupt.  I'm assuming this is the nvidia machine?

detach and move the Einstein folder to trash. Reboot machine so nothing in memory then reattach to project.

hopefully that will force server to send all needed items to the new folder it created when you reattach

also make sure you antivirus  isn't blocking or scanning that folder. I remember some time back some antivirus programs didn't like the dll in another project 

Pete
Pete
Joined: 31 Jul 10
Posts: 14
Credit: 1020243718
RAC: 0

Thank you for your reply. It

Thank you for your reply. It is the nvidia m/c. Detach and move the E folder to trash.  I took this to mean remove the project and trash all the einstein stuff. Removing the project also removed almost all the einstein folders and files that I could find. Certainly the projects folder in prog data\boinc anyway. I also trashed some other files that dir *einstein*.* found that looked interesting. I left some job logs that I doubted would affect anything. Restarted the m/c, turned off the anti virus and attached to the project again. I got a large number of files downloaded and as soon as it started I got the same sort of error. A difference file though.

17/02/2017 20:25:12 | Einstein@Home | Starting task LATeah0012L_1132.0_0_0.0_8739820_0
17/02/2017 20:25:13 | Einstein@Home | Task LATeah0012L_1132.0_0_0.0_8739820_0 exited with a DLL initialization error.
I also see that one of the tasked I compute/failed on earlier has now verified on 2 other computers. So it looks like I am in error.

Any further help would be appreciated.

Logforme
Logforme
Joined: 13 Aug 10
Posts: 332
Credit: 1714373961
RAC: 0

Without the name of the

Without the name of the missing dll file it's guesswork.

My guess is opencl. Make sure your Nvidia graphics driver is updated and supports opencl.

mikegee81
mikegee81
Joined: 15 Nov 16
Posts: 1
Credit: 11729642
RAC: 0

Hello, I am encountering a

Hello, I am encountering a problem for GPU tasks. Any suggestions would be appreciated. 

https://einsteinathome.org/task/612957426

Outcome: Computation error
Client state: Compute error
Exit status: 1 (0x00000001) Unknown error code
Validation state: Invalid
Application: Gamma-ray pulsar binary search #1 on GPUs v1.18 (FGRPopencl1K-ati)

 

'../../projects/einstein.phys.uwm.edu/LATeah0012L_1116.0_0_0.0_10361280_2_1'
17:05:42 (25956): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
17:05:42 (25956): [debug]: glibc version/release: 2.23/stable
17:05:42 (25956): [debug]: Set up communication with graphics process.
boinc_get_opencl_ids returned [0x2173878 , 0x7fd913904ae0]
Using OpenCL platform provided by: Mesa
Using OpenCL device "AMD PITCAIRN (DRM 2.43.0 / 4.4.0-62-generic, LLVM 3.8.0)" by: AMD
Max allocation limit: 268435456
Global mem size: 1073741824
OpenCL device has FP64 support
LLVM ERROR: Cannot select: 0x45f3840: i32,ch = AtomicCmpSwap<Volatile LDST4[%1592(addrspace=1)]> 0x4813360, 0x29d94b0, 0x47e48b0, 0x266ef60
  0x29d94b0: i64,ch = CopyFromReg 0x4813360, Register:i64 %vreg241
    0x2670690: i64 = Register %vreg241
  0x47e48b0: i32,ch = CopyFromReg 0x4813360, Register:i32 %vreg244
    0x45f09e0: i32 = Register %vreg244
  0x266ef60: i32 = bitcast 0x46ea540
    0x46ea540: f32 = fadd 0x2676e10, 0x2b48df0
      0x2676e10: f32,ch = CopyFromReg 0x4813360, Register:f32 %vreg236
        0x4820510: f32 = Register %vreg236
      0x2b48df0: f32 = bitcast 0x47e48b0
        0x47e48b0: i32,ch = CopyFromReg 0x4813360, Register:i32 %vreg244
          0x45f09e0: i32 = Register %vreg244
In function: kernel_ts_2_phase_diff_sorted

 
Pete
Pete
Joined: 31 Jul 10
Posts: 14
Credit: 1020243718
RAC: 0

I upgraded the driver for the

I upgraded the driver for the gpu, tried a system restore to an earlier date, reset and added the project and none of this made the slightest difference. reluctantly I have moved the gpu to another project where it is happily working both cuda and opcl. If anyone has any other help I will try it back here.

jay
jay
Joined: 25 Jan 07
Posts: 99
Credit: 84044023
RAC: 0

Hello Pete!! I am having

Hello Pete!!

 

I am having similar problem, perhaps different circumstances..

I had been running Ubuntu(-Mate) 15.10 on two machines, each with a Radeon 7750.

I upgraded one to 16.04 and ran into the problem of lack of the old fglrx drivers.

I added the amd-gpu package - which should have open cl and that crashes video - so that I was unable to boot.

(I forgot the trick of putting in an old nvdia card and reboot to fix.)

Anyway, I reinstalled again to Ubuntu(-Mate) 16.10

Then, BOINC found no GPU. I suppose because default Mesa drivers included did not have Open CL

Then, I added  mesa-opencl-icd which had:

         libclang-common-3.8-dev{a} libclc-amdgcn{a} libclc-dev{a} libclc-r600{a} mesa-opencl-icd ocl-icd-libopencl1{a}

Then, BOINC could see the GPU  as:

Tue 28 Feb 2017 03:48:38 AM EST |  | Starting BOINC client version 7.6.33 for x86_64-pc-linux-gnu Tue 28 Feb 2017 03:48:38 AM EST |  | log flags: file_xfer, sched_ops, task Tue 28 Feb 2017 03:48:38 AM EST |  | Libraries: libcurl/7.50.1 OpenSSL/1.0.2g zlib/1.2.8 libidn/1.33 librtmp/2.3 Tue 28 Feb 2017 03:48:38 AM EST |  | Data directory: /var/lib/boinc-client Tue 28 Feb 2017 03:48:38 AM EST |  | OpenCL: AMD/ATI GPU 0: AMD CAPE VERDE (DRM 2.46.0 / 4.8.0-39-generic, LLVM 3.8.1) (driver version 12.0.3, device version OpenCL 1.1 Mesa 12.0.3, 1024MB, 1024MB available, 512 GFLOPS peak)

I reinstalled Eistein ang got some GPU WU. but they errored out as found in:

https://einsteinathome.org/task/616702363         which said

<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
03:57:51 (5516): [normal]: This Einstein@home App was built at: Jan 16 2017 08:09:16

03:57:51 (5516): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati'.
03:57:51 (5516): [debug]: 1e+16 fp, 3.9e+09 fp/s, 2719384 s, 755h23m03s62
command line: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati --inputfile ../../projects/einstein.phys.uwm.edu/LATeah0013L.dat --alpha 4.42281478648 --delta -0.0345027837249 --skyRadius 2.152570e-06 --ldiBins 15 --f0start 1116.0 --f0Band 8.0 --firstSkyPoint 0 --numSkyPoints 1 --f1dot -1e-13 --f1dotBand 1e-13 --df1dot 3.344368011e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 2097152.0 --toplist 10 --cohFollow 10 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --mmfu 0.1 --reftime 56100 --model 0 --f0orbit 0.005 --mismatch 0.1 --demodbinary 1 --BinaryPointFile ../../projects/einstein.phys.uwm.edu/templates_LATeah0013L_1124_9018430.dat --debug 1 --device 0 -o LATeah0013L_1124.0_0_0.0_9018430_0_0.out
output files: 'LATeah0013L_1124.0_0_0.0_9018430_0_0.out' '../../projects/einstein.phys.uwm.edu/LATeah0013L_1124.0_0_0.0_9018430_0_0' 'LATeah0013L_1124.0_0_0.0_9018430_0_0.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah0013L_1124.0_0_0.0_9018430_0_1'
03:57:51 (5516): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
03:57:51 (5516): [debug]: glibc version/release: 2.24/stable
03:57:51 (5516): [debug]: Set up communication with graphics process.
boinc_get_opencl_ids returned [0x3082e58 , 0x7f822211dae0]
Using OpenCL platform provided by: Mesa
Using OpenCL device "AMD CAPE VERDE (DRM 2.46.0 / 4.8.0-39-generic, LLVM 3.8.1)" by: AMD
Max allocation limit: 268435456
Global mem size: 1073741824
OpenCL device has FP64 support
LLVM ERROR: Cannot select: 0x34bc120: i32,ch = AtomicCmpSwap<Volatile LDST4[%1580(addrspace=1)]> 0x323c240, 0x31fe1a0, 0x34a6e90, 0x4bb0fe0
0x31fe1a0: i64,ch = CopyFromReg 0x323c240, Register:i64 %vreg232

0x3a0c8a0: i64 = Register %vreg232
0x34a6e90: i32,ch = CopyFromReg 0x323c240, Register:i32 %vreg235
0x34b0240: i32 = Register %vreg235
0x4bb0fe0: i32 = bitcast 0x4b9d8f0
0x4b9d8f0: f32 = fadd 0x34adc30, 0x34a69d0
0x34adc30: f32,ch = CopyFromReg 0x323c240, Register:f32 %vreg227
0x34adb00: f32 = Register %vreg227
0x34a69d0: f32 = bitcast 0x34a6e90
0x34a6e90: i32,ch = CopyFromReg 0x323c240, Register:i32 %vreg235
0x34b0240: i32 = Register %vreg235
In function: kernel_ts_2_phase_diff_sorted

</stderr_txt>

 So... Anybody got some hints? Love Letters? Advice?

 

Thanks!!!!

Jay

-- edit:fix typos

jay
jay
Joined: 25 Jan 07
Posts: 99
Credit: 84044023
RAC: 0

updateProblem may be with

update

The  WU is of type: FGRPopencl1K-ati (Gamma-ray pulsar binary search #1 1.18)

Same type runs ok on other machine.

The driver with Ubuntu 16.10 sees only 1GB of 2GB on the AMD 7750 card.

 

edit: Complete rewrite - was lookin at wrong data.

 

jay
jay
Joined: 25 Jan 07
Posts: 99
Credit: 84044023
RAC: 0

Sudden Hopes!Ubuntu 16.10

Sudden Hopes!

Ubuntu 16.10 had an update today to mesa-opencl-icd and mesa-va-drivers

Do Upgrade.
The following packages will be upgraded:
  libegl1-mesa libegl1-mesa-drivers libgbm1 libgl1-mesa-dri libgl1-mesa-glx libglapi-mesa libgles1-mesa libgles2-mesa libwayland-egl1-mesa
  libxatracker2 mesa-opencl-icd mesa-va-drivers

I hoped it may fix the error. Waited for all WU to finish; removed all projects; removed all boinc; reinstalled Boinc and projects.

Got two Einstein WU tasks

Still get

OpenCL device has FP64 support LLVM ERROR: Cannot select: 0x30b91b0: i32,ch = AtomicCmpSwap<Volatile LDST4[%1580(addrspace=1)]> 0x1e5e100, 0x390a400, 0x1d31a70, 0x31ed890

and the WU errors out in the first few seconds.

I want to try a SETI GPU wu but haven't been able to get any. (Tuesday=SETI maintenance day),; plus having internet  flow problems (Amazon AWS problem day.(reelated?))Oh well.

Please help. Opinions needed. Do I need the boinc-client-opencl?

PS the boinc install had:Preparing to unpack

.../00-libboinc7_7.6.33+dfsg-1_amd64.deb ... .../01-boinc-client_7.6.33+dfsg-1_amd64.deb ... .../02-libwxbase3.0-0v5_3.0.2+dfsg-2_amd64.deb .../03-libjavascriptcoregtk-1.0-0_2.4.11-3_amd64.deb ... .../04-libwebkitgtk-1.0-0_2.4.11-3_amd64.deb ... .../05-libwxgtk3.0-0v5_3.0.2+dfsg-2_amd64.deb ... .../06-libwxgtk-webview3.0-0v5_3.0.2+dfsg-2_amd64.deb ... .../07-boinc-manager_7.6.33+dfsg-1_amd64.deb ... ../08-boinc_7.6.33+dfsg-1_all.deb ... .../09-libgeoclue-2-0_2.4.3-1_amd64.deb ....../10-geoclue-2.0_2.4.3-1_amd64.deb ...  .../11-iio-sensor-proxy_1.3-1ubuntu2_amd64.deb ...

edit: made a bit more readable.

 

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

Pete,  Don't know if you are

Pete, 

Don't know if you are still following this thread.  One of my buddies said to do the following

I recommend uninstalling video card associated software and any other software with a specialized Uninstaller like crapcleaner. Then doing a registry cleanup.

Shutdown, restart then reinstalling again

Hope this helps

Zalster

JBird
Joined: 22 Dec 14
Posts: 1963
Credit: 4046216051
RAC: 0

OK here's a puzzler What the

OK here's a puzzler

What the heck is solution for this? WHICH DLL  is missing - please Name it and it shall be found!

Exit status -1073741515 (0xC0000135)STATUS_DLL_NOT_FOUND

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.