Errors on R9 280X

ROSSOTRON
ROSSOTRON
Joined: 4 Sep 17
Posts: 3
Credit: 126,628,173
RAC: 0
Topic 210283

Hello,

All my Einstein WUs have been errors recently. I am using an XFX R9 280X gpu this time. 

https://einsteinathome.org/host/12566591/tasks/6/0?page=1

 

Thanks for any input.

Richie
Richie
Joined: 7 Mar 14
Posts: 490
Credit: 1,562,256,617
RAC: 639,616

Hi! It looks like 12 Oct

Hi!

 

It looks like 12 Oct 2017 there's been a few download errors of this type: 

Exit status:-186 (0xFFFFFF46) ERR_RESULT_DOWNLOAD<file_xfer_error> <file_name>JPLEPH.405</file_name><error_code>-119 (md5 checksum failed for file)</error_code>

You're not alone. Recent problem with MD5 checksum error has been discussed here:

https://einsteinathome.org/content/md5-check-failed-1

 

Then 17 Oct 2017 tasks have crashed with "error while computing", but I suspect that might've been caused by the same underlying problem.

Exit status:68 (0x00000044) Unknown error code<message> The name limit for the local computer network adapter card was exceeded. (0x44) - exit code 68 (0x44) </message>dpleph[initephem]: Cannot open file ..\..\projects\einstein.phys.uwm.edu\JPLEPH.405, result = 107 Error while initializing ephemeris; status: 104 BarycenterSingleDate failed! Returned time: 0+0.000000 barycenter_photons() failed! 09:24:02 (2152): [CRITICAL]: ERROR: MAIN() returned with error '4'

That JPLEPH points to something problematic that's been misbehaving lately, if I understood right.
ROSSOTRON
ROSSOTRON
Joined: 4 Sep 17
Posts: 3
Credit: 126,628,173
RAC: 0

My other computer is still

My other computer is still getting work, so I thought maybe my GPU was obsolete for Einstein. Hopefully it gets fixed soon.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 4,954
Credit: 31,089,532,226
RAC: 28,529,145

ROSSOTRON wrote:My other

ROSSOTRON wrote:
My other computer is still getting work, so I thought maybe my GPU was obsolete for Einstein. Hopefully it gets fixed soon.

It has been happening for a while and hasn't yet been commented on or fixed by the Devs.  If you know how to use a plain text editor like notepad, it's pretty easy to fix for yourself.  It's a 'once off' fix for a single machine.

As Richie notes, the problem is to do with the JPLEPH.405 file which should be in your einstein project folder.  Initially, the problem was that the file itself was quite correct but the reference to it in the state file (client_state.xml in the BOINC data folder) contained a bad MD5 checksum.  Someone pointed out that the checksum was actually correct for a completely empty file.  In the light of what you are now seeing, this is quite important in explaining why waiting for a fix from the project may not immediately fix things for you.

On the latest failed tasks the error is not about the MD5 checksum.  This implies that your current JPLEPH.405 file is actually passing the MD5 check.  In other words, it must be an empty file or the process wouldn't get as far as trying to read from it - which is now the current error.

To fix the problem, all you need to do is stop BOINC on the 'bad' machine and do two things.

  1. Take a copy of the JPLEPH.405 file from your 'good' machine (say on a USB stick) and use it to overwrite the file on your 'bad' machine.
  2. Open the client_state.xml file in your BOINC data folder on the 'bad' machine and search for the string JPLEPH.405 in a <file>...</file> block and you should find something like this
    <file>
        <name>JPLEPH.405</name>
        <nbytes>0.000000</nbytes>
        <max_nbytes>0.000000</max_nbytes>
        <md5_cksum>d41d8cd98f00b204e9800998ecf8427e</md5_cksum>
        <status>1</status>
        <sticky/>
        <download_url>http://einstein8.aei.uni-hannover.de/EinsteinAtHome/download/29/JPLEPH.405</download_url>
    </file>
    It's the 32 char string between the <md5_cksum> tags that needs replacing.  Just replace it with d6ce12bacd2a81a56423f5f238ba84eb  (just confirm you are replacing 32 chars with 32 exactly).

Save the edited state file making sure it is still called client_state.xml without any .txt extension.  Once you restart BOINC on that machine, you should be able to request tasks and when you receive them they should crunch without any error.

 

Cheers,
Gary.

ROSSOTRON
ROSSOTRON
Joined: 4 Sep 17
Posts: 3
Credit: 126,628,173
RAC: 0

Thank you for this

Thank you for this information Gary. I will do this when I get home tonight.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 4,954
Credit: 31,089,532,226
RAC: 28,529,145

You're welcome!  Just be

You're welcome!  Just be aware there has been a response from one of the Devs so by the time you examine your state file the MD5 checksum might have changed.  You should still check if you have a zero length JPLEPH.405 file and replace it with a good copy if you have, rather than have more tasks error out :-).

 

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.