Asus R9280x Computation Error Win 10

iwajabitw
iwajabitw
Joined: 27 Feb 16
Posts: 8
Credit: 214989406
RAC: 0
Topic 213763

I started seeing errors on one system today.  I looked at the wu file and they all state this.

% Binary point 443/1255 % Starting semicoherent search over f0 and f1. % nf1dots: 31  df1dot: 3.344368011e-015  f1dot_start: -1e-013  f1dot_band: 1e-013 % Filling array of photon pairs Error in computing index of fft input array, i:1057382904 pair:4258498 ERROR: prepare_ts_2_phase_diff_sorted() returned with error 18934888 19:06:58 (11200): [CRITICAL]: ERROR: MAIN() returned with error '1' FPU status flags:  PRECISION 19:07:09 (11200): [normal]: done. calling boinc_finish(65). 19:07:09 (11200): called boinc_finish

The system https://einsteinathome.org/host/12505524 also has a MSI R9 280x in with it does not error out.  Its also not every task but I would say 50% or better as I watch (device 0) in the que to make sure.  Temp is never over 70C and  1 task doesn't max the GPU out, maybe 77%.  I tested with Enigma and Milkyway to see if it was a card issue, but they ran tasks just fine.  I updated drivers, removed and reinstalled the Einstein project in Boinc and its still happening.  Any help is appreciated

 

Log file

3/3/2018 7:21:30 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4136480_0_0 for task LATeah0056L_804.0_0_0.0_4136480_0 absent

3/3/2018 7:21:30 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4136480_0_1 for task LATeah0056L_804.0_0_0.0_4136480_0 absent

3/3/2018 7:21:30 PM | Einstein@Home | Starting task LATeah0056L_804.0_0_0.0_4235625_0 3/3/2018 7:22:20 PM | Einstein@Home | Computation for task LATeah0056L_804.0_0_0.0_4235625_0 finished 3/3/2018 7:22:20 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4235625_0_0 for task LATeah0056L_804.0_0_0.0_4235625_0 absent

3/3/2018 7:22:20 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4235625_0_1 for task LATeah0056L_804.0_0_0.0_4235625_0 absent

3/3/2018 7:22:20 PM | Einstein@Home | Starting task LATeah0056L_804.0_0_0.0_4243155_1 3/3/2018 7:23:06 PM | Einstein@Home | Computation for task LATeah0056L_804.0_0_0.0_4243155_1 finished 3/3/2018 7:23:06 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4243155_1_0 for task LATeah0056L_804.0_0_0.0_4243155_1 absent

3/3/2018 7:23:06 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4243155_1_1 for task LATeah0056L_804.0_0_0.0_4243155_1 absent

3/3/2018 7:23:06 PM | Einstein@Home | Starting task LATeah0056L_804.0_0_0.0_4248175_1 3/3/2018 7:23:43 PM | Einstein@Home | Computation for task LATeah0056L_804.0_0_0.0_4248175_1 finished 3/3/2018 7:23:43 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4248175_1_0 for task LATeah0056L_804.0_0_0.0_4248175_1 absent

3/3/2018 7:23:43 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4248175_1_1 for task LATeah0056L_804.0_0_0.0_4248175_1 absent

3/3/2018 7:23:43 PM | Einstein@Home | Starting task LATeah0056L_804.0_0_0.0_4278295_1 3/3/2018 7:24:31 PM | Einstein@Home | Computation for task LATeah0056L_804.0_0_0.0_4278295_1 finished 3/3/2018 7:24:31 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4278295_1_0 for task LATeah0056L_804.0_0_0.0_4278295_1 absent

3/3/2018 7:24:31 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4278295_1_1 for task LATeah0056L_804.0_0_0.0_4278295_1 absent

3/3/2018 7:24:31 PM | Einstein@Home | Starting task LATeah0056L_804.0_0_0.0_4241900_1 3/3/2018 7:27:09 PM | Einstein@Home | Computation for task LATeah0056L_804.0_0_0.0_4241900_1 finished 3/3/2018 7:27:09 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4241900_1_0 for task LATeah0056L_804.0_0_0.0_4241900_1 absent

3/3/2018 7:27:09 PM | Einstein@Home | Output file LATeah0056L_804.0_0_0.0_4241900_1_1 for task LATeah0056L_804.0_0_0.0_4241900_1 absent

 

mmonnin
mmonnin
Joined: 29 May 16
Posts: 291
Credit: 3416686540
RAC: 3567362

Odd. What happens if the

Odd. What happens if the cards are switched?

iwajabitw
iwajabitw
Joined: 27 Feb 16
Posts: 8
Credit: 214989406
RAC: 0

Box runs headless so I have

Box runs headless so I have not swapped them yet.  Trying some old drivers out right now.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117719215671
RAC: 35006209

iwajabitw wrote:I started

iwajabitw wrote:
I started seeing errors on one system today.

Your tasks list shows 5 errors from around 8 Feb with nothing more (good or bad) until 1 Mar.  Were tasks failing previously so have you just restarted crunching Einstein after a break?  When I looked, there were 202 listed as 'errors' and (at a short glance) many of those were aborted.  There were also 30 invalid results - mainly validate errors.  A validate error means that the result was so wrong that it didn't even make it to the point where it was compared with a result from another host.  Validate errors usually point to hardware operating outside its comfort zone.

Two Tahiti series GPUs use a lot of power and produce a lot of heat.  Are you confident your PSU is up to the job?  How old is it and what's its brand and rating?

To eliminate lack of power as a problem, try removing say the MSI card and see if the Asus card on its own suddenly starts producing all good results.  If that happens, maybe there is insufficient power for two cards.  If there are still errors, try the MSI card on it's own.  If you get errors with the Asus card and not with the MSI, it really points to a problem with the Asus card.  I think it's unlikely to be a driver problem if one card always works properly and the other doesn't.  It would be interesting to see if the validate errors stop with only one card present.

 

Cheers,
Gary.

iwajabitw
iwajabitw
Joined: 27 Feb 16
Posts: 8
Credit: 214989406
RAC: 0

UPDATE:  A roll back to

UPDATE:  A roll back to Crimson 16.10 drivers I had in my download folder and swapping the cards around late last night seems to have stopped the issue.  Thanks MMonnin and Gary!  850W Gold PSU modular, systems been up about a year through the Pentathlon, Formula-Boing races, and a 2 month Folding Challenge for Dec-Jan.  I let them rest most of Feb as its been very warm outside.  Most aborted tasks were from trying to find the solution.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.