Tasks fail on Radeon R7 250X

Hal Bregg
Hal Bregg
Joined: 21 Oct 18
Posts: 7
Credit: 3730764
RAC: 0
Topic 219377

I installed Radeon R7 250X on Linux Mint 19 using official AMD drivers. Unfortunately all tasks fail immediately.

Here's error code for one of the tasks

 <core_client_version>7.9.3</core_client_version> <![CDATA[ <message> process exited with code 69 (0x45, -187)</message> <stderr_txt> 10:01:31 (15248): [normal]: This Einstein@home App was built at: Jan 16 2017 08:09:16 10:01:31 (15248): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati'. 10:01:31 (15248): [debug]: 1e+16 fp, 4.6e+09 fp/s, 2281607 s, 633h46m46s93 command line: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRPB1G_1.18_x86_64-pc-linux-gnu__FGRPopencl1K-ati --inputfile ../../projects/einstein.phys.uwm.edu/LATeah1061L16.dat --alpha 1.41058464281 --delta -0.444366280137 --skyRadius 5.526880e-07 --ldiBins 30 --f0start 292.0 --f0Band 8.0 --firstSkyPoint 0 --numSkyPoints 1 --f1dot -1e-13 --f1dotBand 1e-13 --df1dot 2.512676418e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 2097152.0 --toplist 10 --cohFollow 10 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --mmfu 0.1 --reftime 56100 --model 0 --f0orbit 0.005 --mismatch 0.1 --demodbinary 1 --BinaryPointFile ../../projects/einstein.phys.uwm.edu/templates_LATeah1061L16_0300_13819463.dat --debug 1 --device 0 -o LATeah1061L16_300.0_0_0.0_13819463_1_0.out output files: 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out' '../../projects/einstein.phys.uwm.edu/LATeah1061L16_300.0_0_0.0_13819463_1_0' 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah1061L16_300.0_0_0.0_13819463_1_1' 10:01:31 (15248): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86 10:01:31 (15248): [debug]: glibc version/release: 2.27/stable 10:01:31 (15248): [debug]: Set up communication with graphics process. boinc_get_opencl_ids returned [0x17a8630 , 0x7f6548bbc1b0] Using OpenCL platform provided by: Advanced Micro Devices, Inc. Using OpenCL device "Capeverde" by: Advanced Micro Devices, Inc. Max allocation limit: 1242064281 Global mem size: 1654190080 Couldn't create OpenCL command queue (error: -6)! OpenCL shutdown complete! initialize_ocl returned error [2013] OCL context null OCL queue null Error generating generic FFT context object [5] 10:01:31 (15248): [CRITICAL]: ERROR: MAIN() returned with error '5' FPU status flags: mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out.cohfu': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out.cohfu': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out.cohfu': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out.cohfu': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out.cohfu': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out.cohfu': No such file or directory mv: cannot stat 'LATeah1061L16_300.0_0_0.0_13819463_1_0.out.cohfu': No such file or directory 10:01:43 (15248): [normal]: done. calling boinc_finish(69). 10:01:43 (15248): called boinc_finish </stderr_txt> ]]>

 Am I missing something or is this GPU not supported?

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

Greetings Hal, You might

Greetings Hal,

You might find some useful information which might help you on this thread;

https://einsteinathome.org/content/quick-guide-how-install-opencl-amd-gpus-linux-kubuntu-1804-and-similar-distro

I recently installed a Radeon RX560 on my Ubuntu box and found it required more than just loading the standard driver.

Clear skies,
Matt
Hal Bregg
Hal Bregg
Joined: 21 Oct 18
Posts: 7
Credit: 3730764
RAC: 0

Hi Matt,The problem during

Hi Matt,

The problem during installation I stumble upon is that when I issues ./amdgpu-pro-install -y --opencl=pal,legacy

terminal displays that some packages are being downloaded and at the end list of argument for command apt are displayed.

Nothing else happens.

 

 

Matt White
Matt White
Joined: 9 Jul 19
Posts: 120
Credit: 280798376
RAC: 0

Hal, This link is from the

Hal,

This link is from the AMD site;

https://www.amd.com/en/support/kb/release-notes/amdgpu-installation

I used the information here to get my card running, along with some information from the link I posted earlier. As I recall, I used the command ./amdgpu-pro-install -y  --opencl=pal,legacy --headless. I'm not quite sure what is happening after you run the install script. I issued a reboot command once the install completed and once BOINC launched, GPU jobs ran without error.

In the above link, there is an option to uninstall everything and start over. Perhaps the driver didn't load cleanly. I would probably try that next. There are quite a few people with both LINUX and AMD experience who may be able to provide some additional insight if the above information doesn't help.

Clear skies,
Matt
Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 117973991646
RAC: 22106507

Hal Bregg wrote:I installed

Hal Bregg wrote:

I installed Radeon R7 250X on Linux Mint 19 using official AMD drivers. Unfortunately all tasks fail immediately.

.....

 Am I missing something or is this GPU not supported?

Welcome to the GCN 1st Generation club :-).

The R7 250X (Cape Verde) is a SI (Southern Islands) GPU along with others like Pitcairn and Tahiti. It's possible through the use of certain environment variables to get past the compute errors that you are seeing.  However the real problem for which there doesn't seem to be a solution (yet) is that the results of successful computation then fail the validation process.

I have a large fleet of SI GPUs crunching on Linux.  They have been doing so for many years.  The only way I've found to get validated results with them is to stay with the long deprecated proprietary fglrx driver.  I've been documenting some of my attempts to get things working properly under amdgpu in this thread.

I haven't tried a Cape Verde variant of GCN gen 1 - only tried HD 7850 and R7 370.  If you want to get your Cape Verde to crunch without comp errors, take a look at the exported variables documented in this particular message.   Using these stops compute errors for me so should do likewise for you.

Maybe (if you're lucky) a Cape Verde might be sufficiently different from a Pitcairn to allow your results to actually validate.  It would be interesting to know :-).

Cheers,
Gary.

Hal Bregg
Hal Bregg
Joined: 21 Oct 18
Posts: 7
Credit: 3730764
RAC: 0

Gary Roberts wrote:Hal Bregg

Gary Roberts wrote:
Hal Bregg wrote:

I installed Radeon R7 250X on Linux Mint 19 using official AMD drivers. Unfortunately all tasks fail immediately.

.....

 Am I missing something or is this GPU not supported?

Welcome to the GCN 1st Generation club :-).

The R7 250X (Cape Verde) is a SI (Southern Islands) GPU along with others like Pitcairn and Tahiti. It's possible through the use of certain environment variables to get past the compute errors that you are seeing.  However the real problem for which there doesn't seem to be a solution (yet) is that the results of successful computation then fail the validation process.

I have a large fleet of SI GPUs crunching on Linux.  They have been doing so for many years.  The only way I've found to get validated results with them is to stay with the long deprecated proprietary fglrx driver.  I've been documenting some of my attempts to get things working properly under amdgpu in this thread.

I haven't tried a Cape Verde variant of GCN gen 1 - only tried HD 7850 and R7 370.  If you want to get your Cape Verde to crunch without comp errors, take a look at the exported variables documented in this particular message.   Using these stops compute errors for me so should do likewise for you.

Maybe (if you're lucky) a Cape Verde might be sufficiently different from a Pitcairn to allow your results to actually validate.  It would be interesting to know :-).

Hi Gary,

Thanks for your contribution. I will try to investigate your solution a little bit closer but vision of messing up with the settings otherwise working setup is not very appealing to me. In fact I am tempted to get NVidia GPU which seems to be better supported on Linux.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.