Gravitational Wave S6 Directed Search (CasA) - Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0
Topic 197541

For me, the "Gravitational Wave S6 Directed Search (CasA)" tasks all currently fail, on my GTS 240, on my triple-GPU (GTX 660 Ti, GTX 460, GTS 240) Windows 8.1 x64 Update 1 PC, using the latest x64 beta drivers 337.61.

The error says:
2014-04-18 19:04:13.6448 (8668) [normal]: Using OpenCL platform provided by: NVIDIA Corporation
2014-04-18 19:04:13.6584 (8668) [normal]: Using OpenCL device "GeForce GTS 240" by: NVIDIA Corporation
2014-04-18 19:04:14.1457 (8668) [CRITICAL]: OpenCL compiling FAILED! : -42 . Error message: ptxas application ptx input, line 50; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive

Does this mean that I cannot do this task type on my GTS 240? And, if so, then.... hmmm. I guess I'm being offered the tasks because I actually have 3 GPUs in the system (GTX 660 Ti, GTX 460, and GTS 240), though I have it basically setup such that Einstein only runs on the GTS 240.

So, for a triple-GPU system such as mine, where I generally only run Einstein on my GTS 240... is the best option for me to edit my project settings to disable that app?

Or (pressing my luck here)... Is there any way to write the app such that it'd work on a GTS 240? :) The error message itself "or map_f64_to_f32 directive"... hints that it might be possible maybe?

Let me know,
Thanks,
Jacob

Full error:
7.3.15

Incorrect function.
(0x1) - exit code 1 (0x1)

2014-04-18 19:04:13.1458 (8668) [normal]: This program is published under the GNU General Public License, version 2
2014-04-18 19:04:13.1477 (8668) [normal]: For details see http://einstein.phys.uwm.edu/license.php
2014-04-18 19:04:13.1497 (8668) [normal]: This Einstein@home App was built at: Apr 16 2014 14:38:02

2014-04-18 19:04:13.1507 (8668) [normal]: Start of BOINC application 'projects/einstein.phys.uwm.edu/einstein_S6CasA_1.08_windows_x86_64__GWopencl-nvidia-Beta.exe'.
Activated exception handling...
command line: projects/einstein.phys.uwm.edu/einstein_S6CasA_1.08_windows_x86_64__GWopencl-nvidia-Beta.exe --skyRegion=(6.1237713,1.0264572) --refTime=960541454.5 --Freq=991.45 --FreqBand=0.05 --dFreq=5.3519e-07 --f1dot=-2.95236798883e-09 --f1dotBand=7.76938944429e-11 --df1dot=8.2281e-12 --gammaRefine=90 --f2dot=9.664e-19 --f2dotBand=2.2125385378e-17 --df2dot=1.9328e-18 --gamma2Refine=60 --computeLV --LVuseAllTerms=0 --LVrho=2.7564e+17 --LVlX=0.000165865,0.000165865 --nCand1=3000 --SortToplist=3 --recalcToplistStats=1 -o ../../projects/einstein.phys.uwm.edu/h1_0991.25_S6Directed__S6CasAf40a_991.45Hz_1311_0_0 --printCand1 --semiCohToplist --ephemE=../../projects/einstein.phys.uwm.edu/earth_09_11 --ephemS=../../projects/einstein.phys.uwm.edu/sun_09_11 --segmentList=../../projects/einstein.phys.uwm.edu/seglist-CasAf40.dat --Dterms=8 --DataFiles1=..\..\projects\einstein.phys.uwm.edu\h1_0991.25_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0991.25_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0991.30_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0991.30_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0991.35_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0991.35_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0991.40_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0991.40_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0991.45_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0991.45_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0991.50_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0991.50_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0991.55_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0991.55_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0991.60_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0991.60_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0991.65_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0991.65_S6Directed;..\..\projects\einstein.phys.uwm.edu\h1_0991.70_S6Directed;..\..\projects\einstein.phys.uwm.edu\l1_0991.70_S6Directed --device 2
2014-04-18 19:04:13.6350 (8668) [debug]: Flags: LAL_NDEBUG, OPTIMIZE, HS_OPTIMIZATION, X64, SSE, SSE2, GNUC X86 GNUX86
2014-04-18 19:04:13.6360 (8668) [debug]: Set up communication with graphics process.
Code-version: %% LAL: 6.10.0.1 (CLEAN 14312d5a9fafa5b46fc6ccc57a08bdfab14361f1)
%% LALApps: 6.12.0.1 (CLEAN 14312d5a9fafa5b46fc6ccc57a08bdfab14361f1)

2014-04-18 19:04:13.6448 (8668) [normal]: Using OpenCL platform provided by: NVIDIA Corporation
2014-04-18 19:04:13.6584 (8668) [normal]: Using OpenCL device "GeForce GTS 240" by: NVIDIA Corporation
2014-04-18 19:04:14.1457 (8668) [CRITICAL]: OpenCL compiling FAILED! : -42 . Error message: ptxas application ptx input, line 50; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 51; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 52; : error : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 53; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 54; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 55; : error : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 56; : error : Instruction 'add' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 57; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 58; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 59; : error : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 60; : error : Instruction 'add' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 61; : error : Instruction 'st' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 62; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 63; : error : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 64; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 65; : error : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 66; : error : Instruction 'add' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 67; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 68; : error : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 69; : error : Instruction 'add' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 70; : error : Instruction 'st' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 71; : error : Instruction 'add' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 72; : error : Instruction 'st' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 73; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 74; : error : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 75; : error : Instruction 'rcp' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 76; : error : Instruction 'st' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 77; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 79; : error : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 80; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 81; : error : Instruction 'mul' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 82; : error : Instruction 'ld' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 83; : error : Instruction 'add' requires SM 1.3 or higher, or map_f64_to_f32 directive
ptxas application ptx input, line 85; : error : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directiv
2014-04-18 19:04:14.1828 (8668) [CRITICAL]: Failed to initialize OpenCL
initOpenCL() failed (1).
2014-04-18 19:04:14.1838 (8668) [CRITICAL]: ERROR: MAIN() returned with error '1'
FPU status flags:
2014-04-18 19:04:14.1848 (8668) [normal]: done. calling boinc_finish(1).
19:04:14 (8668): called boinc_finish

]]>

Tom*
Tom*
Joined: 9 Oct 11
Posts: 54
Credit: 366729484
RAC: 0

Gravitational Wave S6 Directed Search (CasA) - Instruction 'ld'

In the announcement in NEWS they said

Quote:
you should have a card which supports double precision FP in hardware.

GTS240 only has compute capability 1.2 it needs 1.3 to be able to do
double precision.

Sorry

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0

So... if I want to keep

So... if I want to keep Einstein as a backup project (for my 2 beefier GPUs), while also keeping it as a main project (for my GTS 240), then I should set the Resource Share to 0 (obviously), but... if I don't want to "chew through and error" those task types on my GTS 240, I should now also change my project settings to not do work for that app, right?

Claggy
Claggy
Joined: 29 Dec 06
Posts: 560
Credit: 2699403
RAC: 0

RE: So... if I want to keep

Quote:
So... if I want to keep Einstein as a backup project (for my 2 beefier GPUs), while also keeping it as a main project (for my GTS 240), then I should set the Resource Share to 0 (obviously), but... if I don't want to "chew through and error" those task types on my GTS 240, I should now also change my project settings to not do work for that app, right?


No, all you need to do is set an for that GPU, for that appname in your cc_config.xml:

Client configuration

Since I believe you were the one that convinced DA to introduce this feature, you'd know that already, ;-)

Claggy

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0

Yes, I'm fully aware of

Yes, I'm fully aware of :)

But... If I used that here, wouldn't that mean that the server could still send me that app, and then when I'd get that app, I'd be forced to run it on one of the beefier GPUs that I don't want to run Einstein on? I only want Einstein to run on one of the beefier GPUs when no other work can run on them. That's my dilemma.

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1887
Credit: 1411451176
RAC: 1184046

Well Jacob I have never had a

Well Jacob I have never had a 240 but I know your 660Ti works best here running the Binary Radio Pulsar Search (Perseus Arm Survey) tasks.

The one I have runs those tasks X4 with a RAC about 40,000 on a quad core

Your 8 core could run more or just run a few other tasks if you want.

My only 8-core is my laptop so I just run those GPU's X2 and a T4T CPU task from that project and once in a while I run 6 LHC's just to use all but the one core and it works fine but I do suspend the other CPU tasks myself instead of running all 8 but I have even tested that in the past and it mainly just slowed down one of those GPU tasks and the CPU tasks stayed the same. (and did the same with previous Einstein CPU tasks)

I have several other quad and triple core hosts with 550Ti's and 650Ti's that I have set up just to run the Binary Radio Pulsar Search (Perseus Arm Survey) tasks and those T4T tasks from the other project just to make those BRP-PAS run as fast as they can with these cards and CPU's

Been running my 8-core like that for 2 years 24/7 so far.

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0

Thanks MAGIC. I know the GTX

Thanks MAGIC. I know the GTX 660 Ti is a great card for lots of projects, and I currently want it and the GTX 460 to work on: WCG (when GPU tasks available), POEM (when GPU tasks available), GPUGrid (GPU tasks are normally available), and then Einstein/SETI/Albert/SETIBETA as a last resort (when no tasks are available from other projects).

So... :) I think my system is very unique, in that I've got 3 generations of GPUs, and I'm trying to delicately manage work loads to them. The GTS 240, for instance, cannot do WCG, GPUGrid, nor POEM. So... I have configuration in place (to prevent those tasks from erroring out on the GTS 240), and then I keep it busy doing the Einstein/SETI/Albert/SETIBETA combo.

But this new CasA OpenCL app, apparently won't work on a GTS 240. Yet the server still gives me those tasks, because the server doesn't even know that my GTS 240 exists! It happens to think I have 3 GTX 660 Ti's :)

Again, my system is unique. I guess I'll get it figured out. It's just a shame that these tasks from this app, cannot be made to work on my GTS 240.

Right now, I think my options are to either exclude the app from my project settings (which I don't want, just in case I have another computer in the future that can do that app), or set up an to exclude that app from my GTS 240 (such that, if the server does send me that task type, it'll end up running on one of the 2 beefier GPUs, even though I would have preferred not to have been given that task type).

David Anderson said that, some day, he might implement "each GPU as a resource type" to allow the server to be more aware of the GPUs we have, and assign out tasks accordingly. In the meantime.... I'll have to find workarounds to deploy.

mikey
mikey
Joined: 22 Jan 05
Posts: 12692
Credit: 1839096974
RAC: 3689

RE: But this new CasA

Quote:

But this new CasA OpenCL app, apparently won't work on a GTS 240. Yet the server still gives me those tasks, because the server doesn't even know that my GTS 240 exists! It happens to think I have 3 GTX 660 Ti's :)

If you look at most peoples pc's you will see that happening when they have multiple gpu's. It says (?) for the number of gpu's then a description of one of them, I am guessing the best, indicating that all of them are the same. I'm guessing that most of us don't have the resources to go out and buy 3 of the same kind of gpu's for each our pc's. So I'm thinking Boinc is just giving a brief check, skipping all the details.

Quote:
David Anderson said that, some day, he might implement "each GPU as a resource type" to allow the server to be more aware of the GPUs we have, and assign out tasks accordingly.

That would be nice to see. The Boinc software itself sees all the gpu's individually during the start-up process, I can see them being listed as gpu 0, gpu 1 etc. So the Boinc software on my end knows what I have now, the Server side just seems to be seeing the total number of gpu's in a machine then assuming that they are all the same as the best one.

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0

RE: the Server side just

Quote:
the Server side just seems to be seeing the total number of gpu's in a machine then assuming that they are all the same as the best one.

Yup, you got it, that's what it has always done, and it's always been a problem for people with heterogeneous GPUs :) 3 generations here, and they all do work. It's quite difficult to maintain (I have like 12 exclusions, and 7 app_config.xml files lol)

mikey
mikey
Joined: 22 Jan 05
Posts: 12692
Credit: 1839096974
RAC: 3689

RE: RE: the Server side

Quote:
Quote:
the Server side just seems to be seeing the total number of gpu's in a machine then assuming that they are all the same as the best one.

Yup, you got it, that's what it has always done, and it's always been a problem for people with heterogeneous GPUs :) 3 generations here, and they all do work. It's quite difficult to maintain (I have like 12 exclusions, and 7 app_config.xml files lol)

I am hoping Boinc 8, or some other future version, has the ability to pass the info on so we users can better tweak our systems. In the beginning, as you are well aware, gpu's weren't even included. Thru small steps we are where we are today, more steps are needed to be sure, hopefully they are 'in the works' as more and more people join the ranks of crunching and want to make it work for them.

Jacob Klein
Jacob Klein
Joined: 22 Jun 11
Posts: 45
Credit: 114028547
RAC: 0

Just to prove how "fun" it

Just to prove how "fun" it has been... I'm going to post all of my .xml files here. I'm not requesting advice, I guess, but really am just "sharing" the amount of tweaking I've had to use to get things to work how I want...

3 GPUs, the 2 beefy ones GTX 660 Ti and GTX 460 to work on WCG/POEM/GPUGrid, only run POEM on 1 GPU since tasks fail when on the other, and then have the tiny GTS 240 GPU work on Einstein/SETI/Albert/SETIBETA by setting those projects as 0 resource share, all while using custom cpu_usage values in order to keep the CPUs fully busy. The configuration here even keeps GPUs busy despite scenarios like when MindModeling releases CPU tasks with ridiculous due dates, or when MilkyWay's multithreaded mt tasks go into high-priority [I set their CPU usage to 6 instead of 8] which I have commented out since a BOINC client change made it no-longer-necessary.

Anyway, fun indeed. Maybe this post will help someone, or give them ideas. I have tons of comments within the XML to describe my logic everywhere, mainly so I remember it when I make changes later :)

--------------------------------------------------------------------------------
cc_config.xml
--------------------------------------------------------------------------------
[pre]

1
0

1
0

1
0

1

0
0
0

0
0
0
0

0
0

0
0
0

1
0

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0



20
12
NotepadTest01.exe
NotepadTest02.exe
-->


0
0
4000

10
4

iRacingSim.exe
iRacingSim64.exe
Aces.exe
TmForever.exe
TmForeverLauncher.exe


1


0
-->


http://www.worldcommunitygrid.org
0
hcc1

-->


http://einstein.phys.uwm.edu/
0


http://albert.phys.uwm.edu/
0


http://setiathome.berkeley.edu/
0


http://setiweb.ssl.berkeley.edu/beta/
0


http://milkyway.cs.rpi.edu/milkyway/
0

-->


1
-->






http://boinc.fzk.de/poem/
1
poemcl


http://www.gpugrid.net
1

-->


http://einstein.phys.uwm.edu/
1


http://albert.phys.uwm.edu/
1


http://setiathome.berkeley.edu/
1


http://setiweb.ssl.berkeley.edu/beta/
1


http://milkyway.cs.rpi.edu/milkyway/
1

-->


2
-->




http://www.worldcommunitygrid.org
2
hcc1




http://boinc.fzk.de/poem/
2
poemcl




http://www.gpugrid.net/
2




http://milkyway.cs.rpi.edu/milkyway/
2



http://einstein.phys.uwm.edu/
2
einstein_S6CasA


http://albert.phys.uwm.edu/
2
einstein_S6CasA

-->

[/pre]

--------------------------------------------------------------------------------
Albert@Home app_config.xml
--------------------------------------------------------------------------------
[pre]







einsteinbinary_BRP4
0

1
0.3







einsteinbinary_BRP4G
0

1
0.3







einsteinbinary_BRP5
0

1
0.3




einstein_S6CasA
0

1
1




hsgamma_FGRP2
0

1
1




hsgamma_FGRP3
0

1
1

[/pre]
--------------------------------------------------------------------------------
POEM@Home app_config.xml
--------------------------------------------------------------------------------
[pre]







einsteinbinary_BRP4
0

1
0.3







einsteinbinary_BRP4G
0

1
0.3







einsteinbinary_BRP5
0

1
0.3




einstein_S6CasA
0

1
1




hsgamma_FGRP2
0

1
1




hsgamma_FGRP3
0

1
1

[/pre]
--------------------------------------------------------------------------------
MilkyWay@Home app_config.xml
--------------------------------------------------------------------------------
[pre]





milkyway
0

1
0.5





milkyway_separation__modified_fit
0

1
0.5





milkyway_nbody
mt
6

-->

[/pre]
--------------------------------------------------------------------------------
MindModeling@Home app_config.xml
--------------------------------------------------------------------------------
[pre]



python2.7_wrap
2



python2.7_wrap_winOnly
2



pypy1.9_wrap
2



R2.15.1_wrap
2

[/pre]
--------------------------------------------------------------------------------
SETI@Home app_config.xml
--------------------------------------------------------------------------------
[pre]







setiathome_enhanced
0

1
0.3







setiathome_v7
0

1
0.3




astropulse_v6
0

1
1

[/pre]
--------------------------------------------------------------------------------
SETI BETA app_config.xml
--------------------------------------------------------------------------------
[pre]







setiathome_enhanced
0

1
0.3







setiathome_v7
0

1
0.3




astropulse_v6
0

1
1

[/pre]
--------------------------------------------------------------------------------
GPUGrid.net app_config.xml
--------------------------------------------------------------------------------
[pre]



acemdshort
0

1
1



acemdlong
0

1
1



acemdbeta
0

1
1

[/pre]
--------------------------------------------------------------------------------
World Community Grid app_config.xml
--------------------------------------------------------------------------------
[pre]



hcc1
0

0.5
1

[/pre]

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.