Error Computing

Silesius
Silesius
Joined: 10 Mar 12
Posts: 8
Credit: 2,139,974
RAC: 0
Topic 226682

I installed another GPU in my box today and subsequently had 31 Tasks error out.

I have enabled multiple GPUs in the Boinc config file and I see both GPUs being used when running Amicable Numbers.

Not sure if the GPU I added is incompatible with the Eistein Tasks or there is something else I didn't do in Boinc when adding another GPU.

Log for one of the Tasks that errored out:

<core_client_version>7.16.20</core_client_version>
<![CDATA[
<message>
Incorrect function.
 (0x1) - exit code 1 (0x1)</message>
<stderr_txt>
putenv 'LAL_DEBUG_LEVEL=3'
2021-12-31 14:03:12.8354 (10716) [normal]: This program is published under the GNU General Public License, version 2
2021-12-31 14:03:12.8383 (10716) [normal]: For details see http://einstein.phys.uwm.edu/license.php
2021-12-31 14:03:12.8407 (10716) [normal]: This Einstein@home App was built at: Aug  5 2021 15:20:43

2021-12-31 14:03:12.8432 (10716) [normal]: Start of BOINC application 'projects/einsteinathome.org/einstein_O3AS_1.01_windows_x86_64__GW-opencl-nvidia.exe'.
Activated exception handling...
[DEBUG} GPU type: 1
[DEBUG} got GPU info from BOINC
[DEBUG} got VendorID 4318
2021-12-31 14:03:12.9428 (10716) [debug]: Flags: LAL_DEBUG, OPTIMIZE, HS_OPTIMIZATION, GC_SSE2_OPT, X64, SSE, SSE2, GNUC X86 GNUX86
2021-12-31 14:03:12.9467 (10716) [debug]: Set up communication with graphics process.
2021-12-31 14:03:12.9574 (10716) [normal]: Parsed user input successfully

DEPRECATION WARNING: program has invoked obsolete function XLALGetVersionString(). Please see XLALVCSInfoString() for information about a replacement.
Code-version: %% LAL: 6.21.0.1 (CLEAN 8d0838c264f9ff9adc8c3cdbfa17b5154eaa2994)
%% LALPulsar: 1.18.2.1 (CLEAN 8d0838c264f9ff9adc8c3cdbfa17b5154eaa2994)
%% LALApps: 6.25.1.1 (CLEAN 8d0838c264f9ff9adc8c3cdbfa17b5154eaa2994)

2021-12-31 14:03:12.9623 (10716) [normal]: Initialise compartments with freqWidth = 0.05 and candidates per compartment = 3000.
XLAL Error - MAIN (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/MinGW6.3/TARGET/windows-x64/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT.c:897): Generic Backend not implemented yet.
XLAL Error - MAIN (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/MinGW6.3/TARGET/windows-x64/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT.c:897): Function not implemented
2021-12-31 14:03:12.9716 (10716) [CRITICAL]: ERROR: MAIN() returned with error '1'
Code-version: %% LAL: 6.21.0.1 (CLEAN 8d0838c264f9ff9adc8c3cdbfa17b5154eaa2994)
%% LALPulsar: 1.18.2.1 (CLEAN 8d0838c264f9ff9adc8c3cdbfa17b5154eaa2994)
%% LALApps: 6.25.1.1 (CLEAN 8d0838c264f9ff9adc8c3cdbfa17b5154eaa2994)

FPU status flags:
2021-12-31 14:03:12.9818 (10716) [debug]: worker done. return(1) to caller
2021-12-31 14:03:12.9828 (10716) [normal]: done. calling boinc_finish(1).
14:03:12 (10716): called boinc_finish

</stderr_txt>
]]>


Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 2,133
Credit: 5,332,585,033
RAC: 19,223,406

Cards may be too old and not

Cards may be too old and not have enough memory for the OAS tasks.

If BOINC is identifying your most capable card as a 2GB GTX 550Ti, I would not like to guess what other "lesser" card you have installed.

You might try the Gamma Ray Pulsar tasks instead which are less demanding.

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,542
Credit: 76,581,469,294
RAC: 64,969,156

Silesius wrote:I installed

Silesius wrote:
I installed another GPU in my box today and subsequently had 31 Tasks error out.

The only current computer listed for the account under which you are posting, last contacted the project on 22 Dec and shows no compute errors and no O3AS tasks like the one for which you gave the stderr output.

The only tasks that do show (FGRPB1G and FGRP5) were sent on 22 Dec and none have been returned, successfully or otherwise.

My guess is that perhaps you have multiple accounts and the one which has the problem computer is not the one under which you have posted your problem report.  Perhaps you would like to provide a link to the problem host so that we can see the hardware details.  Otherwise you would need to give much more information about the GPUs you are trying to use and how you have set them up.

Cheers,
Gary.

Silesius
Silesius
Joined: 10 Mar 12
Posts: 8
Credit: 2,139,974
RAC: 0

Here is the link with all the

Silesius
Silesius
Joined: 10 Mar 12
Posts: 8
Credit: 2,139,974
RAC: 0

The 550 was crunching just

The 550 was crunching just fine until I installed a 1060. 

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 1,602
Credit: 11,877,015,876
RAC: 27,547,807

Silesius wrote:The 550 was

Silesius wrote:

The 550 was crunching just fine until I installed a 1060. 

doesn't look like you processed any Gravitational Wave tasks prior to adding the 1060 though. you were crunching the Gamma Ray tasks. could be a coincidence that you started getting some Gravitational Wave tasks after installing the 1060, possibly due to the project seeing that you now have a 3GB GPU instead of the 2GB on your 550Ti. prior to executing any task, the project is only "aware" of your best GPU, per type. the application will in some cases record what GPU model ran the task, but that information is not forwarded to the project scheduler.

I think you should just uncheck gravitational wave tasks from your compute preferences and just crunch the gamma ray tasks on your GPUs to avoid issues.

_________________________________________________________________________

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 2,133
Credit: 5,332,585,033
RAC: 19,223,406

I only see the Gamma Ray

I only see the Gamma Ray application being used by the 550 Ti.  You never attempted to run the Gravity Wave application with that card. Which is smart because it doesn't have enough memory.

But all your errors are from attempting to run the Gravity Wave application since you installed the 1060.

Can't tell which card in the stderr.txt file because it errored out so soon the application never had a chance to enumerate which card was trying to run the task.

The 1060 is only a 3GB card and it is recommended to use cards with at least 4GB of memory for the Gravity Wave tasks.

The easiest solution is to deselect the Gravity Wave application in your project preferences and only crunch the Gamma Ray application and both cards are capable of running that app because it is less demanding than Gravity Wave.

 

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,542
Credit: 76,581,469,294
RAC: 64,969,156

Thanks for the link to this

Thanks for the link to this host's task list.

Silesius wrote:
The 550 was crunching just fine until I installed a 1060.

That's because it was ONLY crunching GRP (FGRPB1G) tasks. The scheduler was probably aware that it couldn't handle GW tasks and hadn't previously sent any.

After restarting with the extra 1060 GPU, the scheduler is now sending GW tasks (since the 1060 can handle them) and all the failures have probably come from the 550Ti now trying to crunch them as well.

The stderr output for the very last GRP task in your full tasks list shows that processing stopped and started a couple of times and that both the 550Ti and the 1060 were used at various times during crunching.  The second last GRP task shows that crunching started with the 550Ti and was then completed by the 1060 after a restart.

Unfortunately, the stderr output for GW tasks doesn't show which GPU was being used.  I'm guessing perhaps only the 550Ti.  The crunch times were so short (couple of seconds) that all those tasks would have disappeared very quickly, perhaps without a chance for the 1060 to get involved - most likely because the 1060 was finishing off a GRP task.

Your safest course of action is to disable all GW tasks and allow GRP tasks only.  Both GPUs should be able to handle those.  The 1060 should be quite a bit faster than the slow 1.5hr times of the 550Ti

Cheers,
Gary.

Silesius
Silesius
Joined: 10 Mar 12
Posts: 8
Credit: 2,139,974
RAC: 0

Thanks, I updated my Project

I have one Gravitational Wave tasks running now on GPU0 which I'm assuming is going to finish w/o errors as it's been running for 16 mins now.  GPU 1 is crunching Amicable Numbers.  I noticed that Amicable tasks were also throwing errors (All tasks for computer 164837 (sech.me)) stating that a GPU could not be found.  Looking in the device manager I found that the 550ti was not functioning properly.  I reinstalled the drivers for both cards and all seems fine at least for now.

I don't think I can update my project prefs to stop getting the Gravitational Wave Tasks as this box is running under the gridcoin pool. 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 2,133
Credit: 5,332,585,033
RAC: 19,223,406

Being in the gridcoin pool

Being in the gridcoin pool does not preclude you from changing your project preferences as long as you can see your host here on this website.

 

Silesius
Silesius
Joined: 10 Mar 12
Posts: 8
Credit: 2,139,974
RAC: 0

I don't this box in my

I don't see this box in my account.  Computer 12912426 | Einstein@Home (einsteinathome.org)

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.