Error while computing Gamma-ray pulsar search #3 v1.11 (FGRPopencl-ati)

sergioclr
sergioclr
Joined: 16 Jan 13
Posts: 10
Credit: 393027
RAC: 0
Topic 197593

Unable to process application "Gamma-ray pulsar search #3 v1.11 (FGRPopencl-ati)".

Application starts OK then, about 10-20 seconds later, always ends with "Error while computing".

Please note that "Gamma-ray pulsar search #3 v1.11 (FGRPSSE) run OK" and "Binary Radio
Pulsar Search (Perseus Arm Survey) v1.39 (BRP5-opencl-ati)" also run OK.

Computer info:
GenuineIntel Pentium(R) Dual-Core CPU E6500 @ 2.93GHz [Family 6 Model 23 Stepping 10]
Number of processors 2
Coprocessors CAL AMD Radeon HD 6350/6450/7450/7470 series (Caicos) (1024MB) driver: 1.4.1741

Note: my GPU card is HD6450 (single precision FP)

Operating System Linux
3.2.0-61-generic
BOINC client version 7.2.42
Memory 3954.38 MB

Follows a workunit that failed:

Name LATeah0092C_96.0_19107_-8.33e-10_0
Workunit 190668146
Created 21 May 2014 0:05:49 UTC
Sent 21 May 2014 0:30:49 UTC
Received 21 May 2014 0:33:02 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 65 (0x41)
Computer ID 11137048
Report deadline 4 Jun 2014 0:30:49 UTC
Run time 14.24
CPU time 7.50
Validate state Invalid
Claimed credit 0.08
Granted credit 0.00
application version Gamma-ray pulsar search #3 v1.11 (FGRPopencl-ati)

Stderr output

7.2.42

process exited with code 65 (0x41, -191)

21:32:58 (2510): [normal]: This Einstein@home App was built at: Feb 18 2014 15:42:42

21:32:58 (2510): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/hsgamma_FGRP3_1.11_x86_64-pc-linux-gnu__FGRPopencl-ati'.
command line: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRP3_1.11_x86_64-pc-linux-gnu__FGRPopencl-ati --inputfile ../../projects/einstein.phys.uwm.edu/LATeah0092C.dat --outputfile results.cand.out --alpha 1.85606135223 --delta -0.187105282293 --pcutfu 0.06251659 --skyRadius 1.402826e-02 --f0start 32 --f0Band 64 --firstSkyPoint 19107 --numSkyPoints 99 --f1dot -8.34e-10 --f1dotBand 1e-12 --df1dot 8.207629151e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 524288.0 --toplist 5 --cohFollow 1 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --interbinning 2 --useDiriWin 10 --mmfu 0.15 --reftime 55471 --debug 1 --device 0
output files: 'results.cand.out' '../../projects/einstein.phys.uwm.edu/LATeah0092C_96.0_19107_-8.33e-10_0_0' 'results.cand.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah0092C_96.0_19107_-8.33e-10_0_1'
21:32:58 (2510): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
21:32:58 (2510): [debug]: glibc version/release: 2.15/stable
21:32:58 (2510): [debug]: Set up communication with graphics process.
boinc_get_opencl_ids returned [0x2396ae0 , 0x7f8e6a54de00]
Using OpenCL platform provided by: Advanced Micro Devices, Inc.
Using OpenCL device "Caicos" by: Advanced Micro Devices, Inc.
Max allocation limit: 134217728
% Opening inputfile: ../../projects/einstein.phys.uwm.edu/LATeah0092C.dat
% Total amount of photon times: 10000
% Preparing toplist of length: 5
read_checkpoint(): Couldn't open file 'results.cand.out.cpt': No such file or directory (2)
% fft_size: 33554432 (0x2000000)
% Sky point 1/99
% Creating FFT plan.
Error allocating device memory: 268435456 bytes (error: -61)
21:32:59 (2510): [CRITICAL]: ERROR: MAIN() returned with error '1'
FPU status flags: PRECISION
mv: cannot stat `results.cand.out': No such file or directory
mv: cannot stat `results.cand.out': No such file or directory
mv: cannot stat `results.cand.out': No such file or directory

mv: cannot stat `results.cand.out': No such file or directory
mv: cannot stat `results.cand.out': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
21:33:11 (2510): [normal]: done. calling boinc_finish(65).
21:33:11 (2510): called boinc_finish

]]>

Please help!
Thanks in advance.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

Error while computing Gamma-ray pulsar search #3 v1.11 (FGRPopen

The significant error is:

Quote:
Error allocating device memory: 268435456 bytes (error: -61)


The app is trying to allocate 256MB of graphics memory and failing to do so, why that is I don't know as the card has 1024MB of memory and that should be enough.

Do you run more than one GPU app simultaneous?
If yes then how many?

Hopefully someone more knowledgeable about AMD cards will drop in soon.

mikey
mikey
Joined: 22 Jan 05
Posts: 12663
Credit: 1839062161
RAC: 4269

RE: Unable to process

Quote:

Unable to process application "Gamma-ray pulsar search #3 v1.11 (FGRPopencl-ati)".

Application starts OK then, about 10-20 seconds later, always ends with "Error while computing".

Please note that "Gamma-ray pulsar search #3 v1.11 (FGRPSSE) run OK" and "Binary Radio
Pulsar Search (Perseus Arm Survey) v1.39 (BRP5-opencl-ati)" also run OK.

Computer info:
GenuineIntel Pentium(R) Dual-Core CPU E6500 @ 2.93GHz [Family 6 Model 23 Stepping 10]
Number of processors 2
Coprocessors CAL AMD Radeon HD 6350/6450/7450/7470 series (Caicos) (1024MB) driver: 1.4.1741

Note: my GPU card is HD6450 (single precision FP)

Please help!
Thanks in advance.

Are you leaving a cpu core free to feed the gpu? If not try suspending the cpu crunching and run a gpu alone and see if it finishes okay. If so then the unit needs more cpu time to keep the gpu running properly.

Also have you read this thread yet?
http://einsteinathome.org/node/197587

sergioclr
sergioclr
Joined: 16 Jan 13
Posts: 10
Credit: 393027
RAC: 0

No, but after reading thread

No, but after reading thread 10764, it seems very similar to my case.

Thread 10764:
Using OpenCL platform provided by: Advanced Micro Devices, Inc.
Using OpenCL device "Turks" by: Advanced Micro Devices, Inc.
Max allocation limit: 134217728
.
.
Error allocating device memory: 268435456 bytes (error: -61)
13:21:47 (6248): [CRITICAL]: ERROR: MAIN() returned with error '1'

My case:
Using OpenCL platform provided by: Advanced Micro Devices, Inc.
Using OpenCL device "Caicos" by: Advanced Micro Devices, Inc.
Max allocation limit: 134217728
.
.
Error allocating device memory: 268435456 bytes (error: -61)
21:32:59 (2510): [CRITICAL]: ERROR: MAIN() returned with error '1'

I will keep on researching.
Thanks for the info.

sergioclr
sergioclr
Joined: 16 Jan 13
Posts: 10
Credit: 393027
RAC: 0

One thing that I've noticed,

One thing that I've noticed, before the workunit ["Gamma-ray pulsar search #3 v1.11 (FGRPopencl-ati)"] fails,
is that Boinc client allocates 1 CPU + 1 GPU so, in essence, only 2 tasks can be run simultaneously:

1 CPU processor for FGRPopencl-ati
1 GPU for FGRPopencl-ati

As my computer has 2 CPUs + 1 ATI Radeon GPU card, I don't know how Boinc client would behave if Boinc
preferences "% of processors" is set to values different from '0' or '100'.

Einstein@Home user Tom has suggested the use of 'clinfo' command to gather additional info and it get right
into the point:

>clinfo
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.2 AMD-APP (1214.3)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
.
.
Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 4098
Board name: AMD Radeon HD 6450
Device Topology: PCI[ B#1, D#0, F#0 ]

Max clock frequency: 625Mhz
Address bits: 32
Max memory allocation: 134217728
.
.
Global memory size: 536870912

Why 'Global memory size' of 536870912 bytes? The GPU card is supposed to have 1024 MB or roughly 1GB.

I am still researching the issue but it will take about 100 hours to finish the tasks that are running in my computer so I can begin experimenting again.

Thanks for the suggestion.

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

If clinfo is providing

If clinfo is providing accurate information then I think you might have found the problem:

Quote:
Max memory allocation: 134217728

and

Quote:
Error allocating device memory: 268435456 bytes (error: -61)


I don't claim to have any knowledge at all about AMD cards but from the above I would guess that the card supports memory allocation of 128MB at a time and the app tries to allocate 256MB in one go, so ain't gonna work...
Might be driver dependent but more likely a hardware limit...

Anyone with more knowledge about AMD cards around?

Claggy
Claggy
Joined: 29 Dec 06
Posts: 560
Credit: 2699403
RAC: 0

RE: If clinfo is providing

Quote:

If clinfo is providing accurate information then I think you might have found the problem:

Quote:
Max memory allocation: 134217728
and
Quote:
Error allocating device memory: 268435456 bytes (error: -61)

I don't claim to have any knowledge at all about AMD cards but from the above I would guess that the card supports memory allocation of 128MB at a time and the app tries to allocate 256MB in one go, so ain't gonna work...
Might be driver dependent but more likely a hardware limit...

Anyone with more knowledge about AMD cards around?


First thing to try is a later driver, earlier (windows) AMD APP runtimes has different 'Max memory allocation' values across different Catalyst versions, you're running APP runtime 1214.3, that is somewhere between Cat 13.4 and 13.9, Try Cat 14.4 if possible:

ATI Driver Version Cheat Sheet

Claggy

sergioclr
sergioclr
Joined: 16 Jan 13
Posts: 10
Credit: 393027
RAC: 0

RE: RE: If clinfo is

Quote:
Quote:

If clinfo is providing accurate information then I think you might have found the problem:

Quote:
Max memory allocation: 134217728
and
Quote:
Error allocating device memory: 268435456 bytes (error: -61)

I don't claim to have any knowledge at all about AMD cards but from the above I would guess that the card supports memory allocation of 128MB at a time and the app tries to allocate 256MB in one go, so ain't gonna work...
Might be driver dependent but more likely a hardware limit...

Anyone with more knowledge about AMD cards around?


First thing to try is a later driver, earlier (windows) AMD APP runtimes has different 'Max memory allocation' values across different Catalyst versions, you're running APP runtime 1214.3, that is somewhere between Cat 13.4 and 13.9, Try Cat 14.4 if possible:

ATI Driver Version Cheat Sheet

Claggy

My computer: Mint 16 (Petra) KDE 64-bit (updated), Boinc 7.2.7 and driver "fglrx-updates version 2:13.101-0ubuntu3" (Mint uses the same driver from Ubuntu repository).

I tried two different releases (latest 14.x) of drivers directly downloaded from AMD site
and tested them individually after uninstalling the current driver. In both cases
they install correctly and are reconignized by Boinc client/manager but, and I
do not know why, applications that uses GPU cannot be retrieved from Einstein@Home.
I guess that is something related to OpenCL because there is no reference to OpenCL
when you analise the Event Log, as opposed when you have drivers from the Ubuntu repository:
Dom 25 Mai 2014 15:08:22 BRT | | CAL: ATI GPU 0: AMD Radeon HD 6350/6450/7450/7470 series (Caicos) (CAL version 1.4.1741, 1024MB, 933MB available, 400 GFLOPS peak)
Dom 25 Mai 2014 15:08:22 BRT | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 6350/6450/7450/7470 series (Caicos) (driver version 1214.3, device version OpenCL 1.2 AMD-APP (1214.3), 1024MB, 933MB available, 400 GFLOPS peak)

Note 1: there is a thread in AMD forum that addresses "Maximum memory allocation problem" (Feb 8, 2012 10:48 PM). Since I cannot insert links
in this Message Board please Google devgurus thread 158397 and read the first match.

Summary:"The AMD APP SDK v2.3 currently defaults to exposing 50% of the physical GPU memory to OpenCLâ„¢ applications
For developers who wish to experiment with increasing the amount of physical memory that is accessible to their OpenCLâ„¢ applications,
the default 50% setting can be changed by setting the environment variable GPU_MAX_HEAP_SIZE to the percentage of total GPU memory that should be exposed.
For example, if you wanted to set the exposed GPU physical memory size to 75%, you need to the GPU_MAX_HEAP_SIZE environment variable to 75."

Note: there are some caveats or trade-offs when this parameter is tweaked

In fact clinfo shows
Global memory size: 536870912 which is 50% of total memory (1024 MB)
but Max memory allocation: 134217728 which is about 13% of total memory.

I am willing to experiment with GPU_MAX_HEAP_SIZE but I must confess that I don't know how to implement AMD APP SDK v2.3 in my Linux distro.

Thanks for your suggestion.

Sergio

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

RE: Since I cannot insert

Quote:
Since I cannot insert links in this Message Board...


Inserting links is as simple as copying the link here, highlighting the link and hitting the URL-button above the post editing window. That should insert url-tags before and after the highlighted text.

mikey
mikey
Joined: 22 Jan 05
Posts: 12663
Credit: 1839062161
RAC: 4269

RE: RE: Since I cannot

Quote:
Quote:
Since I cannot insert links in this Message Board...

Inserting links is as simple as copying the link here, highlighting the link and hitting the URL-button above the post editing window. That should insert url-tags before and after the highlighted text.

I have NEVER tried that before, I always did it manually. THANKS for teaching me something new today!!

Mike Davies
Mike Davies
Joined: 12 Mar 11
Posts: 10
Credit: 56170852
RAC: 75802

For the record, I have the

For the record, I have the same problem with Gamma-ray pulsar search #3 v1.11 (FGRPopencl-ati).

But, Binary Radio Pulsar Search (Perseus Arm Survey) v1.39 (BRP5-opencl-ati) works OK.

I have a different video card (AMD Radeon HD 6700 Series). clinfo reports

Max memory allocation: 2147483648

but a work unit log for a failure says

Max allocation limit: 134217728
Error allocating device memory: 268435456 bytes (error: -61)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.