Fermi LAT Gamma-ray pulsar search #4 "FGRP4"

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117551096736
RAC: 35319010

I did try to get more beta

I did try to get more beta test tasks a few hours after my previous post but by that time there were none available and apparently none since.

Has anyone actually spotted a FGRP4 task with a reasonable estimate?

Cheers,
Gary.

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7220894931
RAC: 953910

RE: Has anyone actually

Quote:
Has anyone actually spotted a FGRP4 task with a reasonable estimate?


My laptop got what appears to be a full length (not "short end") FGRP4 beta created at 23 Aug 2014 19:00:21 UTC and sent to me five seconds later--so maybe later than you've mentioned.

It seems to be running OK, and grinding down estimated time in some accord with real time. This might not be cause for much rejoicing, as the host DCF is currently 9.03079. Were the distributed estimate "reasonable", I imagine it would be clobbering the estimated time handily.

Though my slice was created on the 23rd and sent as v 1.03; the original was created on the 21st and sent as v1.01, so this appears not really to be fresh work.

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 4341
Credit: 3205415164
RAC: 1980106

I have received three 1.03

I have received three 1.03 FGRP4 wus. First one was downloaded on Friday 22nd at about 9 o'clock UTC. It failed last night after about 11 hours running with "maximum time exceeded" error. Two others I received Saturday 23rd at 20:30 UTC. They haven't started yet crunching, but they both show estimated time 1:04:52 so they probably both will fail with the same error.

I just forced them to start, so we'll see how they'll do.

Anonymous

I too am getting these errors

I too am getting these errors on Gamma-ray pulsar search #3 v1.11 (FGRPopencl-ati)

7.2.42

process exited with code 65 (0x41, -191)

08:04:33 (30995): [normal]: This Einstein@home App was built at: Feb 18 2014 15:42:42

08:04:33 (30995): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/hsgamma_FGRP3_1.11_x86_64-pc-linux-gnu__FGRPopencl-ati'.
command line: ../../projects/einstein.phys.uwm.edu/hsgamma_FGRP3_1.11_x86_64-pc-linux-gnu__FGRPopencl-ati --inputfile ../../projects/einstein.phys.uwm.edu/LATeah0111C.dat --outputfile results.cand.out --alpha 2.28186379714 --delta -0.826671692095 --pcutfu 0.07274839 --skyRadius 2.217964e-03 --f0start 32 --f0Band 64 --firstSkyPoint 5562 --numSkyPoints 32 --f1dot -4.44e-10 --f1dotBand 1e-12 --df1dot 8.49468019e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 524288.0 --toplist 5 --cohFollow 1 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --interbinning 2 --useDiriWin 10 --mmfu 0.15 --reftime 55471 --debug 1 --device 0
output files: 'results.cand.out' '../../projects/einstein.phys.uwm.edu/LATeah0111C_96.0_5562_-4.43e-10_0_0' 'results.cand.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah0111C_96.0_5562_-4.43e-10_0_1'
08:04:33 (30995): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
08:04:33 (30995): [debug]: glibc version/release: 2.15/stable
08:04:33 (30995): [debug]: Set up communication with graphics process.
boinc_get_opencl_ids returned [0x13c57f0 , 0x7f2cc4a06500]
Using OpenCL platform provided by: Advanced Micro Devices, Inc.
Using OpenCL device "Pitcairn" by: Advanced Micro Devices, Inc.
Max allocation limit: 239075328
% Opening inputfile: ../../projects/einstein.phys.uwm.edu/LATeah0111C.dat
% Total amount of photon times: 10000
% Preparing toplist of length: 5
read_checkpoint(): Couldn't open file 'results.cand.out.cpt': No such file or directory (2)
% fft_size: 33554432 (0x2000000)
% Sky point 1/32
% Creating FFT plan.
Error allocating device memory: 268435456 bytes (error: -61)
08:04:33 (30995): [CRITICAL]: ERROR: MAIN() returned with error '1'
FPU status flags: COND_1 PRECISION
mv: cannot stat `results.cand.out': No such file or directory
mv: cannot stat `results.cand.out': No such file or directory
mv: cannot stat `results.cand.out': No such file or directory
mv: cannot stat `results.cand.out': No such file or directory
mv: cannot stat `results.cand.out': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
mv: cannot stat `results.cand.out.cohfu': No such file or directory
08:04:45 (30995): [normal]: done. calling boinc_finish(65).
08:04:45 (30995): called boinc_finish

]]>

Some tasks finish ok while other fail like above. Is it memory related?

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117551096736
RAC: 35319010

RE: I too am getting these

Quote:
I too am getting these errors on Gamma-ray pulsar search #3 v1.11 (FGRPopencl-ati)


No you're not! We're talking about "maximum time limit exceeded" in FGRP4 beta and you're talking about a memory allocation problem in FGRP3.

Try looking at this message and see if the solution there works for you.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117551096736
RAC: 35319010

RE: Though my slice was

Quote:
Though my slice was created on the 23rd and sent as v 1.03; the original was created on the 21st and sent as v1.01, so this appears not really to be fresh work.


Yep, both you and Harri have 'resends' which are just extra copies of the original faulty tasks. Because of the 'time limit' problem, there are quite a few resends being sent out. I don't think any of the new 'fixed' workunits have entered the pipeline yet.

Cheers,
Gary.

Anonymous

RE: RE: I too am getting

Quote:
Quote:
I too am getting these errors on Gamma-ray pulsar search #3 v1.11 (FGRPopencl-ati)

No you're not! We're talking about "maximum time limit exceeded" in FGRP4 beta and you're talking about a memory allocation problem in FGRP3.

Try looking at this message and see if the solution there works for you.

I am not running Ubuntu's BOINC package. Rather I downloaded BOINC and installed per their instructions.

The referenced "post/message" assumes the installation of the SDK for AMD I believe because I cannot find the file that this message references/modifies. Therefore I modified /etc/profile adding the following line:
export GPU_MAX_HEAP_SIZE=100

I stopped BOINC. exited the user I run BOINC under and logged back in causing /etc/profile to be re-read and to pick up the "export GPU_MAX_HEAP_SIZE=100" line I added. I restarted BOINC. I will now monitor to see if this procedure fixes the memory issue noted in my original post.

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 4341
Credit: 3205415164
RAC: 1980106

RE: I have received three

Quote:

I have received three 1.03 FGRP4 wus. First one was downloaded on Friday 22nd at about 9 o'clock UTC. It failed last night after about 11 hours running with "maximum time exceeded" error. Two others I received Saturday 23rd at 20:30 UTC. They haven't started yet crunching, but they both show estimated time 1:04:52 so they probably both will fail with the same error.

I just forced them to start, so we'll see how they'll do.


Quoting myself.

I have managed to run the two last ones mentioned in the quote successfully by reducing the allowed number of CPUs. I have an 8 core CPU (with hyper threading on) and I reduced the number of allowed CPU cores to 4. The two remaining FGRP4 wus had a boost of performance and they finished in time and I have now received credit for them.

Now I'm resuming normal operation.

Bent Vangli
Bent Vangli
Joined: 6 Apr 11
Posts: 23
Credit: 725742660
RAC: 0

My first unit finished and

My first unit finished and validated, giving 2.58 points :-).

I am running AMD A10-7850K APU and Linux 3.13.0-34-generic. No overclocking.

Two more units are in que, with a time guess of over 600 hours :-). However, they seems to finish in about 5-6 hours with this CPU. The first unit reports using less than one hour of CPU time. Wonder what happens to the two running now.

With best regards

Bent, Oslo, Norway

Jeroen
Jeroen
Joined: 25 Nov 05
Posts: 379
Credit: 740030628
RAC: 0

Several of the latest batch

Several of the latest batch of tasks my two systems received took significantly longer to complete. Via an Intel 920 and 3930K with HT disabled, these particular tasks took over 33,000 seconds to complete.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.