Questions, comments and problems on new Fermi LAT gamma-ray pulsar search

FrankHagen
FrankHagen
Joined: 13 Feb 08
Posts: 102
Credit: 75,762
RAC: 391

RE: If you're asking for a

Quote:

If you're asking for a simple way to run only the Fermi-LAT gamma-ray pulsar search (which is an application of the Einstein@home project), I'd say thatthis is not advisable at all.

We won't send out more than a few thousand tasks of that application this week, and an unknown number of these will produce only client- and validation errors. That's what testing is for.

If you would restrict yourself to just that application, you will likely get no work from the project at all.

BM

ok - i did not assume it's for testing only currently..

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 419
Credit: 148,729,774
RAC: 4,982

When wil come back fully the

When wil come back fully the SERVER STATUS page, where are missing large fileds of data since yesterday night about 23:00 MESZ?

Kind regards
Martin

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,851
Credit: 182,888,576
RAC: 38,297

RE: When wil come back

Quote:
When wil come back fully the SERVER STATUS page, where are missing large fileds of data since yesterday night about 23:00 MESZ?

See here.

BM

BM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,851
Credit: 182,888,576
RAC: 38,297

RE: We will probably send

Quote:
We will probably send out the first few tasks (for testing) on Monday.

Sorry, won't work out today. Will hopefully be shipped tomorrow morning (CEST).

BM

BM

astro-marwil
astro-marwil
Joined: 28 May 05
Posts: 419
Credit: 148,729,774
RAC: 4,982

On the Server Status page

On the Server Status page became allready added a collumn "FRGP1" in the table "Workunits and tasks". But I believe there is still missing a table "FRGP1 search progress" with the absolute, relative and time data?

Kind regards
Martin

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 1,910
Credit: 130,151,421
RAC: 66,591

Snapped up one of the

Snapped up one of the gamma-ray tasks on my old Pentium 4 Windows 2000 server host 475735. GW tasks (SSE2 variant) run about 18 hours on this host: the new g-r came in at an estimate of 22 hours, and with a 5-day deadline, so not surprisingly started running immediately in high priority (not a problem there).

But after 45 minutes so far, it still hasn't registered any progress - still showing 0.000% in BOINC Manager. GW tasks are also slow to start registering progress, especially on slow machines like this, but this seems a touch excessive. I'll keep you posted.

Edit - ah, that did the trick - I thought it might! Progress jumped to 2.000% at around 50 minutes elapsed, which suggests the estimate might be a bit low for this hardware. But we'll have a better idea of that tomorrow. Or the next day :P

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 3,851
Credit: 182,888,576
RAC: 38,297

RE: GW tasks (SSE2 variant)

Quote:
GW tasks (SSE2 variant) run about 18 hours on this host: the new g-r came in at an estimate of 22 hours

Looks about right.

Quote:
and with a 5-day deadline, so not surprisingly started running immediately in high priority (not a problem there).

Yep, this is still testing. Will set it back to the normal 2 weeks for the actual run.

Quote:
But after 45 minutes so far, it still hasn't registered any progress - still showing 0.000% in BOINC Manager. GW tasks are also slow to start registering progress, especially on slow machines like this, but this seems a touch excessive. I'll keep you posted.

Hm - we are still juggling with the workunit parameters.

The Task should examine 50 sky positions in total, and report progress (and checkpoint!) after each of these. The (missing) progress display is a nuisance, but I'm worried even more about the checkpointing.

BM

BM

archae86
archae86
Joined: 6 Dec 05
Posts: 2,605
Credit: 2,080,510,938
RAC: 2,250,049

RE: But after 45 minutes so

Quote:
But after 45 minutes so far, it still hasn't registered any progress - still showing 0.000% in BOINC Manager. GW tasks are also slow to start registering progress, especially on slow machines like this, but this seems a touch excessive. I'll keep you posted.


I noticed that a task of "0.16 Gamma-ray pulsar search #1" flavor had arrived in my primary host queue this morning, so suspended all non-started Einstein work plus one executing task to get it going.

After the usual task of reassuring Kaspersky anti-Virus that I trusted this new application even though it was not detected as digitally signed, it started right up. Unlike Richard's experience, mine started reporting progress in under a minute, at something on the order of 0.1% or less. The reporting steps so far are quite closely spaced--I just watched 2.410 increment to 2.439% complete.

Possibly the application has already been updated, or possibly some difference in the WU matters, or in my execution environment. While my host is way faster than the one Richard is reporting on, that does not match the difference:

The host is an E5620 Westmere (the Xeon flavor of second-generation Nehalem 4-core), running HT at a moderate 3.42 GHz overclock. Windows 7 64-bit OS. Recent S6bucket GW jobs cluster near 5 hours 24 minutes or so.

Windows Explorer currently reports the Working Set at 269,728K, compared to about 97,650K for current GW jobs.

Ah while I was typing the progress indicator seems to have gone through a behavior boundary. It now reads 2.00%, so has to have jumped back at a mode switch. Don't know just when it did that, but 17 minutes CPU time consumed so far. I shall edit this post with a performance or other behavior update presently.

[edited before first post to add: Before I finished proof-reading, it was reading 4.00% with 20:57 CPU consumed. So for this WU, the discontinuity in reported progress at the mode switch was pretty severe].

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 1,910
Credit: 130,151,421
RAC: 66,591

There does seem to be a

There does seem to be a problem with checkpointing. The server ran for a while, until:

05/07/2011 15:04:37 | Einstein@Home | Starting task LAT00002124_32.0_0_-9.2e-11_0 using hsgamma_FGRP1 version 16
05/07/2011 19:25:35 |  | Suspending computation - running CPU benchmarks
05/07/2011 19:26:09 |  | Resuming computation
05/07/2011 19:29:14 | Einstein@Home | Restarting task LAT00002124_32.0_0_-9.2e-11_0 using hsgamma_FGRP1 version 16


Although I wasn't around at the time, that 'Restarting...' seems to have been a restart from 0.000% (it's currently showing 4hr 31mn elapsed, and 20% complete, which is right for a restart).

I have a second task running on my Q9300. That's reached 66% after 6hr 40mn, at what seems to be a steady 12 minutes per 2% progress step.

Finally, there seem to be checkpoint (.CPT) files written into the slot directory, and references in stderr_txt, but BOINC itself doesn't seem to be aware of any checkpointing - I have checkpoint debug logging turned on for the Q9300, and although GW task checkpoints are logged, those for g-r are not.

[boinc.at] Nowi
[boinc.at] Nowi
Joined: 6 Jul 05
Posts: 13
Credit: 1,206,569
RAC: 0

Hello, I notice a problem

Hello,

I notice a problem with checkpointing, too. My first WU starts, runs for about 1:30 h, but BoincTasks notices no checkpoint. System is not running 24/7, so checkpointing is necessary for me.

System: Intel Q9550, Win 7 64-Bit, 8 GB, Boinc: 6.12.28

Cheers Nowi

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.