Web replica down

TPCBF
TPCBF
Joined: 24 Nov 12
Posts: 17
Credit: 145310756
RAC: 9857

RE: They are all flattened

Quote:

They are all flattened out to keep the chaos within bounds. My rough estimate is that getting this fixed will take the rest of the week.

As Einstein@Home is working relatively good and reliably atm. I don't think they will bother spending a minute on it.

I'll try one or two things tomorrow to get at least the stats updated again.

BM

Besten Dank für das Update... ;-)

Seems every time something with the hardware on pretty much any of the DC projects goes tits up, it does so big time... :-(

Ralf

robl
robl
Joined: 2 Jan 13
Posts: 1709
Credit: 1454570221
RAC: 3375

I am not sure its just a

I am not sure its just a "stats" issue. I share my computer with SETI and have plenty of work from them. However my downloads from E&H have dropped off quite a bit in the last few days. As of right now I have 2 tasks in progress. This not the norm so something is not quite right.

I am seeing the following in the online log which I don't understand:

2013-03-06 16:20:12.3545 [PID=12285] Request: [USER#xxxxx] [HOST#6382800] [IP xxx.xxx.xxx.22] client 7.0.29
2013-03-06 16:20:12.3568 [PID=12285] [debug] [HOST#6382800] Resetting nresults_today
2013-03-06 16:20:12.3576 [PID=12285] [handle] [HOST#6382800] [RESULT#351302051] [WU#151592225] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2013-03-06 16:20:12.3576 [PID=12285] [handle] cpu time 0.000000 credit/sec 0.003894, claimed credit 0.000000
2013-03-06 16:20:12.3578 [PID=12285] [handle] [RESULT#351302051] [WU#151592225]: setting outcome SUCCESS
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ncpus 8 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] Not using matchmaker scheduling; Not using EDF sim
2013-03-06 16:20:12.4147 [PID=12285] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] CUDA: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] work_req_seconds: 0.00 secs
2013-03-06 16:20:12.4147 [PID=12285] [send] available disk 87.30 GB, work_buf_min 0
2013-03-06 16:20:12.4147 [PID=12285] [send] active_frac 0.999992 on_frac 0.999911 DCF 0.897591
2013-03-06 16:20:12.4169 [PID=12285] Sending reply to [HOST#6382800]: 0 results, delay req 60.00
2013-03-06 16:20:12.4172 [PID=12285] Scheduler ran 0.069 seconds

What is the "matchmaker" comment about?

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2142
Credit: 2774588476
RAC: 842135

RE: I am not sure its just

Quote:

I am not sure its just a "stats" issue. I share my computer with SETI and have plenty of work from them. However my downloads from E&H have dropped off quite a bit in the last few days. As of right now I have 2 tasks in progress. This not the norm so something is not quite right.

I am seeing the following in the online log which I don't understand:

2013-03-06 16:20:12.3545 [PID=12285] Request: [USER#xxxxx] [HOST#6382800] [IP xxx.xxx.xxx.22] client 7.0.29
2013-03-06 16:20:12.3568 [PID=12285] [debug] [HOST#6382800] Resetting nresults_today
2013-03-06 16:20:12.3576 [PID=12285] [handle] [HOST#6382800] [RESULT#351302051] [WU#151592225] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2013-03-06 16:20:12.3576 [PID=12285] [handle] cpu time 0.000000 credit/sec 0.003894, claimed credit 0.000000
2013-03-06 16:20:12.3578 [PID=12285] [handle] [RESULT#351302051] [WU#151592225]: setting outcome SUCCESS
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ncpus 8 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] Not using matchmaker scheduling; Not using EDF sim
2013-03-06 16:20:12.4147 [PID=12285] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] CUDA: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] work_req_seconds: 0.00 secs
2013-03-06 16:20:12.4147 [PID=12285] [send] available disk 87.30 GB, work_buf_min 0
2013-03-06 16:20:12.4147 [PID=12285] [send] active_frac 0.999992 on_frac 0.999911 DCF 0.897591
2013-03-06 16:20:12.4169 [PID=12285] Sending reply to [HOST#6382800]: 0 results, delay req 60.00
2013-03-06 16:20:12.4172 [PID=12285] Scheduler ran 0.069 seconds

What is the "matchmaker" comment about?


That's an interesting question, which I'm sure one of our esteemed (and technically adept) moderators will answer in due course.

But it has nothing to do with your downloads dropping off. You're not getting any new work because you're not asking for any new work. Look closer to home.

Darth Beaver
Darth Beaver
Joined: 28 Jul 08
Posts: 49
Credit: 14208989
RAC: 0

Hi thank's for the update .

Hi thank's for the update . So Scootty that week you give it ? Trekky time ?? like the captains says ,"so Scootty that's 7hrs then" ....bloody hope tits not a week

paris
paris
Joined: 11 Jan 06
Posts: 48
Credit: 7580340
RAC: 9915

I am having a similar issue

I am having a similar issue and have had for a few days now. The messages I get are as follows:

2013-03-06 23:47:50.1974 [PID=30612] Request: [USER#xxxxx] [HOST#2952443] [IP xxx.xxx.xxx.47] client 6.10.56
2013-03-06 23:47:50.2035 [PID=30612] [debug] [HOST#2952443] Resetting nresults_today
2013-03-06 23:47:50.2036 [PID=30612] [send] effective_ncpus 2 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2013-03-06 23:47:50.2036 [PID=30612] [send] effective_ngpus 0 max_jobs_on_host_gpu 999999
2013-03-06 23:47:50.2036 [PID=30612] [send] Not using matchmaker scheduling; Not using EDF sim
2013-03-06 23:47:50.2036 [PID=30612] [send] CPU: req 34091.36 sec, 0.20 instances; est delay 0.00
2013-03-06 23:47:50.2037 [PID=30612] [send] work_req_seconds: 34091.36 secs
2013-03-06 23:47:50.2037 [PID=30612] [send] available disk 54.81 GB, work_buf_min 172800
2013-03-06 23:47:50.2037 [PID=30612] [send] active_frac 0.990677 on_frac 0.994236 DCF 1.163416
2013-03-06 23:47:50.2043 [PID=30612] [send] [HOST#2952443] is reliable
2013-03-06 23:47:50.2044 [PID=30612] [send] set_trust: random choice for error rate 0.003387: yes
2013-03-06 23:47:50.2185 [PID=30612] [version] Checking plan class 'BRP4cuda32OSX'
2013-03-06 23:47:50.2191 [PID=30612] [version] reading plan classes from file '/BOINC/projects/EinsteinAtHome/plan_class_spec.xml'
2013-03-06 23:47:50.2192 [PID=30612] [version] OS version required min: 100800, supplied: 81101
2013-03-06 23:47:50.2192 [PID=30612] [version] Checking plan class 'opencl-ati-lion'
2013-03-06 23:47:50.2192 [PID=30612] [version] OS version required min: 110000, supplied: 81101
2013-03-06 23:47:50.2192 [PID=30612] [version] no app version available: APP#19 (einsteinbinary_BRP4) PLATFORM#6 (i686-apple-darwin) min_version 0
2013-03-06 23:47:50.2192 [PID=30612] [version] no app version available: APP#19 (einsteinbinary_BRP4) PLATFORM#3 (powerpc-apple-darwin) min_version 0
2013-03-06 23:47:50.2238 [PID=30612] [send] stopping work search - no locality app selected
2013-03-06 23:47:50.2239 [PID=30612] [send] stopping work search - no locality app selected
2013-03-06 23:47:50.2255 [PID=30612] [debug] [HOST#2952443] MSG(high) No work sent
2013-03-06 23:47:50.2255 [PID=30612] [debug] [HOST#2952443] MSG(high) see scheduler log messages on http://einstein.phys.uwm.edu//host_sched_logs/2952/2952443
2013-03-06 23:47:50.2255 [PID=30612] [debug] [HOST#2952443] MSG(high) No work available for the applications you have selected. Please check your preferences on the web site.
2013-03-06 23:47:50.2255 [PID=30612] Sending reply to [HOST#2952443]: 0 results, delay req 60.00
2013-03-06 23:47:50.2258 [PID=30612] Scheduler ran 0.035 seconds

I have noticed that the BRP work generators are frequently not running. Is anyone else having a problem? If I am interpreting the above correctly, there has been an upgrade that requires a newer operating system. Any help or enlightenment would be appreciated.

(edit): I have been running OS X (Tiger) 10.4.11 on a Mac mini core duo for a long time with no problems. Do I now need Lion?


Plus SETI Classic = 21,082 WUs

Nobody316
Nobody316
Joined: 14 Jan 13
Posts: 141
Credit: 2008126
RAC: 0

RE: I have noticed that the

Quote:
I have noticed that the BRP work generators are frequently not running. Is anyone else having a problem? If I am interpreting the above correctly, there has been an upgrade that requires a newer operating system. Any help or enlightenment would be appreciated.

I can't say for sure if OS has any effect. I am running windows 7 x64 and at the moment only run BRP4 for CPU and GPU. I still get work daily with no problems on that point only problem at the moment is stats update on Boinc stats as the update for on site works fine.

PC setup MSI-970A-G46 AMD FX-8350 8 core OC'd 4.45GHz 16GB ram PC3-10700 Geforce GTX 650Ti Windows 7 x64 Einstein@Home

robl
robl
Joined: 2 Jan 13
Posts: 1709
Credit: 1454570221
RAC: 3375

RE: RE: I am not sure its

Quote:
Quote:

I am not sure its just a "stats" issue. I share my computer with SETI and have plenty of work from them. However my downloads from E&H have dropped off quite a bit in the last few days. As of right now I have 2 tasks in progress. This not the norm so something is not quite right.

I am seeing the following in the online log which I don't understand:

2013-03-06 16:20:12.3545 [PID=12285] Request: [USER#xxxxx] [HOST#6382800] [IP xxx.xxx.xxx.22] client 7.0.29
2013-03-06 16:20:12.3568 [PID=12285] [debug] [HOST#6382800] Resetting nresults_today
2013-03-06 16:20:12.3576 [PID=12285] [handle] [HOST#6382800] [RESULT#351302051] [WU#151592225] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2013-03-06 16:20:12.3576 [PID=12285] [handle] cpu time 0.000000 credit/sec 0.003894, claimed credit 0.000000
2013-03-06 16:20:12.3578 [PID=12285] [handle] [RESULT#351302051] [WU#151592225]: setting outcome SUCCESS
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ncpus 8 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] Not using matchmaker scheduling; Not using EDF sim
2013-03-06 16:20:12.4147 [PID=12285] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] CUDA: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] work_req_seconds: 0.00 secs
2013-03-06 16:20:12.4147 [PID=12285] [send] available disk 87.30 GB, work_buf_min 0
2013-03-06 16:20:12.4147 [PID=12285] [send] active_frac 0.999992 on_frac 0.999911 DCF 0.897591
2013-03-06 16:20:12.4169 [PID=12285] Sending reply to [HOST#6382800]: 0 results, delay req 60.00
2013-03-06 16:20:12.4172 [PID=12285] Scheduler ran 0.069 seconds

What is the "matchmaker" comment about?


That's an interesting question, which I'm sure one of our esteemed (and technically adept) moderators will answer in due course.

But it has nothing to do with your downloads dropping off. You're not getting any new work because you're not asking for any new work. Look closer to home.

I am running on Linux X64. If I update E&H using the "update" button in Boinc Manager I download new WUs. It almost seems as though the "automatic" update is not taking place when jobs are complete. I have looked at the various parameters on this site and do not see any that could effect automatic download of WUs.

What am I missing?

Holmis
Joined: 4 Jan 05
Posts: 1118
Credit: 1055935564
RAC: 0

RE: RE: RE: I am not

Quote:
Quote:
Quote:

I am not sure its just a "stats" issue. I share my computer with SETI and have plenty of work from them. However my downloads from E&H have dropped off quite a bit in the last few days. As of right now I have 2 tasks in progress. This not the norm so something is not quite right.

I am seeing the following in the online log which I don't understand:

2013-03-06 16:20:12.3545 [PID=12285] Request: [USER#xxxxx] [HOST#6382800] [IP xxx.xxx.xxx.22] client 7.0.29
2013-03-06 16:20:12.3568 [PID=12285] [debug] [HOST#6382800] Resetting nresults_today
2013-03-06 16:20:12.3576 [PID=12285] [handle] [HOST#6382800] [RESULT#351302051] [WU#151592225] got result (DB: server_state=4 outcome=0 client_state=0 validate_state=0 delete_state=0)
2013-03-06 16:20:12.3576 [PID=12285] [handle] cpu time 0.000000 credit/sec 0.003894, claimed credit 0.000000
2013-03-06 16:20:12.3578 [PID=12285] [handle] [RESULT#351302051] [WU#151592225]: setting outcome SUCCESS
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ncpus 8 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] effective_ngpus 1 max_jobs_on_host_gpu 999999
2013-03-06 16:20:12.4147 [PID=12285] [send] Not using matchmaker scheduling; Not using EDF sim
2013-03-06 16:20:12.4147 [PID=12285] [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] CUDA: req 0.00 sec, 0.00 instances; est delay 0.00
2013-03-06 16:20:12.4147 [PID=12285] [send] work_req_seconds: 0.00 secs
2013-03-06 16:20:12.4147 [PID=12285] [send] available disk 87.30 GB, work_buf_min 0
2013-03-06 16:20:12.4147 [PID=12285] [send] active_frac 0.999992 on_frac 0.999911 DCF 0.897591
2013-03-06 16:20:12.4169 [PID=12285] Sending reply to [HOST#6382800]: 0 results, delay req 60.00
2013-03-06 16:20:12.4172 [PID=12285] Scheduler ran 0.069 seconds

What is the "matchmaker" comment about?


That's an interesting question, which I'm sure one of our esteemed (and technically adept) moderators will answer in due course.

But it has nothing to do with your downloads dropping off. You're not getting any new work because you're not asking for any new work. Look closer to home.

I am running on Linux X64. If I update E&H using the "update" button in Boinc Manager I download new WUs. It almost seems as though the "automatic" update is not taking place when jobs are complete. I have looked at the various parameters on this site and do not see any that could effect automatic download of WUs.

What am I missing?

The settings for the cache of work works like this for Boinc version 7:
"Computer is connected to the Internet about every: xx days" is a low water mark.
"Maintain enough work for an additional xx days" forms a high water mark.
Boinc will request enough work for low + high and then wait until it drops below the low water mark again before asking for more.
So if you set it to something like 1 + 0.1 Boinc will always keep about one days worth of work.

If you run more than one project you have to consider resource share, you have probably run more Einstein work in recent time than Seti work and now Seti is allowed to catch up.

TPCBF
TPCBF
Joined: 24 Nov 12
Posts: 17
Credit: 145310756
RAC: 9857

RE: That's an interesting

Quote:

That's an interesting question, which I'm sure one of our esteemed (and technically adept) moderators will answer in due course.

But it has nothing to do with your downloads dropping off. You're not getting any new work because you're not asking for any new work. Look closer to home.

For me, running E@H on several different hosts (4xXP (one exclusively), 3xWin7), they just get new tasks as usual, I did not see any significant variation in getting new task since the issue with the server showed up. And I checked all machines just because I noticed that the stats didn't update a couple of days ago.
The missing stats update is the only thing I noticed and as long as everything else seems to be working OK, that's all fine with me. Just a bit more effort in monitoring a few hosts of anything out of the ordinary for a few days...

Ralf

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4273
Credit: 245277134
RAC: 11956

- stats have been dumped from

- stats have been dumped from the master DB now, it depends on the stats sites when they'll pick it up.

- work at UWM progresses faster than feared, we may have the replica working (or at least been worked on) later today.

- BOINC offers three "schedulers", referred to as "old"/"array", "locality" and "matchmaker". On Einstein@Home we're using the array scheduler to send work for BRP(4) and FGRP(2) and the locality scheduler for GW (S6BucketLVE) work, the matchmaker isn't used. The log entry about it can safely be ignored.

- scheduling (i.e. reporting and getting tasks) is completely independent of the replica.

- An App selection in BOINC is opt-in. If the project issues a new application after you once made a selection, you won't get work for this new application until you revisit your preferences and select this application. On Einstein@Home FGRP1 has been superseded by FGRP2, and previous GW apps (S6LV1, S6Bucket and older) by the recent S6BucketLVE. If you once chose to run applications that don't exist anymore, you may not get any work at all. The "run other apps if no work is available for selected apps" setting is meant to work around this problem, but it doesn't work reliably on Einstein@Home. For now you need to revisit your Einstein@Home preferences every time we release a new application. Changing this behavior is under discussion, but hasn't been implemented yet.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.