Silent Client Errors -- lost over 100 credits -- Now what?

Mike
Mike
Joined: 12 Jan 08
Posts: 6
Credit: 1293131
RAC: 0

Thanks again for your

Message 97113 in response to message 97112

Thanks again for your replies.

Nils: You you asking that I stop posting here and continue this thread under the topic of "Client Errors when throtteling"(sic), following Message 102272?

Gary: Thanks for your informative post.
a. My PC is now un-hidden. Hope this helps.

b. I changed my preferences from CPU usage of 70% to 100% also, now I only allow computing overnight, and not at all when the computer is "idle". I also have preferences set to unload BOINC when it is not in use.

c. In message 102272 Nils said the latest development version was 6.10.29 which I guess you suggested be installed 'over the top' of my current version 6.10.18. However, I noticed that the most recent development version (as of 25 Feb) was 6.10.35. I thought before I install that version I should try to change my preferences to see if that resolves the problem.

I am not sure if this is good news or bad, but since I changed my preferences I have not seen any client errors in the log that I check daily. Also I see that eventually, credits are applied. The only thing I do not see (when computation is complete) is pending credits. That is, I see tasks are completed but credits don't show up for a few days and nothing is listed in pending credits.

If it would be of help I would be willing to go back to my previous preferences to see if the original problems re-occur. If you or Nils would like me to do this, what settings should I use and what version should I install that would give us the maximum amount of useful information.

Best

Mike Behar

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4305
Credit: 248475895
RAC: 29681

Hi Mike! RE: If it

Message 97114 in response to message 97113

Hi Mike!

Quote:
If it would be of help I would be willing to go back to my previous preferences to see if the original problems re-occur. If you or Nils would like me to do this, what settings should I use and what version should I install that would give us the maximum amount of useful information.

It would help us to track down the problem (and potentially other participants, possibly of all BOINC projects) if you could
- install a recent development BOINC Core Client (any client from 6.10.29 on should have the necessary features),
- add the cc_config.xml file described in the other thread (I think you need to restart the client to recognize it) and
- restore your original computing settings (70% CPU usage)

I'm monitoring both threads, so feel free to post in either one.

BM

BM

Mike
Mike
Joined: 12 Jan 08
Posts: 6
Credit: 1293131
RAC: 0

Hi Bernd, OK, I did the

Message 97115 in response to message 97114

Hi Bernd,

OK, I did the following:

1. created the following cc_config.xml (and added two items for fun ;-).
I put this file in my BIONIC folder.

1
1

1
90

2. I installed BOINC 6.10.35 (and verified that my cc-config file was not altered).

3. Changed by preferences back to 70% usage and allow to run when computer is idle (i.e., unused by me for more than 10 minutes).

Below is a copy of the global_preferences_override.xml file (which was in the same BOINC folder as cc-config file).
-- OVERRIDE --

0
1
0
10.000000
0.000000
22.000000
10.000000
0.000000
0.000000
0
0
0
0
0.100000
4.000000
100.000000
60.000000
10.000000
10.000000
25.000000
0.000000
40.000000
50.000000
85.000000
10240000.000000
10240000.000000
70.000000

4. Here are a few of the startup messages copied from the manager screen. NOTE Benchmark reports the number of CPU's as 2, which seems odd to me.

2/28/2010 11:38:33 AM Starting BOINC client version 6.10.35 for windows_intelx86
2/28/2010 11:38:33 AM log flags: file_xfer, sched_ops, task, cpu_sched
2/28/2010 11:38:33 AM Libraries: libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
2/28/2010 11:38:33 AM Running as a daemon
2/28/2010 11:38:33 AM Data directory: D:\Data\BOINC
2/28/2010 11:38:33 AM Running under account boinc_master
2/28/2010 11:38:33 AM Processor: 2 GenuineIntel Intel(R) Pentium(R) 4 CPU 3.20GHz [Family 15 Model 2 Stepping 9]
2/28/2010 11:38:33 AM Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pbe
2/28/2010 11:38:33 AM OS: Microsoft Windows XP: Professional x86 Edition, Service Pack 3, (05.01.2600.00)
2/28/2010 11:38:33 AM Memory: 2.00 GB physical, 5.85 GB virtual
2/28/2010 11:38:33 AM Disk: 55.63 GB total, 47.56 GB free
2/28/2010 11:38:33 AM Local time is UTC -5 hours
2/28/2010 11:38:33 AM No usable GPUs found
2/28/2010 11:38:34 AM Version change (6.10.18 -> 6.10.35)

BUT I also noticed this:

2/28/2010 11:38:34 AM Running CPU benchmarks
2/28/2010 11:38:34 AM Suspending computation - running CPU benchmarks
2/28/2010 11:39:05 AM Benchmark results:
2/28/2010 11:39:05 AM Number of CPUs: 2
2/28/2010 11:39:05 AM 1366 floating point MIPS (Whetstone) per CPU
2/28/2010 11:39:05 AM 2243 integer MIPS (Dhrystone) per CPU

I will watch what happens over time and report back if (when) I see errors.

Please let me know if there are any settings or other items you would like me to change.

Best,

Mike

Mike
Mike
Joined: 12 Jan 08
Posts: 6
Credit: 1293131
RAC: 0

Hi Bernd, In the message

Message 97116 in response to message 97115

Hi Bernd,

In the message TAB of BOINC manager I see the sequence below quite often but I do NOT see any local error message. However, when I check online under 'tasks for user' I do see "client errors"

Here are some details:

from Message Tab:
3/1/2010 10:42:51 PM [cpu_sched] Suspending - CPU throttle
3/1/2010 10:42:51 PM SETI@home [cpu_sched] Preempting 27mr07ad.14077.184825.6.10.133_0 (left in memory)
3/1/2010 10:42:51 PM rosetta@home [cpu_sched] Preempting tyrsim_3gbn_2ibb_Protein_interface_design_25Feb2010_18415_432_0 (left in memory)
3/1/2010 10:42:52 PM [cpu_sched] Resuming - CPU throttle
3/1/2010 10:42:52 PM SETI@home [cpu_sched] Resuming 27mr07ad.14077.184825.6.10.133_0
3/1/2010 10:42:52 PM rosetta@home [cpu_sched] Resuming tyrsim_3gbn_2ibb_Protein_interface_design_25Feb2010_18415_432_0
3/1/2010 10:42:54 PM [cpu_sched] Suspending - CPU throttle
3/1/2010 10:42:54 PM SETI@home [cpu_sched] Preempting 27mr07ad.14077.184825.6.10.133_0 (left in memory)
3/1/2010 10:42:54 PM rosetta@home [cpu_sched] Preempting tyrsim_3gbn_2ibb_Protein_interface_design_25Feb2010_18415_432_0 (left in memory)
3/1/2010 10:42:55 PM [cpu_sched] Resuming - CPU throttle
3/1/2010 10:42:55 PM SETI@home [cpu_sched] Resuming 27mr07ad.14077.184825.6.10.133_0
3/1/2010 10:42:55 PM rosetta@home [cpu_sched] Resuming tyrsim_3gbn_2ibb_Protein_interface_design_25Feb2010_18415_432_0
3/1/2010 10:42:57 PM Suspending computation - user is active

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
From ONLINE report I see
"
too many exit(0)s
"
see here for examples with details:

http://einsteinathome.org/task/164408331

http://einsteinathome.org/task/164139372

http://einsteinathome.org/task/163847243

I will try changing the settings so that throttling happens less often.

From a user perspective, it would be nice if the error was shown in the local message tab, otherwise the only way to know that things are going badly is to do an online check...

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4305
Credit: 248475895
RAC: 29681

Thanks a lot! I need to

Message 97117 in response to message 97116

Thanks a lot!

I need to analyze this a little bit more to find the actual problem, but it's definitely helpful!

(it looks like the process id that I thought should be in the logs is missing, I need to find out why)

If you want you can restore your previous compute settings (100% CPU overnight) to avoid the errors. The Client version and config file shouldn't harm. Please leave them in place, we may ask you for one or two more experiments like this again.

Thanks a lot for your help!

BM

BM

Mike
Mike
Joined: 12 Jan 08
Posts: 6
Credit: 1293131
RAC: 0

Great. Glad to be of

Message 97118 in response to message 97117

Great. Glad to be of help.

I will revert back to 100% overnight processing and I will continue to use version 6.10.35 in case you need additional testing.

Since I might not check back here very often, if you would like to reach me, feel free to email me at: withforesight at gmail dot com.

Mike

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4305
Credit: 248475895
RAC: 29681

Thanks again! For the

Thanks again!

For the techs: It looks like that at some point there are multiple instances of the same application running in the same slot (i.e. writing to the same stderr file). One possible cause is that quitting an application (suspending a task by BOINC) may not work when it has threads suspended (for throtteling / CPU usage).

A fix at least for this possible issue has been checked into BOINC last night. I need to build new Apps with this and the we'll see if that actually fixes the problem. I'll do this ASAP, but there are a couple of higher priority things on my table that will probably occupy the the rest of this week.

BM

BM

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.