...In the meantime until the devs have responded: if you are courageous, you can try the following (experts only, and only if you get this nasty -177 error!!!:
* Stop BOINC
* edit the client_state.xml file :
**** find tags 1100000000000000.000000
**** add two more zeros after the 11
* save client_state.xml
* restart BOINC...
Wouldn't it be better to introduce the directive into the app_info.xml file? I don't know much about it, since I don't have a CUDA device, but from reading on the SETI boards, it sounds as if it could help.
Here's a short reminder of the "Maximum elapsed time exceeded" error:
Quote:
The splitters provide both used to estimate runtime, and 10 times that as used to determine when work has run so long that BOINC should kill it with a "Maximum elapsed time exceeded". BOINC uses DCF in the estimate but not in the max. If DCF were 1, a VLAR which wasn't rebranded would be killed at 10 times the estimate, if DCF were 0.1 it would have to run 100 times the estimate to reach the max limit.
Obviously, the directive can be used to "adjust" the DCF to be consistent between different applications (and to be lower than 1, which helps avoid the error).
Here's another thread from the SETI boards, where the p_fpops to flops conversion is shown. The values of course have to be evaluated anew here at Einstein.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
Thanks for the (as always) excellent advice, I didn't know about that thing in the app_info.xml schema.
So, as the cuda beta app's runtime is in the same order of magnitude as the CPU variant, would it make sense to set x then to (say) half the value of in your client_state.xml ?
started my first einstein beta unit on my homeserver with 9600 GT card
running it under 6.10.13 boinc and using the latest version provided
i am gonna see if this is gonna run as good as it does on gpugrid or collatz.
but by reading through the messages i see not much performance gain being reported.
while on most cuda/ati projects enormous gains are being reported when using the cuda/ati cards.
started my first einstein beta unit on my homeserver with 9600 GT card
running it under 6.10.13 boinc and using the latest version provided
i am gonna see if this is gonna run as good as it does on gpugrid or collatz.
but by reading through the messages i see not much performance gain being reported.
while on most cuda/ati projects enormous gains are being reported when using the cuda/ati cards.
...So, as the cuda beta app's runtime is in the same order of magnitude as the CPU variant, would it make sense to set x then to (say) half the value of in your client_state.xml ?
I hope you aren't asking me :-) since I have no experience with that matter.
I'd say try 0.5 (or perhaps 0.25 of estimated GFlops?) and watch DCF behaviour, thus using trial and error - unless someone else shows up with a sound analytical approach.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
David Anderson checked in changeset [trac]changeset:19282[/trac] less than an hour ago, apparently in response to this problem.
His solution: assume the CUDA card runs at the same speed as the CPU, unless app_info has a flops line to the contrary. While waiting for a new BOINC client to be compiled, you could test it by putting your CPU floating point benchmark figure in app_info and see how well his idea stacks up!
Make sure you get the orders of magnitude right. The figure to put in app_info is measured in individual FLOPs (billions of them). Benchmarks are usually stated in MFLOPs (thousands of them), and CPU speeds in GFlops (not very many of them at all). You can use exponential format in app_info, I'm told, but Bikeman's suggestion of pulling out of client_state is probably the easiest and most reliable.
Indeed, that change is related to the discussion here and is basically reverting an earlier change. If you are using a BOINC version older than 6.10.5 you probably are not affected by this problem anayway (problem = result terminated prematurely by BOINC because BOINC thinks it is in an endless loop as it's resource limit was exceeded).
Task ID 142450599
Name p2030_53995_04256_0033_G63.04-02.74.C_5.dm_443_1
Workunit 59849668
Created 7 Oct 2009 8:32:53 UTC
Sent 7 Oct 2009 19:40:50 UTC
Received 8 Oct 2009 3:21:09 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status -177 (0xffffffffffffff4f)
Computer ID 2086650
Report deadline 21 Oct 2009 19:40:50 UTC
CPU time 0
stderr out 6.10.13
Maximum elapsed time exceeded
]]>
Validate state Invalid
Claimed credit 0
Granted credit 0
application version 3.10
Task ID 142450599
Name p2030_53995_04256_0033_G63.04-02.74.C_5.dm_443_1
Workunit 59849668
Created 7 Oct 2009 8:32:53 UTC
Sent 7 Oct 2009 19:40:50 UTC
Received 8 Oct 2009 3:21:09 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status -177 (0xffffffffffffff4f)
Computer ID 2086650
Report deadline 21 Oct 2009 19:40:50 UTC
CPU time 0
stderr out 6.10.13
Maximum elapsed time exceeded
]]>
Validate state Invalid
Claimed credit 0
Granted credit 0
application version 3.10
Hi!
This error seems to be caused by what we were discussing a few messages ago in this thread: Newer BOINC versions will perform the calculation of the maximum runtime per result differently.
To fix this, you can try the following:
- shutdown Boinc
- Find the app_info.xml file, it should be in it should be "C:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu" or similar.
-insert this line after the line ending in :
3000000000.0
- save modified app_info.xml
- restart BOINC
This error seems to be caused by what we were discussing a few messages ago in this thread: Newer BOINC versions will perform the calculation of the maximum runtime per result differently.
To fix this, you can try the following:
- shutdown Boinc
- Find the app_info.xml file, it should be in it should be "C:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu" or similar.
-insert this line after the line ending in :
3000000000.0
- save modified app_info.xml
- restart BOINC
Hope this helps,
Bikeman
Darn i had forgotten to save the file when i edit it
RE: ...In the meantime
)
Wouldn't it be better to introduce the directive into the app_info.xml file? I don't know much about it, since I don't have a CUDA device, but from reading on the SETI boards, it sounds as if it could help.
Here's a short reminder of the "Maximum elapsed time exceeded" error:
Obviously, the directive can be used to "adjust" the DCF to be consistent between different applications (and to be lower than 1, which helps avoid the error).
Here's another thread from the SETI boards, where the p_fpops to flops conversion is shown. The values of course have to be evaluated anew here at Einstein.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
Hi Gundolf, Thanks for the
)
Hi Gundolf,
Thanks for the (as always) excellent advice, I didn't know about that thing in the app_info.xml schema.
So, as the cuda beta app's runtime is in the same order of magnitude as the CPU variant, would it make sense to set x then to (say) half the value of in your client_state.xml ?
CU
Bikeman
started my first einstein
)
started my first einstein beta unit on my homeserver with 9600 GT card
running it under 6.10.13 boinc and using the latest version provided
i am gonna see if this is gonna run as good as it does on gpugrid or collatz.
but by reading through the messages i see not much performance gain being reported.
while on most cuda/ati projects enormous gains are being reported when using the cuda/ati cards.
RE: started my first
)
http://einsteinathome.org/node/194499&nowrap=true#99848
seeing without seeing is something the blind learn to do, and seeing beyond vision can be a gift.
RE: ...So, as the cuda beta
)
I hope you aren't asking me :-) since I have no experience with that matter.
I'd say try 0.5 (or perhaps 0.25 of estimated GFlops?) and watch DCF behaviour, thus using trial and error - unless someone else shows up with a sound analytical approach.
Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
David Anderson checked in
)
David Anderson checked in changeset [trac]changeset:19282[/trac] less than an hour ago, apparently in response to this problem.
His solution: assume the CUDA card runs at the same speed as the CPU, unless app_info has a flops line to the contrary. While waiting for a new BOINC client to be compiled, you could test it by putting your CPU floating point benchmark figure in app_info and see how well his idea stacks up!
Make sure you get the orders of magnitude right. The figure to put in app_info is measured in individual FLOPs (billions of them). Benchmarks are usually stated in MFLOPs (thousands of them), and CPU speeds in GFlops (not very many of them at all). You can use exponential format in app_info, I'm told, but Bikeman's suggestion of pulling out of client_state is probably the easiest and most reliable.
Indeed, that change is
)
Indeed, that change is related to the discussion here and is basically reverting an earlier change. If you are using a BOINC version older than 6.10.5 you probably are not affected by this problem anayway (problem = result terminated prematurely by BOINC because BOINC thinks it is in an endless loop as it's resource limit was exceeded).
CU
Bikeman
Task ID 142450599 Name
)
Task ID 142450599
Name p2030_53995_04256_0033_G63.04-02.74.C_5.dm_443_1
Workunit 59849668
Created 7 Oct 2009 8:32:53 UTC
Sent 7 Oct 2009 19:40:50 UTC
Received 8 Oct 2009 3:21:09 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status -177 (0xffffffffffffff4f)
Computer ID 2086650
Report deadline 21 Oct 2009 19:40:50 UTC
CPU time 0
stderr out 6.10.13
Maximum elapsed time exceeded
]]>
Validate state Invalid
Claimed credit 0
Granted credit 0
application version 3.10
RE: Task ID 142450599 Name
)
Hi!
This error seems to be caused by what we were discussing a few messages ago in this thread: Newer BOINC versions will perform the calculation of the maximum runtime per result differently.
To fix this, you can try the following:
- shutdown Boinc
- Find the app_info.xml file, it should be in it should be "C:\Documents and Settings\All Users\Application Data\BOINC\projects\einstein.phys.uwm.edu" or similar.
-insert this line after the line ending in :
3000000000.0
- save modified app_info.xml
- restart BOINC
Hope this helps,
Bikeman
RE: Hi! This error seems
)
Darn i had forgotten to save the file when i edit it