Windows Beta Test App 4.24 available

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,027
Credit: 215,662,186
RAC: 76,719

RE: How about the

Message 69311 in response to message 69308

Quote:
How about the 0xc0000142 crash issues? I don't know if you got my email, as you haven't replied... I wish I knew more of what to help with, but that error is a vexing one...


Yep, got it. Sorry for not replying immediately, had two rather chaotic days. Wrote to Rom about it as you suggested.

Quote:
Edit: BTW, SIGABRT still seems to come up for Linux. See this result.


Yep. But not too many (190 in past week), most from the same 4 machines. Not my highest priority right now.

BM

BM

Alinator
Alinator
Joined: 8 May 05
Posts: 927
Credit: 9,352,143
RAC: 0

Just had a 4.24 crap out

Just had a 4.24 crap out about half way thorough its first result run with 4.24.

85487605

Looks like it failed on a routine task switch restart.

Alinator

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 468,679,238
RAC: 58,045

RE: Just had a 4.24 crap

Message 69313 in response to message 69312

Quote:

Just had a 4.24 crap out about half way thorough its first result run with 4.24.

85487605

Looks like it failed on a routine task switch restart.

Alinator

Very strange: It restarts, finds the checkpoint-file (!), tries to open it but somehow can't (!), and exists with an error message that the checkpoint file isn't there at all ...

CU

BRM

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,491
Credit: 63,508,163,253
RAC: 53,685,498

I have noticed one of these

I have noticed one of these as well. At first glance it seems to be the same situation as Alinator's. It happened on the third result since the switch to 4.24.

Before I saw Alinator's report, I had attributed this error to hardware problems. With a large number of older machines, I've run across quite a number of motherboards which have developed the "swollen capacitor syndrome". Being curious by nature and owning a good quality Weller soldering iron, I've attempted the repair of about 10 such motherboards. Until now, my success rate has been 100% since all such repaired systems are back in production.

The result linked above was crunched on a machine where about 8 caps were replaced. I've only replaced caps that are obviously swollen so there are still some original caps left. It has been running fine for about 2 months since the repair but has started locking up about once a day recently. I've been restarting it as required and it has been completing work without any client errors until now. So I don't really know if the client error was associated with more faulty caps or a problem with the 4.24 app. Alinator's seemingly identical error is making me wonder if it's the app.

This weekend I'll probably take the mobo out and see if I can spot any more dodgy caps. If so I'll replace some more of them and see if that cures the lockups. It'll also be interesting to see if I get any more client errors on that particular combination of hardware, 4.24 app, and particular frequency data file, once I cure the lockups.

Cheers,
Gary.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,027
Credit: 215,662,186
RAC: 76,719

RE: Very strange: It

Message 69315 in response to message 69313

Quote:
Very strange: It restarts, finds the checkpoint-file (!), tries to open it but somehow can't (!), and exists with an error message that the checkpoint file isn't there at all ...


Yep. Keeps me confused ever since I made the error messages a little more verbose. We actually get a lot of these errors, I'll write to Rom about that. Maybe boinc_fopen() does some funny things...

BM

BM

RandyC
RandyC
Joined: 18 Jan 05
Posts: 3,636
Credit: 111,139,797
RAC: 0

To answer the question below

Message 69316 in response to message 69270

To answer the question below re 4.24 vs 4.17 with-w/o patch, I went ahead and did the 'ABC' patch and got the following results:

4.17 with 'ABC' patch applied yields approx 85.3k sec/WU
4.24 with no patch yields approx 90.9k sec/WU
4.24 with 'ABC' patch applied yields approx 88.3k sec/WU...about a 50 min. penalty/WU for running 4.24 vs 4.17 on this machine.

Quote:
Quote:


What is the effect that happens when you "ABC" again? Is that working against the modf() -> ftol() change, or is there still some activity going to modf() despite the change, meaning there's another "buggy detection" different from the one that was already worked around, or is that string changing some other function?

Brian

Anyway...the effect of the "ABC" patch is that on AMD CPUs that supports SSE2, a global flag in the runtime lib is set differently. This flag toggles the (usually) faster SSE2 codepath for several functions, not just modf. What Bernd did was to rewrite the code in the hot-loop so that it would no longer call modf but ftol, for which, in VS 2003, only one code path exists which is reasonable fast. The slow codepath will continue to be executed in the new Win apps, but no longer in the hot-loop, as I understand it, so the overall effect of "ABC"ing the app should be much smaller now.

CU

H-B


Seti Classic Final Total: 11446 WU.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,027
Credit: 215,662,186
RAC: 76,719

Just to keep you updated of

Just to keep you updated of our plans, mainly regarding the cross-platform differences:
- Early next week (probably Monday) we'll issue a new validator that should make things easier for transition and probably fix some invalid results by itself
- After the new validator is in place, we'll issue a new set of Apps for public Beta Test (for all platforms) that incorporate the fixes accomplished so far. I'll keep on tracking problems and fixing bugs I find until the very last moment. The new Apps will also incorporate a new feature that we might need.
- If it turns out that we need this feature (using pre-calculated files instead of doing the calculations in the Apps to avoid platform differences there), we will issue new workunits (actually a new workunit generator) that will make use of this feature after the new Apps have been made "official".
- Once we got the validation working properly, I'll work on speeding up the computation in the Apps. The current code I plan to use for parts of the calculation btw. doesn't make use of neither modf() nor ftol() anymore but actually uses bit-operations to achieve something similar.

BM

BM

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,516
Credit: 468,679,238
RAC: 58,045

Excellent news, and just in

Excellent news, and just in time to deal with the 'new' monster workunits (>= 630 credits) that would otherwise cause quite a bit of frustration if crunched for zero credits because of the cross platform validation issue.

As to performance, I was surprised to see that the new app with ftol instead of modf seems to be slightly *faster* at least on some modern SSE2-capable Intel (!!!) CPUSs. I know it's a tad-bit slower on my Pentium M, but I checked one of the top 3 computers (see link on E@H homepage) and there was no decrease in crunching performance when the switch happened.

CU

BRM

Mats Nilsson
Mats Nilsson
Joined: 10 Dec 05
Posts: 94
Credit: 15,011,147
RAC: 0

My AMD 3500+ liked the 4.24,

My AMD 3500+ liked the 4.24, it went from about 38hr with 4.17 to ~28hr with 4.24 on WU from the same set of datafile, still waiting for my crunch partner to see if it is valid. If there is more to do too speed it up it´s great but I understand that the validation problem must be looked at first.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,491
Credit: 63,508,163,253
RAC: 53,685,498

RE: After the new validator

Message 69320 in response to message 69317

Quote:
After the new validator is in place, we'll issue a new set of Apps for public Beta Test (for all platforms) that incorporate the fixes accomplished so far...

Bernd,

You might like to consider posting a short news item (linking to your latest message) on the front page right now. This would give more people who might like to participate in the next beta test some time to do a bit of research before things get going in earnest. There probably aren't a whole lot of participants following this particular thread anymore :).

The other major benefit is for all those people who wouldn't participate in a beta test anyway. At least they should be highly encouraged to see that something is happening to address issues that may currently be turning them off this project.

Just IMHO of course :).

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.