Windows Beta Test App 4.24 available

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,870
Credit: 116,815,990,004
RAC: 36,228,971

RE: It turns out there are

Message 69300 in response to message 69282

Quote:

It turns out there are a number of issues that lead to these cross-platform validation problems, some of which have been addressed recently, some we're still digging for. Solving these problems will probably require both a new validator and a complete set of Apps. I am confident that we will have all these pieces together next week.

BM

Bernd,

When you get to the point of deploying the new validator and the new set of apps, are you intending to run a (perhaps short) beta test phase first, as you did with the 4.24 Windows app?

If you are, might I make a suggestion about the app_info.xml file that would accompany each test app? As you warn quite clearly on the beta test page, changing the app aborts any work in progress with a client error. However you can easily avoid this with a small modification to the app_info.xml file. If you are already fully aware of this and do not want to allow a change of app in the middle of a result, that is fine - no change is needed.

My thinking is that the beta test period could be kept shorter and the number of potential beta testers could be increased if people were allowed to "re-brand" the results in their caches so that they didn't have to abort or wait for their caches to drain or in any way disrupt their normal crunching patterns in order to participate in the test. I'm sure that people have done this in the past by editing their state files. I think it's much safer to do it through the app_info.xml mechanism.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,870
Credit: 116,815,990,004
RAC: 36,228,971

Bernd, Hopefully, whilst

Bernd,

Hopefully, whilst I've got your attention, you might like to review this thread concerning stalled results. I've noticed this behaviour a few times now and i've recorded the result ID of my latest stalled result there.

The result in question was being crunched with the 4.17 Windows app. A little while after I kicked it back to life, I decided to test out my app_info.xml mods in order to speed up the completion of the result as much as possible by using 4.24 instead of 4.17. Even though my result was past the deadline, a third result had not been issued at that point. I hoped that I might be able to beat the system and keep the third result "unsent" :).

Although there was a 25%+ speedup of the final stages of crunching, I still missed out on stopping the third result being issued by just 37 mins.

Cheers,
Gary.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,307
Credit: 249,732,370
RAC: 34,610

RE: When you get to the

Message 69302 in response to message 69300

Quote:
When you get to the point of deploying the new validator and the new set of apps, are you intending to run a (perhaps short) beta test phase first, as you did with the 4.24 Windows app?

If new Apps are needed, I'll definitely publish them for a public Beta test first.

Currently it looks like upgrading some server-side components (validator and workunit generator) may solve the problem and be the best choice, but we're still looking into this.

Quote:

If you are, might I make a suggestion about the app_info.xml file that would accompany each test app? As you warn quite clearly on the beta test page, changing the app aborts any work in progress with a client error. However you can easily avoid this with a small modification to the app_info.xml file. If you are already fully aware of this and do not want to allow a change of app in the middle of a result, that is fine - no change is needed.

My thinking is that the beta test period could be kept shorter and the number of potential beta testers could be increased if people were allowed to "re-brand" the results in their caches so that they didn't have to abort or wait for their caches to drain or in any way disrupt their normal crunching patterns in order to participate in the test. I'm sure that people have done this in the past by editing their state files. I think it's much safer to do it through the app_info.xml mechanism.


Actually I'll not advise people to manually hack the client_state.xml files, they are too fragile.

However in the future the app_info.xml files in the Beta Test packages will include entries for previous (maybe both official and beta) App versions, so after installing the Beta Test Package even in the middle of a result will not lead to a Client Error, but just to be finished with the old App version, and new work will be assigned to the new App.

Furthermore if you really want to switch the App version halfway through a result, see the sticky post on this subject. I can not guarantee that it will work at all, as e.g. the syntax of the checkpoint file might change between versions.

BM

BM

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5,870
Credit: 116,815,990,004
RAC: 36,228,971

RE: Furthermore if you

Message 69303 in response to message 69302

Quote:

Furthermore if you really want to switch the App version halfway through a result, see the sticky post on this subject. I can not guarantee that it will work at all, as e.g. the syntax of the checkpoint file might change between versions.

Hi Bernd,

Thanks for the reply.

I'm fully aware of that sticky you link to and I'm also NOT suggesting any hacking of the state file. My comments were about making some additions to the app_info.xml file so that the state file would remain pristine and that no changing of the name of the new executable so that it could pretend to be the old executable would be needed either (as was mentioned in the sticky).

Taking the case of the transition from 4.17 to 4.24 as an example. Here there were desirable bugfixes and apparently no change in output syntax. It would be prudent therefore for any 4.17 "branded" results in a person's cache to be crunched by 4.24, rather than the old buggy app. This can be achieved very simply using a bit more intelligence built into app_info.xml. No dodgy editing of the state file is required at all.

Cheers,
Gary.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,307
Credit: 249,732,370
RAC: 34,610

RE: Taking the case of the

Message 69304 in response to message 69303

Quote:
Taking the case of the transition from 4.17 to 4.24 as an example. Here there were desirable bugfixes and apparently no change in output syntax. It would be prudent therefore for any 4.17 "branded" results in a person's cache to be crunched by 4.24, rather than the old buggy app. This can be achieved very simply using a bit more intelligence built into app_info.xml. No dodgy editing of the state file is required at all.


I understand.

I guess I have to think about this a little more.

BM

BM

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,522
Credit: 699,374,007
RAC: 229,763

RE: Currently it looks

Message 69306 in response to message 69302

Quote:


Currently it looks like upgrading some server-side components (validator and workunit generator) may solve the problem and be the best choice, but we're still looking into this.

BM

Wouldn't it be worthwhile to correct the uninitialized data problem in the Linux and Mac apps? As those were detected by compiler runtime checks, to me it sounds as if they were relevant.

CU

BRM

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4,307
Credit: 249,732,370
RAC: 34,610

RE: RE: Currently it

Message 69307 in response to message 69306

Quote:
Quote:
Currently it looks like upgrading some server-side components (validator and workunit generator) may solve the problem and be the best choice, but we're still looking into this.

Wouldn't it be worthwhile to correct the uninitialized data problem in the Linux and Mac apps? As those were detected by compiler runtime checks, to me it sounds as if they were relevant.


On Linux and Mac we haven't seen a single result that have been affected by this bug, i.e. it didn't have an effect on the final outcome of the calculation. With this 4.24 Windows App we have found another problem in the same module (which might have been introduced by the fix to the earlier problem). We're working on this. So we'll definitely release a new generation of Apps anyway with some bugfixes.

However for the cross-platform validation problem (only) it might be that we'll need to deal with this only on the server side.

BM

BM

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282,700
RAC: 0

How about the 0xc0000142

How about the 0xc0000142 crash issues? I don't know if you got my email, as you haven't replied... I wish I knew more of what to help with, but that error is a vexing one...

Edit: BTW, SIGABRT still seems to come up for Linux. See this result.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,522
Credit: 699,374,007
RAC: 229,763

RE: How about the

Message 69309 in response to message 69308

Quote:
How about the 0xc0000142 crash issues? I don't know if you got my email, as you haven't replied... I wish I knew more of what to help with, but that error is a vexing one...

Is it still happeneing with the new app?? I would have guesses that the majority of these bugs were secondary problems resulting in a failure to initialize the runtime debugger (which should now work).

CU

BRM

Brian Silvers
Brian Silvers
Joined: 26 Aug 05
Posts: 772
Credit: 282,700
RAC: 0

RE: RE: How about the

Message 69310 in response to message 69309

Quote:
Quote:
How about the 0xc0000142 crash issues? I don't know if you got my email, as you haven't replied... I wish I knew more of what to help with, but that error is a vexing one...

Is it still happeneing with the new app?? I would have guesses that the majority of these bugs were secondary problems resulting in a failure to initialize the runtime debugger (which should now work).

He emailed me the other day asking about it. It is with 4.24. 0xc0000142 is a DLL did not initialize. It is a Windows stop error. From what I read through googling it, it could be a science app problem or it could be a graphics subsystem problem. Graphics-related, I found a few mentions of the issue happening with ATI video cards. Sooooo, based off of what I recall from the initial Linux Signal 11 ("SIGABRT") issue with some OpenGL library, then it could be whatever OpenGL software that the ATI Catalyst drivers use...

Ultimately, it's way out of my league. I mentioned he should contact Rom Walton...one of the main BOINC developers...

Brian

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.