Windows Beta Test App 4.24 available

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5845

Credit: 109879735956

RAC: 30587762

RE: It turns out there are

3 Jul 2007 7:58:05 UTC

Message 69300 in response to message 69282

(moderation:

)

Quote:

It turns out there are a number of issues that lead to these cross-platform validation problems, some of which have been addressed recently, some we're still digging for. Solving these problems will probably require both a new validator and a complete set of Apps. I am confident that we will have all these pieces together next week.

BM

Bernd,

When you get to the point of deploying the new validator and the new set of apps, are you intending to run a (perhaps short) beta test phase first, as you did with the 4.24 Windows app?

If you are, might I make a suggestion about the app_info.xml file that would accompany each test app? As you warn quite clearly on the beta test page, changing the app aborts any work in progress with a client error. However you can easily avoid this with a small modification to the app_info.xml file. If you are already fully aware of this and do not want to allow a change of app in the middle of a result, that is fine - no change is needed.

My thinking is that the beta test period could be kept shorter and the number of potential beta testers could be increased if people were allowed to "re-brand" the results in their caches so that they didn't have to abort or wait for their caches to drain or in any way disrupt their normal crunching patterns in order to participate in the test. I'm sure that people have done this in the past by editing their state files. I think it's much safer to do it through the app_info.xml mechanism.

Cheers,
Gary.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5845

Credit: 109879735956

RAC: 30587762

Bernd, Hopefully, whilst

3 Jul 2007 9:00:15 UTC

Message 69301

(moderation:

)

Bernd,

Hopefully, whilst I've got your attention, you might like to review this thread concerning stalled results. I've noticed this behaviour a few times now and i've recorded the result ID of my latest stalled result there.

The result in question was being crunched with the 4.17 Windows app. A little while after I kicked it back to life, I decided to test out my app_info.xml mods in order to speed up the completion of the result as much as possible by using 4.24 instead of 4.17. Even though my result was past the deadline, a third result had not been issued at that point. I hoped that I might be able to beat the system and keep the third result "unsent" :).

Although there was a 25%+ speedup of the final stages of crunching, I still missed out on stopping the third result being issued by just 37 mins.

Cheers,
Gary.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4273

Credit: 245184601

RAC: 13895

RE: When you get to the

4 Jul 2007 10:25:16 UTC

Message 69302 in response to message 69300

(moderation:

)

Quote:

When you get to the point of deploying the new validator and the new set of apps, are you intending to run a (perhaps short) beta test phase first, as you did with the 4.24 Windows app?

If new Apps are needed, I'll definitely publish them for a public Beta test first.

Currently it looks like upgrading some server-side components (validator and workunit generator) may solve the problem and be the best choice, but we're still looking into this.

Quote:

If you are, might I make a suggestion about the app_info.xml file that would accompany each test app? As you warn quite clearly on the beta test page, changing the app aborts any work in progress with a client error. However you can easily avoid this with a small modification to the app_info.xml file. If you are already fully aware of this and do not want to allow a change of app in the middle of a result, that is fine - no change is needed.

My thinking is that the beta test period could be kept shorter and the number of potential beta testers could be increased if people were allowed to "re-brand" the results in their caches so that they didn't have to abort or wait for their caches to drain or in any way disrupt their normal crunching patterns in order to participate in the test. I'm sure that people have done this in the past by editing their state files. I think it's much safer to do it through the app_info.xml mechanism.

Actually I'll not advise people to manually hack the client_state.xml files, they are too fragile.

However in the future the app_info.xml files in the Beta Test packages will include entries for previous (maybe both official and beta) App versions, so after installing the Beta Test Package even in the middle of a result will not lead to a Client Error, but just to be finished with the old App version, and new work will be assigned to the new App.

Furthermore if you really want to switch the App version halfway through a result, see the sticky post on this subject. I can not guarantee that it will work at all, as e.g. the syntax of the checkpoint file might change between versions.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5845

Credit: 109879735956

RAC: 30587762

RE: Furthermore if you

4 Jul 2007 11:03:37 UTC

Message 69303 in response to message 69302

(moderation:

)

Quote:

Furthermore if you really want to switch the App version halfway through a result, see the sticky post on this subject. I can not guarantee that it will work at all, as e.g. the syntax of the checkpoint file might change between versions.

Hi Bernd,

Thanks for the reply.

I'm fully aware of that sticky you link to and I'm also NOT suggesting any hacking of the state file. My comments were about making some additions to the app_info.xml file so that the state file would remain pristine and that no changing of the name of the new executable so that it could pretend to be the old executable would be needed either (as was mentioned in the sticky).

Taking the case of the transition from 4.17 to 4.24 as an example. Here there were desirable bugfixes and apparently no change in output syntax. It would be prudent therefore for any 4.17 "branded" results in a person's cache to be crunched by 4.24, rather than the old buggy app. This can be achieved very simply using a bit more intelligence built into app_info.xml. No dodgy editing of the state file is required at all.

Cheers,
Gary.

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4273

Credit: 245184601

RAC: 13895

RE: Taking the case of the

4 Jul 2007 11:41:57 UTC

Message 69304 in response to message 69303

(moderation:

)

Quote:

Taking the case of the transition from 4.17 to 4.24 as an example. Here there were desirable bugfixes and apparently no change in output syntax. It would be prudent therefore for any 4.17 "branded" results in a person's cache to be crunched by 4.24, rather than the old buggy app. This can be achieved very simply using a bit more intelligence built into app_info.xml. No dodgy editing of the state file is required at all.

I understand.

I guess I have to think about this a little more.

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 688499096

RAC: 208155

RE: Currently it looks

4 Jul 2007 13:09:19 UTC

Message 69306 in response to message 69302

(moderation:

)

Quote:

Currently it looks like upgrading some server-side components (validator and workunit generator) may solve the problem and be the best choice, but we're still looking into this.

BM

Wouldn't it be worthwhile to correct the uninitialized data problem in the Linux and Mac apps? As those were detected by compiler runtime checks, to me it sounds as if they were relevant.

BRM

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4273

Credit: 245184601

RAC: 13895

RE: RE: Currently it

4 Jul 2007 14:11:00 UTC

Message 69307 in response to message 69306

(moderation:

)

Quote:

Quote:
Currently it looks like upgrading some server-side components (validator and workunit generator) may solve the problem and be the best choice, but we're still looking into this.

Wouldn't it be worthwhile to correct the uninitialized data problem in the Linux and Mac apps? As those were detected by compiler runtime checks, to me it sounds as if they were relevant.

On Linux and Mac we haven't seen a single result that have been affected by this bug, i.e. it didn't have an effect on the final outcome of the calculation. With this 4.24 Windows App we have found another problem in the same module (which might have been introduced by the fix to the earlier problem). We're working on this. So we'll definitely release a new generation of Apps anyway with some bugfixes.

However for the cross-platform validation problem (only) it might be that we'll need to deal with this only on the server side.

Brian Silvers

Joined: 26 Aug 05

Posts: 772

Credit: 282700

RAC: 0

How about the 0xc0000142

4 Jul 2007 18:28:04 UTC

Message 69308

(moderation:

)

How about the 0xc0000142 crash issues? I don't know if you got my email, as you haven't replied... I wish I knew more of what to help with, but that error is a vexing one...

Edit: BTW, SIGABRT still seems to come up for Linux. See this result.

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 688499096

RAC: 208155

RE: How about the

4 Jul 2007 18:33:09 UTC

Message 69309 in response to message 69308

(moderation:

)

Quote:

How about the 0xc0000142 crash issues? I don't know if you got my email, as you haven't replied... I wish I knew more of what to help with, but that error is a vexing one...

Is it still happeneing with the new app?? I would have guesses that the majority of these bugs were secondary problems resulting in a failure to initialize the runtime debugger (which should now work).

BRM

Brian Silvers

Joined: 26 Aug 05

Posts: 772

Credit: 282700

RAC: 0

RE: RE: How about the

4 Jul 2007 18:55:17 UTC

Message 69310 in response to message 69309

(moderation:

)

Quote:

Quote:
How about the 0xc0000142 crash issues? I don't know if you got my email, as you haven't replied... I wish I knew more of what to help with, but that error is a vexing one...

Is it still happeneing with the new app?? I would have guesses that the majority of these bugs were secondary problems resulting in a failure to initialize the runtime debugger (which should now work).

He emailed me the other day asking about it. It is with 4.24. 0xc0000142 is a DLL did not initialize. It is a Windows stop error. From what I read through googling it, it could be a science app problem or it could be a graphics subsystem problem. Graphics-related, I found a few mentions of the issue happening with ATI video cards. Sooooo, based off of what I recall from the initial Linux Signal 11 ("SIGABRT") issue with some OpenGL library, then it could be whatever OpenGL software that the ATI Catalyst drivers use...

Ultimately, it's way out of my league. I mentioned he should contact Rom Walton...one of the main BOINC developers...

Brian

Windows Beta Test App 4.24 available

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner