Searching for pulsars in PALFA data from Arecibo

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

Resetting the project, after

Message 89066 in response to message 89065

Resetting the project, after deleting the app_info.xml file & restarting BOINC, will in the least make sure the correct application version is downloaded again.
Just deleting the app_info.xml file will leave information in the client_state.xml file that an anonymous platform is being used.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2142
Credit: 2774461696
RAC: 847433

RE: Resetting the project,

Message 89067 in response to message 89066

Quote:
Resetting the project, after deleting the app_info.xml file & restarting BOINC, will in the least make sure the correct application version is downloaded again.
Just deleting the app_info.xml file will leave information in the client_state.xml file that an anonymous platform is being used.


I was just going to suggest the same thing!

In fact, I'm surprised that Gary's method works, because in general Beta apps are unsigned, but BOINC requires a signature for the production apps.

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

RE: In fact, I'm surprised

Message 89068 in response to message 89067

Quote:
In fact, I'm surprised that Gary's method works, because in general Beta apps are unsigned, but BOINC requires a signature for the production apps.


That would explain this:-)

19/09/2009 15:41:20|Einstein@Home|[error] Application file einstein_S5R5_3.05_windows_intelx86.exe missing signature
19/09/2009 15:41:20|Einstein@Home|[error] BOINC cannot accept this file

I only had deleted the app_info.xml file. However, there are no problems since, even without a reset.

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5850
Credit: 110030870096
RAC: 22431869

RE: Resetting the project,

Message 89069 in response to message 89066

Quote:
Resetting the project, after deleting the app_info.xml file & restarting BOINC, will in the least make sure the correct application version is downloaded again.


True - but I've always had an aversion to resetting projects if there is another way ... :-).

Quote:
Just deleting the app_info.xml file will leave information in the client_state.xml file that an anonymous platform is being used.


Which was why I said to "delete the reference to the test app that is inserted into the state file."

For someone with a single machine and no experience with playing around in the state file, resetting is obviously a far less risky way to go.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5850
Credit: 110030870096
RAC: 22431869

RE: ... I'm surprised that

Message 89070 in response to message 89067

Quote:
... I'm surprised that Gary's method works, because in general Beta apps are unsigned, but BOINC requires a signature for the production apps.


Not only does the method work but cutting and pasting the signature from another machine running the official app works like a charm as well. I tried cutting and pasting first but then discovered I didn't even need to do that as long as I removed the AP stuff from the state file. The state file ended up with the signature inserted automatically - I guess as a result of the scheduler exchange that ended with the "file exists - skipping download" message.

It's a while since I last did that so I'm a bit hazy on the details but I'm sure that I was able to do this without even running the cache dry. I just examined the differences between two state files, one running under AP and the other running the same app 'officially'. It was pretty obvious what needed to be deleted. I used to pre-prepare the signature as a small block to be inserted and then do each machine sequentially. It took around a minute per machine to go from crunching under AP to crunching the same tasks with the same app - but 'officially'. These days I don't even bother inserting the signature - I just delete the AP stuff. I don't know if anything has changed with more recent BOINCs that would invalidate the procedure.

Another thing I've found I can do by playing around in the state file is completely recover trashed caches. On quad core machines, I quite often have a cache of 50-100 tasks or more. My internet plan with my ISP has a 6GB monthly limit with a 60GB off-peak free allowance. I run a cron job twice a day on one machine that uses boinccmd to stop network access on all machines just before the start of the peak period and then to allow access just after the start of the off-peak period. That way all uploading and downloading is done in the free period.

So with this regimen in place and with the peak period being during the day and evening, I tend to keep an eye out for problems during that time. If something goes wrong and a cache gets trashed, nothing gets reported to the project and an easy recovery is usually possible. As an example, I had an overheating problem on a machine with a dodgy CPU fan. Fortunately it happened and was discovered during peak hours when comms were disabled. The machine didn't crash - it just trashed the entire cache :-). After replacing the fan, I was able to restart the machine and edit the state file to remove all the trashed results, leaving the small number of results that had been completed without error prior to the fan going south. When I restarted BOINC and re-enabled network access, the server promptly sent a flood of missing results and accepted the upload of the good results that had been saved. All I lost was the time that had been spent on the 4 current tasks in flight, up to the point that the temperature became too hot.

I would have done at least 10 full cache recoveries like this (for various reasons) over the last year or two. As yet, I haven't had a single recovery fail. Gotta love that 'resend lost results' BOINC feature :-).

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.