No Progress

ebahapo
ebahapo
Joined: 22 Jan 05
Posts: 47
Credit: 755276
RAC: 0
Topic 187852

Some WUs accumulate time but the progress indication doesn't move beyond zero. If left alone, the WUs are eventually aborted for taking too long. E.g., H1_0160.4__0160.8_0.1_T05_Test02_0 and H1_0160.4__0160.7_0.1_T05_Test02_0 . What gives?

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

No Progress

Which BOINC CC version are you using?

Some of the Alpha products (4.62, 4.64, 4.66, 4.20, 4.21 and 4.22) don't show any progress in the progress bars, or totally lack the progress bars.

ebahapo
ebahapo
Joined: 22 Jan 05
Posts: 47
Credit: 755276
RAC: 0

> Which BOINC CC version are

Message 4088 in response to message 4087

> Which BOINC CC version are you using?

I'm using 4.19. Some E@H WUs do show the progress bar moving on the same machine, it's only some WUs that seem stuck...

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4349
Credit: 253632788
RAC: 35273

You can tell from the error

You can tell from the error code ERR_RESULT_START that for most of your results apparently the Client couldn't start the App. The stderr reads "aborted via GUI RPC". To me this looks like something on your machine is sending Trash to the RPC port. Running any malware scanners etc.?

BM

BM

ebahapo
ebahapo
Joined: 22 Jan 05
Posts: 47
Credit: 755276
RAC: 0

> You can tell from the error

Message 4090 in response to message 4089

> You can tell from the error code ERR_RESULT_START that for most of your
> results apparently the Client couldn't start the App. The stderr reads
> "aborted via GUI RPC". To me this looks like something on your machine is
> sending Trash to the RPC port. Running any malware scanners etc.?

Well, after several hours (about 20h) without progress, I did abort the WUs through the GUI RPC... Typically, the WUs on that system take between 10 and 15h to complete. Was I too impatient?

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4349
Credit: 253632788
RAC: 35273

No, I guess. Anyway - in

No, I guess. Anyway - in these cases the App couldn't even start properly. Hm, you are running a 4.19 client - how did you abort the Result? Using telnet or the boincmgr of a different client version?

BM

BM

ebahapo
ebahapo
Joined: 22 Jan 05
Posts: 47
Credit: 755276
RAC: 0

> No, I guess. Anyway - in

Message 4092 in response to message 4091

> No, I guess. Anyway - in these cases the App couldn't even start properly. Hm,
> you are running a 4.19 client - how did you abort the Result? Using telnet or
> the boincmgr of a different client version?

Why do you say that it couldn't start properly? It did clock 3728s, although it ran for over 20h before I aborted it using BoincView 0.9.2d (http://boincview.amanheis.de/).

TIA

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

BoincView as far as I know

BoincView as far as I know only reports the amount of time written to the client_state.xml, it's not reading the actual time that BOINC is crunching the unit.

Since the version of BV you are using is also still at least a beta client, it could be a problem with that. It could also be a problem of BV working together with BOINC CC4.19

Then there's the thing that BV does not abort units in BOINC. So the units must have run for the amount of time as said and then aborted, but BV kept it running. That's a bug for you to report on BV.

You can only abort units in BOINC CC 4.2x
Not in BOINC CC 4.19

To abort units there, you either have to Reset the project through the BOINC GUI, or to detach & reattach.

ebahapo
ebahapo
Joined: 22 Jan 05
Posts: 47
Credit: 755276
RAC: 0

> BoincView as far as I know

Message 4094 in response to message 4093

> BoincView as far as I know only reports the amount of time written to the
> client_state.xml, it's not reading the actual time that BOINC is crunching the
> unit.

The only interface with the system in question is through TCP port 31416...

On another note, I noticed that a WU wasn't being run at all. This system has 2 processors, yet only one was running anything (S@h specifically). Although BOINC thought that it was running E@H too, the other processor was 100% idle. And I mean idle, no nice time.

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 0

But BoincView is a separate

But BoincView is a separate application, not programmed by anyone who is programming BOINC. BOINC Pojects write their state to the harddrive every so many seconds, either set by you or by the application. They write this state to the Client_State.XML file.

All of the external applications read the client_state.xml file or another file they fancy.

When the unit is done, the results will be written back to the server of the project you anticipate in.

So the port you specify may be something of a network port to another computer that you can check on, but it isn't any port that BOINC communicates on.

ebahapo
ebahapo
Joined: 22 Jan 05
Posts: 47
Credit: 755276
RAC: 0

> All of the external

Message 4096 in response to message 4095

> All of the external applications read the client_state.xml file or another
> file they fancy.

That's not the case here. I run BoincView on a system that's not the one in question. As a matter of fact, I use it on Windows and the system in question, http://einsteinathome.org/host/8009, runs Linux. It is not sharing the BOINC directory.

> So the port you specify may be something of a network port to another computer
> that you can check on, but it isn't any port that BOINC communicates on.

No, it's the port used for RPC of BOINC client 4.19. BoincView in getting that information through the RPC interface.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.