Work Unit not finishing

ric
ric
Joined: 4 Jan 05
Posts: 51
Credit: 236006
RAC: 0
Topic 187217

Windows Advanced Server 2003 AMD 1200, 512MB RAM running at regular speed.

The Work Units in the first line of the Hardcopy is not finishing.

Obviously the same situation as described in this thread .

Action taken so far

- suspend & restart Boinc client
- restarted the client
- upgraded the from 4.13 to 4.56
- under 4.56 suspended the single work, keeping the remaining work queue.

--> The next work unit was completed and returned.

Now the next next work is on the road.

If it helps the project, let me know what I can do or what more information is needed.
Or if its ok to "abort" this single Work

regards

ric

Richard M
Richard M
Joined: 11 Nov 04
Posts: 78
Credit: 250778534
RAC: 939752

Work Unit not finishing

I had the same problem with WU:pt14_I12_f59.898_b0.104 (same one?)
Aborted WU by moving files from slot and restarting Boinc.

stderr.txt file shows:
Resuming computation at 31469/19104567/19120703
Resuming computation at 31453/18151777/18185468
Resuming computation at 31469/19104567/19120703
Resuming computation at 31481/18173254/18185468

Computer specs:
Dual (2) Intel® XEON 2.8GHz Processors

Quad (4) Hyper-Threading Processors

EMT 64-Bit, 800MHz FSB, 1MB L2 Cache

2GB ECC Registered DDR Memory

Dual Channel DDR266 PC2100

1.44MB Floppy Drive (Black)

Lite-On 52x CDRW Drive (Black)

Onboard SATA Raid Controller

Raid 0, 1, 10, and JBOD Support

(2) 200GB SATA Hard Drives, 8MB Cache

SuperMicro X6DAL-TG 800MHz 64-Bit Board

SuperMicro SATA HotSwap Tower, 450Watt

Onboard Intel Gigabit 10/100/1000 NIC

Video:128MB PCI Express Video Card

Richard

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4274
Credit: 245316457
RAC: 12004

Sorry, a bug in our code is

Sorry, a bug in our code is triggered by data in this Workunit. It ends up in an endlees loop at the final calculation, so the counter will stay at 100% until the result reaches the "maximum number of floating point operations" that is defined in the workunit and then the client will continue with the next one. So you don't have to do anything special, just wait...

We found and fixed tha bug, but the new app we built is still passing some internal tests. Sorry for the inconveniance, this is alpha test.

BM

BM

ric
ric
Joined: 4 Jan 05
Posts: 51
Credit: 236006
RAC: 0

Thanks Bernd for answering

Message 824 in response to message 823

Thanks Bernd for answering and the explanation.

Don't be sorry, I'm fully aware, still alpha status and what this means.

Will reactivate the isolated Unit and let it run over night.

Not having any kind of inconveniances, important is to support the project.

ric

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4274
Credit: 245316457
RAC: 12004

The maximum FLOPS are, I

The maximum FLOPS are, I think, set quite high, so you'll have to wait quite a while and still won't get any credit as the result file isn't valid. So if you get bored, "update" the project to upload the Results that you have finished so far, then "reset" the project (and hope you get a different WU...).

BM

BM

ric
ric
Joined: 4 Jan 05
Posts: 51
Credit: 236006
RAC: 0

No! no need to reset the

Message 826 in response to message 825

No!

no need to reset the full project, this will mean, all other work for this attached project will get lost.

Right now, 16 other Einstein WUs are sitting in the Queue of this host, just enough to stay about within the deadline - 2 days.

Will use the new possibilities to reset a single work unit, not all the project.

the rpc "abort result" can be used in this case.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4274
Credit: 245316457
RAC: 12004

Well, I was thinking about

Well, I was thinking about the poor users who use the stock 4.13 client. If, however, you got a newer one, the possibilities are better there.

BM

BM

Marco Niese
Marco Niese
Joined: 11 Nov 04
Posts: 63
Credit: 38527
RAC: 0

I've Googled a link for BOINC

Message 828 in response to message 827

I've Googled a link for BOINC 4.58 but it still has its share of problems (can't use service when not installed to default directory and the ./username login issue).
I don't have this problem so I'll stick with 4.13 for now, but other people might want to Google "BOINC CC 4.58 released for Windows" (use at own risk of course).

> Well, I was thinking about the poor users who use the stock 4.13 client. If,
> however, you got a newer one, the possibilities are better there.
>
> BM
>

- Marco
Team Canada


Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4274
Credit: 245316457
RAC: 12004

The new set of apps we just

The new set of apps we just put on the server (4.69-4.71) should have this bug fixed. Please report if not.

BM

BM

Anonymous

> The new set of apps we just

Message 830 in response to message 829

> The new set of apps we just put on the server (4.69-4.71) should have this bug
> fixed. Please report if not.
>

It would appear "not". I have WU pt16_I12_f59.998_b0.104_8 being cruched by einstein 4.71 and it seems to be having the same trouble of not finishing after it gets to 100%.

Boinc CC 4.14
Window 2000 SP4
Athlon 1100 MHz
512 MB RAM

Rebirther
Rebirther
Joined: 4 Jan 05
Posts: 22
Credit: 31576
RAC: 0

I have an endless loop in

I have an endless loop in "H1_0059.9__0060.0_0.1_T05_Test02" after 100% + a memory leak at now 88MB to be increasing. Pls check it out!

Boinc 4.16 using...Einstein v4.72

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.