Pending credit piling up

ldhcria
ldhcria
Joined: 4 Jun 05
Posts: 34
Credit: 518917
RAC: 0
Topic 190829

I just noticed that I have 60 pending credits (2,264.77 claimed) from as far back as 2/2/06. It seems odd to me for pending credit to pile up like that and I was wondering what could be going on to cause it.

Michael Roycraft
Michael Roycraft
Joined: 10 Mar 05
Posts: 846
Credit: 157718
RAC: 0

Pending credit piling up

Read this history lesson carefully - there will be a short quiz following :-)

This situation arises partly due to the shorter (vs einstein) albert WUs. A host takes less time to process an albert, thus produces a higher # of results.

The project directors made four decisions, all of them for good reasons, that set up the risk for the "pending problem", which is in truth not a problem, just a consequence.

{1} A few months ago, in response to requests by participants crunching on slower hosts, WU deadlines were doubled, to 14 days, so that these hosts have a better chance of completing work before deadline and contributing to the projects.

Last December, the switch was made to albert WUs, which vary greatly in size.

{2} At that time, it was decided to reduce the "replications", the number of hosts initially assigned to a WU, from 4 to 3, the reason being to increase the throughput by eliminating redundancy (the 4th host was usually not needed to complete the quorum and validate the WU, and therefore was increasing inefficiency because it could more usefully be crunching on another WU). The completed, validated WU could then be deleted from the database, relieving server crowding.

{3} At the same time, a decision was made to double the max daily quota (from 8/day to 16/day).

An inordinate number of the early alberts were tiny "shorties", that the faster hosts could crunch in an hour or less. A datafile of shorties contains literally hundreds of those tiny WUs, and a host was "stuck" crunching one after another of them for weeks or longer. Those hosts crunching solely for Einstein@Home would run short of work on a daily basis, because the max daily quota was only 16 WUs/day. That host would then be idle, for nearly a half-day in the worst-case scenario, and that scenario was not at all uncommon. Thus there were a series of requests to address the "shortie" problem from many of our most-productive contributors who for their own personal and entirely valid reasons did not wish to attach to another BOINC project to fill their hosts' idle time. And that leads us to decision 4...

{4} About 5 weeks ago, it was decided to re-double the max daily quota (to 32/day), so that no host attached to Einstein should be forced to be idle.

Note: Even before any of these changes were instituted, it was possible to accumulate a large "pending" collection. Last summer, my own grew to around 35 einstein WUs, worth nearly 3000 credits, and that's on a single-core AMD Athlon XP, one machine. A combination of factors regarding hosts co-assigned to WUs I was crunching: hosts that connected to the 'net infrequently could bottle up a bunch of completed WUs for several days: "newbies" who'd attach, be assigned several WUs, crunch 1 or 2 of them and decide that E@H wasn't that much fun but fail to detach or abort the work they'd never finish; and one of my personal favorites, hosts afflicted with the "graphics bug", which errored-out WU after WU after WU while the owners would presumably be drooling over the pretty screensaver, too uninterested in what work their machine may or may not be doing to check out the message boards for what was a well-known, easy and effective workaround. These last, I dubbed "serial WU killers".

Now, we have a situation where, for example, on your 252731 host, you have been assigned to crunch WUs from the z1_0204.0 large datafile. this datafile has also been assigned to a limited number of other hosts. It has literally hundreds of individual WUs that will be "sliced" from it and assigned to you as your machine requests more work, and if you check several of these WUs, you'll find that many times, your co-crunchers will reappear on more of them. All too often one of these others will overdo it on their cache size, and remember that the possibility now exists, due to larger daily quotas and longer deadlines, for one or more of them to have several hundred WUs queued, so their host is severally commited. Now, suppose that this person decides to "explore" a little, adding another project or three to the mix. Their host is now severely over-committed, no hope of completing all that Einstein work before deadline, so maybe in panic they start playing with all the buttons, hopelessly confusing the scheduler in the process. Possibly they wonder why, when they've added Climate Prediction or SETI or Rosetta to the mix, they don't see their machine working on those projects, so they "suspend" Einstein to force work on another one. Having not figured it out yet, they fail to recognize that Einstein will past deadline, and they don't think to abort those WUs that haven't a chance of completing on-time. These WUs now will go to full 14-day deadline before they're overdue and get re-assigned to another host.

A nightmare in the making? Not so. As in my case last summer, every last one of those WUs eventually reached quorum, was validated and credit issued - just a matter of time. The numbers of pending have the potential of becoming much greater now due to all the changes, but the bottom line is the same - all credit will be received, in time. The developers have taken a calculated risk in order to increase the capacity and throughput of data from project participants - they have faith in us - that the vast majority of us can and will adjust and adapt to the new conditions, rather than let impatience overwhelm us and send us packing to another project or none at all. I think that most of us will adapt, and be pleased that our Einstein project has found a sound way to accelerate it's research, and Einstein@Home is counting on us to grow with it.

Regards,

Michael

microcraft
"The arc of history is long, but it bends toward justice" - MLK

Richard M
Richard M
Joined: 11 Nov 04
Posts: 78
Credit: 221506129
RAC: 1210197

Well...that pretty much

Well...that pretty much covers that subject! :-)

That was a very good explanation Michael. I think this thread should be a sticky for future reference.

Richard
(Proud owner of 196 pending results)

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1119
Credit: 172127663
RAC: 0

Michael's reply says it all.

Michael's reply says it all. Just keep on crunching: eventually those credits will arrive. If you are patient you will eventually get into a 'steady state' condition where the number of pending workunits is constant, and you will be accumulating credits at a constant rate.

Cheers,
Bruce

Director, Einstein@Home

Pooh Bear 27
Pooh Bear 27
Joined: 20 Mar 05
Posts: 1376
Credit: 20312671
RAC: 0

RE: Michael's reply says it

Message 25429 in response to message 25428

Quote:

Michael's reply says it all. Just keep on crunching: eventually those credits will arrive. If you are patient you will eventually get into a 'steady state' condition where the number of pending workunits is constant, and you will be accumulating credits at a constant rate.

Cheers,
Bruce


Depends on your definition of constant. I see pending results fluctuate quite a few up and down over periods of time. They stay within someone a range, but have do fluctuate enough. I don't mind the fluctuation.

Though this is undoubtly one of the best projects out there. I have upped my work here quite a bit. Too bad I lost one of my crunchers (I didn't own it, but it was in my home until yesterday). I want to build more, but the electric company gets too much of my cash right now. It does help keep some of the heating costs down. Two rooms have the vents totally closed, cause they stay warm enough. Summer is gonna kick my butt.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6534
Credit: 284700169
RAC: 113726

What a brilliant explanation

What a brilliant explanation Michael! It certainly deserves attribution. So where's the quiz you promised? :-)
That reminded me of an apt quotation I read from The Mythical Man Month-Month by Frederick P. Brooks, Jr., ( a brilliant and funny book about the human factor and foibles of large software projects ). Chapter Two subtitle - 'Good cooking takes time. If you are made to wait, it is to serve you better, and to please you'. He had quoted this in turn from the heading of a French restaurant menu.
Cheers ( 3954.47 credits pending ) Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

ldhcria
ldhcria
Joined: 4 Jun 05
Posts: 34
Credit: 518917
RAC: 0

RE: Read this history

Message 25431 in response to message 25426

Quote:

Read this history lesson carefully - there will be a short quiz following :-)
...

Regards,

Michael

I just realized I'd never gone back to look for an answer to this question. I wasn't really worried that the credit would eventually be given, I was just curious as to why it started piling up recently. Thanks for the explanation.

Leland.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.