Magic Number

Jamie Plucinski
Jamie Plucinski
Joined: 25 Feb 05
Posts: 27
Credit: 17460
RAC: 0
Topic 188549

Magic Number, such a good idea :P
All work units for any project should be assigned a magic number, and can only be distributed to clients with at minimum, a matching magic number.

The magic number is created as follows:
RAM = Maximum of 5 points
* 0 to 128 MB = 1pt(s)
* 128MB to 256MB = 2pt(s)
* 256MB to 512MB = 3pt(s)
* 512MB to 1024MB = 4pt(s)
* >512MB = 5pt(s)

Total Cache = Maximum of 2 points
* 0 to 512 KB = 1pt(s)
* > 512KB = 2pt(s)

FPU Speed = Maximum of 4 points
* 0 to 500 million ops/sec = 1pt(s)
* 500 to 1000 million ops/sec = 2pt(s)
* 1000 to 1500 million ops/sec = 3pt(s)
* 1500 to 2000 million ops/sec = 4pt(s)
* > 2000 million ops/sec = 5pt(s)

Number of CPUs = Maximum of 3 points
* 1 cpu = 1 pt(s)
* 2 cpus = 2pt(s)
* >2 cpus = 3pt(s)

Measured integer speed = Maximum of 7 points
* 0 to 1000 million ops/sec = 1pt(s)
* 1000 to 2000 million ops/sec = 2pt(s)
* 2000 to 3000 million ops/sec = 3pt(s)
* 3000 to 4000 million ops/sec = 4pt(s)
* 4000 to 5000 million ops/sec = 5pt(s)
* 5000 to 6000 million ops/sec = 6pt(s)
* >5000 million ops/sec = 7pt(s)

21 points avaliable. Then assign each work unit a ranking of 10, where machines that meet or exceed the ranking of 10 will recieve the unit, but not machines that only rank at 5 or 9.

Rating values based at random, they will probably need changing for the majority of users to be included.

The credit would then be based on requested credit from machines on par with each other, rather than being compared to lesser machines, and the credit reduced because of the requested credit difference.

Thoughts??

Jamie Plucinski
Jamie Plucinski
Joined: 25 Feb 05
Posts: 27
Credit: 17460
RAC: 0

Magic Number

> Magic Number, such a good idea :P
> All work units for any project should be assigned a magic number, and can only
> be distributed to clients with at minimum, a matching magic number.
>
> The magic number is created as follows:
> RAM = Maximum of 5 points
> * 0 to 128 MB = 1pt(s)
> * 128MB to 256MB = 2pt(s)
> * 256MB to 512MB = 3pt(s)
> * 512MB to 1024MB = 4pt(s)
> * >512MB = 5pt(s)
>
> Total Cache = Maximum of 2 points
> * 0 to 512 KB = 1pt(s)
> * > 512KB = 2pt(s)
>
> FPU Speed = Maximum of 4 points
> * 0 to 500 million ops/sec = 1pt(s)
> * 500 to 1000 million ops/sec = 2pt(s)
> * 1000 to 1500 million ops/sec = 3pt(s)
> * 1500 to 2000 million ops/sec = 4pt(s)
> * > 2000 million ops/sec = 5pt(s)
>
> Number of CPUs = Maximum of 3 points
> * 1 cpu = 1 pt(s)
> * 2 cpus = 2pt(s)
> * >2 cpus = 3pt(s)
>
> Measured integer speed = Maximum of 7 points
> * 0 to 1000 million ops/sec = 1pt(s)
> * 1000 to 2000 million ops/sec = 2pt(s)
> * 2000 to 3000 million ops/sec = 3pt(s)
> * 3000 to 4000 million ops/sec = 4pt(s)
> * 4000 to 5000 million ops/sec = 5pt(s)
> * 5000 to 6000 million ops/sec = 6pt(s)
> * >5000 million ops/sec = 7pt(s)
>
> 21 points avaliable. Then assign each work unit a ranking of 10, where
> machines that meet or exceed the ranking of 10 will recieve the unit, but not
> machines that only rank at 5 or 9.
>
> Rating values based at random, they will probably need changing for the
> majority of users to be included.
>
> The credit would then be based on requested credit from machines on par with
> each other, rather than being compared to lesser machines, and the credit
> reduced because of the requested credit difference.
>
> Thoughts??
>

Modification:
> 21 points avaliable. Then assign each work unit a ranking of 10, where
> machines that meet or exceed the ranking of 10 will recieve the unit, but not
> machines that only rank at 5 or 9.

should read

21 points avaliable. Then you can assign each work unit a ranking of x, where
only machines that are >=x will recieve the unit, but
not machines that are

Saenger
Saenger
Joined: 15 Feb 05
Posts: 403
Credit: 33009522
RAC: 0

The only thing implemented

The only thing implemented yet, and used at least by Predictor is Homogenous redundancy, and it's because of real calculatory problems, not because of minor inconveniences ;)

Grüße vom Sänger

Jamie Plucinski
Jamie Plucinski
Joined: 25 Feb 05
Posts: 27
Credit: 17460
RAC: 0

> The only thing implemented

Message 8554 in response to message 8553

> The only thing implemented yet, and used at least by Predictor is href="https://einsteinathome.org/%3Ca%20href%3D"http://boinc.ssl.berkeley.edu/homogeneous_redundancy.php">http://boinc.ssl.berkeley.edu/homogeneous_redundancy.php">Homogenous
> redundancy[/url], and it's because of real calculatory problems, not because of
> minor inconveniences ;)
>

I honestly think, the world of distributed computing would benefit from the magic number approach. And I've given up on speeding up validation, I guess I'll have to accept that a 486 is in the same league as my AMD 64 3000+, unless people start implementing something along the lines of the magic number. Science is moving faster and faster, computer technology is moving faster and faster, but why does scientific computing move so damn slow? It's all well and good having a million computers crunching data, but giving an old machine such a big task, in one lump is pointless, smaller chunks will make it more managable a task. I don't want older machines excluded, but I think smaller tasks would work much better, and enable an active participation for these clients in the community, without sacraficing the speed benefits of faster clients.

gravywavy
gravywavy
Joined: 22 Jan 05
Posts: 392
Credit: 68962
RAC: 0

I strongly disagree. You

I strongly disagree.

You are trying to do two different things with the magic number, and a magic number is not the best way to do either of them.

The first thing some of you are trying to do is to make sure you don't get left waiting for a WU to be processed by somebody else. The other is to match scores given that different machines seems to generate different scores for the same work.

Matching the waiting times

From the project point of view this does not matter much, as it does not affect total throughput over a year, just how long a given result spends in the pipeline.

There are two reasons why it might be good to address this, however. One is that if all the slow-turn-round machines are crunching the same WU then there will be fewer entries in the database that trachs WU from the time they are issued to when they are verified.

The second is that there is a strong user-satisfaction issue, even tho it does not matter much to the timescale of the project, many users don't like to wait for their credits. Every user is a volunteer, giving their machines, electricity, etc free, you all do it becasue you like the physics or like the IT or you like something about it or you just plain like being generous. Anything that annoys users subtracts from that liking, and so is bad for the project in the long term.

My suggestion on how to do this is to have a way (in the Einstein prefs maybe?) to choose the shelf-life of your WU. Those who know they can turn round a WU in 2hrs would pick a shelf-life of maybe 6hrs to be on the safe side, and so on. With shelf lives of 6hrs, 12hrs, 1day, 2days, 4days, 8days
you'd be self-selecting into the group that tended to turn round the WU in about the same time you do.

The facility to choose would have to be variable between machines, or at least between the existing three categories of home, scool, and work, so that users could have appropriate choices for differing machines.

Same scores

The same scores issue is a different one. It is basically an illusion created by an imperfect measuring tool. If your machine score higher than another machine on every unit, it stands to reason that it does not mean your machine always does more work than the other one. It is certainly no more valuable work than the other machine, as the project gets the same value out of a WU whoever crunches it (if it is calculated correctly of course). If you score 125 and someone else scores 88 for delivering the same result to the project, then clearly the true value is somewhere in between.

The averaging process is a fairer way of allocating credit than you'd get if you grouped all the similar machines together, typically by taking the 'average' of four measurements of the value it is always a better figure than any one of the diverse results would be. (the precise way the 'average' is deduced is irrelevant).

The better answer here, when the project team have time to address the point, is to make a benchmarking tool that is more consistent across differing machines, not to perpetuate the unfair differences between different measuremnts. I rep

~~gravywavy

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.