granted credit zero

Ensor
Ensor
Joined: 9 Feb 05
Posts: 49
Credit: 1450362
RAC: 0

Hi, > On this particular

Message 5572 in response to message 5564


Hi,

> On this particular wu, what would have happened had the 2
>linux systems returned their results first? It hardly seems
>that simply crossing the finish line first should either
>validate or invalidate science.

Actually, take a look at this WU: #366758.

In this case the two Linux systems reported first and effectively locked out the Windows boxes - by the look of it anyway.

I wonder what it is about the results that causes this problem?

> All that said, I'm really not overly concerned with points,
>but I really hate it when my contribution is useless.

I'm with you on that. I don't think our contributions are completely useless in cases like this, but it is somewhat frustrating - seems like such a waste of time, IYKWIM?

TTFN - Pete.


wijata.com
wijata.com
Joined: 11 Feb 05
Posts: 113
Credit: 25495895
RAC: 0

Well i guess there are some

Well i guess there are some differences in computation between linux and windows versions.

I guess it's time for developers to say something here. Our work (and CPU power and energy power) is wasted this way.

Peter Wagner
Peter Wagner
Joined: 24 Feb 05
Posts: 3
Credit: 73454
RAC: 0

> I guess it's time for

Message 5574 in response to message 5573

> I guess it's time for developers to say something here. Our work (and CPU
> power and energy power) is wasted this way.
>
developers please do something. this is an obvious bug.

john.mac
john.mac
Joined: 9 Feb 05
Posts: 85
Credit: 167393
RAC: 0

I examined some result from

I examined some result from the Merlin cluster running linux;

I found this one very ???
machine ID 3701 WU 1217945

It reported first but with different checksums with the 2 other WinXP's which had identical ones.

Found on the other hand also late reports for a Merlin machine but was credited also together with XP's.

????????????????

John,

Bruce Allen
Bruce Allen
Moderator
Joined: 15 Oct 04
Posts: 1119
Credit: 172127663
RAC: 0

> > I guess it's time for

Message 5576 in response to message 5574

> > I guess it's time for developers to say something here. Our work (and
> CPU
> > power and energy power) is wasted this way.
> >
> developers please do something. this is an obvious bug.

We are looking into this: it appears that our validator may be setting the agreement threshold slightly too tight in some cases. This is hard to 'tune in advance' without having access to the actual results. So please be patient: one of our developers is now working on this.

Bruce

Director, Einstein@Home

john.mac
john.mac
Joined: 9 Feb 05
Posts: 85
Credit: 167393
RAC: 0

> I examined some result from

Message 5577 in response to message 5575

> I examined some result from the Merlin cluster running linux;
>
> I found this one very ???
> machine ID 3701 WU 1217945
>
> It reported first but with different checksums with the 2 other WinXP's which
> had identical ones.
>
> Found on the other hand also late reports for a Merlin machine but was
> credited also together with XP's.
>
> ????????????????
>
> Sorry, work ID = 342166 (and granted credit)
>

John,

Peter Wagner
Peter Wagner
Joined: 24 Feb 05
Posts: 3
Credit: 73454
RAC: 0

> We are looking into this:

Message 5578 in response to message 5576


> We are looking into this: it appears that our validator may be setting the
> agreement threshold slightly too tight in some cases. This is hard to 'tune
> in advance' without having access to the actual results. So please be
> patient: one of our developers is now working on this.

thanks in advance.
I'll be patient.

But hard to understand for an newbie like me. two computers running the same software - receive the same input - do the same computational stuff - should get the same results - apply the same checksum algorithm on these results - why can there be a difference at all?

Peter

wijata.com
wijata.com
Joined: 11 Feb 05
Posts: 113
Credit: 25495895
RAC: 0

Bruce Allen, note that this

Bruce Allen, note that this could be bigger problem than simply tuning the validator.
The history shows, that there were cases where first two windows machines returned similar results, and they were ok as should be. Then two linux machines returned identical results (hence different from windows machines) and they were marked invalid.

On some other thread i read, that core has different version on linux than windows (4.80 vs 4.79).
Maybe they just do different computation? Maybe difference is too big?

Greetings from Poland!

Keck_Komputers
Keck_Komputers
Joined: 18 Jan 05
Posts: 376
Credit: 5744955
RAC: 0

> > > We are looking into

Message 5580 in response to message 5578

>
> > We are looking into this: it appears that our validator may be setting
> the
> > agreement threshold slightly too tight in some cases. This is hard to
> 'tune
> > in advance' without having access to the actual results. So please be
> > patient: one of our developers is now working on this.
>
> thanks in advance.
> I'll be patient.
>
> But hard to understand for an newbie like me. two computers running the same
> software - receive the same input - do the same computational stuff - should
> get the same results - apply the same checksum algorithm on these results -
> why can there be a difference at all?
>
> Peter
>
In two words: rounding errors.

The best example of this is 1/2=4.999999999 with as many 9's as your default floating type has significant digits. Different OS and CPUs handle this differently. After doing this reapeatedly and using the output for input on the next calculation the differences can become significant.

BOINC WIKI

BOINCing since 2002/12/8

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5779100
RAC: 0

Not sure what calculator you

Message 5581 in response to message 5580

Not sure what calculator you used to come to 1/2=4.9999999999999 John, but I would throw it in the bin and use some old fashioned paper and a pencil. ;)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.