Slow Wingman

disturber
disturber
Joined: 26 Oct 14
Posts: 30
Credit: 57155818
RAC: 0
Topic 197961

I was looking at why my RAC was going flat on two of my AMD gpus (7970/280x) and a quick check shows that they were being paired with really slow gpus. In one case they were 9800GT cards that are much slower. The other thing I noticed was that computer had many days worth of tasks in progress.

example:

http://einsteinathome.org/workunit/211080692

Any thoughts on why the server will pick such mismatched wingmen?

Pooh Bear 27
Pooh Bear 27
Joined: 20 Mar 05
Posts: 1376
Credit: 20312671
RAC: 0

Slow Wingman

Everyone will tell you random. The tasks are sent in more or less next order and whoever is next to ask for work gets it. No favoritism. The logic behind matching faster or slower machines could slow down production and is unnecessary. Things work well the way they are.

As for getting tons of work, that is settings you make on your account. Go to Global Preferences and adjust "Computer is connected to the Internet about every
Leave blank or 0 if always connected. BOINC will try to maintain at least this much work (max 10 days)" and "Maintain enough work for an additional" to lower numbers.

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7213814931
RAC: 987004

RE: Any thoughts on why the

Quote:
Any thoughts on why the server will pick such mismatched wingmen?


I don't think the code makes any attempt to "match" quorum partners, except that for those Einstein applications which re-use portions of input data for multiple WUs, it prefers to send work to a host which already has downloaded the re-usable portion.

Yes, it is true that sometimes one can get partnered a lot with a host which is not only slow in returning results, but never does. In that case you just wait until the affected work times out, gets sent to another host, and gets returned.

Generally it all seems to come right in the end.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 713875684
RAC: 903474

RE: RE: Any thoughts on

Quote:
Quote:
Any thoughts on why the server will pick such mismatched wingmen?

I don't think the code makes any attempt to "match" quorum partners, except that for those Einstein applications which re-use portions of input data for multiple WUs, it prefers to send work to a host which already has downloaded the re-usable portion.

Yes, it is true that sometimes one can get partnered a lot with a host which is not only slow in returning results, but never does. In that case you just wait until the affected work times out, gets sent to another host, and gets returned.

Generally it all seems to come right in the end.

I fully agree.
Also note one thing: the purpose of the pairing is to make sure that even if faulty results are reported by one host, there is a good chance that the second wingman will have it right. If anything, we would want the wingmen to be as different as possible in HW, drivers, etc, or at least not somehow correlated. Otherwise we would degrade this system by, say, preferring matches of Maxwell GPUs with other Maxwell GPUs which would be bad if there was a particular problem with them. And if you try to be REALLY smart and , say, try to pair only fast "vendor A" cards to fast "vendor B" cards, you run into problems as well as there are possibly more hoss with fast "Vendor A" cards ... so a more or less random pairing seems to be quite ok.


HB

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.