Pascal again available, Turing may be coming soon

archae86

Joined: 6 Dec 05

Posts: 3163

Credit: 7356201687

RAC: 2260869

Mikey, Thanks for posting

23 Sep 2018 16:12:20 UTC

Message 166938

(moderation:

)

Mikey,

Thanks for posting the Primegrid discussion link.

It appears that person asserts that for one multi-hour PrimeGrid computation the 2080 productivity improvement relative to Pascal generation GTX cards is approximately:

1060    2.9
1070    2.0
1080    1.7
1080TI  1.2

If something like that comes true at Einstein, and if power consumption is on the good side of hopes, then Turing may not be completely nutty for Einstein purchase, at least for people with high power cost. On the other hand if it comes in somewhat slower and has disappointing power consumption the extremely high purchase price may well rule it out for most of us. It seems thoroughly unlikely to reach economic break even for Gary Roberts (but if it did, Nvidia probably under priced the cards).

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3005678180

RAC: 708663

Just a quick heads up for

23 Sep 2018 16:48:29 UTC

Message 166939

(moderation:

)

Just a quick heads up for users of the new RTX range under BOINC. Current versions of BOINC will report double the true GFLOPS peak value when detecting the new cards at startup: this is because the architecture has changed from 128 cores to 64 cores per SM.

I've submitted code (which has been accepted) to correct that discrepancy: version 7.14 of BOINC will display the correct value.

Which, for the record, appears to be 10,687 GFLOPS peak for the RTX 2080 (real testing with BOINC and SIV)

archae86

Joined: 6 Dec 05

Posts: 3163

Credit: 7356201687

RAC: 2260869

Richard Haselgrove

23 Sep 2018 18:31:30 UTC

Message 166940 in response to message 166939

(moderation:

)

Richard Haselgrove wrote:

Current versions of BOINC will report double the true GFLOPS peak value when detecting the new cards at startup

Does that suggest initial fetching requests may be considerably more aggressive than wanted if we fail to take the precaution of greatly restricting our queue size when converting?

Also, can you by any chance provide any comparative context for the 10,687 GFLOPS peak number? Is that good or bad, considering?

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3005678180

RAC: 708663

archae86 wrote:Does that

23 Sep 2018 21:05:38 UTC

Message 166943 in response to message 166940

(moderation:

)

archae86 wrote:

Does that suggest initial fetching requests may be considerably more aggressive than wanted if we fail to take the precaution of greatly restricting our queue size when converting?

That's a very interesting question. At most (generic) BOINC projects, I'd say no, because runtime estimates are controlled from the server components of CreditNew. Estimates would stabilise gradually under the influence of the results returned by the new device, taking maybe 100 tasks to re-normalise.

But here at Einstein, you don't use CreditNew, and you don't have that server-side stabilisation. I don't know exactly what use (if any) is made of the client-side GFlops peak in GPU task estimation. In the worst case, the client would double-fetch the first time, but then DCF would double the instant the first task with the new device completed, and no further work would be fetched until the double fetch had been 50% processed.

Best advice: first user sets a low cache before installing the device, and reports back.

Quote:

Also, can you by any chance provide any comparative context for the 10,687 GFLOPS peak number? Is that good or bad, considering?

Not really - I don't chase the dragon where new GPUs are concerned. My highest comparator is a GTX 970 at 4,087 GFLOPS peak.

Your best bet would be to ask current users of the GTX 1080 (not Ti) to report their values - any BOINC client since 2014 will do, that bit of code hasn't changed for over four years until now.

Edit - exactly the same thing would have happened with the Titan V (again, 64 cores per SM) when that was released. If anyone was paying enough attention.

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3590283663

RAC: 903850

Richard Haselgrove

24 Sep 2018 11:28:04 UTC

Message 166952 in response to message 166943

(moderation:

)

Richard Haselgrove wrote:

archae86 wrote:
Does that suggest initial fetching requests may be considerably more aggressive than wanted if we fail to take the precaution of greatly restricting our queue size when converting?
That's a very interesting question. At most (generic) BOINC projects, I'd say no, because runtime estimates are controlled from the server components of CreditNew. Estimates would stabilise gradually under the influence of the results returned by the new device, taking maybe 100 tasks to re-normalise.

But here at Einstein, you don't use CreditNew, and you don't have that server-side stabilisation. I don't know exactly what use (if any) is made of the client-side GFlops peak in GPU task estimation. In the worst case, the client would double-fetch the first time, but then DCF would double the instant the first task with the new device completed, and no further work would be fetched until the double fetch had been 50% processed.

Best advice: first user sets a low cache before installing the device, and reports back.

Ugh. I certainly hope that's not the case, CPU vs GPU DCF is already a garbage fire in this project if you want to have enough work to survive more than a few hours network/server outage.

With a 1080 and i7-4790 and 3 at a time GPU tasks (and the 3rd task is already mostly for DCF moderation even if it did at a few percent to total GPU throughput at the cost of taking a CPU core offline), wanting 4 days of backup work already means I'm barely able to finish GW CPU tasks before the deadline. With a 2080Ti adding ~50% real speed and potentially 2x bugged DCF even a 1 day work cache would be problematic.

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

DanNeely wrote:Ugh. I

24 Sep 2018 12:07:27 UTC

Message 166954 in response to message 166952

(moderation:

)

DanNeely wrote:

Ugh. I certainly hope that's not the case, CPU vs GPU DCF is already a garbage fire in this project if you want to have enough work to survive more than a few hours network/server outage.

You really have to use two BOINC instances, one for the CPU and the other for the GPU, to get BOINC working correctly for both on the same project (they work OK on different projects). It is another BOINCism that we have to live with.

https://www.overclock.net/forum/18056-boinc-guides-tutorials/1628924-guide-setting-up-multiple-boinc-instances.html

(And to start the second instance upon reboot, I use the task scheduler, which they don't mention there. You will also need to start the BOINC Manager by placing the icon they mention in Windows Start folder. It is all DIY.)

Keith Myers

Joined: 11 Feb 11

Posts: 5055

Credit: 19139477553

RAC: 5039696

Looks like the Phoronix

24 Sep 2018 16:53:54 UTC

Message 166956

(moderation:

)

Looks like the Phoronix article the other day simply forgot to run the FAHBenchmarks on their Compute tests. They have a new article up with the results on cards from GTX 680 through RTX 2080Ti.

Folding@Home Performance Is Looking Good On The GeForce RTX 2080 Ti

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5887

Credit: 119183467226

RAC: 24831462

DanNeely wrote:Ugh. I

24 Sep 2018 21:32:24 UTC

Message 166961 in response to message 166952

(moderation:

)

DanNeely wrote:

Ugh. I certainly hope that's not the case, CPU vs GPU DCF is already a garbage fire in this project if you want to have enough work to survive more than a few hours network/server outage.

You seem to be running only GPU tasks here. I guess you run tasks for other projects on your CPU cores.

Since the DCF is only being driven by GPU tasks, doesn't it settle on a value that allows estimates to match reality? I would like to understand why there would be a problem for a cache size of a day or two at least? I only run Einstein so am unfamiliar with inter-project interactions.

Cheers,
Gary.

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3005678180

RAC: 708663

The DCF problem - if any -

24 Sep 2018 21:55:05 UTC

Message 166962

(moderation:

)

The DCF problem - if any - would be incredibly short lived. If you place a new card in an existing host, just one completed task would cure it.

Alternatively, you could delay installing the new card until BOINC v7.14 has been released. And if that's too far into the future, you could build your own copy of BOINC from the open sources.

As I said before, it's an interesting technical observation, but no way a major issue.

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3590283663

RAC: 903850

Gary Roberts wrote:DanNeely

24 Sep 2018 22:09:18 UTC

Message 166964 in response to message 166961

(moderation:

)

Gary Roberts wrote:

DanNeely wrote:
Ugh. I certainly hope that's not the case, CPU vs GPU DCF is already a garbage fire in this project if you want to have enough work to survive more than a few hours network/server outage.

You seem to be running only GPU tasks here. I guess you run tasks for other projects on your CPU cores.

Since the DCF is only being driven by GPU tasks, doesn't it settle on a value that allows estimates to match reality? I would like to understand why there would be a problem for a cache size of a day or two at least? I only run Einstein so am unfamiliar with inter-project interactions.

I run GW tasks when they're available, running Femi tasks on a CPU when there's an efficient GPU app available is a waste of electricity, so I'm running backup projects on my CPUs (mostly Asteroids@Home).

Pascal again available, Turing may be coming soon

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner