Pascal again available, Turing may be coming soon

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117010767659

RAC: 36436894

DanNeely wrote:I run GW tasks

24 Sep 2018 23:23:59 UTC

Message 166965 in response to message 166964

(moderation:

)

DanNeely wrote:

I run GW tasks when they're available, running Femi tasks ... is a waste of electricity ...

I'm sorry if you thought I was suggesting you should be running FGRP5. That wasn't my intention. Whilst FGRPB1G and FGRP5 are looking for different things, everybody should be quite free to pick and choose.

So when you are running GW, could you please explain your preferred mix of tasks. Do you still run 3x on your GPU? You have 4 cores/8 threads. How many CPU tasks do you run and how do you control the split? Is it just a GPU utilization factor of 0.33 with the remaining 5 threads doing GW tasks?

Cheers,
Gary.

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3562358667

RAC: 0

Gary Roberts wrote:DanNeely

25 Sep 2018 10:54:39 UTC

Message 166969 in response to message 166965

(moderation:

)

Gary Roberts wrote:

DanNeely wrote:
I run GW tasks when they're available, running Femi tasks ... is a waste of electricity ...

I'm sorry if you thought I was suggesting you should be running FGRP5. That wasn't my intention. Whilst FGRPB1G and FGRP5 are looking for different things, everybody should be quite free to pick and choose.

So when you are running GW, could you please explain your preferred mix of tasks. Do you still run 3x on your GPU? You have 4 cores/8 threads. How many CPU tasks do you run and how do you control the split? Is it just a GPU utilization factor of 0.33 with the remaining 5 threads doing GW tasks?

Correct for the nvidia hosts. My AMD host runs 7 CPU/1 GPU because the 460's windows drivers result in concurrent GPU tasks running an order of magnitude slower than 1 at a time.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117010767659

RAC: 36436894

OK, thanks, I was only

26 Sep 2018 3:01:00 UTC

Message 166974

(moderation:

)

OK, thanks, I was only thinking about the nvidia situation anyway. in relation to your DCF experiences.

There are two things that give you problems in trying for a "more than a few hours" cache size.

The first is that GPU tasks run faster than estimate and GW CPU tasks probably run slower than estimate so that the DCF swings can be very large. I completely understood your comment about running 3x on the GPU ("the 3rd task is already mostly for DCF moderation") since I also take advantage of the same effect. This stops the DCF from going quite as low, and the CPU task estimates dropping quite as much, as they otherwise would. The lower the DCF goes, the higher the CPU task overfetch will become and the more likely that problems occur.

There is another factor that is probably more important if you would like to have a somewhat higher cache size without additional misbehaviour. It certainly works for me. This second factor is caused by the mismatch that BOINC sees when it looks at the number of threads compared to the number of CPU tasks that can run. Unfortunately, in your case, BOINC is fetching based on 8 threads when only 5 can be used for crunching. If you fix that, you will have a much better experience.

To negate this effect, I make sure to use BOINC's '% of cores' setting to limit the usable threads that BOINC sees. If you are happy to run 5 CPU threads concurrently with 3 GPU tasks, you could set 62.5% for that. The trick is to make sure that you don't allow BOINC to 'reserve' any more threads for GPU support. If you set via app_config.xml, gpu_usage to be 0.33 and cpu_usage to say 0.3, you would have a 3+5 mixture whilst BOINC would fetch for 5 threads rather than the full 8 it does now. BOINC would not reserve any additional cores since 3 x 0.3 is < 1.

I use this all the time with AMD GPUs and it improves things significantly. I imagine it would work the same with nvidia but you would need to test that out for yourself. I have 2 dual core hosts with 1GB 750Tis where I use 50% cores whilst running a cpu_usage of 0.3 and it works exactly as expected in stopping BOINC fetching CPU tasks for both cores. I don't imagine there would be any problems with more powerful GPUs.

Cheers,
Gary.

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3562358667

RAC: 0

Isn't the % of cores setting

26 Sep 2018 11:10:41 UTC

Message 166977

(moderation:

)

Isn't the % of cores setting global though? I don't want to undermine automatic switching to/from backup projects with a setting that would need manually toggled any time I was running a different GPU app.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7205854931

RAC: 928988

First results from a 2080

26 Sep 2018 22:03:49 UTC

Message 166982

(moderation:

)

First results from a 2080 running on the host I call Stoll9 which most recently was running a 1070 + a 1060:

I have at least one validation from a result which started on one of the old cards and completed on the new, and at least two validations on results run entirely on the new, at stock, at 1X.

A snapshot of the power meter without averaging shows total box power consumption at this condition about 237 watts. I formerly believed the idle power of this box to be 65 watts, but have not taken a recent measurement.

Initial 1X stock completion elapsed times are 8 minutes 16 seconds. Bearing in mind we are getting low-paying work at the moment, this implies a daily credit of a little over 600,000.

I intend to switch to 2X very soon, and assuming the elapsed times come in at less than 16 minutes 32 seconds, I'll leave it there for a day of stability running before trying higher multiplicity, and then trying some overclocking.

This same box, running this same work, scored 277 watts while producing work at a daily credit rate just under 620,000 this morning, when set back to running stock clocks and without power limitation at 2X. On the overclock and power limitation settings I've run here for the last six months, it this morning scored 250.5 watts with a daily credit rate just under 658,000, again running 2X.

So an initial possible result it that I'll see pretty near breakeven for both total Einstein productivity and for Einstein power efficiency, which, if true, is quite a poor return on the considerable investment as an upgrade. It is possible that the 2080 will not turn out to be the price or power efficiency peak of the curve, so perhaps other Turing models will do better.

Barring trouble, I should have 2X average elapsed time and properly averaged power within a few hours, and a look at whether 3X and perhaps 4X help by tomorrow. Overclocking results will take quite a bit longer.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117010767659

RAC: 36436894

DanNeely wrote:Isn't the % of

27 Sep 2018 1:45:33 UTC

Message 166985 in response to message 166977

(moderation:

)

DanNeely wrote:

Isn't the % of cores setting global though?

Sure. I was unaware you were running other project's GPU apps besides FGRPB1G. I was just interested in making sure as many as possible get to understand that it's not just DCF that causes the behaviour.

Sorry for the interruption and apologies to Archae86 for polluting his thread.

Cheers,
Gary.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7205854931

RAC: 928988

archae86 wrote:I intend to

27 Sep 2018 3:04:53 UTC

Message 166986 in response to message 166982

(moderation:

)

archae86 wrote:

I intend to switch to 2X very soon, and assuming the elapsed times come in at less than 16 minutes 32 seconds, I'll leave it there for a day of stability running before trying higher multiplicity, and then trying some overclocking.

2X indeed came in under 16:32, but not by much (about 16:02). Power was up by almost eight watts, so marginal power productivity actually degraded slightly.

I'll check 3X tomorrow, but I'd be really surprised if it looks good enough to persuade anyone running the current Windows code to give up a supervising core to get the tiny bit extra.

Zalster

Joined: 26 Nov 13

Posts: 3117

Credit: 4050672230

RAC: 0

Thanks for the data

27 Sep 2018 4:45:43 UTC

Message 166988 in response to message 166986

(moderation:

)

Thanks for the data Archae86. I'm running 1080Ti under Ubuntu. I'm averaging 430 seconds for 1 work unit at a time just for comparison. I have to wonder what that 2080 would do in Ubuntu. Unfortunately I don't think there are drivers yet for it. Look forward to see what 3X does.

Keith Myers

Joined: 11 Feb 11

Posts: 4957

Credit: 18619857825

RAC: 5614225

Yes. there are drivers for

27 Sep 2018 4:52:35 UTC

Message 166989

(moderation:

)

Yes. there are drivers for the Turing cards in Linux already. I am running the 410.57 drivers on my daily driver right now.

mikey

Joined: 22 Jan 05

Posts: 12642

Credit: 1839031099

RAC: 5192

Keith Myers wrote:Yes. there

27 Sep 2018 12:23:40 UTC

Message 166991 in response to message 166989

(moderation:

)

Keith Myers wrote:

Yes. there are drivers for the Turing cards in Linux already. I am running the 410.57 drivers on my daily driver right now.

Over on PG a guy running a 1080 upgraded to the 411.?? drivers, Windows, and said his run times actually went up, so we may be getting into the be careful which driver you use for which card you have phase again.

Pascal again available, Turing may be coming soon

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner