FGRPB1G downclocks memory

shuhui1990

Joined: 16 Sep 06

Posts: 27

Credit: 3631456971

RAC: 0

15 Jan 2019 12:53:45 UTC

Topic 217922

(moderation:

)

I read that Einstein@home is bottlenecked by GPU memory bandwidth. However I discovered that FGRPB1G downclocks the memory when it runs.

The default memory clock of GTX 1080 Ti is 5600 MHz. When FGRPB1G starts it downclocks to 5100 MHz. If I overclock the memory by 1000 MHz in MSI Afterburner, the memory clock becomes 6600 MHz. As soon as FGRPB1G starts the memory clock drops to 6100 MHz.

https://drive.google.com/file/d/1ZR00QFjqqFtzTb0_4Fjp_lj6Q2NNsD2u/view

Another example. The default memory clock of GTX 980 Ti is 3500 MHz. When FGRPB1G runs it downclocks to 3300 MHz. If I overclock the memory by 400 MHz, the memory clock becomes 3900 MHz in gaming benchmark. But when FGRPB1G runs it stays at 3300 MHz.

Is this a bug? Einstein@home is definitely memory bandwidth dependent. I don't know why it downclocks the memory. Below are the test results of my GTX 1080 Ti on LATeah2008L tasks. Numbers are averaged over a couple tasks. Overclocking the memory by 20% increases power consumption by 6%, but shortens the time by 4.5%.

Concurrency	Memory Clock/MHz	Power/W	Temperature/℃	Total Time/s	Time per WU/s
3	6100	212	47	966	322
3	5100	200	46	1010	337
2	6100	201	45	689	345
2	5100	191	44	713	357

BTW Vega 64 has a memory bandwidth of 484 GB/s and a RAC between 110k and 150k. Since Radeon VII has a memory bandwidth of 1024 GB/s, should we expect a RAC about 250k on a Radeon VII?

Added 15 Jan 2019 18:54:45 UTC

Apparently a stable overclock for gaming is not a stable overclock for crunching. Since Jan 14th I have gotten 169 invalid results.

Zalster

Joined: 26 Nov 13

Posts: 3117

Credit: 4050672230

RAC: 0

It's not the work units that

15 Jan 2019 14:03:11 UTC

Message 168857

(moderation:

)

It's not the work units that downclock the memory, it's Nvidia. Nvidia has stated on their website that when GPU recognizes a scientific work unit, the GPU is moved to P2 states which, by default, is a lower GPU speed and memory so that the scientific work units don't get corrupted by the normal P0 state levels. AMD doesn't have this issue.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

I'm not sure if this works

15 Jan 2019 14:40:13 UTC

Message 168858

(moderation:

)

I'm not sure if this works with GTX 10xx series cards but with 9xx you can use a program called Nvidia Inspector to adjust the P2 mem clock to max. https://www.guru3d.com/files-details/nvidia-inspector-download.html

I'm still using version 1.9.7.8 as I recall there was something strange with installing or using the newer version, but might have been just a user error by me.

Show overclocking --- Overclocking : Performance Level (2)-(P2) - Memory Clock Offset ... move slider to max ... and Apply Clocks & Voltage ... and you can see mem bandwidth (Bus Width GB/s) on the left side will change --- Exit

shuhui1990

Joined: 16 Sep 06

Posts: 27

Credit: 3631456971

RAC: 0

Thank you for pointing it

15 Jan 2019 19:07:39 UTC

Message 168864 in response to message 168857

(moderation:

)

Thank you for pointing it out. Could you elaborate on "get corrupted by the normal P0 state levels"?

Now that I got 169 invalid results since yesterday, I think the P2 state has a point. An overclocked memory at 6100 MHz should be stable for games. But obviously Einstein@home doesn't think so. Need to find a sweet spot between shorter finished time and failure rate.

shuhui1990

Joined: 16 Sep 06

Posts: 27

Credit: 3631456971

RAC: 0

Thanks. It seems the P2 state

15 Jan 2019 19:16:04 UTC

Message 168865 in response to message 168858

(moderation:

)

Thanks. It seems the P2 state can be disabled.

https://www.reddit.com/r/RenderToken/comments/9w2rd9/how_to_use_maximum_p0_power_state_with_nvidia/

However I got lots of invalid results by going 500 MHz above the base memory clock (1000 MHz above P2). Do you have any recommendations on a safe overclock relative to the base clock?

Zalster

Joined: 26 Nov 13

Posts: 3117

Credit: 4050672230

RAC: 0

shuhui1990 wrote:Thank you

15 Jan 2019 19:36:21 UTC

Message 168866 in response to message 168864

(moderation:

)

shuhui1990 wrote:

Thank you for pointing it out. Could you elaborate on "get corrupted by the normal P0 state levels"?

Now that I got 169 invalid results since yesterday, I think the P2 state has a point. An overclocked memory at 6100 MHz should be stable for games. But obviously Einstein@home doesn't think so. Need to find a sweet spot between shorter finished time and failure rate.

Without going into to much detail, P0 states are fine for gaming. No one really cares if you are dropping small bits of data here and there. It will get overshadowed quickly as the screen changes. However, when doing scientific work, any error in calculations will result in the entire work unit being corrupted by an error in processing. Nvidia knows this so to prevent any errors from occurring they restrict scientific processes to P2 state with slower speeds so that no corruption gets incorporated in the analysis. Remember these are gaming cards not scientific cards as opposed to Tesla cards. If you want more info you can google P0 vs P2 states and come up with Miners talking about the difference, etc.. This is just a quick explanation in a nutshell.

shuhui1990

Joined: 16 Sep 06

Posts: 27

Credit: 3631456971

RAC: 0

Thanks. I do understand when

15 Jan 2019 19:57:11 UTC

Message 168867 in response to message 168866

(moderation:

)

Thanks. I do understand when memory clock crosses a certain point the error rate increases exponentially with clock speed. So it seems to me the P2 memory clock was the absolutely safe clock with zero error while the P0 memory clock was already "officially overclocked" with few errors but fine for gaming.

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7376981687

RAC: 2138043

The only way to find the

15 Jan 2019 20:58:49 UTC

Message 168870 in response to message 168867

(moderation:

)

The only way to find the failure threshold for a particular sample of a particular card running a particular application is large-scale testing.

I've done this. The answer varies from card to card of the same make and model. So don't trust anyone who gives you a number.

The other side of the coin is "how much benefit?". Several generations ago when the fact that Maxwell2 generation cards downclocked memory a lot became known, there was an appreciable gain available for Einstein performance by tampering. I had sworn off CPU overclock several years before but got into GPU overclocking for the first time on that occasion.

I think you may find that the current Einstein application gives less performance improvement with memory overclocking than you might suppose, making it a bit questionable whether it is worth the time and effort to find a safe overclock.

Also, as not all the data sets are the same, there is no guarantee that a carefully found just barely safe operating point will stay safe into the future.

Warnings aside, I personally do overclock, but I do it by slowly creeping up in clock rate until I find error, then backing down until I find a rate that gives zero errors in 24 hours, then back down two more increments.

Your preferred method will vary, naturally.

shuhui1990

Joined: 16 Sep 06

Posts: 27

Credit: 3631456971

RAC: 0

I was gonna take your method.

15 Jan 2019 22:08:27 UTC

Message 168875 in response to message 168870

(moderation:

)

I was gonna take your method. I do hope that there is an application that tests vram fidelity like MemTest for ram so I don't need to screw up E@H tasks.

Did you see performance improvement with core clock overclocking?

archae86

Joined: 6 Dec 05

Posts: 3165

Credit: 7376981687

RAC: 2138043

shuhui1990 wrote:Did you see

15 Jan 2019 22:55:59 UTC

Message 168877 in response to message 168875

(moderation:

)

shuhui1990 wrote:

Did you see performance improvement with core clock overclocking?

Yes, but again somewhat less than one might suppose. But more than the memory clock for recent cards and the current Einstein application, if memory serves.

rjs5

Joined: 3 Jul 05

Posts: 32

Credit: 681621655

RAC: 430326

Do you know if there is any

16 Jan 2019 2:43:15 UTC

Message 168878 in response to message 168877

(moderation:

)

Do you know if there is any app_config options to pass a command line option to disable the throttling?

Do you think this might be related to the Nvidia Series 20 problems?

I took a GPUZ log at the point of the EAH screen blank and the it seems like the GPU and memory frequencies are dropped by more than a 1gz. I was not aware that EAH messed around with the frequency. If EAH lowers the frequency too much, it might be the source from the time out.

Thoughts?

Date	GPU Core Clock [MHz]	GPU Memory Clock [MHz]	GPU Temperature [°C]
26:30.2	1350	1750	64
26:30.5	1350	1750	64
26:30.9	1350	1750	64
26:31.2	435 *	101.3 *	63 *
26:31.5	435	101.3	63
26:31.8	435	101.3	63
26:32.1	330	101.3	63
26:32.4	330	101.3	63

archae86 wrote:

shuhui1990 wrote:
Did you see performance improvement with core clock overclocking?
Yes, but again somewhat less than one might suppose. But more than the memory clock for recent cards and the current Einstein application, if memory serves.

FGRPB1G downclocks memory

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner