Hi colleagues,
I changed my GPU just now, from a 8Gib GPU to a 11Gib one, while the detected memory varied from 4Gib to 3Gib. I wonder that whether the core of GW computation here is still the 32-bits one?
Copyright © 2024 Einstein@Home. All rights reserved.
detection of GPU memory
)
detection of GPU memory happens via BOINC, and nothing to do with Einstein tasks.
but yes, BOINC still uses an old memory detection method for Nvidia GPUs that is limited to 32-bits. so it can only read a maximum of 4095 MB (4GB) for Nvidia. the memory detection used for AMD doesn't have this limitation.
this limitation has been known for a long time, and BOINC devs have forever blamed the issue on Nvidia, saying there was nothing they could do about it, but a member of our team actually discovered that BOINC was simply using an old detection method and that a new method does indeed exist. He modified the BOINC code and recompiled the BOINC client with this new method and proved that it works. this information has been shared with BOINC devs, but they don't seem interested in fixing it since it hasn't been implemented. It's been about a year already.
_________________________________________________________________________
Yes, it's 4095 MB detected as
)
Yes, it's 4095 MB detected as the maximum memory of Nvidia, in both cases of P4000 card and 2080 ti card. I missed the same report of this on 2080ti(3105 is my free disk space), just now I got a nvidia-smi warning about my P4000 Nvidia card, then may be a memory error(a trust problem?). This is my first CUDA device, I lost my mind at that time. Thanks.
yuanchen huang wrote: Yes,
)
What was the error message?
The free program MemtestCL uses openCL tools to test memory. I have used it on Gtx1070 that I thought had a problem. I ended up junking the motherboard and the card has been working fine on another mombo.
There is a companion version for CUDA MEmtestG80 but I have not used it.
Thanks, the error appeared
)
Thanks, the error appeared when I installed my P4000 card, I test the same board (a Dell 7910 case with X99) with my 2080ti, and no error appeared up to now. The error was reported by `nvidia-smi` that
WARNING: infoROM is corrupted at gpu 0000:03:00.0
(this is the same warning of my P4000 card and reported of a Titan X on Nvidia Forum)
and the Nvidia Official said it's an error of the non-volatile memory(ROM). I will try your suggestion, but I guess there is something wrong with my buying for this card since I found the warranty location is the Australia.