Comprehensive GPU performance list?

mmonnin
mmonnin
Joined: 29 May 16
Posts: 291
Credit: 3417906540
RAC: 3603248

I've always seen no GPU load

I've always seen no GPU load and 100% of a CPU thread load at 90%. With a RX 580 there was very little CPU util prior to 90% completion and then 100% CPU util.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117727412276
RAC: 34986072

mmonnin wrote:I've always

mmonnin wrote:
I've always seen no GPU load and 100% of a CPU thread load at 90%. With a RX 580 there was very little CPU util prior to 90% completion and then 100% CPU util.

I have an RX 580 in a Q6600 quad core host.  The Q6600 is 2008 vintage - so relatively slow by todays standards.  It runs 3xGPU and 1xCPU tasks.  Using the properties function in BOINC Manager, I recorded the following information for one of the concurrent GPU tasks in the final stages of crunching:-

          % Complete ->          89.54%          89.93%          100%
          CPU Time   ->           1:00            1:00            1:02
          Tot Time   ->          28:18           28:28           28:31

The time figures are minutes:seconds as displayed on the properties page, with a granularity of 1 second.  The main point I'm trying to make is that this is quite different from what it was about a month or so ago.  At that earlier time on this particular GPU, the follow-up stage (% complete sitting on 89.997%) lasted around 30 seconds to a minute.  I never measured it precisely.

On the above figures, the very last bit of crunching plus the followup-stage (if there still is one) plus the time to retrieve the final results back from the GPU plus the time to shut down the app plus writing the results to disk ready for uploading took 3 seconds of elapsed time and very approximately 2 seconds of CPU time.  So the above quote doesn't really give the complete picture, particularly regarding how short the 90-100% stage really is and how relatively little of the total CPU consumption occurs there.

 

Cheers,
Gary.

lunkerlander
lunkerlander
Joined: 25 Jul 18
Posts: 46
Credit: 31464094
RAC: 0

Hi, in case it helps, here is

Hi, in case it helps, here is a link to a list from Tom's Hardware ranking GPU performance from 2018:

https://www.tomshardware.com/reviews/gpu-hierarchy,4388.html

 

Chooka
Chooka
Joined: 11 Feb 13
Posts: 134
Credit: 3717955759
RAC: 1644407

No Titan V? Must have

No Titan V?

Must have assumed most people don't have that kind of cash for a GPU.


mmonnin
mmonnin
Joined: 29 May 16
Posts: 291
Credit: 3417906540
RAC: 3603248

Gary Roberts wrote:mmonnin

Gary Roberts wrote:
mmonnin wrote:
I've always seen no GPU load and 100% of a CPU thread load at 90%. With a RX 580 there was very little CPU util prior to 90% completion and then 100% CPU util.

I have an RX 580 in a Q6600 quad core host.  The Q6600 is 2008 vintage - so relatively slow by todays standards.  It runs 3xGPU and 1xCPU tasks.  Using the properties function in BOINC Manager, I recorded the following information for one of the concurrent GPU tasks in the final stages of crunching:-

          % Complete ->          89.54%          89.93%          100%
          CPU Time   ->           1:00            1:00            1:02
          Tot Time   ->          28:18           28:28           28:31

The time figures are minutes:seconds as displayed on the properties page, with a granularity of 1 second.  The main point I'm trying to make is that this is quite different from what it was about a month or so ago.  At that earlier time on this particular GPU, the follow-up stage (% complete sitting on 89.997%) lasted around 30 seconds to a minute.  I never measured it precisely.

On the above figures, the very last bit of crunching plus the followup-stage (if there still is one) plus the time to retrieve the final results back from the GPU plus the time to shut down the app plus writing the results to disk ready for uploading took 3 seconds of elapsed time and very approximately 2 seconds of CPU time.  So the above quote doesn't really give the complete picture, particularly regarding how short the 90-100% stage really is and how relatively little of the total CPU consumption occurs there.

 

 

No matter the time length of 90% to 100% task completion the GPU util drops, which was my point. Whether the seconds it is now or the couple of minutes with the prior data, there was little load on the GPU during that time frame.

shuhui1990
shuhui1990
Joined: 16 Sep 06
Posts: 27
Credit: 3631456971
RAC: 0

I took a look at the Top

I took a look at the Top Computers list. Here's the single GPU performance I gathered. The actual performance depends on overclocking and the number of work units run at the same time.

 

GPU RAC
GTX 1080 Ti 700k~800k
GTX 1080 550k~700k
Vega 64 1100k~1500k
RX 480/580 550k~750k
RX 570 ~500K

AMD GPUs obviously have a huge advantage over NVIDIA GPUs even when they have similar FLOPS. For example GTX 1080 Ti and Vega 64 both have 484 GB/s memory bandwidth. GTX 1080 Ti has 11,340 GFLOPS FP32 while Vega 64 has 12,583 GFLOPS. However Vega 64 is almost twice as fast in Einstein@home. I understand FGRPG uses OpenCL but AMD cards do not outperform NVIDIA cards in most OpenCL benchmarks.

https://www.phoronix.com/scan.php?page=article&item=12-opencl-98&num=6

I don't understand why Einstein@home has such poor optimization for NVIDIA cards while the number of hosts with NVIDIA GPU is more than twice the number of hosts with AMD GPU according to https://einsteinathome.org/server_status.php.

DanNeely
DanNeely
Joined: 4 Sep 05
Posts: 1364
Credit: 3562358667
RAC: 0

E@H's GPU apps are written in

E@H's GPU apps are written in OpenCL so the project only has to maintain one code base, not two.  Unfortunately NVidia's OpenCL implementation kinda sucks; many people suspect that's deliberate on NVidia's part and that they're trying to encourage developers to target their proprietary CUDA API instead.

shuhui1990
shuhui1990
Joined: 16 Sep 06
Posts: 27
Credit: 3631456971
RAC: 0

Vega 64 doesn't outperform

Vega 64 doesn't outperform GTX 1080 Ti by more than 50% in any of the tests in this article.

https://www.phoronix.com/scan.php?page=article&item=12-opencl-98

embed.php?i=1711111-AL-OPENCLING78&sha=8312d98&p=2

In the Single Precision FFT Benchmark which I think most pertinent to E@H, Vega 64 only leads by 10%.

Richie
Richie
Joined: 7 Mar 14
Posts: 656
Credit: 1702989778
RAC: 0

Looks like an old AMD R9 390

Looks like an old AMD R9 390 still produces as much as RX 480/580 or GTX 1080. Initial cost to get one is much lower but then it will take more electricity.

Gordon Haverland
Gordon Haverland
Joined: 28 Oct 16
Posts: 20
Credit: 428489605
RAC: 0

I have 6 computers on my LAN,

I have 6 computers on my LAN, all with AMD CPU.

One machine currently "has no GPU" (it has a HD5450 (?), but that proprietary crappy add-on from AMD is not supported with the version of X it is running (and it is running Debian-Jessie).  It's a 8320E (8 cores).

I have a 2 core AMD CPU paired with a RX-550

I have a 4 core AMD APU (but GPU part is being ignored) paired with a RX-550.

I have a 8 core 8320E with a RX-460.  The fans have been making noises and I got worried, and accidentally ordered two RX-570 replacements for it, when it dies or I get tired of hearing it.

I have a Ryzen-1600X with 32G of RAM and currently a RX-560 in it.  One of the RX-570's I bought, has 8 GB of RAM, it will go in this machine.

I have a Ryzen-1600 with 16G of RAM which had a RX-560 in it.  That was recently replaced with a RX-570 with 4G of RAM.  This machine has the Ryzen idle freeze problem.  I am only running it (at the moment) with 3 BOINC jobs, because I still haven't found a solution to this idle freeze.  It is running the latest BIOS, I installed the most recent firmware-amd-graphics firmware, I am playing with BIOS settings and some kind of zen python program to disallow C6. 

The machine with the 8320E and (effectively) no GPU, was meant to get hardware upgrades last winter, and I ran out of time.  So soon it will get the hardware upgrades, which will mean putting a RX-560 in it.

 

I have done nothing in so far as tuning how many jobs or anything else.  By and large, I was hoping to lean on BOINC to learn about doing computations across machines on a LAN.  At some point, I am hoping that something (OpenMPI, OpenCL, ...) may allow me to work on problems where all my LAN contributes to solving a problem.  One of the problems is such that I might need to set up a BOINC server, to have other computers contribute to solving problems.  That's really vague.

I live in the "Peace Country", which is partly in NE BC Canada, and partly in NW Alberta, Canada.  It is an area about the size of Germany, with about 150,000 people.  That is not enough people for anyone to consider climate modeling for.  The Peace Country is known worldwide for honey, because with our long summer days (19 hours? 20?) we have lots of flowers and hence bees.  I've written to a couple of people who have a lot of experience with climate/weather stuff, and they both thought I have enough CPU, GPU, RAM and storage to do this.  My background is materials science, but heavy on the computing side.  So I am hoping to figure out one or more variations on downscaling, so that I can try and couple global circulation model results, to what is happening in the "Peace".

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.