Comprehensive GPU performance list?

mmonnin

Joined: 29 May 16

Posts: 292

Credit: 3444726540

RAC: 2099

I've always seen no GPU load

18 Jul 2018 21:01:01 UTC

Message 166190

(moderation:

)

I've always seen no GPU load and 100% of a CPU thread load at 90%. With a RX 580 there was very little CPU util prior to 90% completion and then 100% CPU util.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5888

Credit: 119675777310

RAC: 25274646

mmonnin wrote:I've always

18 Jul 2018 22:32:28 UTC

Message 166195 in response to message 166190

(moderation:

)

mmonnin wrote:

I've always seen no GPU load and 100% of a CPU thread load at 90%. With a RX 580 there was very little CPU util prior to 90% completion and then 100% CPU util.

I have an RX 580 in a Q6600 quad core host. The Q6600 is 2008 vintage - so relatively slow by todays standards. It runs 3xGPU and 1xCPU tasks. Using the properties function in BOINC Manager, I recorded the following information for one of the concurrent GPU tasks in the final stages of crunching:-

          % Complete ->          89.54%          89.93%          100%
          CPU Time   ->           1:00            1:00            1:02
          Tot Time   ->          28:18           28:28           28:31

The time figures are minutes:seconds as displayed on the properties page, with a granularity of 1 second. The main point I'm trying to make is that this is quite different from what it was about a month or so ago. At that earlier time on this particular GPU, the follow-up stage (% complete sitting on 89.997%) lasted around 30 seconds to a minute. I never measured it precisely.

On the above figures, the very last bit of crunching plus the followup-stage (if there still is one) plus the time to retrieve the final results back from the GPU plus the time to shut down the app plus writing the results to disk ready for uploading took 3 seconds of elapsed time and very approximately 2 seconds of CPU time. So the above quote doesn't really give the complete picture, particularly regarding how short the 90-100% stage really is and how relatively little of the total CPU consumption occurs there.

Cheers,
Gary.

lunkerlander

Joined: 25 Jul 18

Posts: 46

Credit: 31464094

RAC: 0

Hi, in case it helps, here is

28 Jul 2018 3:02:58 UTC

Message 166287

(moderation:

)

Hi, in case it helps, here is a link to a list from Tom's Hardware ranking GPU performance from 2018:

https://www.tomshardware.com/reviews/gpu-hierarchy,4388.html

Chooka

Joined: 11 Feb 13

Posts: 134

Credit: 3771079759

RAC: 3311

No Titan V? Must have

29 Jul 2018 9:06:03 UTC

Message 166298

(moderation:

)

No Titan V?

Must have assumed most people don't have that kind of cash for a GPU.

mmonnin

Joined: 29 May 16

Posts: 292

Credit: 3444726540

RAC: 2099

Gary Roberts wrote:mmonnin

6 Aug 2018 18:29:05 UTC

Message 166394 in response to message 166195

(moderation:

)

Gary Roberts wrote:

mmonnin wrote:
I've always seen no GPU load and 100% of a CPU thread load at 90%. With a RX 580 there was very little CPU util prior to 90% completion and then 100% CPU util.

I have an RX 580 in a Q6600 quad core host. The Q6600 is 2008 vintage - so relatively slow by todays standards. It runs 3xGPU and 1xCPU tasks. Using the properties function in BOINC Manager, I recorded the following information for one of the concurrent GPU tasks in the final stages of crunching:-
          % Complete ->          89.54%          89.93%          100%
          CPU Time   ->           1:00            1:00            1:02
          Tot Time   ->          28:18           28:28           28:31
The time figures are minutes:seconds as displayed on the properties page, with a granularity of 1 second. The main point I'm trying to make is that this is quite different from what it was about a month or so ago. At that earlier time on this particular GPU, the follow-up stage (% complete sitting on 89.997%) lasted around 30 seconds to a minute. I never measured it precisely.

On the above figures, the very last bit of crunching plus the followup-stage (if there still is one) plus the time to retrieve the final results back from the GPU plus the time to shut down the app plus writing the results to disk ready for uploading took 3 seconds of elapsed time and very approximately 2 seconds of CPU time. So the above quote doesn't really give the complete picture, particularly regarding how short the 90-100% stage really is and how relatively little of the total CPU consumption occurs there.

No matter the time length of 90% to 100% task completion the GPU util drops, which was my point. Whether the seconds it is now or the couple of minutes with the prior data, there was little load on the GPU during that time frame.

shuhui1990

Joined: 16 Sep 06

Posts: 27

Credit: 3631456971

RAC: 0

I took a look at the Top

15 Jan 2019 10:26:50 UTC

Message 168848

(moderation:

)

I took a look at the Top Computers list. Here's the single GPU performance I gathered. The actual performance depends on overclocking and the number of work units run at the same time.

GPU	RAC
GTX 1080 Ti	700k~800k
GTX 1080	550k~700k
Vega 64	1100k~1500k
RX 480/580	550k~750k
RX 570	~500K

AMD GPUs obviously have a huge advantage over NVIDIA GPUs even when they have similar FLOPS. For example GTX 1080 Ti and Vega 64 both have 484 GB/s memory bandwidth. GTX 1080 Ti has 11,340 GFLOPS FP32 while Vega 64 has 12,583 GFLOPS. However Vega 64 is almost twice as fast in Einstein@home. I understand FGRPG uses OpenCL but AMD cards do not outperform NVIDIA cards in most OpenCL benchmarks.

https://www.phoronix.com/scan.php?page=article&item=12-opencl-98&num=6

I don't understand why Einstein@home has such poor optimization for NVIDIA cards while the number of hosts with NVIDIA GPU is more than twice the number of hosts with AMD GPU according to https://einsteinathome.org/server_status.php.

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3603039681

RAC: 768973

E@H's GPU apps are written in

15 Jan 2019 11:38:50 UTC

Message 168849

(moderation:

)

E@H's GPU apps are written in OpenCL so the project only has to maintain one code base, not two. Unfortunately NVidia's OpenCL implementation kinda sucks; many people suspect that's deliberate on NVidia's part and that they're trying to encourage developers to target their proprietary CUDA API instead.

shuhui1990

Joined: 16 Sep 06

Posts: 27

Credit: 3631456971

RAC: 0

Vega 64 doesn't outperform

15 Jan 2019 11:55:27 UTC

Message 168851 in response to message 168849

(moderation:

)

Vega 64 doesn't outperform GTX 1080 Ti by more than 50% in any of the tests in this article.

https://www.phoronix.com/scan.php?page=article&item=12-opencl-98

embed.php?i=1711111-AL-OPENCLING78&sha=8312d98&p=2

In the Single Precision FFT Benchmark which I think most pertinent to E@H, Vega 64 only leads by 10%.

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

Looks like an old AMD R9 390

15 Jan 2019 11:59:33 UTC

Message 168852

(moderation:

)

Looks like an old AMD R9 390 still produces as much as RX 480/580 or GTX 1080. Initial cost to get one is much lower but then it will take more electricity.

Gordon Haverland

Joined: 28 Oct 16

Posts: 20

Credit: 428489605

RAC: 0

I have 6 computers on my LAN,

22 Jan 2019 0:52:39 UTC

Message 169003

(moderation:

)

I have 6 computers on my LAN, all with AMD CPU.

One machine currently "has no GPU" (it has a HD5450 (?), but that proprietary crappy add-on from AMD is not supported with the version of X it is running (and it is running Debian-Jessie). It's a 8320E (8 cores).

I have a 2 core AMD CPU paired with a RX-550

I have a 4 core AMD APU (but GPU part is being ignored) paired with a RX-550.

I have a 8 core 8320E with a RX-460. The fans have been making noises and I got worried, and accidentally ordered two RX-570 replacements for it, when it dies or I get tired of hearing it.

I have a Ryzen-1600X with 32G of RAM and currently a RX-560 in it. One of the RX-570's I bought, has 8 GB of RAM, it will go in this machine.

I have a Ryzen-1600 with 16G of RAM which had a RX-560 in it. That was recently replaced with a RX-570 with 4G of RAM. This machine has the Ryzen idle freeze problem. I am only running it (at the moment) with 3 BOINC jobs, because I still haven't found a solution to this idle freeze. It is running the latest BIOS, I installed the most recent firmware-amd-graphics firmware, I am playing with BIOS settings and some kind of zen python program to disallow C6.

The machine with the 8320E and (effectively) no GPU, was meant to get hardware upgrades last winter, and I ran out of time. So soon it will get the hardware upgrades, which will mean putting a RX-560 in it.

I have done nothing in so far as tuning how many jobs or anything else. By and large, I was hoping to lean on BOINC to learn about doing computations across machines on a LAN. At some point, I am hoping that something (OpenMPI, OpenCL, ...) may allow me to work on problems where all my LAN contributes to solving a problem. One of the problems is such that I might need to set up a BOINC server, to have other computers contribute to solving problems. That's really vague.

I live in the "Peace Country", which is partly in NE BC Canada, and partly in NW Alberta, Canada. It is an area about the size of Germany, with about 150,000 people. That is not enough people for anyone to consider climate modeling for. The Peace Country is known worldwide for honey, because with our long summer days (19 hours? 20?) we have lots of flowers and hence bees. I've written to a couple of people who have a lot of experience with climate/weather stuff, and they both thought I have enough CPU, GPU, RAM and storage to do this. My background is materials science, but heavy on the computing side. So I am hoping to figure out one or more variations on downscaling, so that I can try and couple global circulation model results, to what is happening in the "Peace".

Comprehensive GPU performance list?

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner