Times (Elapsed / CPU) for BRP5/6/6-Beta on various CPU/GPU combos - DISCUSSION Thread

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5840
Credit: 109038570431
RAC: 33861136
Topic 198005

If you have any general comments or discussion (but not hard data) about BRP6-Beta that you want to share, please post them here. If you have actual results and are prepared to document your setup, please use the corresponding Results ONLY thread and please have a think about the guidelines I gave there.

Thank you for your cooperation!

Cheers,
Gary.

robl
robl
Joined: 2 Jan 13
Posts: 1709
Credit: 1454435721
RAC: 14716

Times (Elapsed / CPU) for BRP5/6/6-Beta on various CPU/GPU combo

HOST 01: 619

CPU: Intel Core I7-4770

PCIe3 slot x16
1st GPU: AMD HD7850 2048MB (Pitcairn)
Concurrency: 4 @ 0.5 CPUs + 0.25 GPUs
CPU Tasks: 4 x FGRP4-SSE2

[pre]
run time cpu time

BRP6 (Parkes PMPS XT -v1.41) ~18,000 ~2400
[/pre]

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

Long running Single Work unit

Long running Single Work unit v1.50 per GPU

PM 0007_016D1_316_0 14360 runtime 10533 CPU time
PM 0007_016D1_326_0 14363 runtime 10345 CPU time
PM 0007_016D1_362_0 15212 runtime 10400 CPU time
PM 0007_016D1_206_1 16121 runtime 7034 CPU time

Edit... Work units uses between 68-72% of 1 core each.

Sid
Sid
Joined: 17 Oct 10
Posts: 160
Credit: 918163110
RAC: 281607

CPU: Intel Core

CPU: Intel Core I7-4790k
PCIe3 slot x16
one GPU: AMD HD7970
CPU Tasks: 5 x other project
run time
6 WUs BRP6 (Parkes PMPS XT -v1.50) 03:38:00

CPU: Intel Core I7-2600k
PCIe2 slot x16
one GPU: NVidia 770
CPU Tasks: 5 x other project
run time
6 WUs BRP6 (Parkes PMPS XT -v1.50) 05:52:00

CPU: Intel Celeron G1850
PCIe2 slot x16
one GPU: NVidia 560Ti
CPU Tasks: None
run time
6 WUs BRP6 (Parkes PMPS XT -v1.50) 08:21:00

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

http://einstein.phys.uwm.edu/

http://einsteinathome.org/host/10267709

CPU type AuthenticAMD AMD FX(tm)-9370 Eight-Core Processor [Family 21 Model 2 Stepping 0]
Operating System Microsoft Windows 7
Number of processors 8
Coprocessors [3]

PCIex16_1 NVIDIA GeForce GTX 780 (3072MB) driver: 34725
PCIex4_1 -----
PCIex16_2 NVIDIA GeForce GTX 780 (3072MB) driver: 34725
PCIex4_2 NVIDIA GeForce GTX 780 (3072MB) driver: 34725
Memory 4 x 8GB
Concurrent 1 @ 1 CPU + 1 GPU
CPU Tasks None
Free CPU Cores 5

Run time CPU time CPU utilization
BRP6-beta v1.50 ~4220 ~1180 25-28% of a core per work unit

http://einsteinathome.org/host/11681771

CPU type AuthenticAMD AMD FX(tm)-9370 Eight-Core Processor [Family 21 Model 2 Stepping 0]
Operating System Microsoft Windows 7
Number of processors 8
Coprocessors [4]

PCIex16_1 NVIDIA GeForce GTX 980 (4095MB) driver: 34725
PCIex4_1 -------
PCIex8_1 NVIDIA GeForce GTX 980 (4095MB) driver: 34725
PCIe4_2 -------
PCIex16_2 NVIDIA GeForce GTX 980 (4095MB) driver: 34725
PCI --------
PCIx8_2 NVIDIA GeForce GTX 980 (4095MB) driver: 34725

Memory 4 x 8 GB
Concurrent 1 @ 1 CPU + 1 GPU
CPU Tasks None
Free CPU Cores 5

Run time CPU time CPU utilization
BRP6-Beta v1.50 ~4965 ~1063 18-34% of a core per work unit

Large CPU utilization work units that were using 78-84% CPU per work unit were removed from this list. Those had excessive run and cpu times as well

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3522
Credit: 679998445
RAC: 513169

RE: Long running Single

Quote:

Long running Single Work unit v1.50 per GPU

PM 0007_016D1_316_0 14360 runtime 10533 CPU time
PM 0007_016D1_326_0 14363 runtime 10345 CPU time
PM 0007_016D1_362_0 15212 runtime 10400 CPU time
PM 0007_016D1_206_1 16121 runtime 7034 CPU time

Edit... Work units uses between 68-72% of 1 core each.

I've got a question concerning that host. If I look at the results on that host from the official app (before the beta), the run times are all quite similar but in the 20 ksec range. Did that app run with more instances in parallel? If not, that would mean that the fast-running beta app units (ca 5ksec) execute at 4 times the speed compared to the avg workunits before (hinting at PCIe bus saturation I guess). This is exceptional, I think a speed-up by a factor of 1.5 is closer to what most people with fairly modern systems and one GPU per system will see, but this thread will certainly shed some light on exactly this question.

Cheers
HB

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

I was running 2 work units

I was running 2 work units per GPU when they were running 20 Ksec range. (pre Beta) 0.6 CPU+0.5GPU

With a single work unit per card the time was down around 14 Ksec. (pre Beta) 1 CPU + 1 GPU

Beta v1.50 compared to this 14ksec(v1.39?) are running just under 5 Ksec 1 CPU + 1 GPU

Currently testing 3 work units with a 0.5 CPU + 0.33 GPU

Zalster
Zalster
Joined: 26 Nov 13
Posts: 3117
Credit: 4050672230
RAC: 0

Just a quick note on some

Just a quick note on some testing. No CPU work done. Only GPU with CPU support. Both Machines 8 cores (refer to prior post for specifics on Motherboard)

Machine ID :10267709
GTX 780s
Memory clock 7000
Bit Width 384
Memory Bandwidth 336 GB/s
Memory 3072 GDDR5

BRP6-beta v1.50 Run time CPU time CPU utilization
1 work unit per GPU 1 CPU + 1 GPU ~4220 sec ~1180 sec 25-28% of a core per work unit
2 work units per GPU 0.5 CPU+0.5 GPU ~6330 sec ~1048 sec 18% of CPU core
3 work units per GPU Not done due to GPU temps

Machine 11681771
GTX 980s
Memory clock 7010
Bit Width 256
Memory Bandwidth 224.3 GB/s
Memory 4096 GDDR5

BRP6-Beta v 1.50 Run time CPU time CPU Utilization
1 work/gpu 1 CPU + 1 GPU ~4965 sec ~1063 sec 18% of CPU core
2 work/gpu 0.5CPU+0.5 GPU ~7800 sec ~1100 sec 14% of CPU core
3 work/gpu 0.5CPU+0.33GPU ~11450sec ~1250 sec 11% of CPU core

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5840
Credit: 109038570431
RAC: 33861136

I have made some significant

I have made some significant modifications to the opening post in the RESULTS ONLY thread. If you are in any doubt, please review the guidlines there because I have made clarifications/changes.

I have added extra hosts to that post - now a total of 4 - and have put more effort into the comments for each host. If you don't understand any of the numbers or descriptions/specs posted for each host, please ask here.

I intend to continue adding hosts - I haven't even started on NVIDIA yet :-). Keep an eye on the 'Last modified:' string at the top under the message ID. That will change each time I add new hosts to the list. I used a 2-digit host number because I'll probably end up putting more than 9 configurations in the list. I'm not sure if there's a limit on the maximum size of a post :-).

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5840
Credit: 109038570431
RAC: 33861136

RE: HOST 01: 619 Hi

Quote:
HOST 01: 619


Hi robl,

I moved your post to this thread because you haven't actually supplied any BRP6-beta results. Since BRP6 (non-beta) is just likely to be the same as BRP5 plus approximately 33% for the larger tasks, the real purpose of posting results is to identify the improvement of BRP6-beta and to throw light on the expected variability of crunch times.

It would be much appreciated if you would care to repost in the RESULTS thread when you have some BRP6-beta data to report. You need to have a significant sample size (which you should clearly state) so that the values you give are likely to be meaningful.

Thank you.

Cheers,
Gary.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5840
Credit: 109038570431
RAC: 33861136

RE: CPU: Intel Core

Quote:
CPU: Intel Core I7-4790k


Hi Sid,

I moved your post to this thread because you haven't specified GPU concurrency and only 6 tasks reported per host is too few a number from which to calculate a meaningful average. It's fine to report run time only if you wish but please report an average over a much larger number of tasks. If you don't report concurrency, the assumption is that you are only running one task at a time. 03:38:00 seems way too slow for a HD7970 running only a single task.

If you'd like to repost with a larger sample size and clearly stated concurrency, it would be much appreciated.

Thank you for your interest in posting performance data for the beta app.

Cheers,
Gary.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.