UPDATE: 09-Mar-2015 UTC There is a new (further optimised - v1.52) beta and an extra line just for it in the template below. If you are submitting results, please segregate and please use the appropriate line in the table.
PLEASE NOTE: Please read these guidelines.
This is intended to be a Results ONLY thread. Its main purpose is to give hard data on how the new, optimised, BRP6-beta app is performing in relation to either BRP5 or the standard non-optimised BRP6 app. If you don't have a decent sample size of BRP6-beta results, please don't post here.
If you do have such results (say a minimum of around 30 data points but preferably more) here are detailed instructions and a template to make things as easy as possible for you. It's NOT mandatory to use this template and you don't have to fill in every field but there are minimum requirements - which will be enforced. The intended audience includes the Devs, who have already expressed interest in having reliable information like this. I'm particularly interested in creating information on which volunteers can base their decisions about how to make their contribution more efficient and/or effective. I think there may be quite a few people wondering about adding/upgrading a GPU in an existing (perhaps older) rig. Hopefully this stuff may help them.
INSTRUCTIONS - for preparing and submitting results using the template.
1. Use the 'quote' button to reply to this post.
2. From the very top of the text in the text box that appears, delete everything up to and INCLUDING the first line that reads **** Make FIRST CUT here ****. You can do this later if you need to leave these instructions in place whilst you are editing the data in the template below. If you read these notes whilst looking at the template below, hopefully you will understand them better.
3. What immediately follows this FIRST CUT line is what I'm calling the 'template'. Just fill in your data to replace the placeholders that are there. Remove placeholders if you don't have particular items to report.
4. Be very careful with 'data table' type fields between the 'pre' and '/pre' tags. Make sure you replace the placeholder with the same number of characters, with leading or trailing blanks as appropriate. If you don't, the formatting will be stuffed up.
5. If you are happy for others to see your results list and general computer details (NOTE: NO PERSONAL INFORMATION IS VISIBLE) then
6. For all the lines that follow until you get to the actual data table just replace *** .... *** with the information you wish to submit about your setup.
7. For the actual data table itself, just replace the '#####' strings with exactly the same number of characters (digits plus leading or trailing blanks as appropriate) to make things line up nicely. I also insert commas to make large numbers more readable - eg 1,234,567 instead of 1234567. You can leave blank any fields for which you don't have data. BRP6-beta data PLUS sample size are mandatory.
8. The COMMENTS section is optional but is useful to document any particular points you noticed and to draw attention easily to them.
9. When finished data editing, immediately below you should find the line that reads **** Make SECOND CUT here ****. Delete that line and everything that follows - likely to be quite a lot :-). To do this, I would click the start of the line to mark it, drag the scrollbar of the text-box to the very end of its travel and shift-click after the very last text you see there. Everything in between the two points gets selected and can be deleted with a single 'backspace'.
If anything in the above list of instructions (or what follows) is not clear, or perhaps even blatantly wrong, please complain in the DISCUSSION thread. I'd like to fix it so it's clear to all who wish to contribute. Suggestions for improvement are very welcome.
**** Make FIRST CUT here ****
HOST NN - *** LINK to host ***
[pre]
CPU: *** Specify at least the model -- eg 'Intel Core i3 2120' plus any more you wish - eg '3.30GHz Sandy Bridge' - give speed if overclocked ***
Cores/Threads: *** Specify both numbers if you have a HT capable CPU -- eg 2 / 4 - this makes it clear if HT is on ***
Motherboard: *** Specify at least Manufacturer and Model -- eg 'ASUS P5-KPL AM/PS' or 'Dell xxxx' if proprietary ***
PCIe slot *** Specify PCIe version (generation) and electrical width of the bus -- eg 'V1.x x16' - add more lines if additional slots are in use ***
1st GPU: *** Specify GPU model and RAM size -- eg 'AMD HD7850 2GB' plus speeds if not standard clocks (or if you know them) ***
2nd GPU: -
3rd GPU: -
RAM: *** Specify number, size, speed of each module -- eg '2 x 2GB 1333MHz' as a minimum ***
Concurrency: *** Specify number of GPU tasks in parallel and resource shares (from project or from app_config.xml perhaps -- eg '4 @ 0.5 CPUs + 0.25 GPUs' ***
CPU Tasks: *** Specify number and type -- eg '2 x FGRP4' or '4 x Seti' or 'none' etc, as appropriate ***
Free CPU cores: *** Specify number -- add (virtual) or something else if Hyper-threading is on to flag this ***
OS: *** Specify the details ***
Driver: *** Specify whatever details you know ***
BOINC Version: *** Specify details -- eg '7.2.42 64bit'
Elapsed Time Statistics CPU time Statistics
----------------------------------- ------------------------------ Sample
Search Min Mean Max S.D. Min Mean Max S.D. Size Notes / Comments
====== ======= ======= ======= ===== ====== ====== ====== ==== ====== ================
BRP5 ####### ####### ####### ##### ###### ###### ###### #### #### *** Add something here if you wish ***
BRP6 (non-beta) ####### ####### ####### ##### ###### ###### ###### #### #### *** Add something here if you wish ***
BRP6-beta (Mean +- 1 x SD): 10
Longest CPU time in sample: 1,571
Shortest CPU time in sample: 953
[/pre]
COMMENTS
====####====
HOST 03 - fx_6300-01
[pre]
CPU: AMD FX 6300 Hex core
Threads: 6
PCIe slot x16 Version 1.x (I suspect) (single slot)
1st GPU: AMD HD7850 2GB
2nd GPU: -
3rd GPU: -
RAM: 2 x 4GB
Concurrency: 4 @ 0.67 CPUs + 0.25 GPUs
CPU Tasks: 3 x FGRP4
Free CPU cores: 3
Mean Times - secs Elapsed time Stats CPU time Stats
----------------- --------------------- ------------------- Sample
Search Elapsed CPU part Std Devn Variance Std Devn Variance Size Notes / Comments
====== ======= ======== ======== ======== ======== ======== ====== ================
BRP5 ~19,100 ~2,020 - - - - - Long term averages - little variation
BRP6 25,622 2,641 343 118,068 21 432 33 Stats results from online calculator
BRP6-beta 18,554 1,426 1,415 2,002,159 232 53,816 56 Stats results from online calculator
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 23,770
Shortest elapsed time in sample: 17,694
Longest CPU time in sample: 2,214
Shortest CPU time in sample: 1,302
[/pre]
COMMENTS
====####====
HOST 04 - g3258-01
[pre]
CPU: Intel Pentium Dual Core G3258 (Haswell Refresh)
Threads: 2
PCIe slot x16 Version 2 (single slot)
1st GPU: AMD HD7850 2GB
2nd GPU: -
3rd GPU: -
RAM: 2 x 4GB
Concurrency: 4 @ 0.45 CPUs + 0.25 GPUs
CPU Tasks: 1 x FGRP4
Free CPU cores: 1
Mean Times - secs Elapsed time Stats CPU time Stats
----------------- --------------------- ------------------- Sample
Search Elapsed CPU part Std Devn Variance Std Devn Variance Size Notes / Comments
====== ======= ======== ======== ======== ======== ======== ====== ================
BRP5 ~15,600 ~1,160 - - - - - Long term averages - little variation
BRP6 ~20,200 ~1,540 - - - - ~20 ~33% larger than BRP5 as expected
BRP6-beta 18,092 685 930 854,500 108 11,622 90 Stats results from online calculator
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 20,776
Shortest elapsed time in sample: 16,347
Longest CPU time in sample: 1,188
Shortest CPU time in sample: 604
[/pre]
COMMENTS
====####====
Please Note: At this point I changed the template to show the stats values of mean, std dev, and variance for each time (CPU and Elapsed) grouped together for easier reference rather than having the mean separated, as previously. I think this will be easier to decipher. Sorry for any inconvenience caused.
HOST 05 - 13_2120-01 - Updated Stats: 11-Mar-2015
[pre]
CPU: Intel Core i3 2120 (Sandy Bridge)
Cores/Threads: 2 / 4
PCIe slot x16 Version 2 (single slot)
1st GPU: AMD HD7850 2GB
2nd GPU: -
3rd GPU: -
RAM: 2 x 4GB
Concurrency: 4 @ 0.5 CPUs + 0.25 GPUs
CPU Tasks: 2 x FGRP4
Free CPU cores: 2 (virtual)
Elapsed Time Statistics CPU time Statistics
----------------------------------- ------------------------------ Sample
Search Min Mean Max S.D. Min Mean Max S.D. Size Notes / Comments
====== ======= ======= ======= ===== ====== ====== ====== ==== ====== ================
BRP5 - ~16,300 - - - ~2,260 - - - Long term averages prior to arrival of BRP6.
BRP6 (non-beta) - *21,400* - - - *2,900* - - - *Inferred value* - from BRP5. All BRP6 crunched with beta app.
BRP6-beta 1.52
BRP6-beta 1.52 18,053 18,214 18,298 60 1,337 1,371 1,403 22 24 First tasks all done with beta 1.52
Note: All beta tasks stats were done in LibreOffice calc after importing directly from the website tasks list.
There are two lines for beta 1.47. I left the previous values when I added the extra points so you could easily see the precise change.
[/pre]
COMMENTS
====####====
Please note: At this point I added a set of instructions and an actual template full of editable placeholders so that it should be very clear as to the details we would like to see in future submissions.
I would like to thank Jeroen and Gavin for their excellent contributions of results. I would also like to thank Gord for pointing out the 'Import from URL' feature in modern spreadsheets which should help people to collect data and calculate the stats.
====####====
HOST 06 - hebe
[pre]
CPU: Intel core2 quad Q8400 (Yorkfield) 2.66GHz clocked to 3.04GHz
Cores/Threads: 4 / 4
Motherboard: ASUS P5QPL-AM
PCIe slot PCIe 1.x x16 (single slot)
1st GPU: AMD HD7850 2GB - default clocks: core=900MHz mem=1200MHz temp=73C GPU load=95%
2nd GPU: -
3rd GPU: -
RAM: 2 x 2GB DDR2 800MHz
Concurrency: 4 @ 0.5 CPUs + 0.25 GPUs
CPU Tasks: 2 x FGRP4
Free CPU cores: 2
OS: PCLinuxOS 2014.04 - kernel 3.12.16-pclos3
Driver: fglrx driver and OpenCL libs from Catalyst 13.12
BOINC Version: 7.2.42 64bit
Elapsed Time Statistics CPU time Statistics
-------------------------- --------------------------- Sample
Search Mean Std Dev Variance Mean Std Devn Variance Size Notes / Comments
====== ====== ======= ======== ====== ======== ======== ====== ================
BRP5 ~23,200 - - ~4,800 - - - Estimated long term average
BRP6 ~30,200 - - ~6,500 - - ~12 Small sample size - averages estimated but agree with 1.33 x BRP5
BRP6-beta 20,045 1,675 2,807,166 2,276 680 461,983 97 Stats calculated from website data imported into LibreOffice
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 24,550
Shortest elapsed time in sample: 18,737
Longest CPU time in sample: 4,164
Shortest CPU time in sample: 1,773
[/pre]
COMMENTS
====####====
Please note: At this point I added an extra BRP6-beta 1.52 line to the template to allow for separate reporting of the new optimisations now added to the app.
Cheers,
Gary.
Copyright © 2024 Einstein@Home. All rights reserved.
Thanks for creating this
)
Thanks for creating this thread and sharing your host results. Below, I have shared the results of one of my two hosts dedicated full time for this project. I have not had a chance to work through the list of results for my second host yet.
HOST 01 - Host Details
[pre]
CPU: Intel Core i7 4930K 6-Core CPU (Ivy Bridge Extreme) – 4.0 GHz
Threads: 6 – HT Disabled
PCIe slot x16/x16/x8 PCI Express Version 3.0
1st GPU: AMD HD7970 3GB – PCI-E 3.0 x16 – 1040 GPU Frequency / 1500 MHz GPU Memory Frequency
2nd GPU: AMD HD7970 3GB – PCI-E 3.0 x16 – 1040 GPU Frequency / 1500 MHz GPU Memory Frequency
3rd GPU: -
RAM: 4 x 4GB – 4-channel – 1600 MHz CL9
Concurrency: 4 @ 0.5 CPUs + 0.25 GPUs
CPU Tasks: None
Free CPU cores: 3
OS: Linux Slackware64
Kernel: 3.12.31 compiled with GCC flag –march=core-avx-i
Driver: 14.6 Beta
Mean Times - secs Elapsed time Stats CPU time Stats
----------------- --------------------- ------------------- Sample
Search Elapsed CPU part Std Devn Variance Std Devn Variance Size Notes / Comments
====== ======= ======== ======== ======== ======== ======== ====== ================
BRP5 8,101 3,639 94 8,838 44 1,898 40 Stats from online stats calculator
BRP6 10,009 4,739 41 1,705 246 60,254 18 Application listed as BRP5-opencl-ati for Parkes PMPS XT data
BRP6-beta 9,328 3,067 226 51,256 215 46,221 40 Stats from online stats calculator
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 10,168
Shortest elapsed time in sample: 9,080
Longest CPU time in sample: 3,982
Shortest CPU time in sample: 2,951
[/pre]
COMMENTS
* Since there is still BRP5 data available, the GPUs sometimes run a mix of BRP5 and BRP6-beta tasks. This can cause varying GPU loads depending on which tasks are running.
* The average CPU load is at approximately 3-cores or half the available resources.
* In my testing, there appears to be a fine balance between GPU and CPU frequencies. Setting either one too high or too low can slow down runtime.
* In Linux, driver 14.6 Beta is still the most optimal for the BRP5 project that I have found. I have not yet tested BRP6 with the latest driver version.
* I previously setup this host with three 7970 GPUs. For BRP6, there was a substantial production loss per GPU compared to a two card configuration. I believe this to be because one card connects at x8 while the other two cards connect at x16. I plan to retest this configuration with the new BRP6 beta optimized application.
====####====
Jeroen
Following on from the good
)
Following on from the good advice in the discussion thread I have compiled results from the first two of my hosts and will add more as time permits:
HOST 01 - Host Details
[pre]
CPU: Intel Core i5 3330 4-Core CPU (Ivy Bridge) – 3.0 GHz on H61 motherboard
Threads: 4
PCIe slot x16 PCI Express Version 3.0 (single slot)
1st GPU: NVidia GTX660Ti 2GB 1188Mhz (with boost)/1502Mhz memory
2nd GPU: -
3rd GPU: -
RAM: 2 x 8GB – dual-channel – 1600 MHz CL9
Concurrency: 3 @ 0.2 CPUs + 0.33 GPUs
CPU Tasks: None
Free CPU cores: 4
OS: Windows 7 Pro 64 bit
Driver: 340.52
Mean Times - secs Elapsed time Stats CPU time Stats
----------------- --------------------- ------------------- Sample
Search Elapsed CPU part Std Devn Variance Std Devn Variance Size Notes / Comments
====== ======= ======== ======== ======== ======== ======== ====== ================
BRP5 16,903 4,078 491 241,959 417 174,361 16 Stats from online stats calculator
BRP6 21,346 6,527 437 191,315 659 435,173 30 Application listed as BRP5 v1.39 for Parkes PMPS XT data
BRP6-beta 16,432 934 1150 1,323,976 1017 1,035,214 30 Stats from online stats calculator
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 20,531
Shortest elapsed time in sample: 15,362
Longest CPU time in sample: 4,747
Shortest CPU time in sample: 564
[/pre]
COMMENTS
* The results for BRP5 and non beta BRP6 may include tasks that ran as a mixture of the two. BRP6-beta results are ~99.98% pure beta.
* GPU load with BRP6-beta is 98/9% with 84% memory contoller load. Current temp at 72C - ~10 degrees hotter than 'normal'.
HOST 02 - Host Details
[pre]
CPU: Intel Pentium G3258 2-Core CPU (Haswell Refresh) – 3.2 GHz on H81 motherboard
Threads: 2
PCIe slot x16 PCI Express Version 2.x (single slot)
1st GPU: AMD R9 280X 3GB 1070/1600Mhz
2nd GPU: -
3rd GPU: -
RAM: 2 x 4GB – dual-channel – 1333 MHz CL9
Concurrency: 4 @ 0.5 CPUs + 0.25 GPUs
CPU Tasks: None
Free CPU cores: 2
OS: Kubuntu 64 bit
Driver: Unknown vintage distro driver (!)
Mean Times - secs Elapsed time Stats CPU time Stats
----------------- --------------------- ------------------- Sample
Search Elapsed CPU part Std Devn Variance Std Devn Variance Size Notes / Comments
====== ======= ======== ======== ======== ======== ======== ====== ================
BRP5 8,890 1,442 111 12,399 11 131 30 Stats from online stats calculator
BRP6 11,618 1,882 248 61,825 9 86 30 Application listed as BRP5 v1.41 for Parkes PMPS XT data
BRP6-beta 8,925 796 708 501,405 216 46,845 100 Stats from online stats calculator
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 12,709
Shortest elapsed time in sample: 8,533
Longest CPU time in sample: 1,936
Shortest CPU time in sample: 714
[/pre]
COMMENTS
* The results for BRP5 and non beta BRP6 may include tasks that ran as a mixture of the two. BRP6-beta results are pure beta.
* GPU load with BRP6-beta is unknown due to lack of working tools. Current temp at 62C - ~2 degrees hotter than 'normal'.
====####====
Gav.
Another one for you
)
Another one for you :-)
HOST 03 - 4965420
[pre]
CPU: Intel Q6600 - 3.0Ghz
Cores/Threads: 4/4
Motherboard: Asus P5Q
PCIe slot PCI-e v2.x x16
1st GPU: NVidia GTX560Ti 448 1280MB default clocks core= 750Mhz memory= 3900Mhz temp=55C Load=96%
2nd GPU: -
3rd GPU: -
RAM: 2 x2GB 1066Mhz DDR2
Concurrency: 3 @ 0.2 cpus + 0.33 gpus
CPU Tasks: None
Free CPU cores: 4
OS: Kubuntu 14.0 64bit - Kernel Linux 3.13.0-40-generic
Driver: Disrto version 331.38
BOINC Version: 7.2.42 64bit
Elapsed Time Statistics CPU time Statistics
-------------------------- --------------------------- Sample
Search Mean Std Dev Variance Mean Std Devn Variance Size Notes / Comments
====== ====== ======= ======== ====== ======== ======== ====== ================
BRP5 ####### ###### ######### ##### ##### ####### #### No data left to refer to...
BRP6 21,100 85 7,234 8,881 88 7,898 70 Stats from online stats calculator
BRP6-beta 13,590 3,895 15,177,136 2,364 4,013 16,107,649 58 Stats from online stats calculator
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 29102
Shortest elapsed time in sample: 9885
Longest CPU time in sample: 18709
Shortest CPU time in sample: 629
[/pre]
COMMENTS
* GPU temps with this host have not changed from pre-beta app values.
* CPU times are considerably better with the beta app but CPU can regularly hit 100% utlisation with CPU temps up 5 - 10C per core.
====####====
Gav.
And another :-) HOST 04 -
)
And another :-)
HOST 04 - 3518088
[pre]
CPU: Intel Pentium G3258 2-Core CPU (Haswell Refresh) – 3.6 GHz
Cores/Threads: 2/2
Motherboard: Asus Z97-K
PCIe slot PCI-e v3 x16
1st GPU: AMD HD7970 3GB core= 1100Mhz memory= 1500Mhz temp=62C Load=86%
2nd GPU: -
3rd GPU: -
RAM: 2 x4GB 1866Mhz DDR3 CL9
Concurrency: 4 @ 0.5 cpus + 0.25 gpus
CPU Tasks: None
Free CPU cores: 2
OS: Kubuntu 14.0 64bit - Kernel Linux 3.13.0-44-generic
Driver: AMD Catalyst 14.9
BOINC Version: 7.2.42 64bit
Elapsed Time Statistics CPU time Statistics
-------------------------- --------------------------- Sample
Search Mean Std Dev Variance Mean Std Devn Variance Size Notes / Comments
====== ====== ======= ======== ====== ======== ======== ====== ================
BRP5 8,021 137 19,021 2,618 216 47,084 57 Stats from online stats calculator
BRP6 10,436 81 6,700 3,152 19 363 61 Stats from online stats calculator
BRP6-beta 8,876 533 284,414 2,465 363 131,876 207 Stats from online stats calculator
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 11875
Shortest elapsed time in sample: 7817
Longest CPU time in sample: 4460
Shortest CPU time in sample: 1944
[/pre]
COMMENTS
====####====
Gav.
OK just two more results for
)
OK just two more results for you, its time for me to let someone else have a go!!! :-)
HOST 05 - 6109304
[pre]
CPU: Intel core i7 3770K (Ivy Bridge) – 3.5 GHz
Cores/Threads: 4/8 HT enabled
Motherboard: Asus Z77-V-LX
PCIe slot PCI-e v3 x16
1st GPU: AMD R9 280X 3GB core= 1070Mhz memory= 1600Mhz temp=60C Load=98%
2nd GPU: -
3rd GPU: -
RAM: 2 x4GB 1600Mhz DDR3 CL9
Concurrency: 4 @ 0.5 cpus + 0.25 gpus
CPU Tasks: None
Free CPU cores: 8 logical
OS: Kubuntu 14.0 64bit - Kernel Linux 3.13.0-43-generic
Driver: AMD Catalyst 14.9
BOINC Version: 7.2.42 64bit
Elapsed Time Statistics CPU time Statistics
-------------------------- --------------------------- Sample
Search Mean Std Dev Variance Mean Std Devn Variance Size Notes / Comments
====== ====== ======= ======== ====== ======== ======== ====== ================
BRP5 7,551 294 86,810 2,936 159 25,389 90 Stats from online stats calculator
BRP6 10,006 148 22,015 3,682 39 1,577 15 Stats from online stats calculator
BRP6-beta 8,675 461 212,546 2,739 337 113,822 243 Stats from online stats calculator
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 10828
Shortest elapsed time in sample: 7778
Longest CPU time in sample: 5420
Shortest CPU time in sample: 2386
[/pre]
COMMENTS
* GPU temps increased 2 - 3C with beta app.
* GPU load increaed ~6% with beta app.
====####====
HOST 06 - 5116827
[pre]
CPU: Intel core i5 3570K (Ivy Bridge) – 4.2 GHz
Cores/Threads: 4/4
Motherboard: Asus Z68-V-gen3
PCIe slot PCI-e v3 x16
1st GPU: AMD R9 280X 3GB core= 1100Mhz memory= 1600Mhz temp=61C Load=99%
2nd GPU: -
3rd GPU: -
RAM: 2 x4GB 1866Mhz DDR3 CL9
Concurrency: 3 @ 0.2 cpus + 0.33 gpus
CPU Tasks: None
Free CPU cores: 4 logical
OS: Windows 7 Ultimate 64Bit
Driver: AMD Catalyst 13.11-beta-6
BOINC Version: 7.4.36 64bit
Elapsed Time Statistics CPU time Statistics
-------------------------- --------------------------- Sample
Search Mean Std Dev Variance Mean Std Devn Variance Size Notes / Comments
====== ====== ======= ======== ====== ======== ======== ====== ================
BRP5 6,292 123 15,169 932 17 300 64 Stats from online stats calculator
BRP6 8,195 339 114,939 1,176 16 275 25 Stats from online stats calculator
BRP6-beta 6,795 463 214,536 771 75 5686 252 Stats from online stats calculator
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 8995
Shortest elapsed time in sample: 4368
Longest CPU time in sample: 1211
Shortest CPU time in sample: 697
[/pre]
COMMENTS
* GPU temps increased 5 - 8C with beta app.
* GPU load increased ~7% with beta app.
====####====
Gav.
Here's some data from my host
)
Here's some data from my host and to emphasize the variable run times I've also added 2 graphs at the end of the post for run time and CPU time.
HOST 01 - 11468519
[pre]
CPU: Intel Core i7 3770K @ 4.2GHz
Cores/Threads: 4/8 - HT on
Motherboard: MSI Z77A-GD65
PCIe slot V3.0x16
1st GPU: Nvidia 660Ti 2GB - , (6508MHz effective), temp ~65C @ 80% fan speed, load ~90%, mem controller ~75%
2nd GPU: Intel HD4000 - used to run BRP4 tasks.
RAM: 2 x 8GB Corsair 1600MHz
Concurrency: 2 @ 0.5 CPUs + 0.5 GPUs, 1 @ 1.0 CPUs + 1.0 GPUs for the HD4000
CPU Tasks: 5 FGRP4 and/or S6Bucket Follow-up as Boinc and the project see fit.
Free CPU cores: 1 CPU core free
OS: Win 7 HP 64-bit
Driver: Nvidia 347.52, Intel 10.18.10.4061
BOINC Version: 7.4.36 64-bit
Elapsed Time Statistics CPU time Statistics
---------------------------------- ------------------------------------ Sample
Search Min Mean Max Std Dev Min Mean Max Std Dev Size Notes / Comments
====== ====== ====== ====== ======= ======= ====== ====== ======= ====== ================
GPU-BRP5 13456 14874 19130 1696 4666 5629 7356 886 13 Data from the online database, small sample size.
GPU-BRP6 10797 18690 25095 2119 4224 7543 11054 1351 48 Data from the online database
GPU-BRP6b 8058 12268 34133 4585 1042 3356 24500 4597 64 Data from the online database, decent sample size.
[/pre]
COMMENTS
* Beta tasks may have run together with BRP4G or BRP5 tasks, but the majority have been paired with another beta task.
* Check out the graphs below to get a feel for how beta tasks varies in length.
* The dots in the graphs match each other so that e.g. the 2nd dot from the left shows both the run time and cpu time for the same task.
* Otherwise there's no real sorting in the graphs.
* Post edited to correct some finger fumbles.
Below is an update of my host
)
Below is an update of my host with BRP v1.52 data, graphs, and three GPU configuration. I selected sample ranges from older data matching concurrency and GPU quantity.
HOST 01 - Host Details
[pre]
CPU: Intel Core i7 4930K 6-Core CPU (Ivy Bridge Extreme) – 4.0 GHz
Threads: 6 – HT Disabled
PCIe slot x16/x16/x8 PCI Express Version 3.0
1st GPU: AMD HD7970 3GB – PCI-E 3.0 x16 – 1040 GPU Frequency / 1500 MHz GPU Memory Frequency
2nd GPU: AMD HD7970 3GB – PCI-E 3.0 x16 – 1040 GPU Frequency / 1500 MHz GPU Memory Frequency
3rd GPU: AMD HD7970 3GB – PCI-E 3.0 x8 – 1040 GPU Frequency / 1500 MHz GPU Memory Frequency
RAM: 4 x 4GB – 4-channel – 1600 MHz CL9
Concurrency: 3 @ 0.5 CPUs + 0.25 GPUs
CPU Tasks: None
Free CPU cores: 1
OS: Linux Slackware64
Kernel: 3.12.31 compiled with GCC flag –march=core-avx-i
Driver: 14.6 Beta
Mean Times - secs Elapsed time Stats CPU time Stats
----------------- --------------------- ------------------- Sample
Search Elapsed CPU part Std Devn Variance Std Devn Variance Size Notes / Comments
====== ======= ======== ======== ======== ======== ======== ====== ================
BRP5 v1.39 7,464 4,247 883 779,568 376 141,264 67 Stats from online stats calculator
BRP6 v1.41 9,170 5,848 2,196 4,822,322 557 310,743 29 Stats from online stats calculator
BRP6 v1.52 7,066 3,170 72 5,232 38 1,435 188 Stats from online stats calculator
Extra information for BRP6-Beta results above
---------------------------------------------
Longest elapsed time in sample: 7,184
Shortest elapsed time in sample: 6,715
Longest CPU time in sample: 3,285
Shortest CPU time in sample: 3,002
[/pre]
COMMENTS
* PCI-E bandwidth appears to have much less of an impact on elapsed time for BRP6 v1.52 compared to before. In the below graphs, the spikes in elapsed time with older applications were from the single card connecting at the slower link speed.
I limited the sample size to 60 for the below graphs. The samples graphed are part of the same data set used above.
====####====
@Gary An update with 1.52
)
@Gary An update with 1.52 (please remove the original below)
Splitting the data into GPU0 and GPU1 is easy when they perform radically different and the tasks are very uniform.
Not so now with the beta, i had to extract the data manually and filter it. This will probably be the last time i will do it, and i would not ask others to do it.
I have used the log files and some basic awk scripts to filter out 100 recent BRP5 tasks. I have tried to pick a range of time where the system has been stable running at 0.33.
I had been trying to run BRP5 as GPU 0.25 for a few days but that was proving a little unreliable on GPU0, so reversed to 0.33 and started the beta about the same time.
HOST NN - 4918234
[pre]
CPU: Intel(R) Core(TM) i3 CPU 530 @ 2.93GHz [Family 6 Model 37 Stepping ***1
Cores/Threads: 2/4
Motherboard: Gigabyte GA-H55M-UD2H
PCIe slot PCIEX16 v2.0m PCIX4 ***2
1st GPU: nVidia GTX-460 768MB (MSI) PCIe v2.0 x16 - system monitor
2nd GPU: nVidia GTX-460 768MB (MSI) PCIe v1.0 x4 - no monitor
3rd GPU: -
RAM: 2 x 2GB DIMM 1333 MHz (0.8 ns)
Concurrency: 3 tasks per GPU (ie share 0.33 GPU)
CPU Tasks: none
Free CPU cores: 4
OS: Ubuntu 2.6.32 ***1 ***7
Driver: 337.12 ***3
BOINC Version: 6.10.17 ***1 ***7
Elapsed Time Statistics CPU time Statistics
---------------------------------- ------------------------------------ Sample
Search Min Mean Max Std Dev Min Mean Max Std Dev Size Notes / Comments
====== ====== ====== ====== ======= ======= ====== ====== ======== ====== ================
GPU0-BRP5 20492 24022 26095 629 4975 5741 6262 225 70 ***5
GPU0-BRP6 20759 30381 34598 4007 7345 9897 14314 1749 15
GPU0-BRP6b 17151 23132 37241 5831 1089 4362 14559 4934 8 Two very large outliers - Maximums. ***4
GPU1-BRP5 54792 57800 58537 717 12273 13385 13978 325 30 ***5
GPU1-BRP6 71021 76565 77692 1973 20072 21210 21601 434 9
GPU1-BRP6b 8232 21225 23423 5305 1080 1162 1237 59 7 Suspect outlier minimum ET may have occurred when running x1 ***6
[Update]
GPUS-1.52 19089 23318 25483 988 947 1011 1195 57 33 Figures are 2 GPUs consolidated, both are completing within 5% of the other. ***8
[/pre]
COMMENTS
* ***2 PCIe slot config needed PER GPU CARD.
* ***3 can be obtained from a task
* ***4 Two tasks
PM0008_02821_54_1, PM0008_01021_260_0
was validated quicker by a BRP6 not a BRP6b.
* ***5 Selected my last 100 PB00* BRP5 Tasks running at x3 0.33GPU
* ***6 Note no matching outlier in CPU time - only ET.
* ***7 Don´t laugh - while you´ve been updating i´ve been crunching
* ***8 Note std deviation has dropped - almost a standard candle.
* GPU0 temp up 5C (to 63C) , GPU1 10C (to 57C).
* PCI bandwith usage down to under 8% on GPU0, under 5% on GPU1 previously always over 40% often 90%+
* Resident memory was higher for Beta now back to nominal 64-80Mb
* No overclocking
HOST 01 -
)
HOST 01 - Stoll8
[pre]
CPU: i3-4130 stock clocked at 3.4 GHz
Cores/Threads: 2/4
Motherboard: ASROCK Z87 EXTREME3 LGA1150
PCIe slot V 3.0 X16
1st GPU: EVGA EVGA GeForce GTX 970 Superclocked ACX 2.0 4GB GDDR5 256bit, overclocked see comments
RAM: 2 X 4 GB see comments for speeds
Concurrency: 3 @ 0.2C + 0.33 NV
CPU Tasks: 2 X FGRP4
Free CPU cores: 2 (virtual)
OS: Microsoft Windows 7 Home Premium x64 Edition, Service Pack 1, (06.01.7601.00)
Driver: 344.60
BOINC Version: 7.4.36
Elapsed Time Statistics CPU time Statistics
----------------------------------- ------------------------------ Sample
Search Min Mean Max S.D. Min Mean Max S.D. Size Notes / Comments
====== ======= ======= ======= ===== ====== ====== ====== ==== ====== ================
BRP6 (non-beta) 15819 16126 17328 248 6102 6270 6540 90 43 all version 1.39 Parkes PMPS
BRP6-beta <1.52 8164 10387 26960 1872 1024 1571 16634 1883 137 all version 1.50--1.47 could not run on this host
BRP6-beta 1.52 9009 10124 13035 661 959 1055 1673 120 72 wonderful improvement over 1.39
[/pre]
COMMENTS
* CPU-Z reports the RAM to be running at 799.6 MHZ with 9-9-9-24-256-2T timings
* nominally two "free" virtual CPU cores, but tampered with by Process Lasso affinity to hold one side of each physical core free
* but the support CPU job allowed to use any virtual core at any time
* GTX 970 overclocked using nVidia Inspector to memory clock=3899, core clock == 1427 as reported by MSI Afterburner
* invalid rate negligible on 1.39, roughly 5% on 1.50/1.52 until core clock overclock slightly reduced
====####====
Here's an update with data
)
Here's an update with data for v1.52 added. The graphs at the end of the post only contain data for the beta versions, for a comparison with the non beta versions please see my earlier post in this thread.
HOST 01 - 11468519
[pre]
CPU: Intel Core i7 3770K @ 4.2GHz
Cores/Threads: 4/8 - HT on
Motherboard: MSI Z77A-GD65
PCIe slot V3.0x16
1st GPU: Nvidia 660Ti 2GB - , (6508MHz effective),
temp ~65C @ 80% fan speed, load ~90%, mem controller ~75%, bus interface load 3%
2nd GPU: Intel HD4000 - used to run BRP4 tasks.
RAM: 2 x 8GB Corsair 1600MHz
Concurrency: 2 @ 0.5 CPUs + 0.5 GPUs, 1 @ 1.0 CPUs + 1.0 GPUs for the HD4000
CPU Tasks: 5 FGRP4 and/or S6Bucket Follow-up as Boinc and the project see fit.
Free CPU cores: 1 CPU core free
OS: Win 7 HP 64-bit
Driver: Nvidia 347.52, Intel 10.18.10.4061
BOINC Version: 7.4.36 64-bit
Elapsed Time Statistics CPU time Statistics
---------------------------------- ------------------------------------ Sample
Search Min Mean Max Std Dev Min Mean Max Std Dev Size Notes / Comments
====== ====== ====== ====== ======= ======= ====== ====== ======= ====== ================
BRP5 13456 14874 19130 1696 4666 5629 7356 886 13 Data from the online database
BRP6 v1.39 10797 18690 25095 2119 4224 7543 11054 1351 48 Data from the online database
BRP6 v1.50 8058 12268 34133 4585 1042 3356 24500 4597 64 Data from the online database
BRP6 v1.52 10422 11164 13256 540 858 1262 1821 216 68 Data from the online database
[/pre]
COMMENTS
* BRP6 tasks are about 33% bigger than BRP5 tasks but only take about 75% of the time to run, that's just amazing! Thank you to HB et al. for taking the time to implement these optimizations.
* Before BRP6 and the release of the beta apps this host hovered at or just above 50,000 RAC, it's now at 65,000 and still climbing.
* I'm not sure why there's a blip in the curve but I think it corresponds to my iGPU running 2 BRP6 tasks, no more BRP6 tasks have been issued to the iGPU since then. But it can also be that I used the machine for something else at the time.
* I should also state that I use Process Lasso to change the priority of the CPU helper to "Above normal" for the BRP6 tasks and "High" for the BRP4 iGPU tasks. I do not change the CPU affinity.
* I'll continue to collect data in anticipation of a new CUDA 5.5 app so I can make comparisons if and when it's released.