I've been seeing Astrocrab post BRP completion times on various video cards (most recently 7970s) that are vastly superior to mine. For example, this system of mine:
- AMD FX-6100
- 8 GB DDR3
- 2x 7970 @ PCI-E 2.0 16x/16x (if I understand the capabilities of the AMD chipset correctly and as reported by GPU-Z)
- Windows 8
Running 4 simultaneous BRP-5 per card, I'm getting completion times at just about exactly the 6 hour (21600 seconds) mark, or 5400 seconds per task. Astrocrab's numbers on a Linux machine with a 3570k and dual 7970's are running a little in excess of 3000 seconds per unit.
I know the CPU isn't ideal (it's what I had at the time), but completion times seem unaffected by number of cores free and scale normally if I try more or fewer tasks. so as far as I can tell it is feeding the cards what they need. My understanding of this CPU / motherboard tells me both cards are running at PCI-E 2.0 x 16 lanes each, which means they're being throttled somewhat by not having 3.0 x16, but I can't believe this would account for a 40% difference.
I do have a good IvyBridge motherboard available to me. It's got an i3 on it now, so I'd have to upgrade to an i5 to get access to PCI-E 3.0, but the motherboard would only support 8x/8x with two cards and my understanding is that the 3570k (for example) only supports 16 lanes anyway. 3.0@8x would be the same as 2.0@16x.
Any thoughts as to what's going on? PCI lanes, CPU, operating system?
Alec
P.S. as another point of data, when we were doing BRP-4, my optimal setup was 4 per card and they were coming in at 2400 seconds, or 600 each, which was about 20% slower than other times I'd seen reported for the same card.
Copyright © 2024 Einstein@Home. All rights reserved.
Is Linux faster for BRP work than Windows?
)
Four task per GPU, so 8 task in total, should be too many for your CPU, and I'm a bit surprised that fewer task didn't improve the runtime. The rule of thumb for the BRP AMD/OpenCL app is that there should not be more tasks than the CPU can handle parallel threads.
Cheers
HB
Removing one card and doing
)
Removing one card and doing only 3 simultaneous (based on FX-6100 having 3 floating point units) resulted in a 10-15% increase in credits / hour, so that's part of it. Perhaps the rest is 3.0 > 2.0.
I guess for now I will just accept reduced performance until I can get an i7. Thank you.
7970 can handle even 8-10 WUs
)
7970 can handle even 8-10 WUs at once with 99% load. try it!
RE: 7970 can handle even
)
YOW......32GB of DDR3 and that processor may have something to do with the 70K RAC too Astro
RE: RE: 7970 can handle
)
The 32GB means nothing in Einstein and astrocrab is only running GPU WUs so the 72k rac is GPU only. Running CPU WUs wouldn't add much to the RAC anyway...
Plus there is major
)
Plus there is major overclocking on the i5-3570K's...
Gord