Task taking 1 CPU plus 1 GPU ?

Matthew S. McCleary

Joined: 5 May 09

Posts: 6

Credit: 183078

RAC: 0

RE: These are dominated by

27 Jan 2010 20:42:40 UTC

Message 96735 in response to message 96734

(moderation:

)

Quote:

These are dominated by the AMD ATI graphics cards as the nVidia ones are so, relatively, slow.

It's not so much that the ATI cards are faster than nVidia's, but rather the ATI cards have so many more stream processors. Even the entry-level ATI cards have 800 stream processors whereas nVidia's GTX295 tops out in the 400 range. Thus the overall throughput is remarkably better with ATI cards than nVidia.

Hopefully the 300-series from nVidia, due real soon now, will rectify the situation.

Ver Greeneyes

Joined: 26 Mar 09

Posts: 140

Credit: 9562235

RAC: 0

RE: Hopefully the

27 Jan 2010 22:58:13 UTC

Message 96736 in response to message 96735

(moderation:

)

Quote:

Hopefully the 300-series from nVidia, due real soon now, will rectify the situation.

At the very least, their Fermi architecture looks very sensibly designed, and should be easily extensible. Only time will tell if they can catch up to AMD/ATI's impressive current performance, though. Although it may not help out Einstein@Home specifically, I do hope they move the front-line of performance to double/extended precision. Exclusively single precision does not belong in the realm of GPGPU.

DanNeely

Joined: 4 Sep 05

Posts: 1364

Credit: 3562358667

RAC: 0

RE: RE: These are

27 Jan 2010 23:26:31 UTC

Message 96737 in response to message 96735

(moderation:

)

Quote:

Quote:

These are dominated by the AMD ATI graphics cards as the nVidia ones are so, relatively, slow.

It's not so much that the ATI cards are faster than nVidia's, but rather the ATI cards have so many more stream processors. Even the entry-level ATI cards have 800 stream processors whereas nVidia's GTX295 tops out in the 400 range. Thus the overall throughput is remarkably better with ATI cards than nVidia.

Hopefully the 300-series from nVidia, due real soon now, will rectify the situation.

It's not the number of cores that's the main difference; if it was the 4870 would be about 3-4x as fast as the 285 vs the two of them being similar in performance as GPUs. ATI's advantage in distributed computing is largely driven by the fact that only 1/3rd of the 8xxx, 9xxx, and 2xx series SPs have double precision floating point support (something not needed much if at all in rendering gfx); while all of the SPs in ATI cards from the 38xx forward have been capable of DP calculations.

Nothing I've seen about Fermi has mentioned if all it's SPs will support DP or not; but between ATI cleaning their clocks among credit whores and NV's emphasis on super computing with the chip I'd be shocked if they don't.

NV has signficantly improved the cache structure on Fermi. While this won't matter much for simplistic programs like collatz which run zillions of completely independent threads. For more general purpose apps where all the threads are working on the same data, or the threads don't have extremely linear memory access (eg S5R6) the reduction in time needed to communicate results and access data should result in major improvements in capability.

J Langley

Joined: 30 Dec 05

Posts: 50

Credit: 58338

RAC: 0

nVidia is claiming Fermi DP

28 Jan 2010 8:05:20 UTC

Message 96738 in response to message 96737

(moderation:

)

nVidia is claiming Fermi DP performance is 4 x that of GT200: see http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf

Fred J. Verster

Joined: 27 Apr 08

Posts: 118

Credit: 22451438

RAC: 0

RE: nVidia is claiming

5 Feb 2010 23:39:44 UTC

Message 96739 in response to message 96738

(moderation:

)

Quote:

nVidia is claiming Fermi DP performance is 4 x that of GT200: see http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf

Hi, almost 2 month ago, got an ATI HD5770 and tried it on Collatz C. former 3x+1.
I'm amazed how those (single precision) card can compute.
Especially when the data fits in the video memory, little communicating over the PCI-Ex16/8 bus, is necessary. These run completely on the GPU.
The CPU, I belief, is used to copy the data to the GPU.

Task taking 1 CPU plus 1 GPU ?

Forums › Cruncher's Corner

RE: These are dominated by

RE: Hopefully the

RE: RE: These are

nVidia is claiming Fermi DP

RE: nVidia is claiming

Comment viewing options

Forums › Cruncher's Corner