The cruel truth is that two ABP2 tasks just would not fit on your video card at the same time, memory wise.
CU
HB
How much memory is required to run two cuda-apps at a time? Would a 1024MB Card be sufficent? I highly tend to a GTX460, which is available as 768MB and 1GB card.
The GPUGRID-Forum says:
Will the new Einstein-App support Fermi, OpenCL ? What are the requirements for future apps? Is double-precision required? And as I've seen in earlier post, memory could also be important.
Someone tested the current ABP2 app on fermi and found it worked. I guess Fermi-support will be high on the agenda for all the CUDA enabled BOINC projects and should not be that hard to achieve.
Bernd and Oliver have stated in the past that they are looking into OpenCL.
For the ABP2 pulsar search that currenly has CUDA support already, double precision is not required. The GC1 app is AFAIK also nopt requiring double precision in the most performance critical parts, but is probably harder to rewrite for GPUs.
Memory: well, more is always better. Having said that, the next generation of ABP2 CUDA apps will put far more load on the GPU than the current one, so running more than one app instance per GPU will then no longer make sense 8a single task will already saturate the GPU almost fully). Still, I'd go for the 1GB version.
... but it would be an ATI 4850 picked up on ebay for maybe around $70 ...
... and I'd stick it on Milkyway and watch the tasks roll past every 205 seconds ...
... hang on a minute -- I'm already doing that!! :-).
Buying an overpriced and underperforming GPU to run on the project that currently makes quite inefficient use of it seems like a bit of a waste.
If E@H does develop an openCL app down the track, it might completely change things. While the current app is still so inefficient, it would be sensible to wait and see what develops. If you really want to get a taste for GPU crunching, you'd be better off trying a cheap ATI (double precision) on MW. At least all your CPU cores would still be going full steam on E@H.
... but it would be an ATI 4850 picked up on ebay for maybe around $70 ...
... and I'd stick it on Milkyway and watch the tasks roll past every 205 seconds ...
... hang on a minute -- I'm already doing that!! :-).
Buying an overpriced and underperforming GPU to run on the project that currently makes quite inefficient use of it seems like a bit of a waste.
If E@H does develop an openCL app down the track, it might completely change things. While the current app is still so inefficient, it would be sensible to wait and see what develops. If you really want to get a taste for GPU crunching, you'd be better off trying a cheap ATI (double precision) on MW. At least all your CPU cores would still be going full steam on E@H.
Hi Gary,
you're right, its a kind of waste, this is why I'm looking for a way to use my GPU more efficient.
And yes, you're right, looking at MW@Home, the crunching speed is outstanding. I have 2 GPU's doing that.
But the most important thing is to participate in scientific work, to understand more about our world and to increase our knowledge. In my case it's the micro- and the macrocosmos, thats of highest interest. And Einstein fits into my view of the world. Credits are only on second (or third) position.
Unfortunately OpenCL came very late and its not working correct right now. If you're interested in more info, check the GPUGRID-forum. Programming new apps in STREAM makes no sense, its a limitation and its outdated (as posted on GPUGRID).
If one compares GFLOP's per € or $, or GFLOP's per Watt, he finds a clear preference for ATI. And if you take a closer look on the overview of graphic cards in thge GPUGRID-forum, you find, that a lot of cards will work, you find a lot of cards that MIGHT work (as my GTX260) and also a lot of cards that don't work. But as long as programmers write apps in CUDA and you want to participate in these project, you need to have a Nvidia-Card.
If you take a look at the cal (STRTEAM) apps, you find MW, Collatz, Seti Astropulse and DNETC.
DNETC is more a game than a scientific app (if you want your data decrypted, dont encrypt it), Seti wu's are very rare, MW is very active and if they have no wu's most of the MW-people crunch Collatz.
I'm very shure that a lot of people will switch to other apps if they could use their ATI-cards. Speaking for myself, I would give 50% of my GFLOPS to other (astrophysic) apps. And I have 5 GPU's, two of them running 6/16.
From my point of view a clear preference for OpenCL.
@Bikeman
Thanks again for your efford. Presumably you're right, an optimized app should make at least 90% use of the GPU, then there is no need to run two apps at the same time. But until this is implemented ...
I hope, Bernd and oliver will use OpenCL! Send them my best wishes!
As you see earlier in my post here, it really makes sense to check, which cards will crunch which app. I never have had problems like this with ATI. Is it DP, it WILL work.
I would prefer a decision like: Fully supports your card OpenCL, it will crunch the app. No matter whether it is an ATI or Nvidia.
this is why I'm looking for a way to use my GPU more efficient.
I've found a thread in the GPUGRID forum regarding driver architectur differences between Win7 and XP.
I tested it overnight. The speed difference is pretty high. Average runtime of a cuda E@H wu under Win7 is in the range of 170 min, under XP in the range of 135 min. This is 25% faster!
Well, exact figures need a longer run, but I really tend to return to XP.
That raises another question: Is XP still fully supported with drivers for the latest cards (GTX460) or is this a dead-end?
That raises another question: Is XP still fully supported with drivers for the latest cards (GTX460) or is this a dead-end?
No, they're still supported. Although you'll miss some features and can't do certain multi-SLI combinations. Missing features are mostly contributed to what OpenGL support there is for XP and DirectX version.
Hi,
yesterday I did some tests regarding the cuda app.
I installed MSI Afterburner and checked all I could, running an E@H cuda app. The result is:
GPU-load < 10%
used RAM: 130 MB
Now: my GTX260 has ~ 800MB Ram
My CPU has 4 cores.
I really would like to run at least 2 cuda apps, but I'm not 'fit' enough to compose the required app_info.xml. From other apps I know how to change the nr. of required GPU's per task, but thats the limit (at least mine). And as I've learned in other threads, running other E@H apps is 'a must' and have to be included in that xml.
The fact that only part of the computation is done on the GPU limits the speedup you can expect for this app. If you have a very fast CPU and a low-end video card (those selling for around 50 $ for example), the GPU app might even run slower than the CPU version!
Certainly the case for me! With my (admittedly poor) built in video card (GeForce 310 - 512MB), the WUs were taking 15% longer on average.
Back to CPU only, fortunately, they crunch very nicely :)
Hi,
yesterday I did some tests regarding the cuda app.
I installed MSI Afterburner and checked all I could, running an E@H cuda app. The result is:
GPU-load < 10%
used RAM: 130 MB
Now: my GTX260 has ~ 800MB Ram
My CPU has 4 cores.
I really would like to run at least 2 cuda apps, but I'm not 'fit' enough to compose the required app_info.xml. From other apps I know how to change the nr. of required GPU's per task, but thats the limit (at least mine). And as I've learned in other threads, running other E@H apps is 'a must' and have to be included in that xml.
Could anyone help please?
Regards,
Alexander
Hi everyone :-)
Interesting
Im seeing less than 5% usage on my fermi cards, though they do seem to go through wu's at a decent rate.
Cant wait for a new improved app, it will be interesting to see just how far they can go with the current generation of gpu's
Are there any optimised apps for einstein?
There are no third-party apps available, but the S5GC1 Windows and Linux apps come in three flavors that are automatically distributed to matching PCs:
* a version that makes use of the SSE2 SIMD instruction set
* a version that makes use of the SSE SIMD instruction , but not SSE2
* a compatibility version for old PCs that do not support SSE or SSE2
The Mac (intel) version is always optimized for SSE2
The SSE and SSE2 apps include hand-written assembly code
The ABP2 (CPU) app requires SSE but uses no explicit SIMD code except where the compiler may insert it automatically. Still it has been "optimized" for performance, a few months ago some modifications both in the app itself and in the design of the search parameters were done that made it run approx 5...7 times faster.
RE: The cruel truth is
)
How much memory is required to run two cuda-apps at a time? Would a 1024MB Card be sufficent? I highly tend to a GTX460, which is available as 768MB and 1GB card.
The GPUGRID-Forum says:
GTX 460 GF104 40nm Compute Capable 2.1 907 BoincGFlops peak (768MB)
GTX 460 GF104 40nm Compute Capable 2.1 907 BoincGFlops peak (1GB)
This is ~50% faster than my GTX260, draws less power and sells for about €205.- But would it satify my wishes for Einstein?
Regards,
Alexander
RE: Will the new
)
Someone tested the current ABP2 app on fermi and found it worked. I guess Fermi-support will be high on the agenda for all the CUDA enabled BOINC projects and should not be that hard to achieve.
Bernd and Oliver have stated in the past that they are looking into OpenCL.
For the ABP2 pulsar search that currenly has CUDA support already, double precision is not required. The GC1 app is AFAIK also nopt requiring double precision in the most performance critical parts, but is probably harder to rewrite for GPUs.
Memory: well, more is always better. Having said that, the next generation of ABP2 CUDA apps will put far more load on the GPU than the current one, so running more than one app instance per GPU will then no longer make sense 8a single task will already saturate the GPU almost fully). Still, I'd go for the 1GB version.
Happy crunching
HB
RE: ... Still, I'd go for
)
I'd go for a 512MB version ... :-)
... but it would be an ATI 4850 picked up on ebay for maybe around $70 ...
... and I'd stick it on Milkyway and watch the tasks roll past every 205 seconds ...
... hang on a minute -- I'm already doing that!! :-).
Buying an overpriced and underperforming GPU to run on the project that currently makes quite inefficient use of it seems like a bit of a waste.
If E@H does develop an openCL app down the track, it might completely change things. While the current app is still so inefficient, it would be sensible to wait and see what develops. If you really want to get a taste for GPU crunching, you'd be better off trying a cheap ATI (double precision) on MW. At least all your CPU cores would still be going full steam on E@H.
Cheers,
Gary.
RE: RE: ... Still, I'd go
)
Hi Gary,
you're right, its a kind of waste, this is why I'm looking for a way to use my GPU more efficient.
And yes, you're right, looking at MW@Home, the crunching speed is outstanding. I have 2 GPU's doing that.
But the most important thing is to participate in scientific work, to understand more about our world and to increase our knowledge. In my case it's the micro- and the macrocosmos, thats of highest interest. And Einstein fits into my view of the world. Credits are only on second (or third) position.
Unfortunately OpenCL came very late and its not working correct right now. If you're interested in more info, check the GPUGRID-forum. Programming new apps in STREAM makes no sense, its a limitation and its outdated (as posted on GPUGRID).
If one compares GFLOP's per € or $, or GFLOP's per Watt, he finds a clear preference for ATI. And if you take a closer look on the overview of graphic cards in thge GPUGRID-forum, you find, that a lot of cards will work, you find a lot of cards that MIGHT work (as my GTX260) and also a lot of cards that don't work. But as long as programmers write apps in CUDA and you want to participate in these project, you need to have a Nvidia-Card.
If you take a look at the cal (STRTEAM) apps, you find MW, Collatz, Seti Astropulse and DNETC.
DNETC is more a game than a scientific app (if you want your data decrypted, dont encrypt it), Seti wu's are very rare, MW is very active and if they have no wu's most of the MW-people crunch Collatz.
I'm very shure that a lot of people will switch to other apps if they could use their ATI-cards. Speaking for myself, I would give 50% of my GFLOPS to other (astrophysic) apps. And I have 5 GPU's, two of them running 6/16.
From my point of view a clear preference for OpenCL.
@Bikeman
Thanks again for your efford. Presumably you're right, an optimized app should make at least 90% use of the GPU, then there is no need to run two apps at the same time. But until this is implemented ...
I hope, Bernd and oliver will use OpenCL! Send them my best wishes!
As you see earlier in my post here, it really makes sense to check, which cards will crunch which app. I never have had problems like this with ATI. Is it DP, it WILL work.
I would prefer a decision like: Fully supports your card OpenCL, it will crunch the app. No matter whether it is an ATI or Nvidia.
Regards
Alexander
RE: this is why I'm
)
I've found a thread in the GPUGRID forum regarding driver architectur differences between Win7 and XP.
I tested it overnight. The speed difference is pretty high. Average runtime of a cuda E@H wu under Win7 is in the range of 170 min, under XP in the range of 135 min. This is 25% faster!
Well, exact figures need a longer run, but I really tend to return to XP.
That raises another question: Is XP still fully supported with drivers for the latest cards (GTX460) or is this a dead-end?
Regards,
Alexander
RE: That raises another
)
No, they're still supported. Although you'll miss some features and can't do certain multi-SLI combinations. Missing features are mostly contributed to what OpenGL support there is for XP and DirectX version.
http://www.nvidia.com/object/winxp-258.96-whql-driver.html
Hi, yesterday I did some
)
Hi,
yesterday I did some tests regarding the cuda app.
I installed MSI Afterburner and checked all I could, running an E@H cuda app. The result is:
GPU-load < 10%
used RAM: 130 MB
Now: my GTX260 has ~ 800MB Ram
My CPU has 4 cores.
I really would like to run at least 2 cuda apps, but I'm not 'fit' enough to compose the required app_info.xml. From other apps I know how to change the nr. of required GPU's per task, but thats the limit (at least mine). And as I've learned in other threads, running other E@H apps is 'a must' and have to be included in that xml.
Could anyone help please?
Regards,
Alexander
RE: The fact that only part
)
Certainly the case for me! With my (admittedly poor) built in video card (GeForce 310 - 512MB), the WUs were taking 15% longer on average.
Back to CPU only, fortunately, they crunch very nicely :)
RE: Hi, yesterday I did
)
Hi everyone :-)
Interesting
Im seeing less than 5% usage on my fermi cards, though they do seem to go through wu's at a decent rate.
Cant wait for a new improved app, it will be interesting to see just how far they can go with the current generation of gpu's
Are there any optimised apps for einstein?
RE: Are there any
)
Depends on the definition :-)
There are no third-party apps available, but the S5GC1 Windows and Linux apps come in three flavors that are automatically distributed to matching PCs:
* a version that makes use of the SSE2 SIMD instruction set
* a version that makes use of the SSE SIMD instruction , but not SSE2
* a compatibility version for old PCs that do not support SSE or SSE2
The Mac (intel) version is always optimized for SSE2
The SSE and SSE2 apps include hand-written assembly code
The ABP2 (CPU) app requires SSE but uses no explicit SIMD code except where the compiler may insert it automatically. Still it has been "optimized" for performance, a few months ago some modifications both in the app itself and in the design of the search parameters were done that made it run approx 5...7 times faster.
Happy crunching
HB