my twin opty 275 (U41.05)
s5 8.73 hrs s4 35 min
mac G5s
s5 ~8.1 hrs s4 2.2 hrs
Looks like there is a lot of room for optimizing the windows app.
They made the long units about twice as long, and the short ones about 5 times as long, so the stuff we are crunching are a lot longer. The S5 is optimized up to SSE, but not SSE2 or SSE3, yet. The next version of BOINC will tell the servers what your processor is, and then the servers can send better optimized codes.
So, with SSE3 and the current results we are crunching, you will gain approximately a 5-8% gain, and not much more.
They made the long units about twice as long, and the short ones about 5 times as long
I'd say thats a bit off the mark. Read through this thread and you'll see a number of people saying long units are at least 8-12 times longer.
first one i've done was 15time longer. Unless you're comparing standard app to standard app? in wich case i'd say that it was about 6-7x.
More i look (not trying to start something here) it seems more evident this new app really doesn't like pentiums. At least not the modern ones more so. Maybe just as you say not optimized for SSE2 or 3 so not taking much advantage.
They made the long units about twice as long, and the short ones about 5 times as long
I'd say thats a bit off the mark. Read through this thread and you'll see a number of people saying long units are at least 8-12 times longer.
first one i've done was 15time longer. Unles ure somparing standard app to standard app? in wich case i'd say that it was about 6-7x.
I think you've got the size factors reversed and it was 5x bigger for the longs, and 2x for the shorts. I've got several of both types in my queue and the etas are ~8:1, the s4 ratio was ~3:1 with the stock client, 15:2 ~= 8:1.
Also the speedup is relative to the standard app. The s5 apps are slightly modified s4 beta apps. The s5 app is ~3x faster than the stock s4 app, or about half the speed of akos's latest s4 apps. relative to the akos apps you've got 5x the size and half the speed which gives 10x longer to crunch.
- To make up for the faster Apps we increased the size of the workunits. The "long" ones will be roughly five times as long as the "long" ones from S4, the "short" ones will be roughly twice as long as their S4 counterparts.
This is for the standard app.
The standard S5 app does have optimizations, and will not gain a lot with processor specific optimizations, but should get some gain.
This is a direct quote from Akos:
I hope you know that SSE3 optimization doesn't mean a big performance improvement (cca. +5%).
It seems that a lot of people here have fallen for the Intel propaganda of "higher GHz = faster computations for everything". This has not been true for more than 5-6 years.
AMD cpu's have been much more efficient in work done per cycle for a long time. It seems to me that with longer work units this is simply more obvious. Admittedly, there are likely more contributers to this that I know about. But what might have been an extra 5 minutes before now may become and extra 50 or more minutes (as an example) due to the longer work units.
I guess we'll see soon how this changes with the Conroe core.
Bottom line is that Intel is seldom the better value when it comes to work done per cycle - they just advertise more effectively.
Those who don’t build must burn. It’s as old as history and juvenile delinquents.
Ray Bradbury - Fahrenheit 451
so i think some of you have your figures wrong.
if you work that out it comes to about 2.3 x as long comparing std app to std app on the long units.
Quote:
first one i've done was 15time longer. Unless you're comparing standard app to standard app? in wich case i'd say that it was about 6-7x.
Mine never took more than 2=1/2hours on the standards app. Usually where done in 1:50hours. Opti app got it down to 35-45min. New apps taking almost 12/+hours
what was wrong with the idea of just increasing the unit length to the old (standard app) length? Use all those new implimentations and just make the units a bit longer to meet the usual unit length? as you said took you about 4 hours wich was a decent time. Instead they've gone and put the optimizasions in and doubled or more again over that length?
The ratio of the completion times for the SR4 WUs Opty vs G5 4:1 should remain the same for the SR5 WUs provided similar optimizations where applied to both apps. Granted the SR4-G5 times are for the standard app but the 4.67 PPC optimization did not result in much of a gain for the G5(25%-28%) so the ratio might be 3:1.
Instead the G5s are now faster than Opty.
RE: my twin opty 275
)
They made the long units about twice as long, and the short ones about 5 times as long, so the stuff we are crunching are a lot longer. The S5 is optimized up to SSE, but not SSE2 or SSE3, yet. The next version of BOINC will tell the servers what your processor is, and then the servers can send better optimized codes.
So, with SSE3 and the current results we are crunching, you will gain approximately a 5-8% gain, and not much more.
RE: They made the long
)
I'd say thats a bit off the mark. Read through this thread and you'll see a number of people saying long units are at least 8-12 times longer.
first one i've done was 15time longer. Unless you're comparing standard app to standard app? in wich case i'd say that it was about 6-7x.
More i look (not trying to start something here) it seems more evident this new app really doesn't like pentiums. At least not the modern ones more so. Maybe just as you say not optimized for SSE2 or 3 so not taking much advantage.
Just looking a the few units i have the pentiums take little over 11hours but AMDs took 8-9hours
http://einsteinathome.org/workunit/9896743 two to compare
http://einsteinathome.org/workunit/9780466Havent sent back yet but is half done and will takewell over 10hour.
RE: Got my first S5 on a
)
Completed two of the "long" WU's now, and they each took a bit over 9 hours.
RE: RE: They made the
)
I think you've got the size factors reversed and it was 5x bigger for the longs, and 2x for the shorts. I've got several of both types in my queue and the etas are ~8:1, the s4 ratio was ~3:1 with the stock client, 15:2 ~= 8:1.
Also the speedup is relative to the standard app. The s5 apps are slightly modified s4 beta apps. The s5 app is ~3x faster than the stock s4 app, or about half the speed of akos's latest s4 apps. relative to the akos apps you've got 5x the size and half the speed which gives 10x longer to crunch.
I got it backwards. Direct
)
I got it backwards. Direct quote from Bernard:
- To make up for the faster Apps we increased the size of the workunits. The "long" ones will be roughly five times as long as the "long" ones from S4, the "short" ones will be roughly twice as long as their S4 counterparts.
This is for the standard app.
The standard S5 app does have optimizations, and will not gain a lot with processor specific optimizations, but should get some gain.
This is a direct quote from Akos:
I hope you know that SSE3 optimization doesn't mean a big performance improvement (cca. +5%).
It seems that a lot of people
)
It seems that a lot of people here have fallen for the Intel propaganda of "higher GHz = faster computations for everything". This has not been true for more than 5-6 years.
AMD cpu's have been much more efficient in work done per cycle for a long time. It seems to me that with longer work units this is simply more obvious. Admittedly, there are likely more contributers to this that I know about. But what might have been an extra 5 minutes before now may become and extra 50 or more minutes (as an example) due to the longer work units.
I guess we'll see soon how this changes with the Conroe core.
Bottom line is that Intel is seldom the better value when it comes to work done per cycle - they just advertise more effectively.
Those who don’t build must burn. It’s as old as history and juvenile delinquents.
Ray Bradbury - Fahrenheit 451
Ive done 19 short work units
)
Ive done 19 short work units with a pentium 4 3.0ghz system with ht off.
Averaging at 4860 seconds claiming and recieving 19.5 credits each time.
Ive swithced to HT on and now average 7400 seconds based on a sample of 4 claiming 19.5 credits each time.
4860*2 = 9720 while with HT 7400.
(9720-7400)/9720= 0.239
So im getting a 23.9% improvemnt in crunch times using HT.
i have no problem with the
)
i have no problem with the longer work units.
s4 long std app - 4h 20m
s4 long Akos app - 45m
s4 short Akos app - 07m
s5 short std app - 1h 10m
s5 long std app - 10h 05m
so i think some of you have your figures wrong.
if you work that out it comes to about 2.3 x as long comparing std app to std app on the long units.
98SE XP2500+ @ 2.1 GHz Boinc v5.8.8
RE: so i think some of you
)
Mine never took more than 2=1/2hours on the standards app. Usually where done in 1:50hours. Opti app got it down to 35-45min. New apps taking almost 12/+hours
what was wrong with the idea of just increasing the unit length to the old (standard app) length? Use all those new implimentations and just make the units a bit longer to meet the usual unit length? as you said took you about 4 hours wich was a decent time. Instead they've gone and put the optimizasions in and doubled or more again over that length?
The ratio of the completion
)
The ratio of the completion times for the SR4 WUs Opty vs G5 4:1 should remain the same for the SR5 WUs provided similar optimizations where applied to both apps. Granted the SR4-G5 times are for the standard app but the 4.67 PPC optimization did not result in much of a gain for the G5(25%-28%) so the ratio might be 3:1.
Instead the G5s are now faster than Opty.