I doubt it, since the box is doing okay on all other BOINC projects. Besides I'm using Notebook Hardware Control (dunno if you know it, very nice tool imo) and that always shows constant CPU voltage, clock speed and temperature when I have the AC cord plugged in and the notebook doing BOINC.
Still, it might be worth a try (at least lacking better ideas). But first I'm going to take some samplings under Linux and compare the two.
I use notebook hardware contral myself, and like it a lot. I've set the power profile to "max performance" on AC and dynamic switching on bat. Seems to work perfect for me.
I'm curious about your Linux results. I guess the "cycles pe retired instruction" metric would be cool as well to have in the output, jsut as a reality check to make sure we are not overlooking something.
BTW, is the other core idle while you test this? Wondering about core affinity... Not that Einstein is somehow hopping between the cores on win.
[I'm curious about your Linux results. I guess the "cycles pe retired instruction" metric would be cool as well to have in the output, jsut as a reality check to make sure we are not overlooking something.
Yeah, as soon as I have figured out how to actually use the thing... the howto is not really helpful; it seems to point to directories that don't actually exist...
Quote:
BTW, is the other core idle while you test this? Wondering about core affinity... Not that Einstein is somehow hopping between the cores on win.
Well, as the other core didn't have any Einstein WUs left under Windows (I want to let it run dry, planning to use Linux more) I used it to make Rainbow Tables... if you think that could be an issue I could re-run the tests while the other core is idle...
Maybe Akos will join us here and shed some light on this question, maybe he knows all the answers already.
The SSE2 routines are parts of a (math) library. These parts don't run on some SSE2 capable CPUs ( not only AMD ) because of a wrong CPU identification.
Maybe Akos will join us here and shed some light on this question, maybe he knows all the answers already.
The SSE2 routines are parts of a (math) library. These parts don't run on some SSE2 capable CPUs ( not only AMD ) because of a wrong CPU identification.
I guess this could be corrected rather easily. But then those hosts that are really not SSE2 capable (but SSE capable) will still have a problem under windows, and even more so after the main hot-loop will be SSE optimized by you (or will it be SSE2 optimzed??). Because the fallback modf function will then be the new bottleck. Right?
And for Annika's T2060 Core Duo there must be a different problem because it definitely uses the SSE2 routines and still shows poor performance compared to the when run under Linux. Strange.
The SSE2 routines are parts of a (math) library. These parts don't run on some SSE2 capable CPUs ( not only AMD ) because of a wrong CPU identification.
LOL
And this for sure is only by mistake. ;-)
Well this is afaik known form the(some) Intel compilers, but Bernd uses the Microsoft Compiler. Anyway this really makes me wonder.
Btw. what the hell did you do with your X2 3600+ respectively what app is running there?
The SSE2 routines are parts of a (math) library. These parts don't run on some SSE2 capable CPUs ( not only AMD ) because of a wrong CPU identification.
LOL
And this for sure is only by mistake. ;-)
Well this is afaik known form the(some) Intel compilers, but Bernd uses the Microsoft Compiler. Anyway this really makes me wonder.
Btw. what the hell did you do with your X2 3600+ respectively what app is running there?
My Core2 runs an SSE optimised version of XLALComputeFaFb subroutine. It shows about 70% performance improvement. Bernd will implement it into the source code and compile for all x86 based platforms.
And for Annika's T2060 Core Duo there must be a different problem because it definitely uses the SSE2 routines and still shows poor performance compared to the when run under Linux. Strange.
CU
BRM
Some runtime differences can be explained by different WUs. One of my VMs jumped up form ~22 cr/h to ~28 cr/h. I have never before seen such big differences(usualy �±5%), so it might be just a problem with the time measurement of VMWare Player.
My Core2 runs an SSE optimised version of XLALComputeFaFb subroutine. It shows about 70% performance improvement. Bernd will implement it into the source code and compile for all x86 based platforms.
Excellent news, isn't it?
Hi Bikeman,
yeah great news and I had already read that, but the difference in Speed on his X2 is far more than 70 %. Even if I add the "AMD penalty", this damn thing is still much faster.
His credits/hour rose from ~12 to ~65!!!
Running Windows and getting about 12 cr/h on that X2 3600+ is already pretty much, 'cause my fathers X2 5000+(Win) getts about 14. So Akos's host is probably oc'd. His results are not validated yet, but if they are going to be, then the speed increase will be very much bigger. :-)
I don't really think so. From what I heard from other crunchers C/H is normally fairly constant, and all WUs give about equal credit in relation to crunching time. Maybe it really has to do with running BOINC in a VM... never tried it out but it sounds plausible to me.
Really great news about the new science app :-D very nice work from Akos (again). Am I correct that the performance increase will be for all platforms?
EDIT: I was referring to the post about different WU sizes and differences in C/H.
RE: I doubt it, since the
)
I use notebook hardware contral myself, and like it a lot. I've set the power profile to "max performance" on AC and dynamic switching on bat. Seems to work perfect for me.
I'm curious about your Linux results. I guess the "cycles pe retired instruction" metric would be cool as well to have in the output, jsut as a reality check to make sure we are not overlooking something.
BTW, is the other core idle while you test this? Wondering about core affinity... Not that Einstein is somehow hopping between the cores on win.
CU
BRM
RE: [I'm curious about your
)
Yeah, as soon as I have figured out how to actually use the thing... the howto is not really helpful; it seems to point to directories that don't actually exist...
Well, as the other core didn't have any Einstein WUs left under Windows (I want to let it run dry, planning to use Linux more) I used it to make Rainbow Tables... if you think that could be an issue I could re-run the tests while the other core is idle...
CU Annika
Might be a good idea to have
)
Might be a good idea to have the other core idle.
I haven't een tried to install VTune under Linux, will do that next. Nice tool anyway and free for non-comm use under Linux.
Maybe Akos will join us here and shed some light on this question, maybe he knows all the answers already.
CU
BRM
RE: Maybe Akos will join us
)
The SSE2 routines are parts of a (math) library. These parts don't run on some SSE2 capable CPUs ( not only AMD ) because of a wrong CPU identification.
RE: RE: Maybe Akos will
)
I guess this could be corrected rather easily. But then those hosts that are really not SSE2 capable (but SSE capable) will still have a problem under windows, and even more so after the main hot-loop will be SSE optimized by you (or will it be SSE2 optimzed??). Because the fallback modf function will then be the new bottleck. Right?
And for Annika's T2060 Core Duo there must be a different problem because it definitely uses the SSE2 routines and still shows poor performance compared to the when run under Linux. Strange.
CU
BRM
RE: The SSE2 routines are
)
LOL
And this for sure is only by mistake. ;-)
Well this is afaik known form the(some) Intel compilers, but Bernd uses the Microsoft Compiler. Anyway this really makes me wonder.
Btw. what the hell did you do with your X2 3600+ respectively what app is running there?
28,387.48 sec for 516.79 credits(pending)!!!
cu,
Michael
RE: RE: The SSE2 routines
)
Hi Michael!
See Akos' message here
Quote:
Excellent news, isn't it?
RE: And for Annika's T2060
)
Some runtime differences can be explained by different WUs. One of my VMs jumped up form ~22 cr/h to ~28 cr/h. I have never before seen such big differences(usualy �±5%), so it might be just a problem with the time measurement of VMWare Player.
cu,
Michael
RE: Hi Michael! See Akos'
)
Hi Bikeman,
yeah great news and I had already read that, but the difference in Speed on his X2 is far more than 70 %. Even if I add the "AMD penalty", this damn thing is still much faster.
His credits/hour rose from ~12 to ~65!!!
Running Windows and getting about 12 cr/h on that X2 3600+ is already pretty much, 'cause my fathers X2 5000+(Win) getts about 14. So Akos's host is probably oc'd. His results are not validated yet, but if they are going to be, then the speed increase will be very much bigger. :-)
cu,
Michael
I don't really think so. From
)
I don't really think so. From what I heard from other crunchers C/H is normally fairly constant, and all WUs give about equal credit in relation to crunching time. Maybe it really has to do with running BOINC in a VM... never tried it out but it sounds plausible to me.
Really great news about the new science app :-D very nice work from Akos (again). Am I correct that the performance increase will be for all platforms?
EDIT: I was referring to the post about different WU sizes and differences in C/H.