Good luck. I'd always assumed AMD used a consolidated computational unit and converted the two instruction sets into a common type on the fly. Would adding 387 based FPU work also be possible?
I'd always assumed AMD used a consolidated computational unit and converted the two instruction sets into a common type on the fly.
AMD does that.
Quote:
Would adding 387 based FPU work also be possible?
C40 is the fastest on 387.
I don't have enough tried trick to improve its speed notably.
ok, I'm wondering if I've misunderstood what you were proposing. I thought it was a client running SSE2 and 3Dnow calculations on different parts of the WU in parallel, but if the same hardware would be doing both 'threads' at once there doens't seem to be any gain. Were you instead proposing an app that would select between SSE and 3Dnow depending on the cpus architechture?
I'd always assumed AMD used a consolidated computational unit and converted the two instruction sets into a common type on the fly.
Quote:
AMD does that.
ok, I'm wondering if I've misunderstood what you were proposing. I thought it was a client running SSE2 and 3Dnow calculations on different parts of the WU in parallel, but if the same hardware would be doing both 'threads' at once there doens't seem to be any gain. Were you instead proposing an app that would select between SSE and 3Dnow depending on the cpus architechture?
I propose a combined code, but I think you forgot some-things.
SSE2 works with 2x8 double precision registers.
3DNow! works with 2x8 single precision registers, and it destroys the FPU stack.
Benefits of combined usage:
- less memory swapping
- parallelism (arithmeticdata preparation)
More like I've never done anything where the details of implementation were relevant at that low a level. The only assembly level coding I've ever done was in college and on an emulation of a risc cpu (presumably to keep people from turning in a disassembly of thier c++ hello world programs).
So what you are basically suggesting is hyper-threading, but on the instruction level. Using booth parts of the processor in parallel.
If it would have been hyper-threading in the P4 sense of the word, one would have run S40 on the SSE side and D40 on the 3DNow! Side of the processor. But instead of running 2 results in parallel, your thought is to split the work of the result and feed the 2parts of the cpu small pieces to make better use of the clock-cycle, by trying to parallel parts of the work.
Then you're really interested in a subject, there is no way to avoid it. You have to read the Manual.
This AMD fan is still using D40 and loving it.
When the offical app comes out I will still use D40.
AMD forever!
Akos, was wondering if you could get more using
3DNow! Professional or
Enhanced 3DNow!
:)
I bought a Sempron on e-Bay, so if I will get it then I will do a combined extended 3DNow! + SSE3 code.
But wasn't a good idea to order from USA. Items don't arrive from there sometimes. :-)
I bought a Sempron on e-Bay, so if I will get it then I will do a combined extended 3DNow! + SSE3 code.
But wasn't a good idea to order from USA. Items don't arrive from there sometimes. :-)
A nice way to say: An international multi-million-dollar science project is apparently not able to equip it's most important unpaid voluntary contributor with decent founding or hardware donations in the sub-1000-dollar range.
Edit/addition: Akos, if you'd like to add a method to make monetary donations (perhaps a paypal account, to cover costs for hardware/energy/coffee/beer ;) ) e.g. to your profile, i would be honored to help your efforts
Good luck. I'd always
)
Good luck. I'd always assumed AMD used a consolidated computational unit and converted the two instruction sets into a common type on the fly. Would adding 387 based FPU work also be possible?
RE: Good luck. Thanks.
)
Thanks. :-)
AMD does that.
C40 is the fastest on 387.
I don't have enough tried trick to improve its speed notably.
RE: RE: Good luck.Thanks.
)
ok, I'm wondering if I've misunderstood what you were proposing. I thought it was a client running SSE2 and 3Dnow calculations on different parts of the WU in parallel, but if the same hardware would be doing both 'threads' at once there doens't seem to be any gain. Were you instead proposing an app that would select between SSE and 3Dnow depending on the cpus architechture?
RE: I'd always assumed AMD
)
I propose a combined code, but I think you forgot some-things.
SSE2 works with 2x8 double precision registers.
3DNow! works with 2x8 single precision registers, and it destroys the FPU stack.
Benefits of combined usage:
- less memory swapping
- parallelism (arithmeticdata preparation)
More like I've never done
)
More like I've never done anything where the details of implementation were relevant at that low a level. The only assembly level coding I've ever done was in college and on an emulation of a risc cpu (presumably to keep people from turning in a disassembly of thier c++ hello world programs).
So what you are basically
)
So what you are basically suggesting is hyper-threading, but on the instruction level. Using booth parts of the processor in parallel.
If it would have been hyper-threading in the P4 sense of the word, one would have run S40 on the SSE side and D40 on the 3DNow! Side of the processor. But instead of running 2 results in parallel, your thought is to split the work of the result and feed the 2parts of the cpu small pieces to make better use of the clock-cycle, by trying to parallel parts of the work.
Then you're really interested in a subject, there is no way to avoid it. You have to read the Manual.
This AMD fan is still using
)
This AMD fan is still using D40 and loving it.
When the offical app comes out I will still use D40.
AMD forever!
Akos, was wondering if you could get more using
3DNow! Professional or
Enhanced 3DNow!
:)
98SE XP2500+ @ 2.1 GHz Boinc v5.8.8
RE: This AMD fan is still
)
I bought a Sempron on e-Bay, so if I will get it then I will do a combined extended 3DNow! + SSE3 code.
But wasn't a good idea to order from USA. Items don't arrive from there sometimes. :-)
RE: I bought a Sempron on
)
A nice way to say: An international multi-million-dollar science project is apparently not able to equip it's most important unpaid voluntary contributor with decent founding or hardware donations in the sub-1000-dollar range.
Edit/addition: Akos, if you'd like to add a method to make monetary donations (perhaps a paypal account, to cover costs for hardware/energy/coffee/beer ;) ) e.g. to your profile, i would be honored to help your efforts
akosf, email me at welltender
)
akosf, email me at welltender at gmail.com. I have some amd cpus that are surplus to my needs.
Terry