The next advice is to take a look at the access pattern of twiddle_dee: that would shave off some 50 seconds on NVIDIA and 17% on some ATI model I know of.
Petri33, could I ask you what is the "access pattern of twiddle_dee" is?
Thanks for asking.
Twiddle_dee is a small table of constant values in the code that is used in FFT calculations (to bring out repeating patterns of signals).
On NVIDIA constant values placed in __constant memory area should be accessed simultaneously at one memory address by all GPU threads (2000-5000 of them).
Einstein software fetches the twiddle_dee values almost randomly. So a performance penalty occurs: The nearly instantaneous parallel fetch of the value is turned into a sequential one. Not one clock cycle, but thousands of clock cycles: all computing waits for the last read from memory to be able to continue.
The FIX is to define twiddle_dee (a small buffer of constants) to reside in global memory and thus not be fetched through constant cache sequentially. The read from global memory can benefit from caching and nearby access of the values that were fetched with some earlier read, A read of one address fills the cache with more values ahead. So : Global memory can be served to the threads from 'random' addresses much faster.
The 'slowdown bug' was not really a bug. It is quite common to think that constants should always be placed in constant memory space or at least be fetched through constant cache (a programming technique used with GPUs).
--
Petri
p.s. The access pattern has localIndex(0) (lI.x, aka thread id) as its base variable. So every thread reads from a different address.
The next advice is to take a look at the access pattern of twiddle_dee: that would shave off some 50 seconds on NVIDIA and 17% on some ATI model I know of.
Petri33, could I ask you what is the "access pattern of twiddle_dee" is?
Thanks for asking.
...[snip]...
Your welcome. It's not often your show up with your wisdom. ;*)
It sounds to me like this was something that was developed long before the high-performance CPUs and GPUs came about. It's time Einstein updated a bit of their software.
I've read that Avi Loeb of Harvard University is to launch a new project called "GALILEO" to search for ETI . But I could not find anything about its technology, both hardware and software.
From what I read on their website, it's not looking for radio signals or anything like that but rather actual, physical objects such as potential satellites orbiting the earth.
"The Galileo Project research group will aim to identify the nature of UAP and ‘Oumuamua-like interstellar objects using the standard scientific method based on a transparent analysis of open scientific data to be collected using optimized instruments. This ground-based project is complementary to traditional SETI, in that it searches for physical objects, and not electromagnetic signals associated with extraterrestrial technological civilizations. For the Galileo Project only ‘known physics’ explanations are in scope. ‘Alternative physics’ hypotheses, while interesting, are explicitly not part of the Galileo Project. Moreover, the Galileo Project will not engage in retroactive attempts to analyze existing images or radar data, or speculate on prior UAP, observations or anecdotal reports, as these are not conducive to cross-validated, evidence-based scientific explanations."
"On this episode of Gear Club, Eventide founder Richard Factor takes us to outer space on a Search for Extra-Terrestrial Intelligence. Richard is the founder and president of the SETI League: a non-profit, membership supported group of amature sky watchers hoping to catch signals of intelligent life from outside our solar system. Richard walks John and Stewart through the inception of SETI, the hows and whys of this galactic search, and one story about a real life tin foil hat."
E-mail from Eventide had this text: "Check out Gear Club Podcast's latest episode where Richard, John Agnello, and Stewart Lerman discuss everything from the formation of SETI League, to the ongoing Search for Extra-Terrestrial Intelligence."
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I was trolling through the Seti Institute website. It has a data collection opportunity with a specific telescope vendor.
It might even include cellphones...
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
GWGeorge007 wrote:petri33
)
Thanks for asking.
Twiddle_dee is a small table of constant values in the code that is used in FFT calculations (to bring out repeating patterns of signals).
On NVIDIA constant values placed in __constant memory area should be accessed simultaneously at one memory address by all GPU threads (2000-5000 of them).
Einstein software fetches the twiddle_dee values almost randomly. So a performance penalty occurs: The nearly instantaneous parallel fetch of the value is turned into a sequential one. Not one clock cycle, but thousands of clock cycles: all computing waits for the last read from memory to be able to continue.
The FIX is to define twiddle_dee (a small buffer of constants) to reside in global memory and thus not be fetched through constant cache sequentially. The read from global memory can benefit from caching and nearby access of the values that were fetched with some earlier read, A read of one address fills the cache with more values ahead. So : Global memory can be served to the threads from 'random' addresses much faster.
The 'slowdown bug' was not really a bug. It is quite common to think that constants should always be placed in constant memory space or at least be fetched through constant cache (a programming technique used with GPUs).
--
Petri
p.s. The access pattern has localIndex(0) (lI.x, aka thread id) as its base variable. So every thread reads from a different address.
...
"int xInd=lI.x+localExtent.x*(lI.y%wTE.y);"
"int yInd=lI.y/wTE.y;"
"int gInd=xInd+rowSizeinUnits*yInd;"
"int TW3Ind=(groupIndex.x*wTE.x+xInd)*(currDimIndex*wTE.y*wgUnroll+yInd);"
"int TW3add=(groupIndex.x*wTE.x+xInd)*wTE.y;"
"tileIn+=gInd;"
"\n#pragma unroll 16\n"
"for(int t=0;t<wgUnroll;t++)"
"{"
"tmp=SLf2(tileIn,sizeof(float2)*t*rowSizeinUnits*wTE.y);"
"int u=TW3Ind;"
"TW3Ind+=TW3add;"
"global float2 *restrict t0=&twiddle_dee[0][0],*restrict t1=&twiddle_dee[1][0],*restrict t2=&twiddle_dee[2][0];"
"float2 result=GLf2(&t0[u & 0xff]),T1=GLf2(&t1[(u>>8)&0xff]),T2=GLf2(&t2[(u>>16)&0xff]);"
"result=(float2)((result.x*T1.x-result.y*T1.y),"
...
petri33 wrote: GWGeorge007
)
Your welcome. It's not often your show up with your wisdom. ;*)
It sounds to me like this was something that was developed long before the high-performance CPUs and GPUs came about. It's time Einstein updated a bit of their software.
Proud member of the Old Farts Association
I've read that Avi Loeb of
)
I've read that Avi Loeb of Harvard University is to launch a new project called "GALILEO" to search for ETI . But I could not find anything about its technology, both hardware and software.
Tullio
From what I read on their
)
From what I read on their website, it's not looking for radio signals or anything like that but rather actual, physical objects such as potential satellites orbiting the earth.
"The Galileo Project research group will aim to identify the nature of UAP and ‘Oumuamua-like interstellar objects using the standard scientific method based on a transparent analysis of open scientific data to be collected using optimized instruments. This ground-based project is complementary to traditional SETI, in that it searches for physical objects, and not electromagnetic signals associated with extraterrestrial technological civilizations. For the Galileo Project only ‘known physics’ explanations are in scope. ‘Alternative physics’ hypotheses, while interesting, are explicitly not part of the Galileo Project. Moreover, the Galileo Project will not engage in retroactive attempts to analyze existing images or radar data, or speculate on prior UAP, observations or anecdotal reports, as these are not conducive to cross-validated, evidence-based scientific explanations."
https://projects.iq.harvard.edu/galileo/scope
https://projects.iq.harvard.edu/galileo/activities
There's a fresh podcast,
)
There's a fresh podcast, perhaps it has some content for somebody:
Gear Club Podcast #78: Richard Factor: Ready, SETI, Go
https://www.youtube.com/watch?v=UHSqvb76oiQ
"On this episode of Gear Club, Eventide founder Richard Factor takes us to outer space on a Search for Extra-Terrestrial Intelligence. Richard is the founder and president of the SETI League: a non-profit, membership supported group of amature sky watchers hoping to catch signals of intelligent life from outside our solar system. Richard walks John and Stewart through the inception of SETI, the hows and whys of this galactic search, and one story about a real life tin foil hat."
E-mail from Eventide had this text: "Check out Gear Club Podcast's latest episode where Richard, John Agnello, and Stewart Lerman discuss everything from the formation of SETI League, to the ongoing Search for Extra-Terrestrial Intelligence."
Thank for the lead!!!
)
Thank for the lead!!!
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
I was trolling through the
)
I was trolling through the Seti Institute website. It has a data collection opportunity with a specific telescope vendor.
It might even include cellphones...
Tom M
A Proud member of the O.F.A. (Old Farts Association). Be well, do good work, and keep in touch.® (Garrison Keillor) I want some more patience. RIGHT NOW!
No matter what project, the
)
No matter what project, the code is always ready for an optimization.
--
petri33