Parallella, Raspberry Pi, FPGA & All That Stuff

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6588

Credit: 315474883

RAC: 321799

@BackGroundMAN : Yup, you can

5 Feb 2015 23:00:58 UTC

Message 111894

(moderation:

)

@BackGroundMAN : Yup, you can use double angle formulae and the like to generate (co)sines from other (co)sines. As you say a higher precision is needed to contain errors there. The Epiphany has fused instructions which ( allegedly ) mitigate that by not rounding intermediates. So you either start with sufficiently high precision values and/or derive from them while maintaining precision. I'm having another look at the FloPoCo VHDL generator ( for FPGA's ) which ( still ) looks promising in that arbitrary operator precision is a main design input such that :

Quote:

... fully parameterized in precision, so that your application may use just the precision it needs, and accurate to the last bit, so that your wires don't carry meaningless noise. Internally, FloPoCo operators are carefully designed to ensure that no bit is computed that is not useful to the final result.

... and so an FPGA block could punt out a suitable precision for the outside world by rounding off as the last thing.

Also partly this can depend upon where you want to stop the factorisation/recursion ie. what is the base case that triggers the winding back out of the recursion. And for that matter what factorisations of N are performed on the way in to the base case. Or if you like : how many powers of the Nth root of unity do you need at each step, and that doesn't have to be the same number for each recursive step. Mathematically you only have to adhere to the prime factors of N or products of those. For E@H at least we have simple power-of-two choices.

There does emerge some 'simplicities' with powers-of-two double angle formulae eg. sin[4*A] in terms of (co)sin[A] say. But simplicity is a relative term here as you get get mixed powers of sines and cosines :

sin[4A] = 4[sin[A]cos^3[A] - sin^3[A]cos[A]]

... so this may be an arguable advantage. As a generality (co)sin[M*A] can be expressed as sums/differences of product terms each with a total of M (co)sin[A] factors.

Another point of merit is that if you can do SQRT[1 - sin^2] real fast then you get the cosines from the sines efficiently. Again you swap time for space.

One especial feature is that all the sines and cosines - and by extension the powers - are bounded ( in magnitude ) below by zero and above by one. So you can do much in a fixed-point format and not have to worry excessively about the relative absolute size of operands. So what bits you might have put aside for exponent can be used for extra mantissa. That's less operations needed for normalising and hence shifters and leading-zero counters ..... :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 722700165

RAC: 1154419

RE: It's got a couple of

5 Feb 2015 23:08:44 UTC

Message 111895 in response to message 111893

(moderation:

)

Quote:

It's got a couple of Neon tasks from here, and a couple of non-Neon tasks from Albert, just running two up at present:

Computer 11741356 at Einstein

COMPUTER 12650 at Albert

Claggy

Cool, thx!

For perspective, I assume this is (yet) at stock CPU clock?

I wonder if you can still get the CPU temperature by

cat /sys/devices/virtual/thermal/thermal_zone*/temp

it would be interesting so see if it's getting any hotter than the old one.

P.S.: still busy downloading Raspbian image ...

Claggy

Joined: 29 Dec 06

Posts: 560

Credit: 2699403

RAC: 0

RE: RE: It's got a

5 Feb 2015 23:19:30 UTC

Message 111896 in response to message 111895

(moderation:

)

Quote:

Quote:

It's got a couple of Neon tasks from here, and a couple of non-Neon tasks from Albert, just running two up at present:

Computer 11741356 at Einstein

COMPUTER 12650 at Albert

Claggy

Cool, thx!

For perspective, I assume this is (yet) at stock CPU clock?

I wonder if you can still get the CPU temperature by

cat /sys/devices/virtual/thermal/thermal_zone*/temp

it would be interesting so see if it's getting any hotter than the old one.

HB

P.S.: still busy downloading Raspbian image ...

It's stock clock at present, i haven't got my spare heatsinks with me (they're at my work address), and didn't want to push it too hard yet,

This is what i got as a temp for the Pi 2 :

pi@raspberrypi ~ $ cat /sys/devices/virtual/thermal/thermal_zone*/temp
51920

and my original Model B with a pair of heatsinks and at stock clock gives (running my self compiled armv6 Stock Seti 7.0 app):

pi@raspberrypi ~ $ cat /sys/devices/virtual/thermal/thermal_zone*/temp
48692

Claggy

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 722700165

RAC: 1154419

Hi! It's a shame that the

6 Feb 2015 13:12:47 UTC

Message 111897 in response to message 111896

(moderation:

)

Hi!

It's a shame that the BOINC version supplied by Raspian wheezy is so old that it doesn't get the CPU features right for the ARM, so it will not allow you to get NEON tasks. I see why you compiled your own version now.

I selected the overclocking profile "PI2" from the raspi-config menu (CPU freq. at 1GHz), and with 4 Einstein@home tasks running in parallel I could easily get my PI2's CPU to over 75 deg C (!!) while it was enclosed in a PI B+ case. With the case open, temp. fell to more reasonable values below 70 deg C, but then again it's not exactly summer in Germany now.... I wonder what would happen if the ambient temp is more like 30deg C or some such.....
I suspect many of the PI cases that do not have ventilation vents over the CPU will not work well with a Raspi 2 under full load. It's also time to buy those tiny heat sinks, I guess. Not as bad as the Parallella tho which even needs a little fan.

The other oddity is that Raspbian currently only allows access to ca 3/4th of the 1GB total RAM :-( . This is a known problem and is expected to be fixed soon.

Cheers
HB

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 722700165

RAC: 1154419

Those who are using a

8 Feb 2015 13:29:10 UTC

Message 111898

(moderation:

)

Those who are using a self-compiled BOINC version on the PI2 (not the 7.0.x version that comes with Raspbian wheezy which will not correctly detect the presence of the NEON CPU feature) can experiment with providing optimized "wisdom" for the FFTW library. A wisdom file contains hints on the performance of individual building-blocks of the FFT implementation the library can choose from on the particular hardware. When generating an FFT "plan" (assembling a full FFT-code from those building blocks) those hints can better guide the library and help get a better plan.

I'm currently experimenting with this wisdom file:

(fftw-3.3.2 fftwf_wisdom #x4a633eef #xb5a95564 #x91014bdd #x9c85ce5f
  (fftwf_dft_r2hc_register 0 #x10bdd #x10bdd #x0 #x3e5f426c #x871be279 #x8093fac5 #xadc9069b)
  (fftwf_dft_r2hc_register 0 #x11bdd #x11bdd #x0 #x23ece1a3 #x2f36c0a5 #x2a1a993d #xd716bd1a)
  (fftwf_codelet_t2fv_8_neon 0 #x10bdd #x10bdd #x0 #x0c80af69 #x2fe458bb #x81ef8f35 #xa61df1c9)
  (fftwf_codelet_n1fv_12_neon 0 #x10fdd #x10fdd #x0 #x25ec8e79 #x4f81cbf6 #x55ad6f62 #xba2232de)
  (fftwf_codelet_q1fv_8_neon 0 #x11bdd #x11bdd #x0 #x8362fe9a #x94cfdedf #x6e4ea842 #xc791a85e)
  (fftwf_rdft_rank0_register 2 #x11bdd #x11bdd #x0 #x6b5e85f4 #xd7cd09c3 #x1c676ef6 #x73f0852e)
  (fftwf_codelet_t2_4 0 #x11bdd #x11bdd #x0 #xb33d15b2 #x5a54c41e #x8b7ec909 #x775234b1)
  (fftwf_dft_vrank_geq1_register 0 #x10fdd #x10fdd #x0 #x7d124f7d #x51c2b7a9 #x67614754 #x6f84eff6)
  (fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #xa48ccbb9 #xa7ba449b #x28f0f2a4 #x2df52e2b)
  (fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #xaf83d15c #xe13833ac #x821ef65e #x141682da)
  (fftwf_codelet_hc2cfdft_16 1 #x11bdd #x11bdd #x0 #xc338dbbd #x81477318 #xc96aed6b #xb15ea60a)
  (fftwf_codelet_q1fv_4_neon 0 #x11bdd #x11bdd #x0 #xe28af32f #x752b225f #x4fa246ed #x920e9f91)
  (fftwf_codelet_t2fv_8_neon 0 #x10fdd #x10fdd #x0 #x805d94c8 #x3a7cf408 #x485262cd #x459be9d0)
  (fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #x961f629e #xd12bd7a7 #xce177b8b #x7c477dca)
  (fftwf_codelet_r2cf_16 2 #x11bdd #x11bdd #x0 #x8f3ef9f7 #xe67e11ab #xe25a4700 #x8eed687a)
  (fftwf_dft_buffered_register 0 #x11bdd #x11bdd #x0 #x64e7d51b #x237dd059 #xc2026fdf #x6e8f2b5f)
  (fftwf_dft_vrank_geq1_register 0 #x10fdd #x10fdd #x0 #xb2d87b1c #x93efaa57 #xe925483a #x16a9f313)
  (fftwf_codelet_r2cfII_16 2 #x11bdd #x11bdd #x0 #xf4d971ab #x381e69c1 #xc4398fe0 #x3f2135b1)
  (fftwf_codelet_q1fv_8_neon 0 #x10fdd #x10fdd #x0 #x03ffa301 #xc7de6e45 #x69aa6d45 #x72039939)
  (fftwf_rdft_rank0_register 2 #x10bdd #x10bdd #x0 #x0d83a237 #x87c5fa52 #x75a543cb #xbd29ad91)
  (fftwf_dft_nop_register 0 #x11bdd #x11bdd #x0 #x1512547f #x11070ad5 #x0adb07e2 #xb74c28d3)
  (fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #xcc076c18 #x857fa31d #x02fbb535 #xe9b7e9d1)
  (fftwf_dft_indirect_register 0 #x10bdd #x10bdd #x0 #xbb00a19e #x3791698e #xf9e1695e #x4dccccac)
)

To use it, save to a file wisdomf and copy it to the predefined system-wide location:

sudo mkdir /etc/fftw
sudo cp wisdomf /etc/fftw/wisdomf

Let's see if this does any good.

Cheers
HB

EDIT:
this might work better:

(fftw-3.3.2 fftwf_wisdom #x4a633eef #xb5a95564 #x91014bdd #x9c85ce5f
  (fftwf_codelet_q1fv_4_neon 0 #x11bdd #x11bdd #x0 #xe28af32f #x752b225f #x4fa246ed #x920e9f91)
  (fftwf_codelet_hc2cfdft_16 0 #x11bdd #x11bdd #x0 #xc338dbbd #x81477318 #xc96aed6b #xb15ea60a)
  (fftwf_dft_vrank_geq1_register 0 #x10bdd #x10bdd #x0 #x32f28f47 #xc8b6d48d #xb13fad96 #x72b9aed8)
  (fftwf_dft_nop_register 0 #x11bdd #x11bdd #x0 #x1512547f #x11070ad5 #x0adb07e2 #xb74c28d3)
  (fftwf_codelet_q1fv_8_neon 0 #x11bdd #x11bdd #x0 #x8362fe9a #x94cfdedf #x6e4ea842 #xc791a85e)
  (fftwf_codelet_t2fv_16_neon 0 #x10bdd #x10bdd #x0 #x4b7d60a2 #xc461d6bb #x1ec678d0 #xf21215db)
  (fftwf_rdft_rank0_register 2 #x11bdd #x11bdd #x0 #x6b5e85f4 #xd7cd09c3 #x1c676ef6 #x73f0852e)
  (fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #x961f629e #xd12bd7a7 #xce177b8b #x7c477dca)
  (fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #xcc076c18 #x857fa31d #x02fbb535 #xe9b7e9d1)
  (fftwf_codelet_t1fv_4_neon 0 #x11bdd #x11bdd #x0 #xb33d15b2 #x5a54c41e #x8b7ec909 #x775234b1)
  (fftwf_dft_vrank_geq1_register 0 #x10bdd #x10bdd #x0 #xbb00a19e #x3791698e #xf9e1695e #x4dccccac)
  (fftwf_codelet_t1fuv_8_neon 0 #x10bdd #x10bdd #x0 #x0c80af69 #x2fe458bb #x81ef8f35 #xa61df1c9)
  (fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #xaf83d15c #xe13833ac #x821ef65e #x141682da)
  (fftwf_codelet_n2fv_8_neon 0 #x10bdd #x10bdd #x0 #x4822b253 #xaef99394 #x90ae1375 #xd6bd44fc)
  (fftwf_dft_buffered_register 0 #x11bdd #x11bdd #x0 #x64e7d51b #x237dd059 #xc2026fdf #x6e8f2b5f)
  (fftwf_codelet_r2cf_16 2 #x11bdd #x11bdd #x0 #x8f3ef9f7 #xe67e11ab #xe25a4700 #x8eed687a)
  (fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #xa48ccbb9 #xa7ba449b #x28f0f2a4 #x2df52e2b)
  (fftwf_dft_r2hc_register 0 #x11bdd #x11bdd #x0 #x23ece1a3 #x2f36c0a5 #x2a1a993d #xd716bd1a)
  (fftwf_codelet_t1_6 0 #x10bdd #x10bdd #x0 #x0b251471 #x2000d3c9 #xc2813bb8 #x7e68294b)
  (fftwf_codelet_r2cfII_16 2 #x11bdd #x11bdd #x0 #xf4d971ab #x381e69c1 #xc4398fe0 #x3f2135b1)
)

Claggy

Joined: 29 Dec 06

Posts: 560

Credit: 2699403

RAC: 0

I've come across a PI 2

8 Feb 2015 19:19:20 UTC

Message 111899 in response to message 111898

(moderation:

)

I've come across a PI 2 scheduler problem at Albert, the scheduler won't send Neon work, i've reported it there with the scheduler log:

http://albertathome.org/content/pi-2-scheduler-problem

Claggy

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 722700165

RAC: 1154419

RE: I've come across a PI 2

9 Feb 2015 10:21:50 UTC

Message 111900 in response to message 111899

(moderation:

)

Quote:

I've come across a PI 2 scheduler problem at Albert, the scheduler won't send Neon work, i've reported it there with the scheduler log:

http://albertathome.org/content/pi-2-scheduler-problem

Claggy

Indeed a scheduler problem, Bernd has already a fix for this which will be committed shortly .

Cheers
HB

Alex

Joined: 1 Mar 05

Posts: 451

Credit: 507044931

RAC: 111931

Just to inform the Pi-2

9 Feb 2015 20:39:44 UTC

Message 111901

(moderation:

)

Just to inform the Pi-2 fans:
Raspberry Pi 2 schaltet sich bei Blitzlicht ab
http://www.tomshardware.de/raspberry-pi2-blitzlicht-bug,news-252128.html

Bikeman (Heinz-...

Moderator

Joined: 28 Aug 06

Posts: 3522

Credit: 722700165

RAC: 1154419

RE: Just to inform the Pi-2

9 Feb 2015 22:33:38 UTC

Message 111902 in response to message 111901

(moderation:

)

Quote:

Just to inform the Pi-2 fans:
Raspberry Pi 2 schaltet sich bei Blitzlicht ab
http://www.tomshardware.de/raspberry-pi2-blitzlicht-bug,news-252128.html

Indeed, the Raspberry Pi 2 has a photo sensitive component that can be triggered to have a transient failure by extreme bright light (Xenon flashlight from short distance or aiming a laser pointer directly at the chip in question)

http://www.raspberrypi.org/xenon-death-flash-a-free-physics-lesson/

It's the oddest case of electromagnetic compatibility issue of electronics I ever saw or heard of (I couldn't resist to try this (successfully) on my own Pi2...it's almost like this flashy-thing in MIB ;-) ).

Now this little component will hardly be build exclusively for the Pi2, and I wonder what other pieces of electronics might show up to be susceptible to the 'Xenon flash of death'.

But there's also good news: an update of the firmware and kernel via rpi-update will now fix the problem that almost 250 MB of the RAM was not usable. Finally you get the full 1GB (minus GPU memory ). Even running 4 E@H task in parallel will leave plenty of free RAM ;-)

Cheers
HB

Highlander

Joined: 1 Jul 05

Posts: 24

Credit: 141580701

RAC: 7710

Only for comparison, a Odroid

10 Feb 2015 9:22:43 UTC

Message 111903

(moderation:

)

Only for comparison, a Odroid C1 features:
(I have no intention to run any boinc project on this, but some numbers are always interesting, imo)

Processor: 4 ARM ARMv7 Processor rev 1 (v7l)
Processor features: swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
Number of CPUs: 4
623 floating point MIPS (Whetstone) per CPU
2489 integer MIPS (Dhrystone) per CPU

Parallella, Raspberry Pi, FPGA & All That Stuff

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner