Parallella, Raspberry Pi, FPGA & All That Stuff

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6,578
Credit: 303,947,530
RAC: 256,404

Well, is my face red or what

Well, is my face red or what ???

More than happy to be wrong ! :-) :-)

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

MarkJ
MarkJ
Joined: 28 Feb 08
Posts: 437
Credit: 139,002,861
RAC: 368

Update #44 today. Last

Update #44 today.

Last week they got their first 200 bare boards back. This week they got the first batch back from being assembled and are now testing them. All appears to be be back on track to ship the first 200 boards to backers in December with the rest following as they are completed.

I wonder if mine will arrive by Australia Day (26th of January)?

They have had to suspend taking Parallella orders because there was so much interest. I expect once they clear the backer orders they will start accepting orders again.

Mac.teh.Knife
Mac.teh.Knife
Joined: 4 Feb 13
Posts: 17
Credit: 865,586
RAC: 0

FPGAs have come a long way.

FPGAs have come a long way. I am convinced that there exists within the BOINC community more than sufficient expertise and "volunteerism" to engineer and produce a board based on FPGA plus a few support ICs (for example ethernet and specialized functions) that has everything required to implement BOINC and project apps but nothing more. It could also implement, maybe, a PCIe x16 3.0 slot to allow attaching a GPU? Maybe 2 such slots.

I am talking about about a lean, mean, very low-cost board sans USB, PS/2, SATA, IDE, parallel port, RS-232 port, sound and likely a few other things we don't need to crunch tasks. Let's get rid of all that superfluous crap and make the fastest, cheapest (low cost not low quality), smallest, most power efficient crunching device we can. No frills, just maximum crunching. I don't know enough about FPGA to outline the concept in detail but from the user's perspective it might go something like this... you have this $25 board I am dreaming about connected to your PC via ethernet (not directly but through a router, maybe) or perhaps directly via USB, you download 10 or 20 MB of code from E@H and flash the FPGA with that code, the FPGA fires up BOINC and downloads tasks and crunches them as fast as or maybe even faster than a $2000 PC. For disk storage it uses NAS on ethernet. The FPGA could do whatever exotic math and logic the project devs want to do and do it their way rather than within the confines of whatever libs they are forced to use. It would allow 1 codebase for 1 platform assuming the device replaced PCs instead of 1 codebase for 3 OS's plus hardware variations. Think of what that could for reducing error rates and the rather wasteful but necessary multiple iterations needed for result comparison and verification. It eliminates some security concerns and might make it harder for people to cheat on credits. It might make the credit system easier to handle since tasks would be running on the same device instead of numerous different devices.

So you crunch E@H that way for a month or whatever then decide you want to crunch some Rosetta tasks for a change. No problem. You download the code from Rosetta, flash the FPGA and crunch Rosetta for a while. Or maybe you get 10 such cards for $250 and crunch 10 different projects. Maybe plug in a GPU and crunch some GPU tasks. Oh yes, I'm making a lot of assumptions and over-simplifications but I think we have some very bright, innovative, tenacious volunteers in this community and all the expertise to do everything that needs to be done to engineer, prototype and debug the device, build 200 or so units as a cottage industry just to prove the concept then collect orders for a 1,000 units and have them made professionally. Some users would want nothing more than kits they solder themselves. Some would want fully assembled units. Some need only schematics or an Eagle CAD file to make their own PCB. Perhaps it could be implemented on Kickstart.

Bikeman (Heinz-Bernd Eggenstein)
Bikeman (Heinz-...
Moderator
Joined: 28 Aug 06
Posts: 3,522
Credit: 692,291,622
RAC: 2,833

Hi! If you are looking at

Hi!

If you are looking at a 25 $ price point, all you will get is a Raspberry Pi or Beaglebone Black type of ARM board, which is just fine for FPGA experiments because there are actually FPGA boards meant to be used with this kind of baoards, e.g. the PIF http://www.bugblat.com/products/pif/

Cheers
HBE

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 566,047,035
RAC: 107,198

Very ambitious plan! Not

Very ambitious plan! Not impossible.. but I'd first start with adding FPGAs as traditional co-processors to BOINC, then see what projects come up with and how it performs. If this demonstration goes well enough there should be enough support for pushing "bare-FPGA" crunching boards. If done well these could also interest universities, HPC etc.

MrS

Scanning for our furry friends since Jan 2002

MarkJ
MarkJ
Joined: 28 Feb 08
Posts: 437
Credit: 139,002,861
RAC: 368

There was an attempt to do

There was an attempt to do this by some guys who had an FPGA programming background. They were looking at porting the Seti@home multi-beam app. It seems to have stalled. Their blog can be found at FPGA@Home

The main issue seems to be getting the C++ apps that each project has and converting them to HDL. Hardware is readily available from PCIe add-in cards to the Raspberry Pi board.

Mac.teh.Knife
Mac.teh.Knife
Joined: 4 Feb 13
Posts: 17
Credit: 865,586
RAC: 0

@ Bikeman, Yes, $25 is

@ Bikeman,

Yes, $25 is probably far too optimistic but quadruple that to $100 and you still have an incredible bargain if it could crunch as fast as an Intel i3 on CPU projects. Even $200 would be a bargain if it included a case and PSU. If on top of that it plus a bigger PSU could drive even a single GTX 680 or AMD 7970, just as examples, it would be "like totally awesome" as the kids say. Sure I'm nuts but I've designed and built more things from scratch than most 10 other men and I'll build BOINC into an FPGA too. The only question is how much it will cost. Nuts is my name, innovation's my game.

@ ETA,

Ambitious project is right. Your advice on where to start is good but I'll be happy if I can just get an FPGA to send "Hello world I'm an FPGA. I can haz BOINC burgers?" to my PC without frying anything.

@ MarkJ

Thanks for that link! I had no idea somebody was already working on this.

I don't know much about FPGA but I've become very interested in them since reading this thread and the stuff available at the Parallela site. Since then I've been Googling new terms (new to me) and concepts related to FPGA and from what I've learned so far I can see where converting from C or Fortran to HDL could be one of the bigger hurdles. If I can get my head around HDL then the project as a whole is doable. I learned 8051 assembler on my own and designed and built custom industrial process control boards using assembler so I have some confidence I can pull this off too. If anybody else is interested in helping, advising, or even just following along and watching let me know.

ExtraTerrestrial Apes
ExtraTerrestria...
Joined: 10 Nov 04
Posts: 770
Credit: 566,047,035
RAC: 107,198

RE: The main issue seems to

Quote:
The main issue seems to be getting the C++ apps that each project has and converting them to HDL.


Yeah, that's what I imagined would be the largest hurdle. You almost need an automatic conversion process, since with any app-updates you'd also need to update the HDL design. A lot of work has been done on this, as pretty much everyone not coding directly in HDL has this problem, but I have no idea how good the existing solutions are and if they're freely available (probably not the really good ones).

And SETI is a comparably complex app, featuring library calls for FFTs etc. To begin with I'd want the most simple program like Collatz or probably ABC. Or something where CPUs struggle to get decent ressource utilization (branchy code) and stay at comparably low power consumption - here the gain from FPGAs might be the largest.

Which leads to the fundamentyl problem: hardly any project publishes their source code. So they had to do it themselves and probably lack the man-power to do so. Mhhh... Einstein does have comparably lot's of man power and is willing to try new things. Is anyone from the staff listening?

MrS

Scanning for our furry friends since Jan 2002

Mac.teh.Knife
Mac.teh.Knife
Joined: 4 Feb 13
Posts: 17
Credit: 865,586
RAC: 0

RE: Which leads to the

Quote:

Which leads to the fundamentyl problem: hardly any project publishes their source code. So they had to do it themselves and probably lack the man-power to do so. Mhhh... Einstein does have comparably lot's of man power and is willing to try new things. Is anyone from the staff listening?

MrS

I know for a fact ABC provided their source to a limited number of individuals who signed NDAs to allow them to help with optimization by compiling their source on their very expensive compilers. Asteroids may have too but not 100% sure. Two admins have indicated that if I can ship them a device that works they would be willing to try porting to it.

The history with GPUs has shown that if a new platform emerges that isn't likely to disappear any time soon and it's beneficial to a project to port to it then they do, eventually. If I can convince projects my platform meets those criteria they'll adopt, eventually.

Mike Hewson
Mike Hewson
Moderator
Joined: 1 Dec 05
Posts: 6,578
Credit: 303,947,530
RAC: 256,404

RE: .... getting the C++

Quote:
.... getting the C++ apps that each project has and converting them to HDL ....


[My 2 cents, from minimal & spurious knowledge of VHDL]

VHDL does have sequential constructs in that it relies upon the synthesizer ( of VHDL to per-platform FPGA bit stream ) to produce a design that performs 'as-if' according to some sequential process. This hence relies upon the quality of the synthetic tool provided, plus of course the manner in which the problem is expressed. My reading indicates that if one writes VHDL code while thinking in procedural/sequential terms then one will : (a) fall over very quickly and/or (b) get an inefficient circuit, and (c) both the more so with even modest complexity.

Now if one can make concurrent VHDL constructs from a sequential language like C++ etc then one will likely get efficient and operable circuits as a result. However this shifts the 'sequential to concurrent' transform into the application programmer's head from the VHDL synthesizer tool writer's head. But the cognitive challenge remains either way, and if anything the tool writer has less chance of success as his/her output will be in generality ( not problem domain dependent ) and thus not able to take advantage of any special feature(s) of a given scenario.

So we seem to come around in a circle ( how annoying!! ) back to the inevitable detail that lies within each specific line of inquiry that uses calculation machines, in whatever form. What I reckon is worthy of consideration is the generation of 'soft-cores' or CPU designs ( in VHDL or similiar language ) that may be tuned/optimised to a given task ( eg. Xilinx's MicroBlaze variants ). That gives control all the way through to instruction sets and encodings thereof etc, but most importantly the impression of a sequential machine*. What would remain thus is to compile C/C++/whatever to said soft-CPU's instruction set. That might be as 'simple' as writing a back-end port to, say, gcc.

One could have hybrids eg. like Parallella having a sequential ( Zynq/ARM ) machine passing on especially concurrent work to an Epiphany instance.

Given the above being true, that makes any efforts rather expensive in someone's time and other assets. What are the odds on that being available in the absence of an IP cost ??

In a real sense the Parallella project is a serious take on dropping at least the hardware entry cost to any open-source solutions. FWIW I certainly will make all of my efforts with Parallella in the free public domain.

[/My 2 cents, from minimal & spurious knowledge of VHDL]

Cheers, Mike.

* I could be cheeky and say that many calculation machines have always been concurrent, only interfaced to give us humans the impression of being sequential ... by way of example I seriously doubt if any hardware floating point multiplier would actually perform steps in the order in which I was taught to do at school. What is the point, say, of doing Karnaugh maps if not to transform from some linear-thinking exposed in the problem domain ( IF, AND, THEN etc ) into a much more efficient and likely concurrent evaluation to yield the same effective result. The black box hides all ..... :-)

( edit ) The bogeyman hiding under all the beds here is that concurrent circuits actually aren't. There are always small but finite transmission and gate delays within any real circuit. So as with all micro-electronic circuit design the leading three key concerns are timing, timing and timing.

( edit ) BTW : clocking in electronic circuits is a way of dividing time up into segments for which outputs ( of circuit subsets ) should be deemed as valid vs. those segments when such states ought be ignored ( being possibly in transition across the gap b/w logic levels ).

( edit ) Much of this would be moot if it weren't for current FPGAs being re-programmable as many times as you like. The original Programmable Logic Arrays ( PLA ) of yore were one-shot writes ( fusing and/or anti-fusing ) per unit and thus more heavily relied upon simulation prior to implementation, lest you throw away a lot of money.

( very late edit ) '... my efforts with Parallella in the free public domain ...' : bear in mind this would not be my income ( I have a day job ) and so I risk nought but my spare time. Hence let us not begrudge the IT professionals with such a sole income, with mortgages et al, the payment for their good efforts.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.