Programmable Logic: Computing Bit by Bit

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
my name is John Haller I'm the CEO and it's a pleasure to have you here today for another in our series of events not just lunches but events that are designed to coincide with the 50th anniversary of the invention of the integrated circuit and the growth of the semiconductor for those of you who showed up today knowing that you were going to hear about programmable logic you are here because we have one of the best people in the field to talk about it and we aren't afraid to get a little technical at lunch as we do that Steve trim burger holds more than 150 patents in programmable logic devices he has chaired numerous steering committees and blue-ribbon industry committees in the field he's written three books on design automation and field programmable gate array technology the famous FPGA industry but you probably know Steve because he's a Xilinx fellow he's led many many breakthrough FPGA projects since joining Xilinx in 1988 he was a member of the technical architecture group for the X C 4000 FPGA a technical leader for many other segments of that that whole field at Xilinx and he managed the Xilinx advanced development group for many years Steve has really sought after to make the presentation that he's going to make today and after he does that Steven Smith who as I entered just found out once shared a cubicle or was across a cubicle I guess from from our speaker is going to be up here to conduct the question-and-answer session so we'll have Steve's presentation you have question cards on your table please fill those out we'll be collecting those about a quarter to the hour Steven will be coming up here for the for the conversation and the QA and please just feel free to ask as many questions as you like so it's now my pleasure to welcome our guest for this lunch Steve from burger thank you well it's quite a pleasure to be here at the Computer History Museum one of my favorite places to visit down in the visible storage room you get to scan and back and listen to the comments and not just a comments from the docents basically your comment said oh yeah I've got one lives in my garage sort of the spouses lament but what I really like to hear is these things people start talking about these machines they know and exciting to me because I was I started programming computers before the the homogenous Asian period back in the back in the 70s we had interesting machines and you hear these commas like oh yeah that was a decimal digit serial machine or did you know this machine has 36 bit words and you can break them up into six bit bytes there's an 18 bit address field with an indirect bit and if you jump in direct to yourself that puts the microcode in an infinite loop now all this interesting stuff you really need to know about old hardware and then Moore's Law came along and well the reason for all these these interesting old machines was that not that they were so silly but they were efficient they were targeted to a specific thing if you want to run payroll maybe you wanted decimal digits if you want to do ballistic trajectories you have a specific word size for your arithmetic in mind and so these machines very efficient but once Moore's law kick in and well you could get a 2x performance improvement just by sitting on your chair for a couple of years and let the process engineers turn you into a genius and so many people made a living at that and then but open we were sort of in an area where we think maybe that's slowing down a bit I'm going to talk a little bit about about computers and keep in mind these old machines because I think they're interested in so computing bit by bit first I get to brag a little bit I've got another pun so this year I Triple E in spectra magazine announced there 25 microchips that shook the world and one of them was a Xilinx 2064 fpga field programmable gate array I'm going to be using the term field programmable gate array FPGA and programmable logic pretty much interchangeably and I've saw a couple out here in the audience would probably object to that sorry guys it's my talk but so what what what is about this machine why was it honored along with the 8088 z80 arm one spark 6502 what is this thing the the field programmable gate array was designed as a replacement for custom logic back in 1985 if you wanted this some custom circuitry you could keep the contract have it built yourself go to the manufacturing cycle and buy yourself a few thousand or a few tens of thousands or if you wanted to be economical about it a few millions but if you had low volume maybe that's not a good idea or if you didn't really know how to design to do great with circuits maybe that wasn't a good idea and so to reduce the risk of all that save money and time you could buy this custom is previously built thing and customize it yourself this was not the first programmable logic device what was special about this it was special because it was programmed by writing into memory on the device and you could overwrite that memory and you could change it this was determined the marketing department of Zombies decided this was extremely bad thing they didn't want to publicize this at all the reason was our competitors were talking about volatile logic where does your logic go when the power goes out now it's hiding on the circuit board I don't know where it is but so this was a problem it really had two big advantages one wasn't tested that we could program it and reprogram it and we knew that a device that we shipped would work the other one was in manufacturing in a new process technology the first things you're able to do our wires and transit and if you can do that you can do memory and logic and you can make this device if you had to wait until you've had a fuse or the EPROM structure that will take longer and so this allowed the company to target new process technology take advantage of Wardlaw earlier it also allowed us to trade off manufacturing partners against one another which was most also valuable early in the life of the company so over time so we're are these things really used and get this to work well violence has about over 20,000 customers so programmable logic is this customized logic is used nearly everywhere I'm not going to go to the list of applications how is it used well it turns out despite having this ability to reprogram easily 90% of the devices only get programmed once they're doing things like well we call the glue logic oh I have a keyboard that's producing a hundred and eight different signals and I need to confront just to encode that or I've got a memory that's got a different voltage in my my processor or I've got a display device that needs some buffering variety of problems like that that there's no custom no standard product for that you need to build that but it also gives another advantage you can upgrade this product this hardware product in a similar fashion to where you would upgrade a software product and for the same reasons so you could have rebbe 1.0 of your display driver and then 1.1 and 1.2 and even if you're not updating them in the field you could at least charge money for that somebody to buy the next one so being able to program and reprogram wasn't essential to the technology but is ascent to using the technology but it's very useful in the product lifecycle and that's another aspect I'm not really going to be talking about but I really want to jump into it well I got I got the go-ahead say something with the technical so let me talk a little bit technical and I'm not going to go real deep into this and there won't be a quiz so don't worry about that so how does this really work it's not quite black magic but it's probably more brute force than you imagine how do we make logic that's programmed by writing into a memory easy you build a lookup table of the memory I have this this red lookup table or 64 words of one bit okay that's funny-looking memory yes we build funny-looking there little memories but when I do that I have I have a six bit address I can address any one of those bits and if I think of my address as independent signals then the outlet then this this lookup table is the truth table of the function so I can have absolutely any function of those six inputs just by writing the right pattern into the into the lookup table okay so what are some interesting patterns well parity might be a good one I could look at this and say well I have two of these inputs and those are the Select lines and the other four are my our inputs and I have a 4 to 1 multiplexer I can build that in this function now those of you who are thinking about gee if I were to design a 4 to 1 multiplexer on my own on my own at home I could do this a lot cheaper than 64 memory cells and a big multiplexer and you're absolutely right this is tremendously inefficient in its use of transistors and guess what's been happening the last 20 years we have shared this we've got really really cheap and so it may be inefficient in the use of transistors but transitions are pretty cheap that result I can pull out and I could use it immediately or I can put into a register and clock it and use it next time and something that turned out to be very handy if I build and a suitable exclusive or in that in that lookup table I can compute the sum and I build a dedicated carrying lobby I can have one bit of arithmetic here ok one bit of arithmetic doesn't sound very exciting well I can gang them together and I can make eight 16 32 I can make 2000 bit the arithmetic I can make six bit arithmetic I can make buses of any width I want so if I want if I need a bus of 2000 its I can make that running into a net into an adder of 2,000 bits I can make that or I can build hundreds of smaller arithmetic units or logic units so I can build a custom computer or more interesting Li a custom execution unit okay so how much can I really do from those early beginnings what's what what we had today is about half a million of those logic cells a large device contains about nine megabytes of memory it's not one block of memory it's two thousand two small memories each independently addressable 2,000 memories running in 500 I got terabytes of bandwidth inside this device in and out of those memories microprocessors oh yeah there's a couple of micro these are to scale so if you throw on a microprocessor with this cache it's really not that big on this device it's 2 billion transistors along with all the rest of the interfacing and so on now this is the this is the large battleship and device so we actually make a family of devices and they're smaller ones and basically what customers wound up doing is they buy their logic by the yard how much do you need that's a size device I'll buy so let's talk a little bit about computing and a specific computing problem that's faced by some people just down the road in Mountain View the SETI Institute is building a radio telescope now radio telescopes you want to collect as many photons as you can one way to do this is build a just a really large dish of metal and and that's been done the problem is as these dishes get bigger the cost of the steel gets astronomical and then I'll turn it to these instead of making one great big dish you make a whole bunch of small dishes if you make a whole bunch of small dishes then you have these little problems I got to bring all the signals together and I got a correlative well first I got it off set for the speed of light between the dishes going to bring the signals together and now correlation problem though is is basically there are four ei transforms I've got N squared kind of problem here because I've got a car like every dish with every other one so as this as you do smaller dishes and more of them your steel costs go down but your electronic costs go up okay so where is that point well yell until a scope of Ray currently has about well it has three two dishes working ok forty three dishes and if you look at the correlation problem because you can run out that that calculation in it they're doing about 4.7 terrorem multiplied accumulates per second to correlate their data in real time ok so that's a few hundred thousand process or something like that it's currently being done in sixteen FPGAs now they could have done it in four but they saved a little money actually the and it's a but but if you were good to do this in the the actual telescope they're building the plan is to build three hundred and fifty dishes this is three hundred Tara ops per second estimated to take you a couple of hundred FPGAs if you're going to do this with more traditional microprocessor technology this would be about a million dollars a month in electricity so if you build a custom supercomputer which is what they're doing you can get by with many orders of magnitude improvement in performance and in power how do you do it I've got a couple more slides of this this is introducing a little little difference in in viewpoint how do I get thousandfold increase in something that's running a few and maybe of order of magnitude slower clock rate in traditional sequential processing this is actually a simple example of a simple filtering application if you do the sequentially I multiply I add and I multiply and I add and in if I'm doing 640 taps I can do 640 cycles and get an answer but if I can make more multipliers if I have as many multipliers as I need and I have an adder that takes in as many inputs as I need either but there are ways around that too but I can get this tonight you get a result out every cycle so I get 600 fold increase here by good by going parallel I get a 600 fold increase in the amount of Multan umber of multipliers that God and I've got to have a system that's going to support that that didn't look too much like computing the must look at something that does okay so here's a bit of a des application code and I've got this green array of rectangles and this is my logic array and in this application I've got these the few operations for example that you know let's ignore the outer loop for a moment that's temp 48 the expand operation is taking a 32-bit value and making 6 48 bit value so it's duplicating some bits and moving them around well that's easy in this technology to move bits around I just wire them to where they belong there are no operations required if I needed to go to two different places at the same time I can wire wire it to two different places at the same time no operations are fired this the blue part this is the these are eight different lookups in small look-up tables with small addresses and there's a lot of activity there masking out bits and shifting bits again this is just moving the bits around this requires no logic and so in this device in this application so the blue s box has turned into the blue regions on the on the device I can build those out of a look-up tables I've got the exclusive-or operations and the rotate which actually does a select fit into those six input look-up tables that I had and so I'm doing all these operates as small bit operations I'm doing I've replicated I have movement of data that it just evaporates out of my logic because as just I could wire them to where I want so I built sort of a custom shuffle operation there basically what I'm doing is I'm building a custom computer for this application and I don't need to stop here I could enroll this loop if I unroll this loop in space I replicate it this loop has a loop carry dependence II this loop as a result requires the each iteration requires the result of the previous one so what do I do I pipeline I save that right but I don't have to push it anywhere else the same but remember those registers that I had on my lookup tables I save it there and so now I'm flowing data through this machine I cannot so replicating taking advantage of small small bits of small word size and small bite sizes and deep pipelining how deep can you go I was looking at a presentation from Michael Flynn at Stanford recently and get an application you had 30 he was doing oil and gas exploration so thirty thousand sensors they're paying this about every 10 seconds 468 deep pipelines 468 deep latency eight lines it's only running at 250 megahertz but it's a teraflop see their floating-point operations in a in two FPGAs so you can make arbitrary wiring arbitrary busing build a custom custom supercomputer for your example one other application we can do with something is pretty pretty clever is a partial reconfiguration so yeah we can reprogram the devices we can reprogram part of it at a time while the rest of it continues to work and that operation can be managed internally we weren't the first ones to think of this while the first reference in the literature I found to this partial reconfiguration of this dynamic reconfiguration was by researcher named James Blish in in his cities and flight novel we're in from the 50s where his he had computers rearranging themselves to solve new problems of course his computers rode on railroad tracks rumbling around in the basement of the building and you had to keep out of the way when they were doing that we've made a little bit of progress since then so if you have these pieces one interesting application is a software-defined radio you have your cell phone but maybe you would like to be able to do push-to-talk we see GPS signals Wi-Fi when it's available you would like this you would like this machine to even perhaps bridge these different networks or servers our router in case of emergencies so this is a either application that when you would like to not have to carry around six different radios and you like to have to not carry around one device that has six different boards in it and so just as you can reprogram the the software stack to manage the different protocols you can reprogram the programmable logic stack to manage the different encoding and error correction schemes and security schemes of this different so we can fact this question is it computing or is it just emulating computing like some science fiction writers has said we computers would emulate emotions or emulate true intelligence auto maybe we could apply the Turing test here and say well if it's something that we would ordinarily think of as computing but now it's done in a programmable logic device that's still computing and yes it's doing computing we have done several applications that were done on more traditional computers but we needed a custom supercomputer to do that brings up the question what is a computer so there one passive aspect of a computer is the processing element and I've contrasted a microprocessor processing element with a spatial programmable logic device processing element that can both actually transform input data to output results another another component of what is a computer is the system component and I highlight this by saying well you know your microprocessor pick it can't download a webpage it can't run word why not well there are few other things it needs it actually needs an interface to the network and it's going to need the memory and it's going to be in a BIOS and a bunch of things going to need in there that system the rest of the box provides all that stuff that's actually been an area where we're programmable logic these spatial computers have been weak we've built these machines and we haven't built the appropriate environment around them one of the successes of the of the PC industry today and the success of not just the the glorious microprocessor but that whole whole system that that allows you to use that processor effectively so there's an interesting interesting work going on in the actually the cubicle next to mine and taking a large FPGA putting on a board with a physical interface for lukla that's the matches as they on processor sticking in the server and talking well fSV but now qpi getting access to that whole system so now one of your slots in your server is a custom supercomputer probably the the weakest point in the whole programmable logic development chain or the spatial computing development changes programming we are very comfortable in our pro computing industry with a a programming model that's a sequential flow of control that's not the programming model that we have in the in the program of logic business I'll remember our history we come from a hardware description and so our programming language though it still looks like characters we have a language of the VHDL and Verilog but there are not simple sequential programming languages the leverage boolean algebra but they require explicit serialization and explicit synchronization so much of that much of that is hidden in today's programming languages for sequential computers but that flexibility that's required to get the performance out of spatial computers we also have this problem with proximity and speed timing isn't guaranteed so we do your program you set up here you build your custom computer and then you find out how fast it runs well this doesn't it doesn't agree well with a lot of people so what are we doing in programmable and in programming languages actually there's not surprisingly there's a tremendous interest in C or C like programming for spatial computers or programmable logic devices in to date they haven't been particularly well received because they haven't given particularly good results and the fact they haven't gained particularly good results because you're being compared with a hardware description mentality C is how good is this compared to what I could do if I were sitting down and thinking about all the problems myself and so but there still have been some some some interesting successes it depends on how much of that problem you extract extract away in last year at the memo code conference there was a competition using a high-level language can you how much can you accelerate this problem and the problem was not one of these signal processing how much can I can I shove through the the custom computer the problem was retrieve data records sort them and save them back the wrinkle was the data records were encrypted so you had to retrieve data records decrypt thort rhe encrypts a so the the winning team from MIT used a language called blue speck and achieved 1100 times improvement over standard processor and this was not designing it like Hardware this is designing it like software their software was able to take that see specifications were close to see specification and insert the parallelism the pipelining in those operations we talked about earlier so while programmable logic is headed toward a more sequential type specification because of its ease of use and something do with human psychology I suppose at the same time we note what we have our sequential processors our temporal processors looking to multi-core and many core and having to deal with this same sort of parallelization problem that that the programmable logic the spatial computers have been looking at for years so while the the spatial computing is getting its interesting its way toward toward temporal computing temporal computing it's inching its way toward spatial for the performance advantage we're going to meet in the middle I don't know we're in the middle but we will meet in the middle because there's nowhere else to go like to invite Steven Smith up here and we'll we'll take some questions so Steve I'd like to start off with the first question I was given you talked a lot about bitwise computation but what about what oriented FPGAs does you see do you see any future in that what about word oriented FPGAs so it it what matters is what is your what what structures your problem looks like if you start with a word oriented kind of you know model then yes word oriented SP J's make a lot of sense but a lot of the applications that are that are seen especially have worked with very small bit sizes and the and some of the high performances it has come from exploiting the very small sizes so that's been then been pretty successful the there has been there commercial ventures for word wise I thought wouldn't wouldn't consider it might not call them FPGA or maybe they just avoiding the terminology for for whatever reason no commercial success yet like part of the part of that maybe as I mentioned the the programmable logic the program the program logic doesn't doesn't exploit the the computing aspect of this it's not a large segment of the business and so basically what's happening is all those people buying their their programmable logic devices as asically place that's buying them for just doing custom logic are subsidizing the computing field right now and so there may not be sufficient business until the software matures man okay it's great answer thank you so this is a great question actually can you tell us about the next great leap in the field that you're most excited about no I guess not okay the actually the the next the next great leap in in this field is has got to be in the area of design technology design methodology we've we've made great strides in exploiting Moore's law to produce just bigger devices and faster devices and macat that that slide I had on what a modern nasty GA looks like I said wow I've got a couple of power pcs I got memories you got Ethernet MAC I you know designing this thing is it's it's not just sitting down and figuring out what you know what connections I want or what logic I want there are the assistant design issues that that we really don't have a good handle on and there are there are a few people in the world who seem to do this very very well but there aren't 20 thousand of them we'd like to have at least 20 thousand of them make them our customers okay okay you mentioned reals Moore's Law what a few times so when do you expect an engineer as well as we approach the end if there is one and what's that PJ is doing about it so when it's the end of Moore's law so I cheat or I need to go on record as for this one my favorite so for everyone has been wrong I might as well be wrong too I expect it to laugh until after I'm retired and that may not be independent efforts now so what a refugee is doing about it interesting the FPGAs have have leveraged Moore's law pretty heavily and we are we compete against custom logic but as I pointed out we take about 30 times or 50 times as much logic to implement something would you have if you actually built it directly so if I wanted you know one gate applaud it takes it takes up programmable device about 30 days to do that well that's a big deficit hmm there has a big big advantage well programmability has been a big advantage so but what the other thing we use is process technology so we're very aggressive on cross technology not quite as aggressive as Intel they're kind of a class by themselves but compared to our competition for custom devices we generally stay in generation and maybe to a head that doesn't make up a 30x deficit but it does mitigate that and so as as Moore's law shouldn't Moore's law come to an end this is a problem for us and because it says with the the gap we have from being aggressive worked it now starts to close on the other hand how would Moore's law end is an interesting question does it end by well one generation you'd stop getting performance improvement in the next generation you stop getting power improvement in the next generations thought gee if it's thought if it ends in dribbles like that well maybe programmable logic has a lot to offer in that case if it ends as we have just have difficulty Manufacturing something and it comes out highly defective well programmable logic by its nature has a lot of redundancy in it I put an adder in one block I could put it in another one no big deal so you could avoid the errors you'll avoid the defects in the parts while you build your custom supercomputer doing this mapping to non-defective regions of your device has been done before there actually some HP Terra Mac did this and over a decade ago very very effective you can you can use parts that are that have a lot of defects in them and never notice it sort of like hard drives right you have hard drives a lot of defects in them but you never notice it okay thank you this question is an excellent question and coming up why are the SPG blitz file formats proprietary and why not publish them like the instruction sets of general purpose processor did you set me up with someone else did thank you whoever did that I love talking about this topic and you may have to stop me when I might run out of time so why are they proprietary their proprietary because it's work to make them non proprietary so if we want if we were to publish the if we would publish it we would have to well we have to tell everybody what it does and honestly yes we do know what everything does but to put that in a form that everyone else could use means a documentation effort ok so now why don't we want why why is it we don't think that documentation effort is worth it is worthwhile actually I should take this back and we did this once we did this with a 6200 so we published that that format so and that was a particularly easy one it was a state with a very simple format so why is it not important what would you do with it ok so what would someone do with it well effect if someone someone could write their own software I suppose well that's wonderful and we have people who do that it's the the bit file is that last bit of translation from I know what I've got I know where I put it to precisely how do you code it it's going from from assembly language to machine code so it's not a big task we have tools that do that frankly the tools that we have to do that are pretty darn cheap and and so there's there's not a lot of pristine value in putting everybody publishing that putting everybody in that business there we also retain the ability to change by keeping it proprietary we haven't done it often but we've done it occasionally that that the bit file has changed and of course nobody notices because they never look at that okay thanks getting back to your application of Seti is the set of data set streams or in real time or is it stored and and computed offline it's streamed in real time there are there are plans to to store that actually went there several very interesting plans going on Iggy what you should really talk to those guys so one plan is to to actually retain a buffer maybe you know several minutes worth up to up to an hour worth of data so if you see something interesting you can go back in time and look at it again so that was pretty clever there's also a plan to to save just save snapshots of the data but there's so much data that it's just hard to save that and do anything to do anything sensible with it they have this real problem so there's a huge amount of data and it's on the top of the mountain so even if you you know couldn't get all the computers up there you couldn't get the electricity up there to do it so the problem if you were gonna store this in a big storage server and then you know put on the back of my truck and truck it to San Diego it has been proposed then you'd actually need a truck about every two or three miles down i-5 if you're going to take all the data but astronomers are very comfortable with the fact that they throw data away I still have a hard time with that but but yeah no problem tell us get more optical telescopes well we only we can only use them when the sun's down so that's half the time and when the moon's not up and that's half the time and then when I'm not drinking coffee and that's half the time sorry so they're very comfortable with the fact that that not all the data gets gets observed or doesn't get reduced and I'm sure you've you hear every once in a while that someone saw a an asteroid that might have hit us three weeks ago they discovered it two months ago but didn't look at the data until yesterday so so they're very comfortable with that and so one of these proposals is Leo let's go ahead let's throw away you know ninety percent ninety eight percent of our data and just put it on a put it on the truck and and reduce it as time becomes available okay so as we move to new application space these mobilities is dominating a lot of people's thoughts today so do you think that we'll see an FPGA and a cell phone anytime soon so an FPGA and a cell phone I sure hope so actually we've had programmable logic devices in cell phones for for quite some time not the large FPGAs but but smaller see fieldy's and we have had FPGA wins in sort of the high end data assistance so yeah yeah really the answer to that question is yes and the reason is actually you have to think about the dynamics of that of that business they come out with cell phones pretty pretty darn rapidly yes they're extremely cost sensitive but on the high end those extra little features are pretty darn valuable and so so I would expect yeah this is a natural extension of the of the temporal data processing front end to have a an FPGA to do the spatial data processing on the back end especially for these these software defined radio types of applications expect to see that good to be become rather common well sticking with applications and and and and directions for the marketplace what about FPGA is and robotics do you think FPGA is in the real-time aspects and reprogram ability are particularly after pull for robotics I really do and there are several aspects of robotics where where they're particularly with we're at FB J's are particularly useful first is control functions I mentioned in passing where do we get used I'll get these bizarre control functions or unique things putting in a little bit more intelligence into a limb for example is our bodies do that and it would make sense to do that in our in a robotic system as well there's also it's a case where you want this supercomputing you want a custom supercomputer because you're heavily constrained and really you don't really need a custom supercomputer unless you are constrained unless you have this amount of data that's so large that you you're just not going to get through it in time or consumes too much power or your space or weight limited and all these things happen in robotics if you have if you have a robot want to be autonomous you're constrained of power you constrain the space and you have this problem that you have to do image processing or complex control processing and do that in real time you don't get you don't get the truck that to San Diego wait for it your robots gotta got to stand up or it's going to fall down so very and there's similar applications for example automotive control sociology we would like to have automobiles with control systems so they don't run into one another no matter who's driving or who's hands off driving yeah I would like to have the most powerful supercomputer in the hood of my car everybody else's car when I when when that's going on and I think this is it now absolutely it's a beautiful application of spatial computing whether it's going to be using programmable logic devices as we supersede them today or was going to take maybe a word oriented device or something else remains to be seen but that's that this is a spatial computing is a fundamental component of that it's got to happen okay in your presentation you talked about the idea of dynamically reconfiguring an FPGA has been something fairly advanced for for the industry and why hasn't a killer app appeared so far or has it appeared for dynamic reconfiguration and what are the difficulties of making oh boy so why don't we have a killer application for dynamic configurations can't think of one is it a difficulty in the application design there we do have we do have several several problems there we don't have we don't have good simulation capabilities for something that's going to change its its programming at that level and so that makes that difficult but what we tend to find is that the overhead for reprogramming these devices can be large I leave large FPGA how big is how big is the program what could be about 10 megabytes of program to load into to do something now in a partial configuration maybe you're going to get by with a couple of anga bytes or maybe a little bit less so there's a long it takes a lot of a lot of bits to change the programming so it takes a lot of time to do to change the programming and so that puts you in a situation where if you if you want an application that's going to reconfigure if you're going to avoid spending all your time reconfiguring and most of your time doing the processing then you have to minimize that somehow so that leads us to some all reconfigurable regions which are small is what's interesting because you can do the last but it also brings back this question that could come back to it again about word wise devices we're programming all of our devices at the bit level if we would be able to if the programs and cells were smaller but we could program word wise or program and have some compress out that redundancy in that or somehow that that perhaps we could reduce now that configuration overhead and make some of these rapid reconfiguration applications more interesting I should say that that there have been devices that have addressed that by having a second copy of the kids remember so you can program one copy while you're running out of the other but if the configuration memory makes up about 30% of your die size which is pretty reasonable you're adding 30% to your to the cost of this little more or more than that to the large devices that most people don't ever see gee instead of at 30% adder for a second copy of the memory maybe I should just make a device that's 30% bigger and have that other 30% be my reprogramming region so these are all questions that go into into the DES system design around what that device would look like okay thanks I've questioned about a configuration because it's particularly important obviously for FPGAs most FPGA is that configured when power is applied to a system and the data comes from an external memory or perhaps a microprocessor but if you want to get into their spatial computing and you can imagine having hundreds of bit patterns ready to configure the same FPGA and configure over time if those bit patterns come from a network are you concerned that those patterns could then contain viruses because today our virus is data destruct if it doesn't harm a computer but if you could be physically destructive by introducing viruses into configuration patterns how would you handle that so yes it's true I guess you would have to know a lot about that application to know what specifically how to insert a virus we have a similar problem that's not there's just in terms of data security or theft of the data and so we currently have in in our large devices is we we can encrypt the the bittern so when you're submit you're sending the victory even over the network it could be sent encrypted now somebody could tamper with that you get something that's gibberish we have checks to make sure that it doesn't you don't actually try to execute something as gibberish but we do have control over that we could prevent an intentional virus and with the same mechanism we use to prevent stuff okay staying with configuration yeah there's things all your buddies are here I'm staying with configuration as we get small small geometries the the potential single event upsets are always there and as we as you mentioned we have mega hundreds of mega bits of configuration space how can you handle radiation hardness or FPGAs in space applications so how do we handle radiation hardness we actually have a we have devices that are that are built with a hardened process and we do we do specialized testings there a couple of other things that we do to to address this problem one is what you might think of as as sort of brute force it's it's relatively straightforward to triplicate so it's what a triple modular redundancy we can certainly do that and we have a tool that will do that will take an application triplicate it and drop it onto the device so that's so now if you're worried about a single event upset it'll don't it's only going to hit one of your one of your triplicated designs something you can do since you have this spatial aspect of your computing you can just make three copies something else to do what in particular it's been a lot of concern about hits in the configuration memory it's something that that's not always obvious that memory is controlling the operation of the device it's always being read right you can't do ECC on because every bit must be correct if one of those bits changes now a connection that you thought you had is no longer there or your lookup table now has the wrong value in it so your logic changed okay this should scare you so we what we do about that is we have a scrubbing function that we could run so the device is continuously reading and correcting errors in its own configuration memory so that's it can make a pass through the configuration memory every few milliseconds and and correct all any error that might have occurred flagged the fact that there was error and now your bet get back and how does the system handle high reliability applications you may need to roll back first your system to the last time you knew it was correct new transactional oriented operations like that but so there are a couple things we can do about about very high reliability okay so we talked about computing where the CPU is is central to the computer or the FPGA is central what about peer processing where an FPGA and CPU compute together one isn't some serving necessarily to the other what kind of fpga technologies are available to data allow an FPGA and a microprocessor to work together in harmony not not one is slave to the other so I'm going to say that is the the the Steph I mentioned at one point in in the talk we're putting an FPGA in as a on socket and that thing is taught that's talking over qpi so it's actually handled dealing with the same memory hierarchy of processes are so now these things can communicate just the way you would communicate with another processor back to the hierarchy so that's good yeah this is this is kind of one those system questions about how things put together and I think leveraging an existing system that has a lot of effort gone into it is really the right way to be doing that ok well staying in that on that theme how successful has it been to put PowerPC architectures embedded inside FPGAs how successful so how judge judging by how well they're used I think that has been been rather successful so in those those devices we have a very high percentage of the processors being you it's a kind of funny question so gee there's a microprocessor in there do you use it for not now it turns out that it's there whether you use it or not so you might as well use it so we get a lot of quite a bit of use out of that some operations for example to cream scrubbing can use that processor and so and do that we actually find there's a tremendous value I talked about spatial computed computing and and all the wonders of that there's something that doesn't that's not very that spatial computing is the kind of kind of breaks down with local control oriented operations and sort of putting the processor in there to do that it's actually pretty good deal so I said I would say how successful sort of medium-high successful it is okay so if there's so many questions this is excellent we can be here for another half oh we have about another two minutes I believe yes well it is quick one here what's the most common product that contains an FPGA most common product actually probably the the most well so one one of our one of the most common products was probably HD TVs have a FPGA Zeeman the high-end epi HD TVs but probably the one year you've run into most often is we sell a lot of devices into data communication to telecommunications so you know Cisco was a big customer and we liked them a lot I told people when you when you make it when you make a cell phone call your voice went through our chips when you make a when you download a web page the page came through our chips the in these applications you have a huge amount of data the small amount of processing exactly kind of kind of structure I was talking about okay we've set up something that's going to strip out the header determine where this goes direct the packet to where it belongs these are very very good where most of the devices are gone okay look at many many technical questions and unfortunately running out of time but I'll just ask this one very last question which is simply how do I get started what's the minimum development kid they need the minimum development kit is actually free off the web go to go to these islands website and download it from there the for a I started getting bored there's a $59 board that's sold by vigilant and actually very popular in academia so we have a whole and there's a huge amount courseware and example applications and stuff built around around this board in some ways perhaps too successful because the first round of professors that were great they could give all these assignments and students did them the second round the professors the students went on the web to look to see if the assignment had already been done and posted and now they're just too clever they're downloading those those solutions through our chips I guess well Mike to thank everyone for the questions we've got and almost a centimeter thick worth for questions here and I won't remind everyone steam will be available to take some questions just shortly after the end of the session so once again thanks to everyone for the questions and to Steven Berger it's gone Steve thank you very much I just wanted to be sure to say thank you to Steve who's a member of the semiconductor special interest group and produced today's session and of course Steve to you thanks very much it's just a real pleasure to have you here and as Steve mentioned Steve's gonna hang around and answer some of your questions we have to let him go so you can see the Babbage engine get cranked at two o'clock but if you will stick around for that and then we have a gift for the two of you it's our special core memory book which is some of the best photographs of some of the best items in the collection so thanks Frank alright guys thanks thanks everyone have a good day
Info
Channel: Computer History Museum
Views: 11,588
Rating: 4.807229 out of 5
Keywords: Computer, History, Museum, PLD, Programmable, Logic, Device, Xilinx, Chips, Semiconductors
Id: b2HjhaNnCIg
Channel Id: undefined
Length: 55min 50sec (3350 seconds)
Published: Wed Jul 29 2009
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.