Jim Keller: Moore’s Law is Not Dead

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Jim Keller isn't using his Moore's law is not dead talk as a marketing ploy. He's speaking to a group of future engineers.

👍︎︎ 12 👤︎︎ u/Remesar 📅︎︎ Sep 22 2019 🗫︎ replies

Nanowires!

We had planar transistors, we went to FinFET, we’re all building nanowires in the fab. Intel, TSMC, Samsung, everybody’s working on it.

👍︎︎ 6 👤︎︎ u/bizude 📅︎︎ Sep 22 2019 🗫︎ replies

If Moore’s law is not dead or is not close to being dead, why is Intel investing so much in creating next generation MESO transistors? They keep publishing papers, filing patents.

I don’t understand this event Intel held and this constant hype around Moore’s law they started this year. Fine, we get it, Intel still thinks they can extract a lot from silicon. What’s the point of having this debate around Moore’s law? What do they hope to achieve?

👍︎︎ 14 👤︎︎ u/dudewithbatman 📅︎︎ Sep 22 2019 🗫︎ replies

moore's law is not dead, we simply cannot afford it

👍︎︎ 5 👤︎︎ u/HardcoreGamesTM 📅︎︎ Sep 22 2019 🗫︎ replies

The layman might be convinced but the reality is extremely obvious:

2005 - Athlon64 90nm

2008 - Wolfdale 45nm

2011 - SandyBridge 32nm

2012 - IvyBridge 22nm. <- End of Moores Law.

2015 - Skylake 14nm

2017 - Kabylake 14nm

2018 - CoffeeLake 14nm

2019 - Whiskeylake 14nm

2020 - Cometlake 14nm

2021 - ??? 14nm

👍︎︎ 3 👤︎︎ u/InfiniteIsolation 📅︎︎ Sep 22 2019 🗫︎ replies

It's over.

👍︎︎ 1 👤︎︎ u/ilostmyoldaccount 📅︎︎ Sep 22 2019 🗫︎ replies
Captions
a little long-term at the end of the day we ready let's go all right welcome everyone to the East colloquium thank you for being here my name is Eric Paul's I'm just going to give a very brief pointer to next week we will have Rodney Brooks here as a speaker but I want to give our time you heard it here moore's law is not dead so today we have a really special introduction by our very own Dean sue Jay King Lou thanks very much Eric good afternoon everyone so I'm CJ and you know College of Engineering but I'm a professor of Electrical Engineering Computer Sciences and it's my pleasure to introduce Jim Keller he's a senior VP and general manager for silicon and silicon engineering proud of the silicon engineering group at Intel Corporation so you might know that Moore's law refers to Gordon Moore who's a alumnus of Berkeley and Moore's law truly has set the pace for advanced exponential advancement in computing performance over the last 50 plus years and there has been debate for many years about Moore's law being dead and Ang Mo will call out some people who have said that on the record but in a case Jim is really well qualified let me tell you he so Intel Corporation was co-founded by Gordon Moore so this is really you know he has to I think we would hope that Intel would continue to be the in the company that ensures that Moore's law is not dead but Jim has impressive background he's before coming to Intel he actually had more than 20 years of experience designing microchips computer computer processing units not only with the Intel's x86 architecture but also the ARM architecture was just derived from the reduced instruction set computing architecture developed here at Berkley and so he designed these chips where he led the development of these chips for many applications so not only servers but PCs and mobile devices before Intel he was actually at Tesla he basically helped them design their chip for automated driving and then before that he was actually at AMD you might have heard of AMD's recent successes with the Zen architecture of the CPUs that was Jim okay let that chip development effort and before that he was at Apple which was through a company acquisition there he designed he led the design of the a4 processor which powered the iPhone 4 and subsequent Apple 5 processor so really impressive experience great perspective to share with us today he did get his bachelor's degree in electrical engineering a few years ago from Pennsylvania State University so without further ado please join me in welcoming Jim to Berkeley nice yes great well first it's almost 40 years it'll be 40 years the computer design next summer which started to make me feel old but and I'm an architect I have a title it says SVP and I I really wanted that because of the big s I was gonna put that on my shirt okay I don't mostly think well my staff knows better I'm not I'm not really a manager but I have a really good staff so we we run a big organization and silicon engineering at Intel's like ten thousand people so it's it's a wild phenomena and I'm delighted to be there super fun but when I joined everybody saying Moore's law is dead and since Intel's the Moore's law company I thought well it's kind of a bad career move like what am I doing there but I also been thinking people in telling me Moore's law is dead in ten to fifteen years from my entire career and I realized about ten years ago I stopped worrying about it because despite its imminent demise it kept proceeding and so at some level you know I don't care that much but then I had this funny problem because let me just walk you through a little two by two matrix everybody knows how they work right so two axis Moore's Law is not dead or it's dead and then you believe in Moore's law or you don't believe in it all right I really here's here's a matrix and and I'm gonna walk you through the matrix a little bit and then I'll tell you about why I think it support so if it's not dead and you believe in it well it's challenging because you got twice as many transistors every couple years your designs get bigger and harder but you're sort of ready right and mmm if it's dead and you think it is you're a little delusional right and many people think that's how the world works and that's that's fun and and then if it is dead and you don't believe in Moore's law well it's gonna be sad because if you're in a high-tech computer company where things aren't really changing it's gonna be a race to the bottom and that's a different thing here's my problem if it's alive and you think it's dead your design teams your methodologies your architects aren't getting ready for the next way with more transistors right and that's a tough situation and an Intel we had some groups that literally design team doubled when we got twice as many transistors because they had twice as much to do and the tools didn't scale and so we're there and and I don't want to be pedantic about the details of Moore's law because my fundamental challenge is the scaling right so and I really want to say because of the endless change in transistor count it's enabled a whole bunch of changes people say computer design is X but it's changed so much over the last 40 years that I've been doing it and it continues to change and we're reinventing new CAD tools so not only is the transistor roadmap continuing but adjacently memory packaging many many things and that's causing us to rethink architectures and how we do things so I want to walk through this a little bit so and and one interesting thing in again I'm intrigued about the phenomena how things scale so if you sort of glance down here you know transistor core it's going up a thousand X frequency a thousand I'll get back to some of the details later wafer size hasn't changed that much transistors on a chip its way up cost per transistor 10 million so that this you know operating voltages you know 5 to 10 X depends on where you started so you know things have scaled very irregularly right and depends on which technology part you're looking at it's a very uneven thing but the net-net has been very strong scaling so and everybody's seen all these it's kind of funny Moore's law is dead now what but I like the other one so this happens all the time everything that can be invented has been invented 1899 alright there's there's a deep intellectual framework about how things scale and move along right and it's easy to be attached to a set of technologies and then think that's kind of the limit and this the Declaration of the end has happened multiple times now and I worked at Apple I had two vendors come to us and talk about CPU performance in one said CPU performance per clock is kind of plateaued and the future is parallel programming and accelerators Intel showed up and said we're gonna make 5 or 10% year keggers you know compounding it'll grow 5 to 10% a year which sounds a lot better than plateauing right an internal plan was to do a new quarter I was twice as fast and then do one after that that was twice as fast and if you look at the Apple roadmap that happened right the expectation and mindset sets your direction and your possibilities and it's important so so you're some famous skeptics and my understanding is they're both very smart people and I've talked to Dave and I can validate that Jensen is kind of interesting ever quite got that his whole business is bigger faster GPUs and why he's running around saving more so I was dead I don't know but somebody told me that one year they announced 12 nanometer an AMD was going to announce seven he wanted everybody to think the technology didn't matter I don't know if that's true but it's it's a fine story but that's that's sort of puzzling so and then the funnier version of this is so there's there's a little of a you know cultural idea about this right and again my problem is if I'm getting twice as many transistors and sometimes you say transistors sometimes they say transistor sent times frequency there's a whole bunch of things that drive it what as computer architects design or tool makers fabricators are we going to do right so we've been digging into understanding this and it was interesting how many different things that we popped up so everybody's familiar with this curve right diminishing return curve says you have a technology you do an invention and the first accelerates pretty fast and then you reach the limits of it you know so I think rotary phones super exciting right and at some point you had the best rotary phone you can do and then push buttons came along that was a super exciting you know the buttons got better and better and then touch pad you know it's a cascade of these things and if you go look at s curves of the computing stack they're sort of everywhere you look we wrote assembly hit the limits of that we wrote C I worked at different equipment when the VMS operating system was written and assembly right and they and everybody's moving to C and they thought well we'll just make a better assembly they called it blessed which is essentially assembly with a compiler on it right and they keep moving along processor architecture or single thread there are Plateau reasons for there we went the multi thread to GPUs and now we're building compute know how to parse through petabytes of data and do computationally highly effectively so we keep changing the game as we go through it so what a what a Gordon say I really like this first time I read it I thought was really funny cos the translators following the number of components per circuit rises by 75 will get as many as 65,000 components on chip and I thought he's only off by a factor of a million because we're at 65 billion transistors a chip we now measure transistors in hundred million transistors per square millimeter but of course he was a the founder and he did say in 75 right so this is a pretty simple role which is fun and Moore's law famously draw of what's called Bell's law right mainframe many workstation TC laptop a faster smaller version of the same computer every 10 years or so as we shrunk it down right there's all kinds of incredible things in the history of this IBM build an out-of-order machine out of essentially tubes discrete transistors years ago right and this transformation kind hit the limit on smartphones and I'll return to this point at the end right there's a lot of arguments about Moore's law right one is well there's lipid optical scaling there's ellipse the limit to the cost right and what we keep finding is as we driving into mass production we always saw for cost we've always solved for power we've always solved for whatever limiter is in the way and then clerks why I really like this graph he went and plotted and I don't know if this is exactly true but it's super fun he said computing technology has been on this you know log linear curve since 1800s right it didn't just start like and there's been many transitions of what you build a computer half but that's continued to generate this exponential growth and computing power right what's just a really fun slide all right so on the one hand individual technologies tend to be on diminishing return curves but total computing power is on an accelerating return curve right build faster computers you get more technology ideas you have tools to analyze stuff you have compound growth and it keeps going so so what's what's going on under the sheets and many people have observed this is accelerator curves are typically made out of a cascade of diminishing return curves often of very different kinds of things rotary phone touch phone you know or keep keypad touch phone like what what drives the next wave of innovation innovation often is a very different thing right and then here's the human nature problem which is when we really understand something really well you're hyper aware of what's going into the innovation curve and you're standing on it's really easy to see that as plateauing but in fact will be seen over and over and over the next innovation curve moves us along that make sense so some of Intel's own scientists Paul Patton got quoted this is pretty funny he got interviewed by The Wall Street Journal and said we're not really sure we can drive technology much further and the next day was a press conference without a lady other than CEO of Intel and they said so Intel engineers think Moore's Law is dead and he said oh I talked to Gordon this morning and he's still doing fine and then he turned around said who the hell's Paul Pack and interestingly enough Intel is a very tolerant company he's driving her seven Anna meter transducer definition and Ruth appropriately named Ruth brain she's super smart that the factors that go into driving transistors shrink are complicated and there's many many technologies and it's easy to point your finger at one or more of them and say I see the limits but the details about how we drive this forward and the trade-offs we make are really really quite interesting so you know when I first started hearing about Moore's laws about optical scaler we had to D transition and we slowly made them smaller right now we changed materials locai dielectrics we changed the transistor architecture right the scaling things underneath a does everybody remember when the big problem of semiconductors actually looking around maybe nobody does we put one male layer down and then another one had to go over it and then the next one went over it and the bump got higher and higher and all the technology development is like how do you keep the metal from thinning this you got the higher metal stacks and somebody figured out how to grow oxide on it and sand it flat they literally they called it planar eyes metal but it was sandpaper right transform now when you look at a metal stock you see these beautiful flat metal layers with vias there are so much technology in that stack there's different metal technologies the sandpaper got better right so the hunt like literally it's very smooth that's super fun all right so we're looking through the laws of this stuff so I worked I worked at Tesla for a couple years and got to know Elon a little bit and one of the things Alana's fantastic at is what does he really want to build and you used to say you first figure out the configuration of atoms then you figure out how to put them there right now he meant that in terms of rocket ships car factories lithium battery he's right he started building electric cars confident in this model that when you look at the fundamental physics of lithium-ion batteries it has a cost rejector he to get there and everybody said well lycium is expensive no it's not it's a rare it's a common mint metal in the earth it was expensive because the lithium mines had been sighs forget this lithium grease for industrial applications it was a capital problem it's a billion dollars to build a lithium mine but once you build them lithium is relatively cheap it's very low technology extraction right so the the summary I tell the engineers now is don't let the house constrain the wets right when you want to build something sometimes iterating on the current set of house is important sometimes it's completely different I want to say something about power scaling so I hear often power scaling to limit so back in the late 90s I was at digital working on alpha computers and we're building a supercomputer for with Cray and their their line was megawatt for teraflop and they said we really need to improve this by 10 to a hundred axon when we're working on it but a megawatt you know get a hundred X out of that that's that's a hard number right we just crossed a watt for teraflop a million X improvement in power in 20 years right and then I think it's almost ironic that as power scaling went down we starting to hit the thermal limits of power flux density through a square centimeter of silicon just at the time we can really stack it properly that's one of the ironies but we're using stacking for a whole bunch of things currently memory and we're starting to look at logic architectures where we stack it and even though the peak power flux through the the vertical is too high there's a whole bunch of tricks to manage that power to do something about it so I'm personally confident that we're going to keep moving the power wall and a number of changes that we've already done is high and they were not done yet let's just talk about the the current you know thought us on where we are so we had planar transistors we went to FinFET where we're all building nano wires in the fab right intel TSMC is announced not they have Samsung everybody's working on it right and there's a really interesting thing well the the world thinks Moore's Law is dead all the fab and the technologists think it's not and everybody's announced now a 10-year roadmap for Moore's law and if you actually looked at it from the left side to the right side here is about 5,000 times shrink right actually if we scaled this properly you wouldn't be able to see it so we just scared a little bit and then what's driving this stuff so here's a fun side we're staring at disco when we plotted the wavelengths of light when I first learned heard about this the wave you know it's like that's smaller the wave lights got too small and he's sort of interfering with it but if you actually look at the wavelength stacking we went from 436 to 193 well the feature scaling was actually exponential right which is pretty wild and EUV shows up and that really resets the wavelength right so we get back to direct printing and with UV we're talking earlier today it should have been you know introduced way back here before we had to go crazy but look what they actually did so here's a different way to graph it so the wavelength on the vertical thing generated the printed dimension on the bottom right and this is a super fun thing so when the wavelength first got small sorry to interfere so they started tweaking the pattern they tweaked the wavelengths they tweaked the material and throw and then they started doing you know computational orthography so they started in 1990 with printing with no correction and then is true do minor correction and then basically they started printing on the mask the field that you wanted so that when the light interfered a little bit you got what you wanted okay and then things started to go crazy that's the pattern you print on a mask print annex right and they got something like 20 to 50 X out of that and then here's a set of patterns and you're probably looking at the goes what is act on a print that right so the limitation of wave like that everybody they thought he understood was you know and and at some level you think well the semiconductor business is complicated there's physicists and chemists and material people and machine people optic people you know the diversity if technology is driving this is so high and now we're even using AI so we evaluate how we build stuff we have big data sets about what works and what doesn't and they're starting to do closed loop what do we build what do we print what causes defects and then what do we do about it super interesting so I'm going to whip through a path to 50x so I talk to a bunch of engineers in Intel and I said I really want to pass through 100x scaling this is a hundred times more transistors per square millimeter and after about three weeks they came back and roomful people with me and they said mmm they look kind of glum cuz they only got 250 in three weeks and I asked them to maybe spend a couple more weeks on it and we're still looking at it but this is the first graph so there's fins right and we have clear line-of-sight to pitch scaling the fins this is printing fins and there's a whole bunch of metallurgy and I want to do want to point out that the top of the fins are still over 100 atoms wide right so we're not running out of atoms we know how to print single layers of atoms there's a whole bunch of process steps where the metallurgy is really interesting in terms of small number of atoms but the fins themselves they're mountains right in terms of atomic layers all right we're going to nanowire stack in the way we build in the nanowires we get more drive current because with gate all around you get way more control to the device especially at low voltage we're super excited about that it's another factor of two and then we're gonna go stack the nanowires so in the same noun our stock will get pnn devices then we already do wafer to wafer bonding and I'm not going to go into all the details of this but we're starting to build stacks of transistors and metal stocks - you know if there's there's the goodness both in how the metal stack drives down to the transistors multi sides and there's goodness about how we build logic functions 3d stacking is going to be more and more important as we build stuff and just think about a big building if you want lots of square foot and it's it's one layer the distance is the XY from anywhere if it's stacked the distance is XYZ it's way shorter right and then finally we already do die to die died away for stacking and it's not clear how how high we can do that now people have said well that's more way four steps and more processing and more costs I've been talking in the fabs for like 30 years now and they always promise me at some point in the near future that wafers are going to be too expensive and the transistor cost per transistor is not going to cross over and it's never ever happened it's unbelievable how fast that happens if we go in the high-volume manufacturer and you work on it and we've just seen some remarkable things when TSMC announced death 16 ffs process it was fairly complicated they got the FFC they radically reduced number of wafers right you can you can put away for through Samsung 14 and I guess I probably shouldn't quote the number of days but it would shock you how simple it is right so we we always figure out how to do the what and then we always radically improve the how here's a funny one wire bond technology way back in the 80s and 90s was the limit because of inductance in these wires and then the wire bond guys and the flash guys got together and they ship production products where a wire bond Derwent one two three four five six seven eight this is eight they got up to 16 in production right so the flash cells were getting smaller to get more flash in the package they stacked them and it got too complicated so they decided to stack layers of flash cells vertically and I found this picture and I thought it was really funny because that's actually the bat Babbage difference engine from 1854 right that was a meter modern devices are an animator tends in a nice shrink and here's a modern flash device which all looks a whole lot like a Babbage engine - cracks me up they're now up to a hundred and twenty eight layers thick now those are very simple airs now just imagine that those layers get a little more complicated and we start building 3d devices really deep right so the limits of shrinking like you know I sometimes feel we're so far away like I'm not sure how many atoms across we need things are sort of quantum mechanical or two or three we know how to print single layers of atoms we know how to work we're going to know how to print down to small numbers of atoms and this is going to keep going okay back to how this changes computer designs now the really interesting thing is like I'm an architect right and if you go look at the frequency move by a thousand and the transistors procore move by a thousand the cost moved by transistors per millimeter by a hundred thousand the bottom two numbers are kind of sad because memory latency as seen by the processor went up a hundred x right you know what if four gigahertz in 100 nanoseconds that's that's a lot of cycles and the instructions per cycle while it's improved a lot isn't nearly like the rest of the curves right and does that make sense so if I go sit down with my partner's you know the frequency guys the transistor you guys in the micro architects I'm the I'm the low man on the totem pole right and and a lot of times we look at graphs like this and think and this is a famous crash which I quite like that the performance wrapped linearly for a while with frequency and process scaling and it's been tougher to ramp that as the frequency didn't ramp but we had lots more transistors now we are continuing to move this along and and I my personal believe that some of the limits of this have been mindset about how far can single threaded programming go because we're actually going to do something about it but there's a whole bunch of dimensions on this the programming guys kind of cracked me up because here's an old C program that did something interesting convolve 3x3 and then an open CL looks about the same we wrote a PI torch program that looks you know like one line call right and and I just want to talk about abstraction layers because this this graph right here is super interesting somebody said no when we went from assembly you write a line you got a line you're right see you know it's something like ten lines of executed code for every line you go to a C++ it goes up and depends on who you are which programs you look at you get a graph something like this and there's two arguments about this one is boy monitor programmers are inefficient right and if you're using javascript to write hello world it's probably true but you can write one line of code today and fire up a data center and find a cat photo and if you tried to do that and assemble you'd have a thousand people working for a thousand years and need to fail right so the interesting thing about the abstraction layers and this goes for computing in general the reason processor performance is sublinear with transistor count it's because it's limited by unpredictability branch predictability data predictability sometimes instructions predicted right but we are we are building the new abstraction layers to meet that need in some places it's linear we added a lot of floating-point extensions bigger floating point and I can be linear with processing and then the AI stuff has been super linear with processing because the architectural innovation that was allowed by the huge increase in transistors came along alright I gave a talk on micro architecture recently and I just wanted you know the the innovation track on micro architecture has been hilarious the computers used to be pretty dumb fetch execute write back write super small super pipe super short and then the RISC innovation I think really started is like that stretch out that pipeline do the minimal amount of work on each stage it was super super exciting and then many went you know super scalar and then they went super scalar super pipeline and then they started to build big out-of-order machines or big superscalar super pipelined machines right and we saw sequential improvements in performance and then one of my favorite computers because I was the architect to this was Eevee 6 it was digital first big out-of-order machine so we built fetch for instructions we had two full Anna's your three integer for you know stuck like drawing right three integer pipes three floating-point pipes big execution monster we had a whopping 24 instruction window and the scheduler and a hundred instructions inflate here's a that's a abstracted diagram of a recent sunny coke so that 800 instruction window sustains between three and six x86 instructions of clock mass of data predictors massive branch predictors right and if you look inside it carefully it's not one big computer anymore there is a fetch predictor or a branch predictor and instruction fetch your micro appengine where we take decode instructions execute them every single piece of that computer looks more complicated in the computer now going from point three IPC to you know let's say three or four IPC is only 10x but the memory latency went from one o'clock i memory back in the good old days to hundreds of clocks of memory latency and when you're running floating-point code where we can really apply that acceleration it's it's completely different now one thing I was talking to some software people on one guy was looking at and he said you're scaring me the complexity of this engine like how do you ever verify it there's been recent people talking about well you have a million flip-flops in a computer in principle it has to is a million states we don't really verify the whole thing by the way but it's been a long time since we actually executed C programs in the computer right the compilers generating basic blocks at average six instructions we put in an instruction window of eight hundred instructions we then go find out many many parallel dependency graphs and execute those in parallel and we're working on a generation not significantly bigger than this and closer to the linear curve on performance right this is a really big mindset change so I finished up here in a few minutes so why do we make new computers right this is just data from from the web somewhere look if you just look at the total numbers the IPC went way up to frequency 5 megahertz we're shipping now 4.2 gigahertz isolates performance 14,000 X like these are huge scaly numbers and lots of people think well we're hitting some kind of limit I really doubt it we have a road map to 50 X more performance 150 X more transistors and huge steps to make on every single piece of the stack right and remember computers are built by large numbers of people but and actually many many small teams write better prediction and better instruction set architecture better optimization better cat tools better libraries the number of different places that we're doing innovation it's really really high it's not one thing you called this Rajas law because Raja is the first one who showed me this graph but I think it's an interesting phenomena right we had single single core single threaded computing and then went up and kind of plateaued a little bit and then we went to multi-threading right and then we went to GPU computing which and now we're working on AI computing and we're working on essentially computers that compute across very large very sparse datasets with a very high computational intensity and this is one place where Moore's law is enabling the architects to have a field day right and you can see when the idea sets are powerful so when we first build parallel computers lots of people thought will make a paralyzing compiler we'll take your C program and the parallel compiler and make it work and basically I saw the first one of those in 1985 it didn't work at all right and paralysing code was something the only a small number of people can do and I still remember when Google built massive data centers with low-cost servers they build a Google file system anybody could use they use MapReduce and a bunch of tools where they got amazing scaling right now you could say it's not efficient it could get 10,000 cores and go a thousand times as fast that's thousand times as fast is really good right so there's a there's a real creative tension on that the GPU computing guys man they started trying to do computing on GPUs when they didn't even get the right answer they didn't have any tools people remember trying to write you know OpenGL programs they get shader shaders to do what you wanted but they Co evolved the software stock and the computing stack and the math engines underneath and they started to do an incredible job of computing and now we're starting to do the same thing for a I had super interesting and they're so far to go so every and this is what Roger said specifically it's very hard to get software communities to move for 20% or 50% or even to X but for 10x computing that enables a whole different genre of confusing types so the architecture challenge is to take all the transistors that we're building build the architectures that allow to solve problems that kind of work your way up to stack it's a super interesting problem and we're literally in this transition you know there was a hundred millions of devices and a billion devices we're going into a world of a ten billion smart devices and that curve I believe that's not stopping now we'll see how diverse a are you know cloud computing mobiles of a computing a personal computing image processors number of smart devices running all around is really incredible so the opportunities for people and you know in college today looking at all kinds of different applications it's really phenomenal and will continue to grow I want to end on this thing and we'll take some questions Richard Feynman famously said there's a lot of room at the bottom right our current transistors are a thousand by thousand by a thousand if we can get that to ten by ten by ten that's a million x scaling the kind of computer we build out of that I don't even know if we'd know how to imagine that yet right this could be so many transitions over the next 10 20 years so I believe Moore's last not dead I hope I convinced somebody and and but but more deeply if your idea set says this is going to keep going and there's a whole bunch of challenges and I think will rise to the challenge and I've seen that over and over if you think it's running out of gas at well if you think it's not it's not going to and there's a many people in the industry working on this with so many people told me well computer architecture can't move any further really how many times does it change over the last thirty years like over and over and over the hardware software contract is really interesting we've definitely found out that a more open source world is really interesting because when everybody participate you've gathered that together we saw that first and open-source software amazingly we're seeing in security today if you think security is keeping something secret and the secret gets out then you're you're dead right if you think security is getting the best people look at your stuff and investigate and collaborate with you you have a plan for security we see this in software and the co optimization of software architecture and and and actual chips we build is going to keep going all right with that any questions well if it doesn't work I can always hey we get to be interactive okay so this is your question box it's what we will pass it to the next person I'll start here in the front but it's gonna work its way back thanks your talk you have a lot of hope for conventional architecture but you just build a neural net accelerator of Tesla where where does that fit in the computing landscape going forward is that gonna revolutionize thing is that a niche niche product yeah i-i've a mixed feeling about it so if people look at the human brain you know there's the motor cortex and the reptile cortex and the lower primate you know there's there's a diversity of thinking in our own brains right so I think computers will have diverse programming I'm more interested and I think long-run like right now the AI world is changing very fast right so there's a lot of a accelerators which are pretty bespoke to a narrow set of problems and then there's things that are much more programmable I like those the widget we build a Tesla with super interesting because it runs the output a cafe right that's a very programmable inference framework and we can also port other things to that and it had some very interesting phenomena like it's not hard to move the data around because the on chip s rounds hold the data and it's not hard to run the instruction because it runs cafe instructions so that was an interesting accelerator experiment and it let us go from standing start to driving a car in eighteen months right without the six years of software development right so the world's really interesting right now we're putting AI acceleration instructions in CPUs the GPU guys are getting better at making them programmable right and then there's the bespoke accelerator kinds of things have a place we saw it stabilized in like video encoding and this is Rogers point there's encoders and decoders and you can get hardware for both on H dot 264 but on the cloud the arms race is on the encoder right there are all software encoders because if you can tweak a little bit of compression with a new encoder right you can save a lot of bandwidth on the Internet but everybody's phone runs a hard-wearing decoder like the decoder is the fixed target and it could be AI is the same way training is a monster that's gonna be evolving and inference is a more simplified thing but it's still pretty open-ended and I think the experimental world is the right answer well if I could just sharpen up that question so imagine the pie chart with all the silicon coming off fabs in five years in that pie chart how much of that is that so can a cpu how much of it is transistors to other GPU and EMA that is neural net accelerator like Mattox software guys move slower than you think and the AI is moving faster you think so today it's sort of 80 CPU 20 GPU zero everything else and if things move quickly it be one third one third one third but I don't think it'll move that fast but I couldn't call it ok think so Moore's law is not dead Dennard scaling is not dead right that's what you also said so is the number of semiconductor companies shrinking well so a couple years ago in the valley once at the high end right that trying to do trying to know here's a here's the Worley wild thing so so that you know know how pendulums work right they swing back and forth right so you know back back in the late 90s or you couldn't kick over a rock without finding the chip startup right and everybody's doing a six and then five years later you couldn't get funding if you're doing hardware because quote Hardware doesn't make any money and now there's a hundred AI chip startups fabs yeah the big ones are Intel TSMC Samsung buying GF UMC smick and then there's like five more so there's probably Jeff's famously said they're not doing seven they have a huge business doing other technologies so there's really interesting thing the business model the semiconductor company is complicated tsmc brags and so on their website they just keep building new fabs and they keep the old ones running so it's hazlit we had a whole bunch of products in like a hundred and something nanometer so i would say today there's a cut solid ation leading-edge there's three next step there's like three more next step there's like ten more and then there's a whole bunch of bespoke technologies for like RF and special-purpose power devices and there's actually some new stuff there so that's that's pretty good and the investment is good and the other people always say well fobs are getting so expensive you can't build a new fab you know one thing the world doesn't have a shortage of money like it's amazing how much we're spending on it and UV machines when they came out were like four hundred million dollars needs to show you go to a fab they always have a billion dollar row like there's ten of these hundred million dollar things and we're replacing them with three hundred million dollars it's gonna be three billion dollars price of those things is crashing so so I don't think we're gonna run out of money the worldwide appetite for square feet of semiconductors is still going up all the big tabs are building more capacity so we'll see what happens long run but and you can now get a fun startup funded building chips and you can actually build ships in eighteen months with 50 IPs integrate them together and power it on and round drive a car so I proved that your abstract seemed to indicate your weren't that optimistic about voltage scaling and frequency scaling but some of us think that there is especially if you have an infinite amount of money you can actually solve those problems too what's your attitude about voltage scaling and frequency scaling so the roadmap I showed here didn't include much for voltage scaling or for frequency scaling we're working on it III don't want to disclose what the frequency numbers are they're pretty good the voltage scale so using silicon or germanium and fans and nanowires is moving the voltage a little there's bigger developments to move voltage scaling further and I'm totally interested in it's definitely like we're gonna we're gonna see significant changes over the next 10 years but I have nothing to talk about today next couple doing a safe throw off the roof thanks for the talk I just wanna know what's your view on trading off poverty of frequency versus computation accuracy like are you guys moving toward approximate computing let's say at 12 transits for a Tercel versus 228 transistor because if the data is a statistical why should we know so so there are people anybody know anything about AI in here so there's a couple years ago there was a whole bunch of work around reduced precision variable session and let's say fuzzy answers and the problem is people are going to bigger and bigger data sets and they're trying to get repeatability and convergence having fixed answers and data types really mattered and most of the chips that the first spin was some reduced precision actually went and so and there's also some standard now there's 16-bit floats there's two formats of that of course people are working on some 8-bit stuff you kind of your your intuition is your brain seems to do a lot of computation and it's pretty fuzzy and it seems like a right answer and we know that when you're training big datasets you can sort of randomly nuke stuff without much loss of coherence but the current thing is when you're in development mode the repeatability is super important on the training side and I know we did a lot of work is like you get your new algorithms all work on 32-bit float for chrissakes because it stability is so high and then as you refine it you could drop down the 16-bit and then the inference engines can pretty pretty low so there seems to be a lot of variability on that problem today so actually it seems like fuzzy computes and approximate answers is the right thing but today it's been difficult to drive projects to completion that way Thanks all right what are your thoughts on FPGAs FPGAs they're really cool I build a lot of stuff using FPGA s you know generically when you go from fixed accelerator it's a programmable thing there's something like you know let's say around you know an average number of 10x performance and then when you go to FPGAs you might lose another 10x right now the good thing about FPGA is you can build the algorithm you want and that often works very well and then there's a whole bunch of places in industry where you want a specific chip and you don't have enough volume to justify the cap accident the cost to build it an FPGA is are great so I've been intrigued at that the FPGA companies that we're able to put multipliers and adders and FPGAs and then use that to get effective compute density out of them but there's a pretty big spread on frequency and in cost numbers so the the FPGA businesses are growing but but they're not like taking over computing for example so it's a little complicated and the stuff you could do with FPGA is it's great like we build FPGA models of everything we do now like they're they're the heart of all the emulation technology we have and you know so FPGAs are super useful to go into network spacing you pull up there's FPGA is all over everything because there's lots of special-purpose low-volume devices that FPGAs or grade ends so I'm a fan I've used them a lot but I'm not sure they're part of there as soon as the part of computing gets really big building the purpose-built accelerator or the right extension for a programmable computer seems to win the question behind you well it's interesting to see that you on the graph for where we are going in to match the texture from single core traumatic car and it seems that for the next 20 years ai will be I mean will be a focus and I'm imagining that AI will be built into the Intel ship at the same level of branch predictor instruction in structure like instruction fighter or is a different abstraction I didn't quite understand you so like how to use like how do you think the AI components for the chipset will be integrated with the e 16 instructions executor you know that's a really big question and you know until just recently introduced what's called cxl which is basically coherent memory port because you know we think there's going to be you know standard computers that run lots of C programs in JavaScript and all kinds of stuff and then there's stuff that works really well on that you know the vector architectures that GPUs have and then the AI chips today tend to be dense computational things like convolution matrix multiplies something that you know can take tensors right out of the tensor flow and do the right operations on them but the algorithms you know are going to want to talk to each other and that the customers you know are uncertain what to build like one reason GPUs haven't gone bigger in the data center is if the GPU workload is half of your workload and it's 10 percent duty cycle wise then stand me up to GPU data center super expensive right and but what people want is how do we get CPU GPU acceleration in the same box where it's really flexible to move the data back and forth and do it and they want to do in a flexible way so if they think well I need this much GPU in AI computing but it turns out to be true they don't want to be stuck with of the wrong computer set so did I answer your question do you foresee that say in the future in touch CPU 8000 how instruction specialized for convolution yes yeah so the short answer is there's a there's a whole bunch of stuff where you have the code it's pretty good and some kind of data type acceleration really works and we'll definitely do that we're working on multiple projects are there's other things where either I have the data set and the math intensity doesn't really care about like the unpredictable C code or the data set is so big and weird that having a CPU grovel through it's not the right answer right so AI has significant diversity on how you want to architect for that and my question is Intel looking for new materials especially in the richard feynman calls you highlight 7 atoms yes so can you yeah we we had a really we made a funny slide and then I saw TSMC cribbed it we did a walk through the periodic table so you know used to be silicon and aluminum and so in in silicon dioxide didn't the number of materials used in chip fabrication just keeps going up we use more and more atoms and then in terms of where as you're alluding to in terms of the actual device type it's it's really interesting what's gonna happen but but that's not my exact wheelhouse but you want to answer that question carbon we'll see what happens like the material science I we've just scratched the surface on material science I mean you get a couple atoms together and have them talk like it's so unpredictable like there right now that Schrodinger's equation works right you can write it down it's pretty simple well we solved hydrogen sad story like material science is amazing and oh look at that a handoff right relay cooperation so teamwork yeah it was interested in your in your comments about FPGAs because one thing that's missing from all your graphs and all your charts all these big multipliers but nowhere to see anything about NRE right non-recurring engineering right so the the cost to develop a new design that's probably going up pretty steep as well you were a little you know hesitant about FPGA is really more I can give you some self how do how what can you do you got any other way to beat down the NRE so that small companies you know come up with new products quickly 130 nanometers you can build a sophisticated advise for about one to two million dollars is that right on 28 nanometer I built multiple parts for 10 million dollars 14 nanometer 40 right an advanced process node like depends on how you allocate the cost whoo-hoo to hell knows hundreds of million if you take the process for granted billions if you if you need to build the cost so the diversity in cost you know let's imagine to do a new product ranges from two million to two billion dollars it's fairly diverse right now FPGAs are really interesting because if it's a small number of parts even at two million dollars it takes a lot of parts that amortize that right and if you're doing experiments in research lab and you don't care about the current frequency but you really want to do are experiments modern modern FPGAs are great tools work way more than they used to be you know you can get a stack of hops boards with a boatload of FPGAs on each one and standard interfaces and I'm sure that some of the students here have built some really cool computers using the FPGAs and working memory controllers and Ethernet NIC sand embedded CPU so you can do what you want so I'm a huge fan of that but I would say just think to millions of to billions it's arranged and you know what you're paying for you know advanced technology you know 16 FinFET you know 25 to 45 you start going in the seven I mean you're starting to look at you know five to ten million for mass costs alone and but everybody says well it's this gonna explode the run out of control but then the mass costs and the trailing nodes keep coming down and like 16 FinFET when it first came out that was a hundred million dollar minimum and they're down under 40 and they're on their way to 20 so so it keeps moving and they have to say what do I need to make the competitive product it's architectural innovation is 10x the the other guy then the 2x on the process isn't the thing but if everybody is on the process they have the equal architecture then the process is a differentiator and that stuff is pretty stable I'm gonna take the last question or okay let me take one more and then I want to ask ya hi great talk so one thing I was wondering is um in that chart of processors you had the while computer is seems to be scaling pretty fast still I the size of the the register files and cache isn't really scaling as fast so is that gonna bottleneck things and no how might we mitigate that yeah so I mean what it would X so there's a big step function so with the l1 caches are like 32 to 64 K because they're trying to keep in a four cycle load to use pipeline and to grow that when you step off that one more cycle you get 2% on the bigger cache and you lose five to seven percent on the latency so we build l1 caches middle cash level last level cache so that the trade-off has been around the cache hierarchy the instruction stream caches have gotten bigger faster but that that trade-off is out of a fairly sophisticated performance model environment and then the other thing is we build more load store pipes right when you want to make the l1 cache is bigger and said you dump you put more ports in them so those modern caches have a lot of ports like five or six or eight you know read/write ports and so we've sort of been spending the the bits and the transistors and metal tracks on ports more than size and then trying to compensate with big block moves out of the mid and last level cache well school questions there's one more back there yeah so this may be a slightly off-topic question but until an IBM have both come up with quantum computers but equipped with like 49 and 51 qubits and so is there a is there a competition between the two technologies because they're two very different types of processing so is there like a competition gene which one is going to prevail if so which do you think is going to win I literally have no opinions I've been watching quantum computing for years and I had a friend who was working on it for a couple years and he said I'm not sure it's not a fraud because we haven't really gotten results out of it and I'm curious because you know the physics says a whole bunch of interesting things is happening but we haven't really gotten the results that we expect and but they are making progress and so I would have to direct you to the curtain papers more for four answers to that okay all right I said I had one more question which is a little bit of you've had this amazing career of different you know places where you've worked I'm wondering you have an audience of certainly the next generation of practitioners in this space so you have advice to students or what do you wish that someone had told you when you were in the seat what's your kind of messaging well first if they told me something I wouldn't listen I'm not really you know when I was in college I was a EE and then a senior year we had a two inch wafer fab and my advisor ran it so I worked on semiconductor physics I thought that was super cool and then my first job I took in Florida cuz I want to live next to the beach and surf and surfing was shitty and the company was horrible but I learned that good they threw me in the lab and I fixed stuff and I got to be a you know scope jockey and I can fix bloody anything in the lab and it was like one of those worst two years your life and best two years of experience and then somebody told me I should work for digital equipment so I went back then as met in his newspaper he times I found an advertisement a recruiting ad for digital equipment and I read the backs 1178 Emmanuel in this 1170 manual on the plane and I talked to the chief architect Bob Stewart of the fax 11 780 which is a very famous computer he said alright so why are we here and I said I have a lot of questions for you and it turns out he thought I was a complete goof but he thought it'd be funny to have me work for him and and you know I my my thing is I like to throw myself in the stuff and learn what I can and then make sure that you're working with people are excited like I worked at Harris's the Florida company and at lunch all people did was complain and then I was a digital in the first eight years there a friend of Mines wife said what are they put in the water all you guys do is work and when you got drinking you talk about work and it was super fun and way back when I gave a talk and microprocessor for him on the backs 8800 this was my first computer I worked on it was kind of weird I was a junior person and my boss quit and then his boss quit and we got all done I was the chief architect of the cache and there was no poison involved and I gave a talk on it and I was a nervous wreck and after me was Marty Hopkins as a fellow at IBM he's a great guy and now we're all standing around there was like a little chat and he goes he said you know cuz back then he went to a computer conference I remember the risk Wars you know you'd say this and they'd say that and everybody was all and he said you know we're so passionate about this because it's a great endeavor yeah and I thought I was a perfect way to say it you know there's so many technologies involved is so transformative the society so many good things have happened some bad things obviously so find some smart people that are excited about 30 so everybody's all bummed out and ragged on the comp and go somewhere else right or be a change agent you know I worked at AMD when after they'd fired half the people it was not a happy place but you know my conviction was if we go do the right thing design the right thing put our energy into it we'll do something cool now it's happy place so Intel is a way Wilder place because you know we had record revenue we have some stuff that's great some stuff that's not so good super fun for me because I like to think about those problems oh great thank you for reading that energy here and thanks Jim for giving a great talk he's gonna stick around a little bit if you want to interact with them please come up thank you again [Applause] [Music]
Info
Channel: UC Berkeley EECS Events
Views: 55,795
Rating: 4.9673023 out of 5
Keywords:
Id: oIG9ztQw2Gc
Channel Id: undefined
Length: 64min 53sec (3893 seconds)
Published: Wed Sep 18 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.