Rajat Monga: TensorFlow | Lex Fridman Podcast #22

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

the following is a conversation with Rajat manga he's an engineering director of Google leading the tensorflow team tensorflow is an open source library at the center of much of the work going on in the world and deep learning both the cutting edge research and the large-scale application of learning based approaches but is quickly becoming much more than a software library it's now an ecosystem of tools for the deployment of machine learning in the cloud on the phone in the browser on both generic and specialized hardware tbu GPU and so on plus there's a big emphasis on growing a passionate community of developers Raja Jeff Dean and a large team of engineers at Google brain are working to define the future of machine learning with tensorflow 2.0 which is now in alpha I think the decision to open-source tensorflow is a definitive moment in the tech industry it showed that open innovation can be successful and inspire many companies to open-source their code to publish and in general engage in the open exchange of ideas this conversation is part of the artificial intelligence podcast if you enjoy it subscribe on youtube itunes or simply connect with me on Twitter at Lex Friedman spelled Fri D and now here's my conversation with Roger manga you were involved with Google brain since its start in 2011 with Jeff Dean it started with this belief the proprietary machine learning library and turn into tensorflow in 2014 the open source library so what were the early days of Google brain like what were the goals the missions how do you even proceed forward once there's so much possibilities before you it was interesting back then you know when I started or I needed you even just talking about it the idea of deep learning was interesting and intriguing in some ways it hadn't yet taken off but it held some promise that had shown some very promising and early results I think that the idea where Andrew and Jeff had started was what if we can take this what people are doing in research and scale it to what Google has in terms of the compute power and also put that kind of data together what does it mean and so far the results had been if you scale the compute scale the data it does better and would that work and so that that was the first year or two can we prove that outright and with disbelief and we started the first year we got some early wins which which is always great what were the ones like what was the ones where you were there's some problems to this this is gonna be good I think they're too early wins were one was speech that we collaborated very closely with the speech research team who was also getting interested in this and the other one was on images where we you know the cat paper as we call it that was covered by a lot of folks and the birth of Google brain was around neural networks that was so who was declaring from the very beginning that that was the whole mission so what would in terms of scale what was the sort of dream of what this could become like what were there echoes of this open-source tensorflow community that might be brought in was there a sense of TP use was there a sense of like machine learning is not going to be at the core the entire company is going to grow into that direction yeah I think so that was interesting in like if I think back to 2012 or 2011 and first was can we scale it in in the year or so we had started scaling it to hundreds and thousands of machines in fact we had some runs even going to 10,000 machines and all of those shows great promise in terms of machine learning at CoCo the good thing was Google's been doing machine learning for a long time deep learning was new but as we scale this up pretty sure that yes that was possible and it was going to impact lots of things we started seeing real products wanting to use this again speech was the first there were image things that photos came out of in many other products as well so that was exciting as we went into with that a couple of years externally also academia started to you know there was lots of push on okay deep learning is interesting we should be doing more and so on and so by 2014 we were looking at okay this is a big thing it's gonna grow and not just internally externally as well yes maybe Google's ahead of where everybody is but there's a lot to do so I wanted this start to make sense and come together so the decision to open-source I was just chatting with the Chris flattener about this the decision go open-source with tons of flow I would say so for me personally seems to be one of the big seminal moments and all the software engineering ever I think that's a when a large company like Google decides to take a large project that many lawyers might argue has a lot of IP just decide to go open-source with it and in so doing lead the entire world and saying you know what open innovation is is is a pretty powerful thing and it's okay to do that that was I mean that's an incredible incredible moment in time so do you remember those discussions happening the other open source should be happening what was that like I would say I think so the initial idea came from Jeff who was a big proponent of this I think it came off of two big things one was research by his view at a research group we were putting all our research out there if you wanted to we were building on others research and we wanted to push the state of the art forward and part of that was to share the research that's how I think deep learning and machine learning has really grown so fast so the next step was okay now word software help with that and it seemed like they were existing a few libraries out there they are hoping one torch being other and a few others but they were all done by academia and the level was was significantly different the other one was from a software perspective Google had done lots of software or that we'd used internally you know and we published papers often there was an open source project that came out of that that somebody else picked up that paper and implemented and they were very successful back then it was like okay there's Hadoop which has come off of tech that we've built we know the tech we've built is very better for a number of different reasons we've you know invested a lot of effort intact and turns out we have Google cloud and we are now not really providing our tech but we are saying okay we have BigTable which was thought is no thing we're going to now provide HBase api's on top of that which isn't as good but that's what everybody's used to so there's there's like can we make something that is better and really just provide helps the community in lots of ways but it also helps push the write a good standard forward so how does cloud fit into that there's a tensorflow open source write library and how does the fact that you can use so many of the resources that Google provides and the cloud fit into that strategy so tensile flow itself is open and you can use it anywhere right and we want to make sure that continues to be the case on Google cloud we do make sure that there's lots of integrations with everything else and we want to make sure that it works really really well there so you're leaving the tensorflow effort you tell me the history and the timeline of transfer flow project in terms of major design decisions so like the open source decision but really you know what to include and not there's this incredible ecosystem that I'd like to talk about there's all these parts but what if you just some sample moments that defined what tensorflow eventually became through its I don't know if you were a lot to say history when it's just but in deep learning everything moves so fast in just a few years is already history yes yes so looking back we were building tensor flow I guess we open sourced it in 20 15 November 2015 we started on it in summer of 2014 I guess and somewhere like three to six late 2014 by then we had decided that okay there's a high likelihood we'll open source it so we started thinking about that and making sure we're heading down that path at that point by that point we had seen a few you know lots of different use cases at Google so there were things like okay yes you want to run in a large scale in the data center yes we need to support different kind of hardware we had GPUs at that point we had our first TPU at that point er was about to come out you know roughly around that time so the design sort of included those we had started to push on mobile so we were running models on mobile at that point people were customizing chord so we wanted to make sure tensorflow could support that as well so that that sort of became part of that overall design when you say mobile you mean like pretty complicated algorithms running on the phone that's correct so so then you have a model that you deploy on the phone and run it their authority at that time there was the ideas of running machine learning on the phone that's correct we already had a couple of products that were doing that by then in those cases we had basically customized handcrafted code or some internal libraries that were using so I was actually at Google during this time in a parallel I guess University but we were using piano and cafe yeah we did we was there some degree to which you were bouncing I like trying to see what cafe was offering people trying to see what Theano was offering that you want to make sure you're delivering on whatever that is perhaps the Python part of thing maybe did that influence any design decisions um totally so when we built this belief and some of that was in parallel with some of these libraries coming up I mean Theano itself is older but we were building this belief focused on our internal thing because our systems were very different by the time we got to this we looked at a number of libraries that were out there Tiano there were folks in the group who had experience with torch with Lua there were folks here who had seen cafe I mean actually Yangcheng was here as well there's one other libraries I think we looked at a number of things might even have looked at China back then I'm trying to remember if across there in fact the I we did discuss ideas around okay should we have a graph or not and they were so supporting all these together was definitely you know there were key decisions that we wanted we we had seen limitations in our priors just believe things a few of them were just in terms of research was moving so fast we wanted the flexibility we want the hardware was changing fast we expected to change that so that those probably were two things and yeah I think the flexibility in terms of being able to express all kinds of crazy things was definitely a big one then so what the the graph decisions without with moving towards tensorflow 2.0 there's more by default will be eager execution so sort of hiding the graph a little bit you know because it's less intuitive in terms of the way people develop and so on what was that discussion like with in terms of using graphs it seemed its kind of the Theano way they seemed the obvious choice so I think where it came from was are like this belief had a graph like thing as well a much more simple it wasn't a general graph it was more like a straight line you know thing more like what you might think of cafe I guess in that sense but the graph was and we always were cared about the production stuff like even with disbelief we were deploying a whole bunch of stuff in production so graph did come from that when we thought of okay should we do that in Python and we experiment with it some ideas where it looked a lot simpler to use but not having a graph went okay how do you deploy now so that was probably what triggered the balance for us and eventually we ended up with a graph and I guess the question there is did you I mean the production seems to be the really good thing to focus on but did you even anticipate the other side of it where there could be what is it what are the numbers something crazy a forty 1 million downloads yep I mean was that even like a possibility in your mind that there would be as popular as it became so I think we did see a need for this a lot from the research perspective and like early days of keep learning in some is 41 million oh I don't think I imagined this number then there it seemed like there's a potential future where lots more people would be doing this and how do we enable like I would say this kind of growth I probably started seeing somewhat after the open-sourcing there was like okay you know deep learning is actually growing way faster for a lot of different reasons and we are in just the right place to push on that and leverage that earned and delivered on a lot of some things that people want so what changed once the open source like how you know this incredible amount of attention from a global population of developers what how did the project start changing I don't you actually remember it during those times I know looking now there's really good documentation there's an ecosystem of tools there's a community of law is a YouTube channel now yeah it's very very community driven back then I guess 0.1 version is that the version I think we called two point six or five something like what changed leading into 1.0 it's interesting you know I think we've gone through a few things there when we started our twin we first came out people love the documentation we have because it was just a huge step up from everything else because all of those were academic projects people doing you know we don't think about documentation I think what that changed was instead of deep learning being a research thing some people who were just developers could now certainly take this out and do some interesting things with it right who had no clue what machine learning was before then and that I think really changed how things started to scale up in some ways and pushed on it over the next few months as we looked at you know how do we stabilize things as we look at not just researchers now we want stability people who aren't apply things that's how we started planning for minato and there are certain needs for that perspective and so again documentation comes up designs more kinds of things to put that together and so that was exciting to get back to a stage where more and more enterprises wanted to buy in and really get behind that and I think post one not oh and you know with the next few releases that enterprise adoption also started to take off I would say between the initial release and whatnot oh it was okay researchers of course then a lot of hobby is an early interest people excited about this who started to get on board and then over the one knotek's thing lots of enterprises i imagine anything that's you know below 1.0 get some pressure to be and rise probably want something that's stable exactly and uh do you have a sense now the tensorflow misses day like it feels like the deep learning in general is extremely dynamic field as so much is changing do you have uh and doesn't fall it's been growing incredibly you have a sense of stability at the helm of it I know you're in the midst of it but yeah it's it's I think in the midst of it it's often easy to forget what in enterprise wines and what some of the people on that side one they're still people running models that are three years old four years old so inception is still used by tons of people just even last night fifty is what couple of years over now or more but tons of people who use tag and they're fine they don't need the last couple of bits of performance or quality they want some stability and things that just work and so there is value in providing that with that kind of stability and in making it really simpler because that allows a lot more people to access it and then there's the research crowd which wants okay they want to do these crazy things exactly like you're saying right not just deep learning in the straight-up models that used to be there they warned RN ends and even are an enzyme a B or there are transformers now and now it needs to combine with RL and Gans and so on so so there's definitely that area that like the boundary that's shifting and pushing the state of the art but I think there's more and more of the past arts much more stable and even stuff that was two three years old is very very usable by lots of people so that makes her that part makes it all easier so I imagine maybe you can correct me if I'm wrong one of the biggest use cases is essentially taking something like resna 50 and doing some kind of transfer learning on a very particular problem that you have it's basically probably what majority of the world does and you want to make that as easy as possible that's right so I would say for the hobbyist perspective that's the most common case right in fact the apps on phones and stuff that you'll see the early ones that's the most common case I would say there a couple of reasons for that one is that everybody talks about that it looks great on slides yeah that's a virtual presentation you know exactly what enterprises wine is that is part of it but that's not the big thing enterprises really have data that they want to make predictions on this is often what they used to do with the people who are doing M L was just regression models linear regression logistic regression linear models or maybe gradient booster trees and so on some of them still benefit from deep learning but they weren't that that that's the bread and butter like the structured data and so on so depending on the audience you look at their little bit different and they just have I mean the best of enterprise probably just has a very large data set or deep learning can probably shine that's correct right and then they I think the other pieces that they weren't again it with 2.0 or that developer summit we put together is there the whole tensorflow extended piece which is the entire pipeline they care about stability across doing their entire thing they want simplicity across the entire thing I don't need to just train a model I need to do that every day again over and over again I wonder to which degree you have a role in I don't know so I teach a course on deep learning and I have people like lawyers come up to me and say you know say one is machine learning gonna enter legal the legal around the same thing in all kinds of disciplines immigration insurance often when I see what it boils down to is these companies are often a little bit old-school in the way they organize the day so the data is just not ready yet it's not digitized if you also find yourself being in the role of an evangelist for like let's get organized your data folks and then you'll get the big benefit of tensorflow do you get those have those conversations so yeah yeah I you know I get all kinds of questions there from okay what can I do what do I need to make this work right - do we really need deep learning I mean they're all these things I already used this linear model why would this help I don't have enough data or let's say you know or I want to use machine learning but I have no clue where to start so I'd really start to all the way to the experts who wise but very specific things it's interesting is there a good answer is it boils down to oftentimes digitizing data so whatever you want automated though whatever date you want to make prediction based on you have to make sure that it's in an organized form you'd like with it within in the sense of like ecosystem there's now you're providing more and more data sets and more pre training models are you finding yourself also the organizer of data sets yes I think the tensorflow data sets that we just released that's definitely come up people want these data sets can we organize them and can we make that easier so that's that's definitely one important thing the other related thing I would say is I often tell people you know what don't think of the most fanciest thing that the newest model that you see make something very basic work and then you can improve it there's just lots of things you can do in there yeah I start with the basic truth one of the big things that makes it makes tensorflow even more accessible was the appearance whenever that happened of Karass the Cara standard sort of outside of tents of no I think it was Karis on top of the a no at first only and then Karis became on top of tensorflow do you know when Cara shows to also add 10 Sefolosha back and who was the was it just the community that drove that initially do you know if there was discussions conversations yeah so Francis started the Charis project before he was at Google and the first thing was Tiana would I don't remember if that was after tensorflow was created or way before and then at some point ray intense flow started becoming popular there were enough similarities that he decided to okay create this interface and input tense flows the back end I believe that might still have been before he joined Google so I you know we're not really talking about that he decided on his own and thought that was interesting and relevant to the community in fact I didn't find out about him being at Google until a few months after he was here he was working on some research ideas and doing Kerris on his nights and weekends project and things so he wasn't like part of the texture flow he didn't join in the joint research and he's doing some amazing here's some papers on that on research he's done he's a great researcher as well and at some point we realized oh he's he's doing this good stuff people seem to like the API and he's right here so we talked to him and he said okay why don't I come over to your team and work with you for a quarter and let's make that integration happen and we talked to his manager and he said sure my quarters fine and that quarter's been something like two years now so Karis got integrated into tensorflow like in a deep way yeah and now with 2.0 tensorflow 2.0 sort of Karass is kind of the recommended way for a beginner to interact with testify which makes that initial sort of transfer learning or the basic use cases even for an enterprise super simple right that's good that's right so what was that decision like that seems like a I it's kind of a bold decision as well we did spend a lot of time thinking about that one we had a bind of API somewhere by us there was a parallel layers API that we were building and when we decided to do caris in parallel so they were like okay two things that we are looking at and the first thing we was trying to do is just have them look similar like be as integrator as possible share all of that stuff they were also like three other API is that others had built over time because we didn't have a standard one but one of the messages that we keep kept hearing from the community okay which one do we use and they kept seeing like okay here's a model in this one and here's a model in this one which should I pick so that that's sort of like okay we had to address that straight on with 2.0 the whole idea is you need to simplify you had to pick one based on where we were we were like okay let's see what's what are the what do the people like and caris was clearly one that lots of people loved there were lots of great things about it so we settled on that organically that's kind of the best way to do it which it was great because it was surprising there were less to sort of bring in and outside I mean there was a feeling like Karis might be almost like a competitor and this is a certain kind of to tensorflow and in a sense it became an empowering element of tensorflow that's right yeah it's interesting how you can put two things together which don't which can align iron in this case I think Francois the team and I you know a bunch of us have chatted and I think we we all want to see the same kind of things we all care about making it easier for the huge set of developers out there and that makes a difference so python has grid over in Rossum who until recently held the position of benevolent dictator for life right so there's a huge successful open source project like tensorflow need one person who makes a final decision so you did a pretty successful tensorflow dev summit just now last couple of days there's clearly a lot of different new features being incorporated and amazing ecosystem so on who's a how are those design decisions made is there is there a btfl intensive flow and or is it more distributed in organic I think it's it's some more different I would say I've always been involved in the key design directions but there are lots of things that are distributed where there number of people Martin Rick being one who is really driven a lot of our open source stuff a lot of the api's in there there a number of other people who have been you know pushed and been responsible for different parts of it we do have regular design reviews over the last year we've really spent a lot of time opening up to the community and adding transparency we're setting more processes in place so RFC's special interest groups really grow that community and and scale that I think that kind of scale that ecosystem is in I don't think we could scale with having me as the saloon decision-maker yeah so yeah the growth of that ecosystem maybe you can talk about a little bit first of all when I started with Andre karpati when he first had come that j/s the fact that you can train in your network in the browser's in that JavaScript was incredible yep so now tensorflow jas is really making that a serious like a legit thing a way to operate whether it's in the back end or the front end then there's the tensorflow extended like you mentioned there's a stencil for light for mobile and all of it as far as I can tell it's really converging towards being able to you know save models in the same kind of way you can move around you can train on the desktop and then move it to mobile and so on thickness is that cohesiveness so he may be give me whatever I missed a bigger overview of the mission of the ecosystem that's trying to be built and where is it moving forward yeah so in short the way I like to think of this is our goals to enable machine learning and in a couple of ways you know one is if you have lots of exciting things going on in ml today we started with deep learning but we now support a bunch of other algorithms too so so one is to on the research side keep pushing on the state-of-the-art can we you know how do we enable researchers to build the next amazing thing so Bert came out recently you know it's great that people are able to do new kinds of research and there are lots of you know amazing research that happens across the world so that's one direction the other is how do you take that across all the people outside who want to take that research and do some great things with it and integrate it to build real products to have a real impact on people and so if that's the other axes in some ways you know at a high level one way I think about it is there a crazy number of compute devices across the world and we often used to think of ml and training and all of this as okay something you do either in the work station or the data center or cloud but we see things running on the phones we see things running on really tiny chips I mean we had some demos at the developer summit and so the way I think about this ecosystem is how do we help get machine learning on every device that has the compute capability and that continues to grow in and so in some ways this ecosystem is looked at you know various aspects of tagged and grown over time to cover more of those and we continue to push the boundaries in some areas we've built more tooling and things around that to help you I mean the first tool we started was ten support you wanted to learn just the training piece the effects or tensorflow extended to really do your entire ml pipelines if you're you know care about all that production stuff but then going to the edge going to different kinds of things and it's not just us now if you're a place where there are lots of libraries being built on top so there are some for research may be things like pens flow agent certain the probability that started as research things or for researchers for focusing on certain kinds of algorithms but they're also being deployed or produced by you know production folks and some have come from within Google just teams across Google who wanted to build these things others have come from just the community because there are different pieces that different parts of the community care about and I see our goal as enabling even that right it's not we cannot and won't build every single thing that just doesn't make sense but if we can enable others to build the things that they care about and there's a broader community that cares about that and we can help encourage that and that that's great that really helps the entire ecosystem not just those one of the big things about 2.0 that we're pushing on is okay we have these so many different pieces right how do we help make all of them work well together so there are few key pieces there that we're pushing on one being the core format in there and how we share the models themselves through save model and Martin's flow hub and so on and you know a few of the pieces that we really put this together I was very skeptical that that's you know intensive for j/s came out I didn't seem or deep-learning j/s yeah it was the first it seems like technically very difficult project as a standalone it's not as difficult but as a thing that integrates into the ecosystems is very difficult so yeah I mean there's a lot of aspects of this you're making look easy but and the technical side how many challenges have to be overcome here a lot and still have to be yes that's the other question here too there are lots of steps to a training leave iterated over the last few years so there's lot we've learned I yeah often when things come together well things look easy that's exactly the point it should be easy for the end user but there are lots of things that go behind that if I think about still challenges ahead there are you know we have a lot more devices coming on board for example from the hardware perspective how do we make it really easy for these vendors to integrate with something like tensorflow right so there's a lot of compiler stuff that others are working on there things we can do in terms of our API is and so on that we can do as we you know tensorflow started as a very monolithic system and to some extent it still is there are less lots of tools around it but the core is still pretty large in monolithic one of the key challenges for us to scale that out is how do we break that apart with clear interfaces it's you know in some ways its software engineering 101 but for a system that's now four years old I guess or more and that's still rapidly evolving and that we're not slowing down with it's hard to you know change and modify and really break apart it's sort of like as people say right it's like changing the engine with a car running or fake professor that's exactly what we're trying to do so there's a challenge here because the downside of so many people being excited about tensorflow and becoming to rely on it in many of their applications is that you're kind of responsible it's the technical debt you're responsible for previous versions to some degree still working so when you're trying to innovate I mean it's probably easier to just start from scratch every few months absolutely so do you feel the pain of that a 2.0 does break some back compatibility but not too much it seems like the conversion is pretty straightforward and do you think that's still important given how quickly deep learning is changing can you just the things that don't you've learned can you just start over or is there pressure to not it's it's a tricky balance so if it was just a researcher writing a paper who a year later will not look at that code again sure it doesn't matter there are a lot of production systems that rely on tensor flow port at Google and across the world and people worry about this I mean they're these systems run for a long time so it is important to keep that compatibility and so on and yes it does come with a huge cost there's we have to think about a lot of things as we do new things and make new changes I think it's a trade-off right you can you might slow certain kinds of things down but the overall value you're bringing because of that is is much bigger because it's not just about breaking the person yesterday it's also about telling the person tomorrow that you know what this is how we do things you're not going to break you and you come on part because there are lots of new people who are also going to come on board hey you know one way I like to think about this and I always push the team to think about as well when you want to do neat things you want to start with a clean slate design with a clean slate in mind and then we'll figure out how to make sure all the other things work and yes we do make compromises occasionally but unless you've designed with the clean slate and not worried about that you'll never get to a good place I was brilliant so even if you're do you are responsible when you're in an idea stage when you're thinking of new it just put all that behind you yeah that's right okay that's really really well put so I have to ask this because a lot of students developers ask me how I feel about pie tours for successful so I've recently completely switched my research group to tensorflow I wish everybody would just use the same thing and tensile force as close to that I believe as we have but do you enjoy competition so testify was leading in many ways on many dimensions in terms of the ecosystem terms the number of users momentum power production level so on but you know a lot of researchers and now also using PI torch do you enjoy that kind of competition or do you just ignore it and focus on making tencel flow the best that it can be so just like research or anything people are doing right it's great to get different kinds of ideas and when we started with tensorflow like I was saying earlier one it was very important for us to also have production in mind we didn't want just research right and that's why we chose certain things now pi torch came along and said you know what I only care about research this is what I'm trying to do what's the best thing I can do for this and it started iterating and said okay I don't need to worry about drafts let me just run things I don't care if it's not as fast as it can be but let me just you know make this part easy and there are things you can learn from that right they they again had the benefit of seeing what had come before but also exploring certain different kinds of spaces and they had some good things there you know building on say things like chain and so on before that so competition is definitely interesting it made us you know this is an area that we had thought about like I said you know very early on over time we had revisited this a couple of times should we add this again at some point we said you know what here's it seems like this can be done well so let's try it again and week that's how you know we started pushing on eager execution how do we combine those two together which has finally come very well together in 2.0 but it took us a while to get all the things together and so on so let me I mean ask put another way I think eager execution is a really powerful thing those at it you think he wouldn't have been you know Muhammad Ali versus Frazier right you think it wouldn't have been added as quickly if pi torch wasn't there it weight might have taken longer no long yeah it was I mean we dried some radiance of that before so I'm sure it would ever happen but it might have taken longer I'm grateful that tensorflow responds in the way they did it's doing some incredible work last couple years what are the things that we didn't talk about are you looking forward in 2.0 that it comes to mind so we talked about some of the ecosystem stuff making it easily accessible to Karis Iker execution is there other things that we missed yeah I would say one is just where 2.0 is and you know with all the things that we've talked about I think as we think beyond that there are lots of other things that it enables us to do and that we are excited about so what it's setting us up for ok the hair these really clean api's we've cleaned up the surface for what the users warned what it does also allows us to do a whole bunch of stuff behind the scenes once we've we are ready with 2.0 so for example intensive flow with graphs and all the things you could do you could always get a lot of good performance if you spent the time to tune it right and we have clearly shown that lots of people do that the 2.0 with these API is where we are we can give you a lot of performance just with whatever you do you know if you're because we see please it's much cleaner we know most people are going to do things this way we can really optimize for that and get a lot of those things out of the box and it really allows us you know both for single machine and distributed and so on really explore other spaces behind the scenes after you know 2.0 in the future versions as well so right now the team is really excited about that that over time I think we'll see that the other piece that I was talking about in terms of just restructuring the monolithic thing into more pieces and making it more modular I think that's going to be really important for a lot the other people in the ecosystem are their organizations and so on that wanted to build things can you elaborate a little bit what you mean by making tons of flow more ecosystem more modular so the way it's organized today is there's one there are lots of repositories in the tensorflow organization at github the core one where we have cleanser flow it has the execution engine it has you know the key backends for CPUs and GPUs it has the work to do distributed stuff and all of these just work together in a single library or binary there's no way to split them apart easily when there are some interfaces but they're not very clean in a perfect world you would have clean interfaces where okay I want to run it on my fancy cluster with some custom networking just implement this and do that I mean we kind of support that but it's hard for people today I think as we are starting to see more interesting things in some of these paces having that clean separation will really start to help and and again going to the large size of the ecosystem in the different groups involved they're enabling people to evolve and push on things more independently just allows it to scale better and by people you mean individual developers and I know organization and organizations that's so the hope is that everybody sort of major I don't know Pepsi or something uses like major corporations go to tensorflow do this kind of yeah if you look at enterprise like Pepsi or these I mean a lot of them are already using tensorflow they are not the ones that do the development or changes in the core some of them do but a lot of them don't I mean they cut small pieces there are lots of these some of them being let's say hardware vendors who are building their custom hardware and they want their own Vsauce or some of them being bigger companies say IBM I mean they're involved in some of our special interest groups and they see a lot of users who want certain things and they want to optimize for that so folks like that often a tourist vehicle companies perhaps exactly yes so yeah like I mentioned tensorflow has been downloaded 41 million 50,000 commits almost 10,000 pool requests and 1,800 contributors so I'm not sure if you can explain it but Oh what does it take to build a community like that what if in retrospect what do you think what is the critical thing that allowed for this growth to happen and how does that growth continue yeah uh yeah that's a interesting question I wish I had all the answers there I guess so you could replicate it I I think there's a there number of things that need to come together right a one you know just like any new thing it is about there's a sweet spot of timing what's needed you know does it grow with what's needed so in this case for example tensa flow is not just grown because it was a good tool it's also grown with the growth of deep learning itself so those factors come into play other than that though I think just hearing listening to the community what they're - what they need being open to like in terms of external contributions we've spent a lot of time in making sure we can accept those contributions well we can help the contributors in in adding those putting the right process in place getting the right kind of community welcoming them and so on like over the last year we've really pushed on transparency that that's important for an open source project people want to know where things are going and they're like okay here's a process where you can do that here RFC's and so on so thinking through there are lots of community aspects that come into that you can really work on as a small project it's may be easy to do because there's like two developers in and you can do those as you grow putting more of these processes in place thinking about the documentation thinking about what to developers care about what kind of tools would they want to use one of these come into planting so one of the big things I think that feeds the tensorflow fire is people building something on tensorflow and you know some implement a particular architecture that does something cool useful and they put it at that and github and so it just feeds this this growth these have a sense that with 2.0 and 1.0 that there may be a little bit of a partitioning like there's a Python two and three but there'll be a code base and in the older versions of test fall there will not be as compatible easily or any pretty confident that this kind of conversion it's pretty natural and easy to do so we're definitely working on hard to make that very easy to do there's lots of tooling that we talked about at the developer summit this week and really continue to invest in that tooling it's you know when you think of these significant version changes that's always a risk and we we are really pushing hard to make that transition very very smooth I I think so so at some level people want to move and they see the value in the new thing they don't want to move just because it's a new thing and some people do it but most people want a really good thing and I think over the next few months as people start to see the value will F&T see that shift happening so I'm pretty excited and confident that we will see people moving as you said earlier this field is also moving rapidly so that'll help because we can do more things and you know the new things will clearly happen into point X so people who have lots of good reasons to move so what do you think that's the 43.0 looks like is that is there it's everything's happening so crazily that even at the end of this year seems impossible to plan for or is it possible to plan for the next five years I I think it's tricky there are some things that we can expect in terms of okay change yes change is going to happen I are there some good things going to stick around and something's not going to stick around I would say that the basics of deep learning the you know say convolution models or the basic kind of things they'll probably be around in some form still in five years will rln ganz stay very likely based on where they are we have new things probably but those are hard to predict and some directionally some things that we can see is you know and things that we're starting to do right with some of our projects right now is just 2.0 combining you could execution in in graphs where we starting to make it more like just your natural programming language you're not trying to program something else similarly with surfer tensorflow we're taking that approach can you do something roundup right so some of those ideas seem like okay that's the right direction in five years we expect to see more in that area other things we don't know is will hardware accelerators be the same will we be able to train with four bits instead of 32 bits and I think the TPU side of things is exploring that I mean GPUs already on version three it seems that the evolution of TPU and tensorflow are sort of their Co evolving almost in terms of both are learning from each other and from the community and from the applications where the biggest benefit is achieved that's right you've been trying to sort with with ego with carrots to make tensorflow as accessible and easy to use as possible what do you think for beginners is the biggest thing they struggle with have you encountered that or is basically what Karis is solving is that eager like we talked about yeah for for some of them like you said right beginners want to just be able to take some image model they don't care if it's inception the rest net or something else and do some training or transfer learning on the kind of model being able to make that easy is important so I in some ways if you do that by providing them simple models would say in hub or so on and they don't care about what's inside that box but they want to be able to use it so we are pushing on I think different levels if you look at just a component that you get which has the layers already smushed in the beginners probably just want that then the next step is okay look at building layers with players if you go out to research then they are probably writing custom layers themselves they don't live so there's a whole spectrum there and then providing the pre-trained models seems to really decrease the time from are you trying to start so you could basically in a collab notebook achieve what you need so basically answering my own question because I think what tensorflow delivered on recently is this trivial for beginners so I was just wondering if there was other pain points you tried to ease but I'm not sure there would know that those are probably the big ones every night I see high schoolers doing a whole bunch of things now it's pretty amazing it's it's both amazing and terrifying yes in a sense that when they grow up it's some incredible ideas will be coming from them so there's certainly a technical aspect to your work but you also have a management aspect to your role with tensorflow leading the project large number of developers and people so what do you look for in a good team what do you think you know Google has been at the forefront of exploring what it takes to build a good team and tensorflow is one of the most cutting-edge technologies in the world so in this context what do you think makes for a good team it's definitely something I think a fair bit about I think in terms of you know the team being able to deliver something well one of the things that's important is a cohesion across the team so being able to execute together and doing things it's not an end like at this scale an individual engineer can only do so much there's a lot more that take they can do together even though we have some amazing Superstars across Google and in the team but there's you know often the way I see it as the product of what the team generates is very larger than the whole or you know the individual put together and so how do we have all of them work together the culture of the team itself hiring good people is important but part of that is it's not just that okay we hire one smart people and throw them together and let them do things it's also people have to care about what they're building people have to be motivated for the right kind of things that's often an important factor and you know finally how do you put that together with a somewhat unified vision of where we want to go so are we all looking in the same direction or what's going on over and sometimes it's a mix Google's a very bottom-up organization in some sense also research even more so and that's how we started but as we've become this larger product and ecosystem I think it's also important to combine that well with mix if ok here's the direction you want to go in there is exploration we'll do around that but let's keep staying in that direction not just all over the place and is there a way you monitor the health of the team sort of like is is there way you know you did a good job he was good like I mean you're sort of you're saying nice things but it's sometimes difficult to determine yeah how aligned yes because it's not binary it's nothing it's it's there's tensions and complexities and so on and the other element of visit the mesh is superstars you know there's so much even at Google such a large percentage of work is done by individual superstars too so there's a yeah and sometimes those superstars could be against the dynamic of a team and those those tensions and it was that has the I mean I'm sure in telephone might be a little bit easier because the mission of the project is so mr. beautiful year at the cutting edge was exciting yeah when have you had struggle with that has there been challenges there are always people challenges in different kinds of fairs that bad said I think we've been what's good about getting people who care and are you know have the same kind of culture and that's Google in general to a large extent but also like you said given that the project has had so many exciting things to do there's been room for lots of people to do different kinds of things and grow which which does make the problem a bit easier I guess yeah and it allows people depending on what they're doing if there's room around them and that's fine but yes we do it we do care about whether superstar an art that they need to work well with the team across Google that's interesting to hear so it's like superstar not the productivity broadly is about the team yeah yeah yeah I mean they might add a lot of value but if they're putting the team then that's a problem so in hiring engineers it's so interesting right the hiring process what do you look for how do you determine a good developer or a good member of her team from just a few minutes or hours together again no magic answers I'm sure yeah yeah I mean Google has a hiding process that we've refined over the last 20 years I guess and that you've probably heard and seen a lot about so we do work with the same hiring process and that that's really helped for a mean particular I would say in addition to the the core technical skills what does matter is their motivation in what they want to do because if that doesn't align well with their you want to go that's not going to lead to long-term success for either with them or the team and I think that becomes more important the more senior the person is but it's important at every level like even the junior most engineer if they're not motivated to do well at what they're trying to do however smart they are it's going to be hard for them to succeed there's the Google hiring process touch on that passion so like trying to determine because I think as far as I understand maybe you can speak to it that the Google hiring process sort of helps in the initial like determines the skill set there is your puzzle solving ability problem solving ability good but like I'm not sure but it seems that the determining whether the person is like fire inside them yeah that burns to do anything really doesn't really matter it's just some cool stuff I'm going to do it that I don't know is that something that ultimately ends up and when they have a conversation with you or once it gets closer to the sales so one of the things we do have as part of the process is just a culture fit like part of the interview process itself in addition to just the technical skills in each engineer or whoever the interviewer is is supposed to rate the person on the culture and the culture fit with Google and so on so that is definitely part of the process now there are various kinds of projects and different kinds of things so there might be variants and if the kind of culture you want there and so on and yes that does vary so for example tensorflow has always been a fast-moving project and we want people who are comfortable with that but at the same time now for example we are at a place where we are also very full-fledged product and we want to make sure things that work really really work right you can't cut corners all the time so that balancing that out in finding the people who are the right fits for fit for those is important in anything those kind of things do vary a bit across projects and teams and product areas across Google and so you'll see some differences there in the final checklist but a lot of the core culture it comes along with just the engineering excellence and so on what is the hardest part of your job I think you pick I guess it's it's fun I would say right hard yes I mean lots of things at different times I think that that does vary so let me clarify that difficult things are fun yeah when you solve them right yes it's fun in that in that sense I I think the key to a successful thing across the board and you know in this case it's a large ecosystem now but even a small product is striking that fine balance across different aspects of it sometimes that's how fast do you go versus how perfect it is sometimes it's how do you involve this huge community who do you involve where you reside okay now is not a good time to involve them because it's not the right fit sometimes it's saying no to certain kinds of things those are often they're hard decisions some of them you make quickly because you don't have the time some of them you get time to think about them but they're always hard so on both both choices are pretty good it's that those decision what about deadlines this is do you find tensorflow to be driven by deadlines to a degree that a product might or is there still a balance to where I mean it's less deadline you had the dev summits yeah they came together incredibly didn't look like there's a lot of moving pieces and so on so that did that deadline make people rise to the occasion releasing that's the flow 2.0 alpha yeah I'm sure that was done last minute as well I mean likely there's up to the yes up to the up to the last point yes again you know it's one of those things that's a you need to strike the good balance there's some value that deadlines spring that does bring a sense of urgency to get the right things together instead of you know getting the perfect thing out you need something that's good and works well and the team definitely did a great job in putting that together so it was very amazed and excited by how that came together that said across there we try not to put artificial deadlines we focus on key things that are important figure out what that how much of it's important and and we are developing in the open what you know internally and externally everything is available to everybody so you can pick and look at where things are we do releases at a regular cadence so fine if something doesn't necessarily end up at this month it'll end up in the next release in a month or two and that's okay but we want to get like keep moving as fast as we can in these different areas because we can iterate and improve one things sometimes it's okay to put things out that aren't fully ready if you make sure it's clear that okay this is experimental but it's out there if you want to try and give feedback that's very very useful I think that quick cycle and quick iteration is important that's what we often focus on rather than here's a deadline where you get everything else it's to point now is there pressure to make that stable or like for example WordPress 5.0 just came out with it and there was no pressure - is that it was a lot of build up dates to deliver - way too late but and they said okay well but we're gonna release a lot of updates really quickly to improve it this do you see Tesla photo 2.0 in that same kind of way or is there this pressure - once it hits 2.0 once you get to the release candidate and then you get to the final that that's gonna be the stable thing so it's going to be stable in just like when not expose ver every API that's there it's gonna remain and work it doesn't mean we can't change things in under the covers it doesn't mean we can't add things so there's still a lot more to for us to do and recon did you have more razors so in that sense there still I don't think we'd be done in like two months when we released this I don't know if you can say but is there you know there's not external deadlines for tensorflow 2.0 but is there internal deadlines the artificial are otherwise that you try and just set for yourself is or is it whenever it's ready so we want it to be a great product right and that's a big important piece for us tensorflow is already out there we have you know 41 million downloads for one Dalek so it's not like but you have to have it it is yeah yeah exactly so it's not like all a lot of the features that we've you know really polishing and putting them together out there we don't have to rush that just because so in that sense we want to get it right and really focus on that that said we have said that we are looking to get this out in the next few months in the next quarter and we you know as far as possible we let me try to make that happen yeah my favorite line was spring is a relative yes spoken like a true developer so you know something I'm really interested in and your previous line of work is before test for you let a team at Google on search ads I think this is like this is a very interesting topic on every level on a technical level because that their best ads connect people to the things they want and need yep and and that they're worse they're just these things that annoy the heck out of you to the point of ruining the entire user experience of whatever you're actually doing and so they have a bad rap I guess and so at the end the other end so that this connecting users to the thing they need to want is a beautiful opportunity for machine learning to shine like huge amounts of data that's personalized and you've got a map to the thing they actually want won't get annoyed so what have you learned from this Google that's leading the world in this aspect what have you learned from that experience and what do you think is the future of ads take you back to the yeah of that but yes it's been a while but I totally agree with what you said I think the search ads the way it was always looked at and I believe it still is is it's an extension of what search is trying to do and the goal is to make the information and make the words information accessible that's it's not just information but it may be products or you know other things that people care about and so it's really important for them to align with what the users need and you know the in search ads there's a minimum quality level before that ad would be shown if you don't have an ad that hits that quality but it will not be shown even if we have it and okay maybe we lose some money there that's fine that that is really really important noting that that is something I really liked about being there advertising is a key part I mean it as a model it's been around for ages right it's it's not a new model it's it's been adapted to the web and you know became a core part of search and in many other search engines across the world I I do hope you know like I said there are aspects of ads that are annoying and I go to a website and if it just keeps popping in out in my face not to direct me let me read that that's going to be knowing clearly so I I hope we can strike that balance between showing a good ad where it's valuable to the user and provides the monetization to the to the you know service and this might be searched this might be a website all of these they they do need the monetization for them to provide that service but if it's done in a good balance between showing just some random stuff that's distracting versus showing something that's actually valuable so do you see it moving forward as to continue being a model that you know that funds businesses like Google that's a it's a significant revenue stream because that's one of the most exciting things but also limiting things in the Internet is nobody wants to pay for anything yeah and advertisements again coupled at their best are actually really useful not annoying to continue do you see that continuing and growing and improving or is there GC sort of more netflix type models where you have to start to pay for content I think it's a mix I think it's gonna take a long wait for everything to be paid on the internet if at all probably not I mean I think there's always going to be things that are sort of monetized with things like ads but over the last few years I would say we've definitely seen that transition towards more paid services across the web and people are willing to pay for them because they do see the value and I mean Netflix is a great example and we have YouTube doing things people pay for the apps they buy more people I find are willing to pay for newspaper content for the the good news websites across the web that wasn't the case a few years even a few years ago I would say and I just see that change in myself as well and just lots of people around me so definitely hopeful like real transition to that mix model where maybe you get to try something out for free maybe with ads but then there's a more clear revenue model like that sort of helps go beyond that so speaking of revenue how is it that a person can use the TPU and a Google call app for free so what's the I guess the question is what's the future of tensorflow in terms of empowering say teacher class of 300 students and they amassed by MIT what is going to be the future of that being able to do their homework intensive flow like why are they going to train these networks right right what's that future look like with TP use with cloud services and so on I think a number of things that I mean any tensile flow open source you can run it whatever you can write on your desktop and your desktops always keep getting more powerful so maybe you can do more my phone is like I don't know how many times more powerful than my first desktop probably trained on your phone though yeah right so in that sense the power you have in your handles is is a lot more clouds are actually very interesting from say students or our courses perspective because they make it very easy to get started I mean colab the great thing about is go to a website and it just works no installation needed nothing to you know you're just just there and if things are working that's really the power of cloud as well and so I do expect that to grow again you know collab is a free service it's great to get started to play with things to explore things that said you know with free you can only get so much UV yeah so just like we were talking about you know free versus Karen yeah there are there are services you can pay for and get a lot more great so the final complete beginner interested in machine learning intensive flow what should I do probably start going to our website and playing there's a lot of tests for that organs start clicking on things yep check our tutorials and guides their stuff you can just click there and go to a collab and do things no installation needed you can get started right there ok awesome project thank you so much for talking about thank you like songs great you

Info

Channel: Lex Fridman

Views: 28,565

Rating: 4.9649563 out of 5

Keywords:

Id: NERNE4UThHU

Channel Id: undefined

Length: 70min 58sec (4258 seconds)

Published: Mon Jun 03 2019