Andrew Ng: Deep Learning, Education, and Real-World AI | Lex Fridman Podcast #73

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

the following is a conversation with Andrew and one of the most impactful educators researchers innovators and leaders in artificial intelligence and technology space in general he co-founded Coursera and Google brain launched deep learning AI landing AI and the AI fund and was the chief scientist at Baidu as a Stanford professor and with Coursera and deep learning AI he has helped educate and inspire millions of students including me this is the artificial intelligence podcast if you enjoy it subscribe on YouTube give it five stars an apple podcast supported on patreon simply connect with me on Twitter at Lex Friedman spelled Fri D ma n as usual I'll do one or two minutes of ads now and never any ads in the middle that can break the flow of the conversation I hope that works for you and doesn't hurt the listening experience this show is presented by cash app the number one finance side up in the App Store when you get it use collects podcast cash app lets you send money to friends buy Bitcoin and invest in the stock market with as little as one dollar brokerage services are provided by cash up investing a subsidiary of square and member si PC since gap allows you to buy Bitcoin let me mention that cryptocurrency in the context of the history of money is fascinating I recommend a cent of money as a great book on this history debits and credits on Ledger's started over 30,000 years ago the US dollar was created over 200 years ago and Bitcoin the first decentralized cryptocurrency released just over ten years ago so given that history cryptocurrency still very much in its early days of development but it's still aiming to and just might redefine the nature of money so again if you get cash app from the App Store or Google Play and use the collects podcast you'll get $10 and cash app will also donate $10 the first one of my favorite organizations that is helping to advance robotics and STEM education for young people around the world and now here's my conversation with Andrew Eng the courses you taught on machine learning in Stanford and later on Coursera the co-founded have educated and inspired millions of people so let me ask you what people are ideas inspired you to get into computer science and machine learning when you were young when did you first fall in love with the field there's another way to put it growing up in Hong Kong Singapore I started learning to code when I was five or six years old at that time I was learning the basic programming language and they would take these folks and you know they'll tell you typed this program into your computer so typed that programs my computer and as a result of all that typing I would get to play these very simple shoot-'em-up games that you know I had implemented on my own minds old computer so I thought was fascinating as a young kid that I could write this code that's really just copying code from a book into my computer to then play these cool of video games another moment for me was when I was a teenager and my father because his doctor was reading about expert systems and about neural networks so he got me read some of these books and I thought was really cool you could write a computer that started to exhibit intelligence then I remember doing an internship was in high school this isn't Singapore where I remember doing a lot of photocopying and and I was office assistants and the highlight of my job was when I got to use the shredder so the teenager me remote thinking boy this is a lot of photocopying if only we could write software build a robot something to automate this maybe I could do something else so I think a lot of my work since then has centered on the theme of automation even the way I think about machine learning today were very good at writing learning algorithms they can automate things that people can do or even launching the first MOOCs massive open online courses that later led to Coursera I was trying to also meet what could be automatable in how I was teaching on campus process of Education tried to automate parts of that make it more to have more impact from a single teacher single educator yeah I felt you know teaching Stanford teaching machine learning it's about 400 students a year at the time and I found myself filming the exact same video every year telling the same jokes the same room and I thought why am I doing this well just take last year's video and then I can spend my time building a deeper relationship with students so he has process of thinking through how to do that that led to the first first moves that we launched and then you have more time to write new jokes are their favorite memories from your early days at Stanford teaching thousands of people in person and then millions of people online you know teaching online what not many people know was that a lot of those videos were shot between the hours of 10:00 p.m. and 3:00 a.m. a lot of times we were watching the first moves that fit with our announcer course but a hundred thousand people have signed up we just started to write the code and we had not yet actually filmed the video so you know a lot of pressure a hundred thousand people waiting for us to produce the content so many Friday Saturday's I would go out have dinner my friends and then I was thinking okay do I want to go home now or do you want to go to the office to film videos and the thoughts of you know that helped hundred thousand people potentially learn machine learning unfortunately that made me think okay I'm gonna go to my office go to my time in the recording studio I would adjust my Logitech webcam adjust my you know Wacom tablet make sure my lapel mic was on and then I was not recording often until 2:00 a.m. or 3:00 a.m. I think I'm fortunate it doesn't doesn't show that it was recorded that late at night but it was really inspiring the the thought that we could create content to help so many people learn about machine learning how does that feel the fact that you're probably somewhat along maybe a couple of friends recording with a logitech webcam and kind of going home alone at 1:00 and 2:00 a.m. at night and knowing that that's going to reach sort of thousands of people eventually millions of people is what's that feeling like I mean is there a feeling of just satisfaction of pushing through I think is humbling and I wasn't thinking about what I was viewing I think one thing we I'm proud to say we caught right from the early days was I told my whole team back then that the number one priority is to do what's best for learnis to asbestos students and so when I went in a recording studio the only thing on my mind was what can I say how can I design my slides ready to draw a right to make these concepts as clear as possible for lehre news I think you know I've seen sometimes instructors is tempting hey let's talk about my work maybe if I teach you about my research someone will cite my papers a couple more times and I think one things we got right launch the first few MOOCs and later building Coursera was putting in place that bedrock principle let's just do what's best for learners then forget about everything else and I think that that is a guiding principle turns out to be really important to the to the rise of the movement and the kind of learner your imagined in your mind is as as broad as possible as global as possible so really try to reach as many people interested in machine learning and AI as possible I really want to help anyone that had an interest in machine learning to break into fields and and I think sometimes eventually people ask me hey why you spend so much time explaining gradient descent and then and my answer was if I look at what I think to learn they need somewhat benefit from I felt that having that a good understanding of the foundations coming back to the basics would put them in a better stead to then build on a long term career so you've tried to consistently make decisions on that principle so one of the things you actually revealed to the narrow AI community at the time and to the world is that the amount of people who are actually interested in AI is much larger than we imagined by you teaching the class and how popular became it showed that wow this isn't just a small community of sort of people who go to Europe's and and it's much bigger it's the developers it's people from all over the world from front I mean I'm Russian so as everybody in Russia is really interested this is a huge number of programmers who are interested in machine learning India China South America everywhere that there's just millions of people who are interested machine learning so how big you get a sense that this number of people is that are interested in your perspective I think the numbers grown over time I think I'm one of those things that maybe it feels like it came out of nowhere but it's an insider building it it took years there's all those overnight successes that took years to get there my first foray into this type of education was when we were filming my Stanford class and sticking the videos on YouTube and then some other things with uploading the holes and so on but you know basically the one hour fifteen minute video that we put on YouTube and then we had four or five other versions of websites that had built most of what you would never have heard of because they reach small audiences but that allowed me to iterate allow my team and me to innovate to learn what the ideas that work and what doesn't for example one of the features I was really excited about and really proud of was build this website where multiple people could be logged into the website at the same time so today if you go to a website you know if you're logged in and then I want to log in you need to log out it was the same browser the same computer but I thought well one of two people say you and me were watching a video together in front of the computer what if a website could have you type your name and password hit me type in their password and then now the computer knows both of us are watching together and it gives both of us credit for anything we do as a group influences feature rolled it out in a higher in school in San Francisco we had about 20-something users worth the teacher there Sacred Heart Cathedral prep teachers great and guess what zero people use the speaker it turns out people studying online they want to watch the videos by themselves so you can playback pause at your own speed rather than in groups so that was one example of a tiny lesson learned out of many that allows us to hone in to the set of features and it sounds like a brilliant feature so I guess the lesson to take from that is you there's something that looks amazing on paper and then nobody uses it doesn't actually have the in the impact that you think it might have and so yeah I saw that you really went through a lot of different features and a lot of ideas you had to arrive at the final at Coursera its final kind of powerful thing that showed the world that MOOCs can educate millions and I think with how um machine learning movements as well I think it didn't come out of nowhere instead what happened was as more people learned about machine learning they will tell their friends and their friends will see how the big world to their work and then and in the community kept on growing um and I think we're still growing you know I don't know in the future what percentage of our developers would be AI developers I could easily see it being more for 50 percent right because so many a I developers broadly construed not just people doing the machine learning modeling but the people but the infrastructure data pipelines you know all the software's surrounding the old machine learning model maybe is even bigger I feel like today almost every software engineer has some understanding of the clouds no oh you know but maybe this is my microcontroller developer doesn't need to do the cloud but I feel like the vast majority of software in Jesus today are sort of having appreciated the cloud I think in the future maybe we'll approach nearly a hundred percent of all developers being you know in some way an AI developer or at least having an appreciation of machine learning and my hope is that there's this kind of effect that there's people who are not really interested in soft being a programmer or being into software engineering like biologists chemists and physicists even mechanical engineers and all these disciplines that are now more and more sitting on large data sets and here they didn't think they're interested in programming until they have this data set and they realized there's the set of machine learning tools that allow you to use the data set so they actually become they learn to program and they become new programmer so like the not just because you've mentioned a larger percentage of developers become machine learning people the it seems like more and more the the kinds of people who are becoming developers is also growing significantly yeah yeah I think I think once upon a time only a small part of humanity was literate you could read and write and and and maybe you thought maybe not everyone needs to learn to read and write you know you just go listen to a few monks write me to you and maybe that was enough or maybe we just need a few handful of authors to write the bestsellers and then no one else needs to write but what we found was that by giving as many people you know in some countries almost everyone basically literacy it dramatically enhanced human to human communications and we can now write for an audience of one such as if I send you an email you send me an email I think in computing we're still in that phase where so few people know how the codes that the code is mostly have to code for relatively large audiences but if everyone well most people became developers at some level similar to how most people and develop economies are somewhat literate I would love to see the owners of a mom-and-pop store be with a very little code to customize the TV display for their special this week and I think of it enhance human to computer communications which is becoming more more important today as well so you think you think it's possible that machine learning becomes kind of similar to literacy where we're yeah like you said the owners of a mom-and-pop shop is basically everybody in all walks of life would have some degree of programming capability I could see society getting there um there's one other interesting thing you know if I go talk to the mom and pop store if I toss a lot of people in their daily professions I previously didn't have a good story for why they should learn to code yeah we give them some reasons but what I found with the rise of machine learning and data science is that I think the number of people with a concrete use for data science in their daily lives and their jobs may be even larger than the number of people of a country used for software engineering for example if you were actually run a small mom-and-pop store I think if you can analyze the data about your sales your customers I think there's actually real value there maybe even more than traditional software engineer so I find that for a lot of my friends in various professions being recruiters or accountants or you know people that work in the factories which I deal with more and more these days I feel if they were data scientists at some level they could immediately use that in their work so I think that data science and machine learning may be an even easier entree into the developer world for a lot of people then the software engineering that's interesting and I grew that but that's a beautifully put we live in a world where most courses and talks have slides PowerPoint keynote and yet you famously often still use a marker and a whiteboard the simplicity of that is compelling in for me at least fun to watch so let me ask why do you like using a marker and whiteboard even on the biggest of stages I think it depends on the concepts you want to explain for mathematical concepts it's nice to build at the equation one piece at a time and the whiteboard marker or the pen is stylus is a very easy way you know to build up the equation a build up a complex concept one piece at a time while you're talking about it and sometimes that enhances understandability the downside of writing is as it slow and so if you want a long sentence it's very hard to write that so I think their pros and cons in sometimes I use slides and sometimes they use a whiteboard or a stylus the slowness of a whiteboard is also it's upside is it forces you to reduce everything to the basics some of some of your talks and involve the whiteboard I mean there's really none but you go very slowly and you really focus on the most simple principles and that's a beautiful that enforces a kind of a minimalism of ideas that I think is surprisingly least for me is is great for education like a great talk I think is not one that has a lot of content a great talk is one that just clearly says a few simple ideas and I think you the white board somehow enforces that Peter erbil who's now one of the top roboticists and reinforcement learning experts in the world was your first PhD student hey so I bring him up just because I kind of imagine this is this was must have been an interesting time in your life do you have any favorite memories of working with Peter your first student in those uncertain times especially before deep learning really really sort of blew up any favorite memories from those times you know I was really fortunate to have had Peter of you as my first PhD students and I think even my long-term professional success builds on early foundations or early work that that Peter was so critical to so I was really grateful to him for working at me you know what not a lot of people know is just how hard research was and and so is Peter's PhD thesis was using reinforcement learning to fly helicopters and so you know actually even today the website Helly thought stanford.edu heö I don't Stanford are you still up here watch videos of us using reinforcement learning to make the helicopter fly upside down five loops rolls this is cool so one of the most incredible robotics videos ever so how do you still watch it oh yeah thanks firing that's from like 2000 it's eight or seven or six like that really my dad's like yeah so is over ten years old that was really inspiring to a lot of people yeah but not many people see is how hard it was so Peter and Adam codes and Morgan Quigley and I work on various versions of the helicopter and a lot of things did not work for example turns out one of the hardest problems we had was when the helicopters flying around upside down doing stunts how do you figure out the position how do you localize a helicopter so we want to try all sorts of things having one GPS unit doesn't work because you're flying upside down the GPS units facing down so you can't see the satellites so we tried them we experimented trying to have two GPS units one facing up one facing the house if you flip over that didn't work because the downward facing one couldn't synchronize if you're flipping quickly um Morgan quickly was exploring this crazy complicated configuration of specialized hardware to interpret GPS signal look into FPGA is completely insane spent about a year working on that didn't work so I remember Peter great guy him and me you know sitting down in my office looking at saw the latest things we had tried that didn't work and saying you know Don it like what now because because we tried so many things in it and it just didn't work in the end what we did when Adam Cole's was was crucial to this was put cameras on the ground and used cameras on the ground to localize a helicopter and that soft a localization problem so that we couldn't focus on the reinforcement learning and inverse reinforcement learning techniques so didn't actually mean to helicopter fly and you know I'm reminded when when was doing um this work at Stanford around that time there was a lot of reinforcement learning theoretical papers but not a lot of practical applications so the autonomous helicopter work for fine helicopters was this one of the few you know practical applications of reinforcer learning at the time which which caused it to become pretty well known I I feel like we might have almost come full circle with today there's so much but so much hype so much excitement yeah about reinforcement learning but again we're hunting for more applications and all of these great ideas that delica he's come up with what was the drive sort of in the face of the fact that most people doing theoretical work what motivate you in the uncertainty and the challenges to get the helicopter sort of to do the the applied work to get the actual system to work yeah in the face of fear uncertainty is sort of the setbacks the you mentioned for localization I like stuff that works III know physical world so like it's this back to the shredder and you know III like theory but when I work on theory myself and this personal taste I'm not seeing anyone else should do what I do but when I work on theory I Percy enjoyed more if I feel that my the work I do will influence people have positive impact will help someone I remember when many years ago our speaking with a mathematics professor and it kind of just said hey ytt what you do and then he said he you know he had stars in his eyes when he answered and this mathematician not from Stanford different University he said I do what I do because it helps me to discover truth and beauty in the universe here starts analyzing he said yeah and I thought that's great um I don't want to do that I think it's great that someone does that fully supportive people that do a lot of respect review that but I am more motivated when I can see a line to how the work that my team's and I are doing house people the world needs all sorts of people I'm just one type hoping everyone should do things the same way as I do but when I delve into either theory or practice if I personally have conviction you know that here's a pathway to help people I find that more satisfying to have that conviction that that's your path you were a proponent of deep learning before it gained widespread acceptance what did you see in this field that gave you confidence what was your thinking process like in that first decade of the I don't know that's called 2000s the odds yeah I can tell you the thing we got wrong with the thing we got right the thing we really got wrong was the importance of the early importance of unsupervised learning so early days of Google brain we put a lot of effort into unsupervised learning rather than supervised learning and those as argument I think was around them 2005 after a new Europe's at that time called nips but now in Europe said ended and Geoff Hinton and I were sitting in the cafeteria outside you know the conference we had lunch was chatting and Geoff pulled up this napkin he started sketching this argument on her on a napkin it was very compelling as our repeated human brain has about a hundred trillion so there's 10 to the 14 synaptic connections you will live about 10 10 and 9 seconds that's 30 years you actually live for two to two by ten to nine maybe three right nine seconds so just let's say ten to nine so if each synaptic connection each weight in your brains new network has just a one bit parameter that's 10 to the 14 bits you need to learn in up to 10 to 9 seconds of your life so via the simple argument which is a lot of problems it's very simplified that's 10 to the 5 bits per second you need to learn in your life and I have a one-year-old daughter I am NOT pointing out 10 to 5 bits per second of labels to her so and and I think I'm a very loving parent but I'm just not gonna do that so from this you know very crude definitely problematic argument there's just no way that most of what we know is through supervised learning the wife you get so many visit information is from sucking in images audio those experiences in the world and so that arguments and a lot of known forces argument you go going to really convince me that there's a lot of power to unsupervised learning so that was the part that we actually maybe maybe gone wrong I still think I was learning is really important but we but but in the early days you know 10 15 years ago and all of us thought that was the path forward oh so you're saying that that that perhaps was the wrong intuition for the time for the time that that was the part we got wrong the part we got right was the importance of scale so Adam calls another wonderful person fortunate said worth of him he was in my group of Stanford at the time and Adam had run these experiments at Stanford showing that the bigger we train a learning algorithm the better performance and it was based on that it was a graph that hadn't generated you know where the x-axis y-axis lines going up into the right so bigger paint make this thing the better his performance accuracy is the vertical axis so it's really based on that chart that Adam generated that he gave me the conviction that you could scale these models way bigger than what we could on the few CPUs we should understand that then we could get even better results and it was really based on that one figure that Adam generated that gave me the conviction to go of Sebastian's to pitch you know starting starting a project at Google which became the CooCoo brain crunch brain you know filing Google brain and there the intuition was scale will bring performance for the system so we should chase larger and larger scale and I think people don't don't realize how how groundbreaking of it is simple but it's a groundbreaking idea that bigger data sets will result in better performance it was cultural first it was controversial at the time some of my well-meaning friends you know see any people in the machine or in community I won't name but whose people told people of some some of whom we know my well-meaning friends came and we're trying to give me friendly meze hey Andrew why are you doing this is crazy it's in the near enough architecture look at these architectures of building you just like go for scale like there's a bad career move so so my well-meaning friends you know we're trying to some of them we're trying to talk me out of it if I find it if you want to make a breakthrough you sometimes have to have conviction and do something before it's popular since that lets you have a bigger impact let me ask you just a small tangent on that topic I find myself arguing with people saying that greater scale especially in the context of active learning so it's very carefully selecting the data set but growing the scale of the data set is going to lead to even further breakthroughs in deep learning and there's currently pushback at that idea that larger datasets are no longer that so you want to increase the efficiency of learning you want to make better learning mechanisms and I personally believe that bigger data sets will still with the same learning methods we have now result in better performance what's your intuition at this time on those I Anna this dual side is do we need to come up with better architectures for learning or can we just get bigger better data sets that will improve performance I think both are important and there's also problem dependent so for a few data sets we may be approaching your Bayes error rate of approaching or surpassing human level performance and then there's that theoretical ceiling that we will a surface of a CRE but then I think there plenty of problems where we're we're still quite far from either human of a performance all from Bayes error rate and bigger data says with new networks but without further elaborate innovation will be sufficient to take us further but on the flip side if we look at the recent breakthroughs using you know transforming networks for language models it was a combination of novel architecture but also scale has a lot to do with it if we look at what happened with your GP - and birds I think scale was a large part of the story yeah that's that's not often talked about is the scale of the data set it was trained on and the quality of the data set because there's some so it was like reddit threads that had they were operated highly so there's already some weak supervision on a very large data set that people don't often talk about right I find it today we have maturing processes to managing cold things like get right version control it took us a long time to evolve the good processes I remember when my friends and I were emailing each other C++ files in email you know but then we had was that CVS subversion get maybe something else in the future we're very immature in terms of Susa managing data and think about how the creator and how the soft I'm very hot messy data problems I think there's a lot of innovation there to be I still I love the idea that you were versioning through email I'll give you one example um when we work with manufacturing companies is not at all uncommon for there to be multiple late lists that disagree of each other right and so we were doing the work in visual inspection we will you know take say a plastic cards and show to one inspector and the inspector sometimes very opinionated there go clearly that's the defector scratch understand so gonna check this part take the same parts of different inspector different very opinionated clearly the scratch is small is fine don't throw it away you're gonna make us yours and then sometimes you take the same plastic part show it to the same inspector in the afternoon and I suppose in the morning and very affinity go in the morning to say clearly is okay in the afternoon equally confident clearly this is a defect and so what does the i-team supposed to do if if sometimes even one person doesn't agree of himself or herself in the span of a day so I think these are the types of um very practical very messy data problems that that you know that my teams wrestle with in the case of large consumer Internet companies where you have a billion users you have a lot of data you don't worry about they just take the average it kind of works but in a case of other industry settings we don't have big data if just a small data very small the users maybe 100 defective parts or 100 examples of a defect if you have only 100 examples these little labeling errors you know if 10 of your hundred labels aren't wrong that actually is 10% it is that has a big impact so how do you clean this up what you're supposed to do this is an example of the of the types of things that my team's did this is a landing AI example are wrestling with to deal with small data which comes up all the time once you're outside consumer internet yeah that's fascinating so then you invest more effort in time in thinking about the actual labeling process what are the labels what are the how our disagreements resolved in all those kinds of like pragmatic real world problems that's a fascinating space yeah I find it actually when I'm teaching at Stanford I increasingly encourage students at Stanford to try to find their own project for the end of term project rather than just downloading someone else's nicely clean data set it's actually much harder if you need to go and define your own problem and find your own dataset rather than you go to one of the several good websites very good websites with with creams scopes datasets that you could just work on you're now running three efforts the AI fund landing AI and deep learning AI as you've said the AI fund is involved in creating new companies from scratch Landing AI is involved in helping already established companies do AI and deep learning AI is for education of everyone else or of individuals interested of getting into the field and excelling in so let's perhaps talk about each of these areas first deep learning that AI how the basic question how does a person interested in deep learning get started in the field the Atlanta AI is working to create courses to help people break into AI so my machine learning course that I taught through Stanford is one of the most popular causes on Coursera to this day it's probably one of the courses sort of if I ask somebody how did you get into machine learning or how did you fall in love with machine learning or will get you interested they it always goes back to rain and Drang at some point you won't find the amount of people you influence is ridiculous so for that I'm sure I speak for a lot of people say big thank you no yeah thank you you know I was once reading a news article I think it was tech review and I'm gonna mess up the statistic but I remember reading article that said um something like one-third of all programmers are self-taught I may have the number one third Romney was two-thirds but I rent an article I thought this doesn't make sense everyone is self-taught because you teach yourself I don't teach people and it's no good haha oh yeah so how does one get started in deep learning and where does deep learning that AI fit into that so the define specialization offered by today is is this I think one it was called service specialization it might still be so it's very popular way for people to take that specialization to learn about everything from new networks to how to tune in your network so what is it confident to what is a RNA nor sequence model or what is an attention model and so the design specialization um steps everyone's through those algorithms so you deeply understand it and can implement it and use it you know for whatever a from the very beginning so what would you say the prerequisites for somebody to take the deep learning specialization in terms of maybe math or programming background you know need to understand basic programming since there are Pro exercises in Python and the map prereq is quite basic so no calculus is needed if you know calculus is great you get better intuitions but deliberately try to teach that specialization without requirement calculus so I think high school math would be sufficient if you know how to Mouse by two matrices I think I think that that deaths that desperates so little basically in your algebra it's great basically the algebra even very very basically the algebra and some programming I think that people that done the machine learning also find a deep learning specialization a bit easier but is also possible to jump into the divine specialization directly but it'll be a little bit harder since we tend to you know go over faster concepts like how does gradient descent work and what is an objective function which which is covered mostly in the machine learning course could you briefly mention some of the key concepts in deep learning that students should learn that you envision them learning in the first few months in the first year or so so if you take the d-line specialization you learned foundations of what is in your network how do you build up in your network from a you know single which is a unit stack of layers to different activation functions you don't have a trained in your networks one thing I'm very proud of in that specialization as we go through a lot of practical know-how of how to actually make these things work so what the differences between different optimization algorithms so what do you do of the algorithm over things so how do you tell the algorithm is overfitting when you collect more data when should you not bother to collect more data I find that um even today unfortunately there are your engineers that will spend six months trying to pursue a particular direction such as collect more data because we heard more data is valuable but sometimes you could run some tests and could have figured out six months earlier therefore this problem collecting more data isn't going to cut it so just don't spend seconds collecting more data spend your time modifying the architecture or trying something also go through a lot of the practical know-how also that when when when when someone will you take the deviant specialization you have those skills to be very efficient in how you build is net so dive right in to play with the network to train it to do the inference on a particular data set to build an intuition about it without without building it up too big to where you spend like you said six months learning building up your big project without building an intuition of a small small aspect of the data that could already tell you everything needs you know about that date yes and also the systematic frameworks of thinking for how to go about building practical machine learning maybe to make an analogy um when we learn to code we have to learn the syntax of some Korean language right be a Python or C++ or octave or whatever but that equally important that may be even more important part of coding is to understand how to string together these lines of code into coherent things so you know when should you put something in the function call and when should you not know how do you think about abstraction so those frameworks are what makes the programmer efficient even more than understanding to syntax I remember when I was an undergrad at Carnegie Mellon um one of my friends with debug their codes by first trying to compile it and then it was T plus s code and then every line did a syntax error they want to care for the syntax errors as quickly as possible so how do you do that well they would delete every single line of code with a syntax error so really efficient for general syntax errors were horrible service I think so we learned how the debug and I think in machine learning the way you debug the machine learning program is very different than the way you you know like do binary search or whatever use the debugger I traced through the code in in traditional software engineering so isn't evolving discipline but I find that the people that are really good at debugging machine learning algorithms are easily 10x maybe 100x faster at getting something to work so and the basic process of debugging is so the the bug in this case why is in this thing learning learning improving sort of going into the questions of overfitting and all those kinds of things that's that's the logical space that the debugging is happening in would in your network yeah the often question is why doesn't it work yet well can I expect it eventually work and what are the things I could try change the architecture malteaser more regularization different optimization algorithm you know the different types of data are so to answer those questions systematically so that you don't heading down so you don't spend six months hitting down the blind alley before someone comes and says why you spent six months doing this what concepts and deep learning do you think students struggle the most with or sort of this is the biggest challenge for them was to get over that hill it's it hooks them and it inspires them and they really get it similar to learning mathematics I think one of the challenges of deep learning is that there are lot of concepts that build on top of each other if you ask me what's hard about mathematics I have a hard time pinpointing one thing is it addition subtraction is it carry is it multiplication long there's lot of stuff I think one of the challenges of learning math and of learning certain technical fields is that a lot of concepts and you miss a concept then you're kind of missing the prerequisite for something that comes later so in the deep learning specialization try to break down the concepts to maximize the answer each component being understandable so when you move on to the more advanced thing we learn your confidence hopefully you have enough intuitions from the earlier sections to then understand why we structure confidence in a certain certain way and then eventually why we build you know our nn zone ellos tienen or attention model in a certain way a building on top of the earlier concepts I'm curious you you you do a lot of teaching as well do you have a do you have a favorite this is the hard concept moment in your teaching well I don't think anyone's ever turned the interview on me I think that's a really good question yeah it's it's it's really hard to capture the moment when they struggle I think you put a really eloquently I do think there's moments that are like aha moments that really inspire people I think for some reason reinforcement learning especially deep reinforcement learning is a really great way to really inspire people and get what the use of neural networks can do even though you know networks really are just a part of the deep RL framework but it's a really nice way to the to paint the entirety of the picture of a neural network being able to learn from scratch knowing nothing and explore the world and pick up lessons I find that a lot of the aha moments happen when you use deep RL to teach people about neural networks which is counterintuitive I find like a lot of the inspired sort of fire and people's passion people's eyes comes from the RL world do you find I mean first of all learning and to be a useful part of the teaching process or not I still teach me forceful learning and one of my Stanford classes and my PhD thesis wonderful so nice thank you I find it if I'm trying to teach students the most useful techniques for them to use today I end up shrinking the amount of time and talk about reinforce another in English it's not what's working today now our world changes so fast maybe it does be totally different in a couple years I think we need a couple more things for reinforcement learning to get there if you get there yeah one of my teams is looking to reinforce the learning for some robotic control toss so I see the applications but if you look at it as a percentage of all of the impact of you know the types of things we do is at least today outside of you know playing video games right in a few of the games the the scope nearest a bunch of us was standing around saying hey what's your best example of an actual deploy reinforcement learning application and you know among your like scene in machine learning researchers right and again there are some emerging ones but there are there are not that many great examples well I think you're absolutely right the sad thing is there hasn't been a big application impactful real-world application reinforcement learning I think its biggest impact to me has been in the toy domain in the game domain in a small example that's what I mean for educational purpose it seems to be a fun thing to explore new networks with but I think from your perspective and I think that might be the best perspective is if you're trying to educate with a simple example in order to illustrate how this can actually be grown to scale and have a real world impact then perhaps focusing on the fundamentals of supervised learning in the context of you know a simple data set even like an eminence data set is the right way is the right path to take I just the amount of fun I've seen people have of the reinforcement learning it's been great but not in the applied impact on the real-world setting so it's a it's a trade-off how much impact you want to have versus how much fun you want to have yeah that's really cool and I feel like you know the world actually needs also even within machine learning I feel like deep learning is so exciting but the AI team shouldn't just use deep learning I find that my team's use a portfolio of tools and maybe that's not the exciting thing to say but some days we use internet some days we use a you know the PC a the other day are sitting down with my team looking at PC residuals trying to figure out what's going on with PC applied to manufacturing problem and sometimes we use the promising graphical model sometimes you use a knowledge trough where some of the things that has tremendous industry impact but the amount of chat about knowledge drops in academia has really thin compared to the actual rower impact so so I think reinforcement learning should be in that portfolio and then it's about balancing how much we teach all of these things and the world the world should have diverse skills if he said if you know everyone just learn one one narrow thing yeah the diverse skill help you discover the right tool for the job what is the most beautiful surprising or inspiring idea in deep learning to you something that captivated your imagination at the scale that could be a the performance I give you achieve of scale or there are other ideas I think that if my only job was being an academic researcher and have an unlimited budget and you know didn't have to worry about short-term impact and only focus on long term in fact I pretty spent all my time doing research on unsupervised learning I still think unsupervised learning is a beautiful idea at both this Pastner herbs and I CML I was attending workshops on the center Vera's talks about self supervised learning which is one vertical segment maybe of sort of unsupervised learning I'm excited about maybe just to summarize the idea I guess you know the idea of describe movie no please so here's the examples self supervised learning let's say we grab a lot of unlabeled images off the internet so with infinite amounts of this type of data I'm going to take each image and rotate it by a random multiple of 90 degrees and then I'm going to train a supervised near Network to predict what was the original orientation so has something rotated 90 degrees hundred eighty degrees turns in seven degrees or zero degrees so you can generate an infinite amount of label data because you rotate to the image so you know what's the branch of label and so various researchers have found that by taking unlabeled data and making up label datasets and training a large neural network on these thoughts you can then take the hidden layer representation and transfer to a different toss very powerfully um learning word embeddings when we take a sentence to leave the word predict the missing word which is how we learn you know one of the ways we learn where the embeddings is another example and I think there's now this portfolio of techniques for generating these made-up toss another one called jigsaw what behave you take an image cut it up into a you know three by three grid so like a nine 3x3 puzzle piece jump out the nine pieces and have a neural network predict which of the nine factorial possible permutations it came from so are many groups including your opening I Peter P has been doing some work on this to Facebook Google brain I think deep mind Oh Aaron menthols has great work on the CPC objective so many teams are doing exciting work and I think this is a way to generate infinitely both data and and I find this a very exciting piece of an supervisor and he's a long-term you think that's going to unlock a lot of power and in machine learning systems is this kind of unsupervised learning I don't think there's the whole enchilada I think that's just a piece of it and I think this one piece unsuited is self supervised learning it's starting to get traction we're very close to it being useful well what embedding is really really useful I think we're getting closer and closer to just having a significant real world impact maybe in computer vision and video but I think this concept and then I think there'll be other concepts around it you know other unsupervised learning things that I worked on I've been excited about I was really excited about sparse coding and I see a slow feature analysis I think all of these are ideas that various of us were working on about a decade ago before we all got distracted by how well supervised learning was wearing work yeah it was a we would return we were returned to the fundamentals of representation learning that that really started this movement of deep learning I think there's a lot more work that one could explore around the steam of ideas and other ideas to come or better algorithms so if we could return to maybe talk quickly about the specifics of deep learning that AI the deep learning specialization perhaps how long does it take to complete the course would you say the official length of the divine specialization is I think 16 weeks so about 4 months but is go at your own pace so if you subscribe to the divine socialization there are people that finish that in less than a month by working more intensely and study more intensely so it really depends on on the individual who created the divine specialization we wanted to make it very accessible and very affordable and with you know Coursera and Devon dyers education mission one thing that's really important to me is that if there's someone for whom paying anything is a it's a financial hardship then just apply for financial and get it for free if you were to recommend a daily schedule for people in learning whether it's through the deep learning that a a specialization or just learning in the world of deep learning what would you recommend how do they go about day two days or a specific advice about learning about their journey in the world of deep learning machine learning I think I'm getting the habit of learning is key and that means regularity so for example we send out our weekly newsletter the batch every Wednesday so people know is coming Wednesday you can spend a little bit of time on Wednesday catching up on the latest news through the batch on the on on on Wednesday and for myself I've picked up a habit of spending some time every Saturday and every Sunday reading or studying and so I don't wake up on the Saturday and have to make a decision do I feel like reading or studying today or not it's just it's just what I do and the fact is a habit makes it easier so I think if someone can get in that habit it's like you know just like we brush our teeth every morning I don't think about it if I thought about this a little bit annoying to have to spend two minutes doing that but it's a habit that it takes no cognitive loads but this would be so much harder if we have to make a decision every morning so and actually that's the reason why we're the same thing every day as well it's just one less decision I just get out in there where I'm sure so I think you can get that habit that consistency of studying then then it actually feels easier so yeah it's kind of amazing in my own life like I play guitar every day for life forced myself to at least for five minutes play guitar it's just it's a ridiculously short period of time but because I've gotten into that habit it's incredible what you can accomplish in a period of a year or two years you could become you know exceptionally good at certain aspects of a thing by just doing it every day for a very short period of time it's kind of a miracle that that is how it works it's adds up over time yeah and I think is this something is often not about the bursts of sustained effort and all-night is because you can only do that in a limited of times it's the sustained effort over a long time I think you know reading two research papers there's a nice thing to do but the power is not reading through research papers this reading through research papers a week for a year then you've read a hundred papers and and you actually learn a lot when you read a hundred papers so regularity and making learning a habit do you have do you have general other study tips for particularly deep learning that people should in in their process of learning is there some kind of recommendations or tips you have as they learn one thing I still do when I'm trying to study something really deeply is take handwritten notes in theories I know there are a lot of people that take the deep learning courses during the commutes or something where maybe mobile quit to take notes so I know it's may not work for everyone but when I'm taking courses on Coursera you know and that still takes on my every now and then the most recent I took was a was a course on clinical trials because those engines of all that I got my little moleskin notebook and I was sitting in my desk is just taking down notes so what the instructor was saying and that Act we know that that act of taking notes preferably handwritten notes increases retention so as you're sort of watching the video just kind of pausing maybe and then taking the basic insights down on paper yeah so I should have been a few studies if you know search online you find for some of these studies that taking handwritten notes because handwriting is slower as were saying just now um it causes you to recoat the knowledge in your own words more and that process of recoding promotes long-term attention this is as opposed to typing which is fine again typing is better than nothing or in taking across and nautical is better than nothing any cause law but comparing handwritten notes and typing um you can usually type faster for a lot of people do you can hand write notes and so when people type they're more likely to transcribe verbatim what they heard and that reduces the amount of recoding and that actually results in less long-term retention I don't know what the psychological effect there is but so true there's something fundamentally different about in handwriting I wonder what that is I wonder if it is as simple as just the time it takes to write it slower yeah and and and because because you can't write as many words you have to take whatever they said and summarize it into fewer words and that summarization process requires deeper processing of the meaning which then results in better attention that's fascinating oh and then I spent I think yeah because of course error I spent so much time studying pedagogy thank you my passion that I really love learning how to more efficiently help others learn yeah one of the things I do both in creating videos or when we write the batch is um I kind of think is one minute spent of us going to be a more efficient learning experience than one minute spent anywhere else and we really try to you know make a time efficient for the learning it's good to know everyone's busy so when when we're editing them I often tell my teams everywhere it needs to fight for his life and if can delete it where this is the lead to that not wait that's not waste than during this time wow that's so it's so amazing that you think that way because there is millions of people that are impacted by your teaching and sort of that one minute spent has a ripple effect right three years of time which is just fascinating talk about how does one make a career out of an interest in deep learning give advice for people we just talked about sort of the beginning early steps but if you want to make it a entire life's journey or at least a journey of a decade or two how did it how do you do it so most important thing is to get started right and ever I think in the early part of a career coursework um like the divine specialization or it's a very efficient way to master this material so because you know instructors be me or someone else or you know Laurence Moroney teaches our tensor field specialization and other things we're working on spend effort to try to make a time efficient for you to learn new concepts of coursework because actually a very efficient way for people that learn concepts and the beginning parts of break into new fields in fact one thing I see at Stanford some of my PhD students want to jump in the research right away and actually tend to say look when you first copy yours the piece didn't spend time ticking causes because it lays the foundation it's fine if you're less productive in your first couple of years you'd be better off in the long term um beyond a certain point there's materials that doesn't exist in courses because it's too cutting edge the courses we created yeah there's some practical experience that we're not yet that good as teaching in a in a course and I think after exhausting the efficient course were then most people need to go on to either ideally work on projects and then maybe also continue their learning by reading blog polls and research papers and thing like that doing practice is really important and again I think is important to start small it's just do something today you read about deep learning if you like all these people doing such exciting things whatever I'm not building a neural network they change the world and what's the point well the point is sometimes building that time in your network you know be it m-miss or upgrade to a fashion amnesty whatever it's doing your own fun hobby project that's how you gain the skills to let you do bigger and bigger projects I find this to be true at the individual level and also at the organizational level for company to become good at machine learning sometimes the right thing to do is not to tackle the giant project is instead to do the small project that lets the organization learn and then build up from there but this triple for individuals and and and for and for companies just taking the first step and then taking small steps it's the key should students pursue a PhD do you think you can do so much that's the one of the fascinating things in machine learning you can have so much impact without ever getting a PhD so what are your thoughts should people go to grad school should people get a PhD I think that there are multiple good options of which doing a PhD could be one of them I think that if someone's admitted to top ph.d program you know that MIT Stanford top schools I think that's a very good experience or someone gets a job at a top organization at the top a I team I think that's also good experience there are some things you still need a PhD to do if someone's aspiration is to be a professor here at the top academic University you just need a PhD to do that but if it goes to you know start a complete build a complete do great technical work I think PhD is a good experience but I would look at the different options available to someone you know where the places where you can get a job where the place isn't getting a PhD program and kind of weigh the pros and cons of those so just to linger on that for a little bit longer what final dreams and goals do you think people should have so the what options for they explore so you can work in industry so for a large company like Google Facebook buy do all these large companies already have huge teams of machine learning engineers you can also do with an industry sort of more research groups that kind of like Google research Google brain that you can also do like we say the professor neck as in academia and what else oh you can still build your own company you can do a start-up is there anything that stands out between those options or are they all beautiful different journeys that people should consider I think the thing that affects your experience more is less are you in discomfort versus that company your academia versus industry I think the thing that affects to experience Moses who are the people you're interacting with you know in the daily basis so even if you look at some of the large companies the experience of individuals and different teams is very different and what matters most is not the logo above the door when you walk into the giant building every day what matters the most is who are the 10 people who are the 30 people you interact with every day so I actually tend to advise people if you get a job from from a company also who is your manager who are your peers who are you actually going to talk to you we're all social creatures we tend to you know become more like the people around us and if you're working with great people you will learn faster or if you get admitted if you get a job at a great company or a great university maybe the logo you walk in you know is great but you're actually stuck on some team doing really worth it doesn't excite you and then that's actually really bad experience so this is true both universities and for large companies for small companies you can kind of figure out who you be working quite quickly and I tend to advise people if a company refuses to tell you who you work with someone say oh join us the rotation system will figure out I think that that that's a worrying answer because it because it means you may not get sense - you mean not actually get to team with with great peers and great people to work with it's actually a really profound advice that we kind of sometimes sweep we don't consider to rigorously or carefully the people around you are really often this especially when you accomplish great things it seems the great things are accomplished because of the people around you so that that's a it's not about the the worry whether you learn this thing or that thing or like you said the logo that's hangs up top it's the people that's a fascinating and it's such a hard search process of finding just like finding the right friends and somebody to get married with and that kind of thing it's a very hard search process a people search problem yeah but I think when someone interviews you know at a university or the research lab at a large corporation it's good to insist on just asking who are the people who is my manager and if you refuse to tell me I'm gonna think well maybe that's because you don't have a good answer it may not be someone I like and if you don't particularly connect if something feels off for the people then don't stick to it you know that's a really important signal to consider yeah and that's yeah I am in my standard cause cs2 30s was an ACN talk I think I gave like a hour long talk on career advice including on the job search process and then some of these those are yours if you can find those videos on also and others I'll point people to them beautiful so the AI fund helps ai startups get off the ground or perhaps you can elaborate all the fun things it's evolved with what's your advice and how does one build a successful hey I start up you know in second Valley a lot of starter failures come from building our products that no one wanted so when you know cool technology but who's gonna use it so I think I tend to be very outcome driven um and then customer obsess ultimately we don't get to vote if we succeed or fail is only the customer that the only one that gets a thumbs up or thumbs down those in the long term in the short term you know there are various people who get various votes but in the long term that's what really matters so as you build to start where to cast as the question well the customer gives a thought and give a thumbs up on this I think so I think startups that are very customer focused customer says deeply understand the customer and are oriented to serve the customer are more likely to succeed with the provision that I think all of us should only do things that we think create social good and lose the world for words I'm sorry I personally don't want to build addictive digital products just so long as you know the things that that could be lucrative but I won't do but if we can find ways to serve people in meaningful ways I think those can be those can be great things to do either the academic setting or in a corporate setting real startup setting so can you give me the idea of why you started the AI fund I remember when I was leaving the AI group at Baidu I had two jobs two parts of my job one was to build an AI engine to support the existing businesses and that wasn't running you know just read this performed by itself the second part of my job at the time which was to try to systematically initiate new lines of businesses using the company's aiq abilities so you know the self-driving car team came out my group the spot speaker team similar to what is some amazonica a lexer in the US but we announced it before Amazon did so we were goodbye to wasn't following him wasn't following an Amazon that that came out of my group and I found that to be um actually that the most fun part of my job so what I what to do was to build AI fund as a startup studio to systematically create new startup firms with all the things we can now do of AI I think the ability to build new teams to go after this rich space of opportunities is a very important way to very important mechanism to get these projects done that I think will move the world forward so of unfortunate that don't the few teams that had a meaningful positive impact and I felt that we might present do this in the most systematic repeatable way so a start-up studio is a relatively new concept there there are maybe dozens of startup studios you're right now but I feel like all of us many teams are still trying to figure out how do you systematically build companies with a high success rate so I think even though my you know venture capital friends are seem to be more and more building companies rather than investing companies but I find a fascinating thing to do to figure out the mechanisms by which we could systematically build successful teams successful businesses in in areas that we find meaningful so startup studio is something is is a place and a mechanism for startups to go from zero to success so try to develop a blueprint it's actually a place for us to build startups from scratch so we often bring in found this and work with them or maybe even have existing ideas that we match founders with and then this launches yo hopefully into successful companies so how close are you to figuring out a way to automate the process of starting from scratch and building successful AI startup yeah I think we've we've been constantly improving and iterating on our processes but how we do that so things like you know how many customer calls do we need to make and all they get customer validation how do we make sure this technology can be built well all of our businesses need cutting-edge machine learning algorithms so you know kind of Alrosa develop in the last one or two years and even if it works in a research paper it turns out taking the production it's really hard a lot of issues for making these things work in the real life didn't know why the actress in academia so how do you validate is actually doable how do you build a team get the specialize domain knowledge speed in education or healthcare or whatever staffing are focusing on so I think we're actually getting we've been getting much better at giving the entrepreneurs a high success rate but I think we're still I think the whole world is still in the early phases freaking us out but do you think there is some aspects of that process the transferable from one startup to another to another to another yeah very much so you know starting a company to most entrepreneurs is is a really lonely thing and I've seen so many entrepreneurs not know how to make a certain decision like when do you need - how do you do PDP sales right if you don't know that this is really hard or how do you market this efficiently other than you're buying ads which is really expensive other more efficient tactics that know from machine learning project you know basic decisions can change the course of whether machine learning product works or not and so there are so many hundreds of decisions that entrepreneurs need to make and making a mistake in a couple of key decisions can have a huge impact on the fate of the company so I think a starter studio provides a support structure that makes starting a company much less of a lonely experience and also um when facing with these key decisions like trying to hire your first the VP of Engineering what's a good selection criteria do you sauce should I hire this person or not but helping by having by having an ecosystem around the entrepreneurs the founders to hope I think we help them at the key moments and hopefully cyclically make them more enjoyable and in higher success rate there's somebody to brainstorm with in these very difficult decision points and also to help then recognize what they may not even realize is a key decision point right that's that's the first probably the most important part yeah you can say one other thing um you know I think the building companies is one thing but I feel like is really important that we build companies move the world forward for example Lavinia funteam does once an idea for a new company that if it had succeeded but have resulted in people watching a lot more videos in a certain narrow vertical type of video looked at it the business case was fine the revenue case was fine but a look that I just said I don't want to do this that you know I don't actually just want to have a lot more people watch this type of video wasn't educational is the educational Haiti and so and so III code the idea on the basis that didn't think it would actually help people so what the building companies or work of enterprises or doing personal projects I think it's up to each of us to figure out what's the difference we want to make in the world with learning AI you helped already established companies grow their AI and machine learning efforts how does a large company integrate machine learning into their efforts AI is a general purpose technology and I think it will transform every industry our community has already transformed the logic center software internet sector most software internet companies outside the top right five or six or three or four already have reasonable machine learning capabilities or or getting there is still room for improvement but when I look outside the software internet sector everything from manufacturing agriculture healthcare they're just X translation there's so many opportunities that very few people are working on so I think the next way for AI is first also transform all of those other industries there was a McKinsey study estimating 13 trillion dollars of global economic growth the u.s. GDP is 19 trillion dollars or thirteen trillion this is a big number or PwC it's been 16 trillion dollars so whatever number is this large but the interesting thing to me was a lot of that impact would be outside the software internet sector so we need more teams to work with these companies to help them adopt AI and I think this is one things that make you hope drive global economic growth and make humanity more powerful and like you said the impact is there so what are the best industries the biggest industries where AI can perhaps outside the software tech sector um frankly I think is all of them some of the ones I'm spending a lot of time on are manufacturing agriculture looking to healthcare for example in manufacturing we do a lot of our work in visual inspection where today there are people standing around using the AI humanoid to check it you know this plastic part or the smartphone or this thing has a stretch or gentle something in it um we can use a camera to take a picture use a algorithm deep learning and other things to check if it's defective or not and does our factories improve you then improve quality and improve throughput it turns out the practical problems we run into are very different than the ones you might read about in most research papers the data says they're really small so if a small D the problems you're the factories keep on changing the environment so it works well on your test set but guess what you know the something changes in the factory the lights go on they're off recently we there was a factory in which M burned through through the factory and pooped on something and so that you know so that changed stuff and so increasing our algorithm of making robustness so all the changes happen the factory I find that we runs a lot of practical problems that that are not as widely discussed in in academia and is really fun kind of being on the cutting edge solving these problems before you know maybe before many people are even aware that there is a problem there and that's such a fascinating space you're absolutely right but what is the first step that a company should take it's just scary leap into this new world of going from the human eye inspecting to digitizing that process having a camera having an algorithm what's the first step like what's the early journey that you recommend that you see these companies taking I published a document called the AI transformation playbook that's online and talk briefly if everyone course on Coursera about the long term journey that companies should take but the first step is actually to start small I've seen lot more companies fail by starting to bake than by starting to small um take even Google you know most people realize how hard it was and how controversial was in the early days so when it's not the Google brain um it was controversial you know people thought deep-learning Nunez tried it didn't work why would you want to do deep learning so my first internal customer rule in Google was the Google speech team which is not the most lucrative project in Google but not the most important it's not web search or advertising but by starting small on my team helped the speech team build a more accurate speech recognition system and this caused their peers other teams to start at more faith and deep learning my second internal customer was the Google Maps team where we use computer vision to read house numbers from basic Street view images the more accurately locate houses within Google Maps so improve the quality later and there's only after those two successes that I then started the most serious conversation with a Google Ads team and so there's a ripple effect that you showed that it works in these in this cases and then it just propagates through the entire company that this this thing has a lot of value and use for us I think the early small-scale projects it helps the teams gain faith but also hosts the team's learn what these technologies do I still remember when our first GPU server it was a server under some guys desk and you know and and then that taught us early important lessons about how do you have multiple users share a set of GPUs which is really non-obvious at the time but those early lessons were important we learned a lot from that first GPU server then later helped the teams think through how to scale without too much large deployments are there concrete challenges that companies face that the UC is important for them to solve I think building and deploying machine learning systems is hard there's a huge gulf between something that works and I drew the notebook on your laptop versus something runs their production deployment setting in a factory or culture plant or whatever um so I see a lot of people you know get something to work on your laptop you say wow look without done and that's that's that's great that's hot that's a very important first step but all teams underestimate the rest of the steps um so for example I've heard this exact same conversation between a lot of machine learning people and businesspeople the machine learning person says look my algorithm does well on the test set and the clean test said I didn't a peak and then machine and the business person says thank you very much but your algorithm sucks it doesn't work and the machine learning person says no wait I did well on the test set um and I think there is a gulf between what it takes to do well on a test set on your hard drive versus what it takes to work well in a deployment setting some some common problems robustus in generalization you know yuuta for something the factory maybe they chopped down a tree outside the factory so the tree no longer covers the window and the lighting is different so the first set changes and in machine learning and especially in academia we don't know how to deal with test set distributions that are dramatically different than the training set distribution this research there's stuff like domain annotation transfer learning you know that the people working on it but we're really not good at this so how do you actually get this to work because your test set distribution is going to change and I think um also if you look at the number of lines of code in a software system the machine learning model it's maybe five percent or even fewer relative to the entire software system we need to build so how to get all that work done and make it reliable and systematic a good software engineering work is fundamental here to building a successful small machine learning system yes and and and the software system needs to interface with people's work clothes so machine learning is automation on steroids if we take one task all the many tasks that done in factories so in factory does lots of things one tosses visual inspection if we automate that one task it can be really valuable but you may need to redesign a lot of other tasks around that one task for example say the machine learning algorithm says this is defective what is supposed to do is you throw the way to get a human to double check do you want to rework it or fix it so you need to redesign a lot of toss around that thing you've now automated so planning for the change management and making sure that the software he write is consistent with the new work though and you take the time to explain to people when he so happens I think what Lani AI has become good at and I think we learned by making mistakes and you know painful experiences for my ring what would become good at is working with our partners to think through all the things beyond just the machine learning model don't you put a notebook but build the entire system manage the change process and figure out how to deploy this in a way that has an actual impact the processes that the large software tech companies use for deploying don't work for a lot of other scenarios there for example when I was leading you know large speech teams um if the speech my vision system goes down what happens what allowance goes off and then someone like me will say hey you 20 engineers please fix this baby with an American but if you have a system garden in the factory there are not 20 machine learning engineers sitting around you can page the duty and have them fix it so how do you deal with the maintenance or the or the DevOps or the mo ops or the other aspects of this so these are concepts that I think landing AI and a few other teams on the cutting edge uh but we don't even have systematic terminology yet to describe some of the stuff we do because I think we're we're indenting it on the fly so you mentioned some people are interested in discovering mathematical beauty and truth in the universe and you're interested in having big positive impact in the world so let me ask the two are not inconsistent no they're all together I'm only half joking because you're probably interested a little bit in both but let me ask a romanticized question so much of the work your work and our discussion today has been on the applied AI maybe you can even call narrow AI where the goal is to create systems that automate some specific process that adds a lot of value to the world but there's another branch of AI starting with Alan Turing the kind of dreams of creating human level or superhuman level intelligence is this something you dream of as well do you think we human beings will ever build a human level they're superhuman level intelligent system I would love to get the AGI and I think humanity will but whether it takes a hundred years or 500 or 5,000 I find hard to estimate do you have some folks have worries about the different trajectories that path would take even existential threats of an AGI system do you have such concerns whether in the short term or the long term I do worry about the long term fate of humanity um I do wonder as well I do worry about overpopulation on the planet Mars just not today I think there will be a day when maybe maybe someday in the future mass will be polluted there are these children dying and some will look back at this video and say Andrew how is Anja so heartless you didn't care about all these children dying on the planet Mars and I apologize to the future viewer I do care about the children but I just don't know how to productively work on that today your picture will be in the dictionary for the people who are ignorant about the overpopulation on Mars okay yes so it's a long term problem is there something in the short term we should be thinking about in terms of aligning the values of our AI systems with the values of us humans sort of something this to Russell and other folks are thinking about as this system develops more and more we want to make sure that it represents the better angels of our nature the ethics the values of our society you know if you take so driving cars um the biggest problem with self-driving cars is not that there's some trolley dilemma and you teach this so you know how many times when you're driving your car did you face this moral dilemma as it would I food I crash into you so I think itself Giancarlo runs that problem roughly as often as we do when we drive our cars um the biggest problem Sir John calls is when there's a big white truck across the road and what you should do is break and not crash into it and the search on car fails and it crashes into it so I think we need to solve that problem for us I think the problem with some of these discussions about a gi you know alignments the paperclip problem is that is a huge distraction from the much harder problems that we actually need to address today some hard problems yesterday I think I'm bias is a huge issue um I worry about wealth inequality the AI and Internet are causing an acceleration of concentration of power because we can now centralized data use there to process it and so industry after industry we've affected every industry so the internet industry has a lot of winner-take- modes are willing to take all dynamics but if infected all these other industries so also giving these other industries when they take most I'm going to take all flavors so look at what uber and lyft into the taxi industry so we're doing this type of things along so this so creating tremendous wealth but how do we shoulder the wealth is fairly shared I think that and then how do we help people whose jobs are displace you know I think education is part of it there may be even more that we need to do then education I think bias is a serious issue there adverse users of AI and like deep fakes being used for various nefarious purposes so I worry about some teams maybe accidentally and I hope not deliberately making a lot of noise about things that problems in the distant future rather than focusing on senses much harder problems yeah the overshadow the problems that we have already today they're exceptionally challenging like those you said and even the silly ones but the ones that have a huge impact which is the lighting variation outside of your factory window that that ultimately is what makes the difference between like you said the jupiter notebook and something that actually transforms an entire industry potentially yeah and I think and then just to some companies when a regulator comes to you and says no your product is messing things up fixing it may have a revenue impact was much more fun to talk to them about how you promise not to wipe out humanity in this interface they're actually really hard problems we face so your life has been a great journey from teaching to research to entrepreneurship two questions one are there regrets moments that if you went back you would do differently and two are there moments you're especially proud of moments that made you truly happy you know I've made so many mistakes it feels like every time I discover something I go why didn't I think of this you know five years earlier or even ten years earlier and Reese's and then sometimes I read a book and I go I wish I read this book ten years ago my life we've been so different although that happened recently and then I was thinking if only I read this book when we're a start-up Coursera could have been so much better but I discovered that book had not yet been written we're starting Coursera so that means even but I find that the process of discovery we keep on finding out things that seem so obvious in hindsight but it always takes us so much longer than then I wish to figure it out so on the second question are there moments in your life that if you look back that you're especially proud of or especially happy the that fills you with happiness and fulfillment well two answers one despite all turnover yes of course you say no matter how much time I spend for I just can't spend enough time with her congratulations weather thank you and then second is helping other people I think to me I think the meaning of life is um helping others achieve whatever are their dreams and then also to try to move the world forward by making humanity more powerful as a whole so the times that I felt most happy most proud works when I felt um someone else allowed me the good fortune of helping them a little bit on the path to their dreams I think there's no better way to end it than talking about happiness and the meaning of life so enter it's a huge honor me and millions of people thank you for all the work you've done thank you for talking to thank you so much thanks thanks for listening to this conversation with Andrew Aang and thank you to our presenting sponsor cash app downloaded use coal export cast you'll get ten dollars and $10 will go to first an organization that inspires and educates young minds to become science and technology innovators of tomorrow if you enjoy this podcast subscribe on youtube give it five stars and Apple podcast supported on patreon or simply connect with me on Twitter at Lex Friedman and now let me leave you with some words of wisdom from NGO Aang ask yourself if what you're working on succeeds beyond your wildest dreams which you have significantly helped other people if not then keep searching for something else to work on otherwise you're not living up to your full potential thank you for listening and hope to see you next time you

Info

Channel: Lex Fridman

Views: 304,761

Rating: 4.9510856 out of 5

Keywords: andrew ng, deep learning, machine learning, coursera, deeplearning.ai, landing.ai, stanford, introducation to machine learning, artificial intelligence, agi, ai, ai podcast, artificial intelligence podcast, lex fridman, lex podcast, lex mit, lex ai, lex jre, mit ai

Id: 0jspaMLxBig

Channel Id: undefined

Length: 89min 10sec (5350 seconds)

Published: Thu Feb 20 2020