Break into NLP hosted by deeplearning.ai

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone and welcome my name is ryan keenan and i'm the director of products here at deeplearning.ai we really appreciate you taking some time for this event today and we hope that you and all your families are doing well these have been difficult times for the last several months [Music] we know that wherever you are in the world and we have something like uh people joining us from something like over 140 countries today uh we know that wherever we are in the world it's been a difficult several months and it's not over yet so we hope that you're doing well and we hope that maybe you've had something good come out of this like more time with family or a new opportunity or something else but today we are gathered to to celebrate the launch of our third course in the natural language processing specialization and we've assembled a panel of experts to talk about nlp about the work that they're doing about the future of the field and give some perspectives so nlp is really all about giving machines the ability to interpret and manipulate human language and so as uh all sorts of different fields in ai continue to expand and make significant contributions to industry the demand for people with skills in these areas is only growing so um we're really excited today to launch this course our mission of deep learning.ai is to to create great courses really and to make world-class ai education accessible to everyone um so our goal is really to empower our community of developers and business leaders and people changing their career paths uh all of you to be part of this ai transformation that's happening all over the world um so these courses are created by experts from industry and academia and uh course three is uh sequence models in nlp it just went live a few hours ago and so this is a really exciting course in the sense that with courses one and two the skills that you built if you were a learner in those courses uh were sort of foundational skills uh the nlp basic toolkit and then now with this course you're moving into more sort of real world applications so more powerful models and applications so super exciting course and um today's panel of experts that we've lined up uh have a lot of interesting input on the field and on on all these uh different techniques in nlp uh we've also lined up a sneak peek of the courses and we'll have a course demo at the end so stick around for that today our agenda uh we'll we'll be opening up with a keynote by lucas kaiser who's one of the instructors in the nlp courses as well as a scientist at google brain following that we'll have a talk by andrew egg who's the founder of deep learning and uh deep learning.ai and um then we'll have a panel discussion where we'll bring in um in addition to lucas and andrew we'll be joined by kenneth church from baidu usa professor marty hurst from uc berkeley and eunice mori from stanford who's also one of the instructors in the nlp course after that we'll do a q a session and we'll select we'll do a panel discussion and then we'll have a q a session where we select some of the most upvoted questions from our nlp learner slack community which is the slack community you can join once you enroll in the specialization of coursera we've got several thousand of you in there already um and then uh following all that we'll do the course three demo so without further ado i'd like to welcome our first speaker lucas kaiser who's as i said one of the instructors in our nlp specialization lucas is the co-author of tensorflow the as well as the tensor to tensor and tracks libraries uh and he's a co-author of the transformer paper which has uh really transformed the field of nlp and learning in general uh his staff research scientist at google brain as i mentioned um and so it's a great honor and pleasure to welcome you to this forum today lucas i'll turn it over to you now thank you very much ryan uh it's it's great too it's great to be able to see you and and i want to tell you um a little bit about how i came to deep learning in nlp and this has launching this course is a very personal thing for me as well because i've been working for the last number of years on using deep learning in nlp and making it accessible to people and it's been a wild journey so we started the work i mean around the year 2014 people in the nlp community did not believe very much in deep learning of course not all of them some did but but many did not because the best performing methods were still not deep learning based they were based on probabilistic models you've learned about them in the earlier courses and they gave good results for example machine translation which is the field i've been working on mostly was uh the performance there is measured by something called the blues car and it's the higher the better and around the year 2013-14 the best blue score was obtained by phrase-based systems so so and on this english german benchmark which is a standard data set of it was about 21 and it was considered quite good but for comparison like human translator score would be around 30. and people were thinking okay these deep learning systems that came around then they were maybe got 19 maybe managed to get 20 but they were very large and we managed to train just larger networks and these networks you will learn about lstms in the course they're not complicated but the larger they make you make them the better they work and around 2016 they got really large and really good we got 25 blue which is when the google translate launched a neural network for translation for the first time uh it it was a big thing it really improved performance for the first time over the old methods it was still 25 blue and training this model took about a month on on 128 gpus it was absolutely not reproducible outside of google no one could do this because it was very tight to google specific hardware it was a special framework but it worked so so rnns started spreading into nlp and as you learn in this course the rnns work by processing the input word by word and then decoding the output in in another language like this so rnns are great but but as i told you it's slow it takes hundreds of gpus and also when the sentences get long when you go beyond one sentence to paragraphs the gradients don't propagate that well and then in 2017 we had this model called transformer which does away with the recurrent scenario so it's way more parallel it can use way less compute and it gets blue scores on the order in the paper it was 28 then if you train longer you get it to 29 with some improvements now we actually got it to 30 so on the blue score metric it's on par with human translators so that's what you learned in course four and the transformer model is deceptively simple because it has no recurrence it has this attention uh matrix that attends everywhere and it turns out it works not just for translation it works for a large number of nlp tasks uh you can you can sorry lucas i'm sorry to interrupt you just your cameras off would you be willing to turn your camera on so folks oh sure sorry i i didn't i didn't notice that so uh so yes that's the that's the transformer model and since it does better with gradients in long sequences and to understand why it does better with long sequences you really need to understand rnns and then look at attention so that's why it's a really good idea to pay attention in course three understand the rnns with their features and problems and then go to course four uh and with the attention mechanism you can generate a really long text so this is the text generated from the open ai transformer the thinking green is the only thing you input it's a prefix and then it continues the story and it really reads like you could have written the story which five years ago was something in the field of nlp you would barely believe that that the model can be generating coherent stories so what are the frontiers well we are going into longer and longer sequences we are getting models that are more able to reason we're getting models that can do few shot learning there is now a gpt3 model that does fusion learning and there are more and more use cases for these models so it's not just translation these days a lot of nlp tasks generation it's also language to programming there are a lot of exciting developments still going on but the basics you learn in these courses so join them learn about them and these days luckily you can train your models in either a google collab or even just on your own computer i find this very exciting that you can train a whole translation model in your own environment rather than on hundreds gpus and we can really explain to you every detail of how it works i i hope you will like it too thank you ryan thank you lucas uh that that's fantastic and super interesting how it seems like bigger and bigger models are have been sort of a path towards better and better outcomes but then at the same time making those models faster and faster so that they can be not just big and slow and only runnable at google but but big and fast is super cool um i like how you emphasize too that understanding the basics of some of these building blocks like rnn's are is really important to getting to uh the big complicated models i hope we can talk more about that in the panel discussion uh but right now i'd like to turn it over to andrew eng founder of deeplearning.ai and um get his perspective on nlp and and where the field's going where do you andrew thanks ryan um hey everyone it's nice to see you here online despite the circumstances of the world and thank you for joining us what i want to do is um share with you some thoughts on the state of nlp today um and i know that many of you are looking into breaking into nlp hence the name of this um session that we're holding today but what i thought might be useful would be if i share some thoughts on where i think nlp is um so the rise of ai the rise of the deep learning revolution the rise of machine learning and ai is leading to more specialization into sub views you know i feel like my background is a machine learning generalist but that's a few materials they're now more and more specialized tools in the most important application sectors of ai and this includes of course nlp with lots of specialized tools that grew up and are native to nlp like some of the things that um lucas talked about but some of which like transformers which roots sectors as well um and then also of course computer vision uh speech audio processing a lot of exciting work on structured data um that that's not widely appreciated in in some of the published literature i think and then and then other sectors everything from medical imaging to robotics to um earth sciences you know all of these different sub-disciplines are building up specialized tools to take advantage of the deep learning revolution just now luke has alluded to the history of um the growth of deep learning and nlp and and i thought i i want to share with you some of the history that i saw for that because it has implications for some of the work that all of us do maybe some of the work that you might end up doing today so the rise of deep learning we all know about that um what what what happened uh from from where where i was sitting what i saw was that deep learning um as a application discipline first had its biggest impact in speech recognition so i remember it was about 10 years ago around 2010 the work of dunlee and jeff hinton and a few others really showed that deep learning can make a big difference in speech recognition accuracy and um what happened uh and i was leaving a google brain team around that time reading a lot of these papers and implementing these algorithms but this first wave of the injection and deep learning to speech recognition resulted in dramatically improved speech recognition system accuracy and a wave of new applications everything from the smart speakers the voice assistance to practical voice search that you use on your mobile phone maybe and the injection of speed deep learning to speech improve accuracy and led to wave of applications i think the second wave was computer vision where you know again this is well told uh the the story of um 2012 imagenet alexnet and that in turn subsequently led to a slightly later wave but but equally bigger maybe as maybe even bigger i don't know wave of new applications in computer vision and so you know everything from uh today some of my teams are inspecting things in factories using computer vision or there are teams driving uh big machines using computer vision or computer vision is enabling driver assistance systems maybe in a car that your driver will soon drive even if we aren't all the way to complete you know self-driving cars and all the ways we we hope to get to some day um but and and of course you know face recognition with all the problems and opportunities that creates too but that that's a huge wave of applications and then i think uh nlp as lucas was alluding just now came a little bit later and i actually remember um when deep learning first hit speech recognition the community had a lot of internal angst you know we had spent all this time hand engineering features what are we doing with this work and then the field of computer vision had the same controversy and angst and discussions and then i remember when my friends in nlp you know were going through the same thing and deep learning guys right were saying look got to use deep learning but the people that spent a long time in a in other techniques yeah it was it was a it was a journey transition but i think because nlp um had the wave of deep learning had hit nlp later uh this rise of increase in accuracy came a bit later in nlp than speech or computer vision and so i think there is actually a very large set of untapped opportunities in nlp as in the the uh today we have very good speech recognition you know applications with still lots of opportunities with quite a lot of vision applications but i think the set of untapped opportunities in nlp is just massive because the wave of improved accuracy has come relatively later and there's still so many applications that this opens up that no one has really thought of yet or invented so to those of you thinking of jumping nlp i think this is actually a very good time it's not a bad time to do visual speech recognition either but the space of untapped opportunities in nlp feels huge to me um and then of course you know in the history of deep learning there's still other these other disciplines they're still jumping in and enjoying these improvements in performance which are also leading to improvements in lots of applications so today nlp has tremendous momentum behind it um here's a marker study i i tend to take these things with a grain of salt but you know significant rises in the value of nlp applications are projected by this organization utractica these numbers frankly seem low to me but but but i find the trend useful as they confirm confirmatory trend and if you look at the growth of acl and a big nlp conference number of submissions is rising rapidly so the volume of activity very much seems um up into the right and maybe even accelerating for nlp and um what's exciting about all this one of the mat one of the magical pieces of nlp is that today probably all of you use nlp dozens of times a day without without even thinking about it and and the applications that work so well they've disappeared in the background you don't think of them as nlp anymore it's just you know some something you use every day without thinking about it uh so today most of us use web search uh probably every day um and we talk about search web search is the obvious one but lots of websites have product search capabilities go to the website you know maybe buying something looking for a movie looking for a look and buy something for your home uh search it's not just a web search engine it's actually pervasive on a lot of apps and websites um summarization when again we do a web search that sympathetic text is really valuable uh auto complete again in search functionalities and how we type useful all the place anti-spam is without any spam my email system just won't function uh machine translation is actually getting really good you know it is interesting that uh um and and we'll see hopefully uh uh the world becomes a place where anyone can talk to anyone else even if we don't understand the same language but i find that very exciting um speakers smart assistants in addition to speech recognition portion of it there's an important piece understand the words you actually said and translate that to action or command and nascent chatbots there was a wave of hype and chat odds i think the hype around chatbot says die down but there are actually working chatbots today that provide real value even though the hype isn't quite there but some of these are today's nlp applications um many of which still have significant room for improvements or cycling room for reinvention and if you learn about nlp i hope that these will be uh some of these could be sectors you maybe some of you could dive in to build cool stuff uh to to move um to move your company or to move the world forward and one thing that excites me at as either at least as much uh maybe even more is all of the future applications that no one none of us have have even envisioned yet i don't know what the future applications will be but these are something i personally find exciting i'm just one person some of you maybe you have better ideas than than i do um i'm personally excited about the application of nlp to education uh specifically to alter grading giving feedback to people at scale um i think there's a lot of room for improvement in email tools i know email feels like this mature you know relatively consolidated industry but with rise of modern nlp and i think google's smart reply right when you hit the button and get a reply back that's just one small piece i i think there's actually a lot of room um where uh sometimes i i ask you my collaborators to help meet you with emails some of the things i ask my collaborators to help me with i think could be automated so a lot of room for improvement um synthesizing academic papers there is research on this but i'm actually excited about there's so much literature in many disciplines like medical discipline you know no human can read in a small fashion of these papers but i think there's actually a lot of um i'm seeing nascent progress i'm seeing weak signals that this could become a big thing in terms of spotting trends in statistical way from this giant body of literature to help us drive uh medical research or medical maybe even medical decision making in a heuristic way sometimes um rpa robotic process automation has been taking off rapidly uh lots of great companies doing uh rpa and i think there's a potential for nlp to make rpa robotic process automation much more flexible um and many more the dot dot is really i i would love if a few years from now i'm giving another talk and and maybe something that you come up with ends up what i could talk about uh and inside you you know in in the dot dot some idea that none of us that i certainly have not yet thought of today so the rise of nlp accuracy opens up tons of rooms both to improve today's applications which are already important and pervasive and also to envision new things that just were not possible before and on the research front i want to highlight one which is just one other thing i'm excited about and if you complete the nlp specialization all the causes you you'd be well positioned to jump into try your hand at some of these cutting-edge aspects um you may have heard you know a a bunch of us right me you know younger and jeff hinton also your share chat about the importance of unsupervised learning well there's one aspect of unsupervised learning that i'm excited about uh that isn't like a 30 year off thing but it's something that's working now um well i say about 10 plus years ago i think it was actually 12 13 years ago i started getting excited about one form one application of unsupervised learning that we call self-taught learning um and here's the idea you want to learn from a large amount of unlabeled data the world has a lot more unlabeled data than label data be it unlabeled images enable text you know so learn from tons of unlabeled data um and this thought was motivated also by the observation that most of human learning is from unlabeled data right so so some of you know i have a 18 month year old daughter uh called nova and you know i love her but i'm not going to point out one object to her every second to her life so frankly you know i i do my best to teach my daughter but most of what any uh son or daughter any any infant learns is just wandering around by themselves with a little bit of supervision maybe um but so we think most of human learning human infant learning is is unsupervised so can we algorithms learn from tons of unlabeled data and then the self-taught learning task is to take what you've learned from tons of unlabeled data uh be it images or text or audios whatever and transfer that to different tasks usually a supervised task we have only a minuscule or much smaller label data set so we call this framework self-taught learning um self-supervised learning which i've talked about in other forums as an exciting piece of progress and self-taught learning and i think the best example of this uh the most exciting product in this is in nlp so you heard lucas um talk just now and and the thing that really enabled you know this type of learning um uh to to take off was uh the work that lucas and his collaborators have done on the transformers one thing i've learned about transformers work is that when you look carefully you find that there is more than the ci um and that transformers work enable the rise of other great pieces of work like birds by google um and the various flavors of gpt uh uh by by opening i i usually just read the gpt3 paper um great work uh and and and and the lack of um and rather than using a fine tuning instead giving a text prompt to gpt3 to have an output additional output i thought that was actually really really interesting and clever but if you haven't if you haven't seen some of the demos of gp3 on on the internet it's worth uh searching around and finding some cool demos but this is an example of english language correction uh that this is actually example drawn from the gpt3 paper but i thought this was um uh you know really exciting example of um how this type of framework we learn from a massive amount of unlabeled data and then adapt it to some other tasks maybe a supervised house maybe something else in the case of gpt3 is giving a text prompt and having it do auto complete right to do these things i think um uh this will open up tons of applications and um i want to share just one last thought before i um uh the if i wrap up which is uh just now lucas alluded to the early transformer networks uh some of the machine translation models the only stuff that you could do at google back then um we've learned in tech that uh yesteryear's big giant supercomputer is tomorrow's cell phone wristwatch uh and so all of these models we'll see how long moleslaw holds up maybe it will maybe it won't but i think all of these things and and by the way i i actually i remember early days i was leaving google brain i remember we were so proud of ourselves and we got sixteen thousand you know cp calls you trying to join neural network uh uh and and at that time only google could have done that probably today any of you can go online to one of the cloud services and for about two thousand three thousand dollars with my background calculation for about three thousand dollars worth of cloud credits which isn't nothing three thousand dollars is a lot of money you could replicate what we were so proud of you know doing uh slightly less than a decade ago so all of these things that seem so incredible uh when we work hard on improving these albums and finding the right applications even if it seems niche uh i just imagine one day where these models that we read about and are amazed by i i don't know if i'll ever run on your wristwatch but uh this is how technology becomes more available and right now i see a lot of weak signals in the field of nlp um from from really building off you know lucas's uh and then the team's transformative work uh so many puns you can make with the stuff uh that that i think will explode and uh become more capable and more widely available and i hope that maybe you watching this from home wherever you are maybe by learning these tools hopefully you can also participate in that revolution thank you thanks andrew i think that's that's really cool what do you say about the the future of nlp uh there's some things you can think about as exciting advances that might be just on the horizon but perhaps the most exciting thing is this the unknown the things that that nobody knows uh or nobody could predict uh it's that's exciting to think about so right now i would like to continue the conversation by welcoming our other panelists so today we have online with us kenneth church distinguished scientist at baidu usa marty hurst a professor in the school of information and the electrical engineering and computer science department at uc berkeley and eunice mori an instructor of ai at stanford university so before we start the panel discussion i thought it'd be good to just get a little bit of an intro to each of you so ken if you could start us off ken church from baidu usa oh i'm sorry oh there you are yeah so you want an intro of yeah just an intro and a hello oh hello um so um i'm ken church i work at baidu in california but right now i'm in new york um anyway i've been working in the field since the 70s mostly in competition linguistics but also speech and language and um and even things like data mining um anyway it's a real pleasure to be here all right thanks ken we're really glad to have you uh marty hurst could you give a little intro to yourself in your background hi everyone my name is marty hurst i'm a professor at uc berkeley once a long time ago i was a grad student and worked with ken church i visited him at bell labs and i have i'm actually an interdisciplinary scholar so nlp is one of my fields another field is human computer interaction or hci as well as information visualization and search so i do focus on applications a lot of the time although i can i've been doing this since the late 80s so i have done some foundational work at that time as well and i really appreciate i'm also very interested in uh accessible online education like andrew and uh so i applaud this this kind of course where that you all are offering here great thanks very much for the intro marty uh eunice mori could you give a little intro to yourself thanks ryan uh i'm eunice and i teach here at stanford and also deep learning dot ai so you've probably already seen me at deep learning dot ciao with lucas in the specialization and i'm also very interested in accessible education specifically for ai and hope to see you all soon and and i just say you know we're really grateful uh to ken marty and eunice in addition to lucas for being with us uh i just want to give a shout out to ken uh who is a huge help in uh to the whole team in creating the nlp specialization in addition to being an insightful you know leading figure of of nlp so thank you ken and marty i've known for a long time had the privilege of learning from her interacting with her way back when i was a student at uc berkeley and lucas always raises work within known lucas for a long time uh and and lucas and i actually co-instruct uh one of stanford's machine learning classes so so really glad to have all of you here and then that is of course as well great yeah this group goes this group goes way back uh so what i'd like to do now is for the next 30 minutes or so i'll be asking some questions of our panelists and while i'll be directing my question to an individual one of you i'd like to encourage any anyone else to feel free to jump in and you know offer your own input on the conversation we'll just kind of keep it rolling that way and then we'll after that we'll be fielding some questions from the from the submissions that we're getting online uh so marty hurst i'd love to start off with you um as a professor i'd imagine that you have students coming to you a lot asking what to focus on or what to get into or why to get into something like ai in general or nlp so what do you tell them that's a great question and we have both undergraduates and graduate students and i also do talk to people that are looking at second careers and i i speak at panels as well and that might be i think a lot of your audience it might fall into that group uh so the first thing is uh you know why are you considering this area and i do find that people often do something because everybody else is and that's often not a good reason so really look uh carefully at what your motives are now it's there's nothing wrong with dipping your toe in and with these online courses this is it's so wonderful that you can just get your feet wet start to learn about something without a really big commitment and and learn a bit about it and see if it's for you but i think really assess what interests you what you like and then see if something about that field resonates with you in terms of other parts of ai they're not really my specialization but i can speak to nlp and what drew me and what's drawn other students that i've worked with so i'm really interested in language and i think that when i was in grad school in my first years people in nlp there were a lot of people that were linguists we were computational linguists it was people interested in language or interested in psychology interested in the mind interested in philosophy i see that there and and i was interested in sort of animation and how do we make a computer interact with language i mean that's just really fascinating and i think today a lot of people are coming at it more from the machine learning angle and the mathematical angle and that's a perfectly valid angle as well and there's room in the field for both types of people or all types of people but i think that it's you know again look at what your interests are and and and see which part of the field is right for you but even if you are currently strong on one side of this one of these aspects but not the other don't let that deter you because you can learn the other aspect so uh people that start with machine learning but don't know a lot about language can learn i would say that language is a very deep field it's it's harder to learn than something like not harder but it takes more time than something like information retrieval just language and linguistics is very deep very complex very rich but you can still get a first pass at it but it's it rewards years and decades of study it never loses its interest i would say um and the machine learning side some people have that background some don't you can learn it there's so much there's so many great resources now so different from when i was a student that you can with study and practice and learn those skills if you don't have them but i'd say just the best thing about nlp is it's interdisciplinary so it's not just the machine learning side and there's applications that are fascinating is andrew so wonderfully outlined and so if you're interested in applications in changing the world and making a difference for good then there's a lot here for you and if you're a skeptic too i mean there's a lot to question about ai and i think there's a lot um that you can contribute from that angle in this field and it helps to be well informed about the technology if you want to look at it from a skeptical percep perspective yeah that's a really interesting point that the field would definitely benefit from more skeptics and more uh folks challenging ideas um it's also interesting what you said about nlp or language i know you worked on computational linguistics but there's i think sometimes we make the mistake of thinking about nlp is just another application of machine learning but it's really language meets machine learning and language itself is a whole world unto itself so that's that's an interesting thing to consider um i'd like to ask you a question ken church you've been you've been in the field a long time you've worked on a lot of different things you've seen uh the academic perspectives the industry perspectives um first if you could just kind of paint a picture of where the field was when you got started in it and where it's come to today i know that's a that's a a broad scope but uh if you could give us some historical perspective that'd be great sure i'd love to so um actually let me go a little broader than you suggested father was at harvard when when skinner was there and in those days empiricism was all the rage not only in psychology but shannon was um sort of at his peak then information theory um firth quoted this very famous quote you shall know a word by the company it keeps harris who was chomsky's professor pushed the distribution hypothesis the idea that you would understand what's going on by distributional statistics all feels very familiar again today then um uh 20 years later i was at mit when chomsky and minsky were there and they were pushing a different position um that was called rationalism it was anything but the empiricism of the 50s and then um my generation so um what i want to say is that the uh chomsky and minsky rebelled against their [Music] turn against our teachers chomsky and minsky we revived empiricism um and uh it's hard to remember that um before the 1990 or so it was almost impossible to get a paper into a top conference that used statistics at all right um and then 20 years later came along deep nets so what i want to say is that every 20 years there's an oscillation it goes back and forth and every and i think the basic common theme is like that cliche about grandparents and grandchildren they have a common enemy that every generation needs to rebel against its teachers in order to make its mark and i wrote a paper called the pendulum swung too far which sort of goes over this whole history anyway in retrospect maybe that's sort of unfortunate metaphor that it's it's not really quite as simple as that chomsky was a strong personality minsky's actually a sweetheart uh deep down although he never really weren't pleased i want to say that while you know we never agree with all our teachers on everything we don't agree with everyone and everything and it's perfectly fine to disagree what i want to say is that a lot of times fads come fads go but a lot of these things have always been around in good times and bad times hinton and lacoon did what they did when it was fashionable and when it wasn't and similarly sultan who's um you know sort of the the the guy behind um vector space which is really like word to vac and it's really the foundation between behind much of what we do um that stuff was very unpopular through most of his career so i don't say that um we mostly have been talking about methods empiricism statistics and so on we i think marty did a pretty good job of talking about some that important too then it was very important we didn't always have with data back in the 80s it was very very hard to come up with data nowadays it's a lot easier you can hope to get massive numbers of books and wikipedia and all sorts of things you can download these wonderful data sets it wasn't so easy but even so we tend to to underestimate the importance of things like balance corpora a data set should be a sample of something we want to think about what the population is we're trying to sample not all data is interchangeable and so on the last thing i really want to highlight is the commercial impact i no longer have to explain to people what i work on there was a time when i had to explain what speech recognition is what speech synthesis is you've all experienced this now there was a time when there really weren't any commercial successes you probably heard of the company semantic but you probably don't know why it's called symantec um anyway um nowadays there are lots of fortune 500 companies in this area i no longer really have to we we don't have to talk about just supply there is tons of demand for what we do that's all new that wasn't like that in the 50s certainly not even in the 60s and 70s or 80s or 90s anyway thank you that's fantastic that perspective yeah working on things when they're not in demand yet but now for sure there's no question about whether this stuff is relevant or whether it's going to change the world that's that's very very interesting um lucas i'd like to to ask you a question we we talked a little bit or you talked about transformers a little bit and uh and andrew touched on it a bit in his talk as well um a few years back the paper you wrote uh co-authored with folks called all uh attention is all you need um and we've already had some good puns on this but uh this paper's gotten a lot of attention if you will um and and the the subsequent uh things that have come out based on transformers have been really exciting um i think for the audience that we have gathered today i think many many of whom might be familiar with uh the sort of math and basic architectures around neural networks how would you explain what is the transformer and what it was designed to achieve well i i think it's good to to you know bring back to what ken said uh everything comes in waves and everything builds on top of the things that were before um the you need to understand the context before transformer which is where everything was at least in the deep learning part of nlp rnns were doing everything right they were the big first wave of of deep learning nlp and they have this super basic idea that you should do things in recurrence step by step pro right which is so so intuitive but it's really slow on on modern hardware because the accelerators are built for parallel processing so the idea of attention comes from another very old technique in nlp which is alignment when you try to translate sentences you have an english one you have a french one you try to align which words correspond to which words so so if you put this a disalignment idea into a neural network what you get is the attention mechanism it is a soft version a differentiable version of aligning words with words that come before and self-attention is aligning the same text such that you align with with words that that appeared earlier and this idea works surprisingly well in terms of the results you get but it also works surprisingly fast if you implement it well on modern deep learning accelerators so it's a mixture of making the implementation really easy to use and i think that that was part of why it became so successful because you can just run it and it runs fast enough it's always a big difference whether you need to wait for your result one day or one month but the other thing is alignment is a really good idea this has been known in nlp for for a long long time if you want to translate the whole paragraph it's much easier when you focus on translating this part and only look at this one part at a time that's what every human being does and you give this prior to a neural network and that it works really well so so that's that's where attention came from but i would want to say that as everything goes in waves uh it's a very good idea to use rnns because there are modern papers on on sparse transformers that kind of start showing that this attention is more like arrons than people think and actually adding some recurrent networks into it makes it even better so so it wouldn't be surprising if people go back to recurrent networks in some way of fashion again that's why it's good to to learn about more than just the most recent technique it's good to know things that came before and learn all about it that's really interesting uh that that in some sense with the development pushing pushing the boundaries of transformers that you end up kind of somewhat circling back to where you came from um yeah the it again highlights is that the building blocks are important or the where it came from is important eunice i'd like to ask you a question uh i i you're you're the the sort of earliest career uh panelist here and i think it must be an interesting perspective to have been studying all this stuff or sort of developing your own mastery while the field was advancing so faster while it was sort of making such a big public splash so maybe you could give some perspective on on what your journey's been like and and also any recommendations you'd have for people who are wanting to potentially start on that journey themselves yeah thanks ryan so my journey has been a little bit different so for me education means a lot to me and especially i grew up in morocco so we do not have access to the latest you know technologies and states of the arts and as many resources as we have here at stanford so for me i really wanted to be part of the mission that will help you know take the latest uh technologies and states of the arts and make it accessible to everyone around the world so that's how i started and i came here around in 2014 where i started with the machine learning course on coursera and i started with by writing like lecture notes and trying to make it as accessible as possible to everyone and eventually i started building up and you know studied math computer science statistics and just started slowly you know mastering uh building assignments and quizzes and uh so forth so that was uh pretty much the journey then we built the deep learning specialization and we started teaching uh on stanford campus so it's been interesting because every time you you finish building an assignment then there's new technology that goes out and then you have to read another paper and then try to transform it uh into another assignment and it just keeps going on and on so uh it's been definitely exciting to be like you know as soon as you're done with uh a new assignment you you go ahead and try to uh bring in new uh uh states of the arts so yeah i can definitely say from my own perspective that working on these courses with you and lucas has been really interesting in the fact that the field has developed significantly while building the courses so that that feels like it's been an exciting uh an exciting path um andrew i'd like to to turn to you with a with another question um so as you mentioned you worked on a wide range of ai applications i think that one of the one of the issues that you've also thought about is that the fact that ai often seems like a black box and people struggle with explainability or at least there's skepticism around the sort of black nature of black box nature of ai algorithms um and particularly the bias that might come out of algorithms when they're um when when you know the output's not fully understood so in the in the field of nlp um what thoughts do you have on on how how models can or how we can think about bias and and and what the i don't know what the big considerations are why um don't know uh this is a tough one um you know i think i think that i've seen seen a few pieces of very good work you know one of the most influential papers of recent memory was uh the one out of microsoft showing the bias and learned word embeddings where if you learn from text off the internet learns the really horrible analogy um that uh uh was it a man managed to software engineer or manage a computer programmer as women into the horrible output homemaker or something like that uh and to the microsoft team's credit they proposed a solution to zero out this form of bias so that you end up with the more appropriate conclusion which is managed to uh complete programmer as one and it's to compute the program which seems like a much much more appropriate way of mapping analogies um i think um uh explainability and understandability is is is is this one yeah it's one of those things that um speaks to the importance of the field right once upon a time the field of ai frankly the field of computer science had much less of an impact than and and it was just you know a few people writing code where writing software doing cool things trying to build things at work but today um the work of the entire discipline uh computing ai nlp has something outsized impact uh that that we owe it to society really to think through the real impact of our work and then to do our best to move things in a positive direction um and i think you know there's a there's there's one when when ken and lucas were were speaking there was one thing that occurred to me that i want to comment on which is um ken mentioned uh lucas mentioned also technology coming in waves and can use the term fads in nlp which i think is completely accurate one lesson i've learned thinking a lot about technology is um i think it's important to time the technology appropriately and so you know who invented the helicopter you know that thing that fires around right it turns out little nadal da vinci uh had early drawings of what looks a lot like a helicopter there were some problems of leonardo da vinci's designs he did not get it right but if you search online you find leonardo da vinci was trying to invent a helicopter totally did not work without the internal combustion engine the human body could not generate enough power for its contraption to the volume of the air so it failed um how about smartphones well the apple newton search online if you want the apple newton was it was a terrible idea technology wasn't ready lcd screens weren't ready batteries weren't ready the touch interface wasn't ready so the apple news and failed a couple decades later uh steve jobs you know launched the iphone and that transformed the world of mobile computing so i think a lot of the fads that can uh accurately alluded to is about getting the timing right um in my opinion and this may be controversial 30 years ago deep learning was a terrible idea computers weren't fast enough the data wasn't there it just didn't work now i'm not ungrateful to people they're working on deep learning 30 years ago i think you know they laid the foundation and and you can also argue maybe i actually have no idea if um leonardo da vinci's early drawings of helicopters inspired whatever actually happened about 50 years ago or something actually made more than 50 years ago when it became practical and even though apple newton failed you could argue maybe a laid the foundation for the modern smartphone so i'm not dismissing the importance of the early foundational work but i think 30 years ago or twenty the timing was wrong and the right timing to jump into deep learning was maybe 15 years ago when finally the compute became powerful enough and the data became there and then deep learning kind of finally overtook you know support vector machines or whatever we're doing a little bit before then so um uh really leonardo da vinci deserves tons of credit for envisioning the helicopter even though his attempts to make to to do a fail but i think um uh as we go through these things um i think the timing of uh nlp it feels more solid now that you know the computer's there the data is there we're getting early needs and results and you're one of the one of the ways to have an impact on the world um not the only one we need people work on all sorts of things including things like 50 years old including things that are 100 years old it's fine you know we need people working on all sorts of things but if you look at the weak signals i think one of the best ways to make a big contribution to society is if you can spot the weak signals about things that with our help can take off in a better way or faster way over the next several years and to me nlp patent matches very well to to you can see a lot of weak signals that suggest that if you work on it you can make a big difference whereas arguably i think maybe now that davinci did have a big impact on the rise of holocaust as you don't know and certainly the pioneers 30 years ago had had a big impact on on the world today but given that we all have limited lives and limited amounts of time i want to do the work where i had kind of the biggest impact and so i tried to think through if i do this now versus don't do this now what enables me to to hopefully help others the most and i do think nlp patentment just matches well to that yeah and i don't know and i know actually marty you you i know you thought a lot about timing too i don't know if ryan wants us to jump in chaotically yeah i do actually the uh i think that i was gonna follow up there but but marty i i would love to get your perspective on um timing because yeah i think i've also uh i guess there's there's one way to think about it like you sh if you want to time something right you should be ahead of the wave uh and so like if if it's a big deal now maybe it's too late you should move on to the next thing but um it sounds like uh for nlp at least even though it seems like a really big deal right now it's it's perhaps just the beginning or or something like that but marty i'd love to get your perspective well first i loved andrew's analogy to da vinci and the helicopter when and neural not smith's not being ready when i was a grad student and people were tinkering around with it and hoping to make it work actually a lot of people i know got interested in ai got sucked into neural nets and and it kind of was it was good they were productive but it wasn't really going anywhere then and um so just being the wrong timing and linking that to something seeming like a fad as opposed to oh it's having its moment because it's possible that said there's a lot of things that become fads that that are just popular because there's something that that interests people and even though they're clearly not working so i think that does happen as well and i can think of examples that i won't name but i but in terms of sort of timing and in terms of choosing research projects and i know this is an audience i think maybe largely a practitioner so i don't know how much it applies to people in the audience here but for those who are thinking about research uh my strategy or i guess i couldn't call it a strategy my bent has was always to look at an area where there weren't a lot of people working or look at a problem where people um weren't that interested in it like i wasn't looking at search before the web so it was to say it was unpopular as an understatement it was very much a backwater but i was really interested in the question i just didn't like library catalogs i wanted to fix them since i was a kid and so that's why i was interested in that topic and i had ideas about it since i was a kid and now they they are what you see on websites when you shop now so that's very satisfying for me so a lot of times there's a problem that you want fixed and nobody else is thinking about it so maybe you can fix it or maybe a few people are thinking about it but you have a slightly better way of doing it so for me it's that's what i started with it's what are you interested in not as what is everybody else interested in if you really want to have satisfaction in an intellectual pursuit i think that's the most important thing to think about and then if you want to pursue something that other people aren't doing be prepared for a lot of rejection be prepared for your papers to be rejected i i have a lot of well cited papers most of them were rejected the first time i submitted i often had to send them to a smaller venue or something and then they were there kind of cited strangely a few of them were accepted the first time but that happens to almost everyone who has a new big idea their papers are rejected the first time if it's a good idea you know just keep working on it so yeah everyone has their own way of working some people are really good at taking ideas that are established but making them deeper making them more sound and that's a great way to advance work as well it's just so important to understand what are your particular interests and strengths and do that and not what everyone else is doing for that sake yeah absolutely um ken i i wanted to get your perspective as well on uh the timing of things i i know you've probably seen a lot of a lot of different things timed well or timed poorly and what would you say so i think of timing a little bit like a stock market um uh you you you want to you know buy low and sell high and not vice versa and a tendency is to do just the wrong thing that is it's sort of like a kid's soccer game where everybody goes to the ball it's much better to play your position um and to get a good assist and in in my own case you know i was very early on for the revival of empiricism in the 90s my first empirical paper was in the late 80s people have been looking at in those days maybe there were hardly any papers in empirical methods a decade later there were hardly any papers in anything but empiricism so it's good to to get started in the 90s and by the by the decade later was time to do something else deep nets came in a pretty good time i would worry about jumping in when everybody else is jumping in and doing what everyone else is doing you want to get into the next new thing and i don't know what that is but you know i think we could make some suggestions i don't think it's a good idea to be doing what everyone else is doing and marty i think put that very well thank you yeah i i think the audience would would certainly welcome suggestions if anyone has them i know that it's hard to predict what the what the thing will be but uh so i maybe i can jump in with uh one comment on that i i feel like it might feel with the internet like everyone is doing deep learning in nlp i i feel like it's actually the opposite i think this is a field that has barely scratched the surface it's it of course nowadays everything happens online there is it feels like if if you submerge into a field it feels like there is a lot of movement because the world of nlp is much bigger than it was decades ago but i feel we are barely scratching the surface in what these deep models can actually do to language and this is as marty says language is an extremely field where we are yes we can generate you a story and it reads okay but does it really have meaning does it really is you know can we really steer this to generate things that are true can we verify can it tell us you know we talked about biases you can actually ask the language model is this offensive and in a lot of cases it will tell you whether it is or not but it will make some mistakes that are very non-human-like can we understand why i feel like with the gpt3 which was only released weeks ago it's the first model that shows that you can do learning without gradient descent you can just put things into the model as input and it works as if it was training on them so this is very new and it's very hard to test because you need large models so i feel like in half a year in a year there will be models that that you'll be able to start doing this with hands-on so that's why i feel like it is a great time to start learning because it's only beginning a lot of these things that actually go into language i feel like we've we've gotten the technology we've gotten the connection to deep learning but actually going into the depths of language is still ahead of us very imagined this will bring applications i i think maybe marty can can say more about that yeah i think you're making a great point lucas and i i didn't i don't want to apply that people should stop doing deep learning i think that's a fundamental tool now so one works with the fundamental tools of the field and moves forward and i didn't want to imply otherwise in case i was i think that the key thing though is like you were just you just raised gpt 3. so i think a lot of people are having reaction to language modeling as being the way now getting a little technical here but being the way to solve all the problems and so people are looking for alternatives to that uh since it's clearly not going to solve oh i think i think it's not going to solve a lot of the hard problems it doesn't model a lot of things about language and communication that are really important and by the way for an area for people to think about going into that andrew mentioned chat bots for example which are sort of a type of communication we really don't know how to do communication well with language or interaction that is really unsolved and it's an intersection of a lot of interesting fields we really don't our machines do not read despite that word there's no understanding in these systems i don't think and so we we are uh that we're doing really well in applications that we did terribly on iu you know 10 years ago i used to open one of my classes with scenes from 2001 to space odyssey and have the students say what's possible and what's not i don't even do that anymore because they're all almost all possible now but except for the very end of the movie but um nonetheless it's somewhat of a mirage or an illusion in terms of what's going on with nlp and there's so instead of just working on what i mean is don't just work on language modeling work on these other more difficult problems which i think is exactly what lucas is saying absolutely yeah that's that's interesting perspective ken did you have something you wanted to add to that yeah so i certainly agree there's tons of interesting open problems in language and deep learning and i don't want to discourage anyone from that what i'm more worried about is that there are a lot of papers these days on sort of mindless benchmarks and i've got an opinion piece coming out soon called benchmarks and goals where i'm saying we should focus on the on the goals what are we trying to address here and i worry that some of the benchmarks take on a life of their own that they were posed for no particularly good reason they might have once made sense but they certainly don't make sense now and suddenly you see an enormous number of papers showing incremental progress on something that's not worth doing and that worries me okay so i think consensus here there are a lot of things worth doing but that's not one of them absolutely yeah i think that it can seem like that sometimes that there's uh just uh endless plots showing the number moving slowly higher and higher um well untasked and are not worth doing fantastic can i say on one hand can i agree on the other hand i think to a lot of people i think it's okay so there's a lot of online programming competitions or you know some of the machine learning modeling competitions um i think at some point you know they are pointless but i think to a lot of people is an important learning experience so when i was learning actually i am still learning ai myself but in my phases of learning journey i do a lot of pointless things right write a lot of code with no purpose other than i want to try this thing out and see what happens and and i think and i think it's actually okay um you know i was talking to someone um in the launch of a travel company uh that had built a chatbot uh and their executives were saying hey this is useless how's this helping you know this big travel this very large travel company how is this helping company make money and i think they said no doesn't help the company make money at all but i think that's fine because to this team in this very large company that i think you you probably have heard of i'm not going to say their name it was an important part of the learning journey of the ai team to try these things out do a little demo and to go now they stay at that level for their whole life then that's not great but as a stepping stone to then do something bigger and more significant i i think it's fine so i think really to anyone go ahead download some silly thing you know build build some silly things uh download tweets and classify things the way it is totally useless if that's all you do your whole life maybe that's not so great there's a stepping stone to gain skills to then go on to the next level i i think that's fine so i think if ken's complaining about people living doing this their whole life then i'm not with you and i also encourage people as part of your learning journey go ahead and do a fun but useless project right it's fine and then celebrate that and then use the skills you learn to do something even bigger [Music] absolutely great perspectives uh eunice i did i wanted to give you a chance to to add to the conversation it looks like you're on mute though yeah thank you so one other trend i've been seeing is that these nlp models have been getting larger and larger so like gpt2 then gpt3 and then i don't know the pt4 i know what's going to come up next but the thing is one other uh important research application i'm getting interested in is what if you can just you know take like these huge models and make them smaller and keep the same performance so you can load it in an app and then it could be way more useful than having to like login into a server and calling the api so model reduction i was thinking would also be a great research absolutely all right at this point i'd like to turn to some of the questions that have been coming in online uh we have a lot of people asking questions and uh hopefully we can cover up a few of them here um i think one thing that is on people's mind we touched on it a little bit but the the question of gbt3 and okay what does this mean is this does this mean everything else is now irrelevant and that it doesn't matter anymore um this is the thing that everybody's going to use and so whatever all the other stuff was don't bother just think about this and the next thing um maybe maybe lucas you could uh start us off on some thoughts so so so gpt3 is a big transformer we will actually go through the whole code of you'll understand every line of this model uh in course four um it's a transformer model trained on all over the web it predicts the next word and it's it's wonderful that it generalizes to with future learning to new tasks which is a little unexpected but it let it it has really been trained on all over the web so it's not that unexpected in some sense and you will see even the smaller models we'll train in the cars sorry they do generalize as well so now the gpd3 it gives you answers great but as marty said what use are these answers and maybe your tasks fits the this is the question this is the answer paradigm but you need to solve usually you want to use the model to solve your task and if you play with it it's okay you should solve a useless task get experience you should learn but later you need to use a tool in a real-world task and as i told you we have great translation models google translate is not just this model you cannot launch a deep learning model just like this you need you need to put some regex you need to put some pre-processing you need to put some post-processing all of the old techniques that even just simple regexp to to see if it's producing some terrible output you need that you you you need to really look at what you're outputting to your uh to your customers to your to your users and you need to verify it and you need to start thinking why is this model in certain situation producing such bad outputs and if you don't know how to repair the model and there is still a lot we don't know then you need to put some defenses against this maybe just detect cases when the model is really bad and gpt3 is bad in many cases it's good in many other cases so you just need to learn it's a tool you need to really learn how to use it gain experience gain experience on maybe some useless problems gain experience everywhere you can because you can use it productively but but it's not just take it and apply it and it works no it's it's a lot of work to understand when when it works how it works and as marty says hopefully with the years we'll get a better understanding of how these things actually work they're tools that allow us to go a little bit deeper in language but there's still a long way to go and i think the courses will help you get a real understanding of where can i use it or where maybe but there might be a lot of other things that are needed too and there will certainly be a gt4 it's not the end of the road but it's a great step on the road yes fantastic thanks for that perspective lucas we have another question here uh and and i i guess um i'd like to direct this to you marty but if it doesn't if it feels like i i feel like you've you've touched on a lot of things that are probably related to this although i don't know everything about your background the question is can you tell us about the influences cognitive of cognitive science to neurobiology on uh neural network architectures so for example computer vision architectures have been motivated by local receptive fields and compositionality in the neural vision neurovisual pathway can you say something about that i'm not sure i'm the biggest expert on on that in the cognitive science on neural architectures maybe others in this group no more and they can indicate that certainly though i can say that it had a lot of influence on nlp in the earlier days and uh that there's a classic uh the realm of heart mcclellan book realm hard and um again the second name mcallister mcclellan yeah remember okay mcclellan where actually they that was a very early neural net books and on distributed cognition and they really were motivated by uh cognition and modeling language and thought and those are great classics to look at they're a little hard to read but if you're really interested in that those are the books to look at but maybe other people can answer too maybe i'll say uh yeah once upon a time a lot of neural networks were smaller was inspired motivated by uh neuroscience so i remember again maybe about 10 15 years ago there was actually a bunch of us uh i actually i i i read a whole stack of neuroscience papers about 12 13 years ago looking for inspiration um for new deep learning architectures so i think there has been an element of inspiration uh and you know ideas like sponsor coding right a good which review really inspired from by neuroscientists like bird please uh pruning holes housing for example uh but i think the trend recently has been that ai deep learning has evolved into a standalone engineering discipline where the principles by which we get ideas how to move the view forward tend to be engineering computer science principles and i think the inspiration from cognitive science and neuroscience has has diminished significantly over the last decade nothing is a bad idea not saying people shouldn't try it but if i look at the percentage of new ideas that seem to be going to deep learning the percentage that you know seems to have its roots in biology seems to have a significant diminishing in the last decade thanks andrew we have another question uh specifically for ken uh do you think that the latest cutting edge nlp techniques like transformer applications could be applied on all text formats and styles so for example could you do something like neural style transfer as has been shown on paintings to be uh to generate to do some work with writing styles um it seems like this is something that people have already tried working on but i can get a comment on this yeah so um you know i'm sure that that this is a topic that's probably there's probably already some papers on archive that i don't know about that address this and it's just unbelievable what's going on um that said i do [Music] buy it not that i have anything bad to say about transformers but i think that just based on the history that i talked about if you ask me how long do you expect transformers to stay on top i'd say i'd be shocked if they stayed on top for more than a decade probably not that long you know how long did lstm stay on top it wasn't that long how long did any shiny object stay on top it's just not that long um so we could start to say what are some challenges what are some of the big hairy deep open problems that been open for 60 years that um maybe you know transformers have something to say about like say chomsky talked about long distance dependencies he was criticizing shannon and engrams where the distance what he thought by long distance was something that was shorter than a sentence now transformers can deal with something that's longer than a sentence but i don't know if they can really deal with something the size of a book something like that so large scales is one of many things robustness is another one so in the 1962 darpa mission statement for what is the funding agency supposed to work on robustness was in the mission statement and i'd say it's still a kind of work in progress the acl best paper award this year was on that topic and we normally deal with average case performance all the numbers you heard were about average case occasionally we talk about worst case adversarial gans but i want to say what this best paper award suggested is that we should work on something that's harder than average case but not as hard as worst case something like what would a reasonable user expect the system to do um that's probably you know i work for a company once it tried to do five nines reliability defects per million that's very expensive that's not as expensive as the worst case but it's very expensive um other company wow but it should work the user would be okay with a couple of defects now what i want to say is that these kinds of things the neural nets right now are still kind of trained for average case so the kinds of things that people are asking what what can this not do you know there's we're all saying well can this do this can this do that can this do the other thing we did used to worry about not only what we can do but what we can't do say like you can't compute all functions the terrain you can't you can't solve the halting problem you can't do decidability there are many things like this that that we can't do what is the next opportunity for the next shiny object to do there's probably going to be something that's going to be better than transformers right now transformers look really good but that's right now thank you thanks ken uh would would like to close out the session here with um as ken said this was a problem very much on our mind i mean transformers was three years ago we we've been working these three years exactly on that like transformers are not the end of the road right they're a step and there are new models and one paper we have just this here is the first transformer that can actually do a book which i will tell you a little bit about in the last course it's it's called reformer and it also will not be the last step on the road it's that's why you need to learn how these things work so so in a year there will be a next step and you'll be able to use whatever comes whatever is needed for your application uh yeah that's why learning is so powerful because it's it's never the end fantastic thanks lucas um i'd like to close out with with uh one last question for marty um uh someone asked given that the field is changing so rapidly we've just been talking about some of these changes and like how even the stuff that seems exciting right now might not be exciting even you know a few years down the road um does it make sense to dive deep into particular topics like solving a particular problem such as speech recognition or does it make sense to sort of stay diverse try to learn about everything try to get a sense of everything um what would you tell to somebody uh heading into this field marty well i i always tailored my advice to the individuals so it's hard to generalize but i will say that the statistics that andrew presented about the number of papers being submitted and the amount of work going on you know increasing exponentially has made this field it's uh the field is different than it ever has been in my life and i think this is how the biosciences are where developments are extremely rapid but it's new and i think we're all trying to figure out how to navigate these waters i don't think i don't think if someone says they figured out they're probably lying so even graduate students who do nothing else but read these papers and work on them can't keep up so imagine what it's like for someone like a professor who has to do a lot of committee work and be in multiple fields so the fact that you all who are spending full time teaching these classes and when there's a new paper have to make a new lesson i mean that's not sustainable for a faculty member so uh it's kind of a crazy time and i don't i don't have a uh i don't have a solution for how to keep up with everything going on in the field right now people what they're doing is they're putting their head down they're picking a problem they're learning what they can about some subset of techniques and also by talking to others and they are looking for solutions within a subspace of all the possible solutions and at least that's what my interpretation i'd love to hear other people's views on this um but you have to get the foundations first so if you're new to the area you've got to take the courses you've got to get the broad foundations on the language side on the machine learning side on the programming side you've got to get a lot the courses that give you like examples to start with where you write some code test it on a test set like andrew said are incredibly valuable for letting you learn how to do things and then you have to just keep learning and studying i mean i was actually yesterday taking an online course to actually finally learn github i'm going to say in a minute i haven't learned it yet it's my fourth version control system so i was tired of learning version control systems we get tired of this we always have to learn new ones but that's part of this field even the professors we have to constantly learn these new tools so bottom line is get broad first it's a t get broad first go deep but it's a t with multiple it's not really a t because i think you have to go deep multiple places but i'd love to know how other people are coping with this growth cool well um i feel like we could keep the conversation going for a long time here but i we need to save some time at the end here for for the course demo with eunice um so at this point i want to thank all of the panelists very much for all the insights um i want to just mention as well that for people looking to stay up to date we do publish a weekly newsletter called the batch and that we spend a lot of time trying to figure out what's relevant to think about this week and you know sort of bring it down to a consumable size so that you can try to stay up to date of course we can only do so much in there but it's a great place to catch up on some things um so thank you very much to all of the panelists and at this point eunice i'd like to turn it over to you to give us a little demo a little sneak peek into uh the course that's coming out today yes thank you ryan um perfect i hope everyone can see my screen now um okay so what i'm about to do right now i'm gonna give you a course demo of course three and uh so why take this specialization uh the specialization teaches nlp which is one of the most sought after skills in ai uh and you use nlp every day so text is everywhere basically right now you're reading text when you are looking at uh your phone you're using text when you're sending a message to someone you're using autocorrects when you're inputting a search query it's google search you're using autocomplete you're basically using google machine translation every time you're basically using nlp tools and applications all the time so why take specifically the specialization well the special the specialization was designed specifically to build up to the latest states of the arts models so you start with course one and two where we lay the foundations and then in courses three and four you start reaching states of the arts and hopefully by the end you will not only know how these apis work but also um uh you'll be able to use them so you'll know how they're built from scratch and how to use them so who is this specialization for uh the first of all the prx for the specialization are just basic machine learning and basic coding so as long as you know basic machine learning and some basic coding you'll be able to do the specialization software engineers developers and students of all backgrounds can easily take the specialization and if you're a product manager entrepreneurs and digital businesses who wants to transform their businesses into ai powered businesses then this specialization will also help you with that and also people who just want to know the latest ai trends currently in the specialization we do cover the latest uh states of the arts models so it will also be a very good thing if you just wanted to know what the latest industry is using so let's take a look at what you learn this is just a recap of course one where you learn to do sentiment analysis with logistic regression and naive bayes and then you also learn about vector space models uh pca and locality sensitive hashing so you learn how to translate like one word from english to french or you learn how to complete analogies so this is course one and it does lay the foundations then in course two you learn about dynamic programming and hidden markov models so this is where you learn to build these um applications like autocorrect autocompletes and then parts of speech sagging for example if someone told you book of lies then book is a verb and if someone told you i want to read the book then book is a no so this is also useful and has a lot of consumer enterprise use cases course three is where we start getting into more deep learning and where you cover like dense and recurrent neural networks lstms gru's and signing and siamese networks for example uh we always answer like sometimes questions gets answered uh gets uh students ask us questions and the same question gets you know asked again and again so with a system like with siamese networks you can identify whether a user ask the question that has already been answered and you can just recommend that answer to the user you also learn about text generation named entity recognition and how to identify these question duplicates and this is course free which uh is currently states of the arts where you cover encoder decoder causal and self attention where you perform the latest you know states of the arts machine translation you learn how to take an article and summarize it uh you do question answering and you do chat bots so you're not only like using an api what's really special about uh this course four is that you do get to see how these products are being built from scratch so you can use them in your own application and projects and some of the models covered are t5 bert transformer reformer reformer is the efficient transformer so that's what lucas was talking about and this allows you to save a lot of memory uh and it just makes it much much faster and allows you to capture uh long sequences and then we also by taking course four later you'll also learn how gpt2 and dpt3 like they're just bigger models and bigger versions of themselves so this is coming soon coming in september so stay tuned for that but now let's focus on uh course three which is sequence models so remember course one was classification vector spaces probabilistic models course three sequence and course four is the attention models so in week one of course we are going to do sentiment analysis and i know you're like oh we've already seen sentiment analysis in course one but in course one if you had two inputs like i'm happy and i'm sad this is easy to classify you can do a logistic regression a naive bayes as you will see or you've already seen in course one and you can get it to work but what if i asked you a question uh like an input like with a great setting i almost liked the movie but things changed when i saw the end so over here you have great you have like and then with a naive model you might easily classify oh it's a happy uh sentence but it turns out that it might be a set sentence so this is much harder and this is what you do in the first programming assignment so over here you have it's much it's such a nice day think i'll be taking uh said in ramsgate fish and chips anyways it's like a positive sentence and you can see that the model says that it's positive but it's it's naive it's simple over here you have i hated my day it was the worst i'm so sad this is obviously a negative sentence but over here with the great setting i almost liked the movie but things changed when i saw the end so you can see over here you have the word like great and then you have the word liked so but the neural network was capable of capturing this dependency and it turns out that it is truly a negative uh sentence so you'll be able to capture you'll basically be able to build things that will be able to capture these dependencies uh week two is text generation and what you'll do is you'll use character level generation and use a similar concepts in other text generation applications so like poems question answering and chat bots all make use of text generation and this is an example of character generation so you can see here and which thou hears ten bangs so it was generated one character at a time and then over here uh some more character generation examples and all of these are examples that you will be building in your programming assignments of week two of course three uh week three is named entity recognition and this is an example like many french citizens are going to morocco for christmas and you can see i put morocco and french when french is geopolitical entity morocco is a geographic entity and christmas is a time indicator so you'll be doing exactly this in week three and here's a concrete uh use case here's a concrete use case so let's say that you want to help businesses with customer supports tickets by getting unstructured data from by getting structured data from unstructured data so you can use named entity recognition to help identify people places brands monetary values and more this is an example of a sentence so peter navarro the white house director of trade and manufacturing policy of u.s said in an interview on sunday morning that the white house was working to prepare for the possibility of a second wave of the coronavirus in the fall though he said um it wouldn't be it wouldn't it wouldn't necessarily come so you can see peter navarro is a person white's house is an organization sunday morning is a time indicator white's house is a organization and coronavirus fall is a time indicator so you'll also be building this week four is question duplicates so i'm once again asking the same question and you'll be using siamese networks applied to a specific use case with the quora data set and you'll use it to find similar search queries or similar sentences and this is a very important application because you do not want to be answering the same question again and again so this is an example of uh the programming assignment towards the end and of course over here we're giving you like the built version but you're going to build this firm you're going to have to code it so the question one says do they enjoy eating the dessert and question two do they like hiking in the desert so you can see that these are not the same right so it's false uh next if you were to change hiking in with do do they like the dessert the deserts and do they enjoy eating uh the dessert then this will also be false now i'm gonna add an s here and you're going to see that these are still not the same because i can enjoy for example eating something i can like something like kale because it's healthy but i might not be i might not enjoy eating kale so it's also false but if you were to change it like this then you get it's true do they enjoy eating the desserts and do they like eating the desserts then it gives you true so this is just an example um of uh week four and the programming assignments that you will be building and in conclusion so we designed the specialization uh to not only show you how to use apis but also how to build them from scratch and most assignments could be turned into products for consumer enterprise and assignments are strategically designed to build up to the states of the art like gpt3 which is currently states of the arts and you just now have the tools and all the necessary equipment to start building any products that you want hopefully turn your ideas into uh a fully fleshed out product and thank you thanks very much eunice uh fantastic intro to to the courses uh with that we are going to wrap up this event and so if you'd like to learn more about the specialization that eunice was just talking about you can check out the link in the description below uh we'll have course four that eunice talked about launching in september and we'll be sending a follow-up email to everyone who attended today and we'd love to hear about how we can make future events better so we're offering a 50 discount code to any of our online courses for up to 200 people uh the first 200 people who submit qualifying responses so um keep an eye out for that email and make sure to send your feedback and as we mentioned before if you're looking to stay up to date on nlp topics check out the batch newsletter from deeplearning.ai and uh thanks again to all the panelists who joined us today to everyone else keep learning and stay safe we'll see you next time
Info
Channel: DeepLearningAI
Views: 42,973
Rating: undefined out of 5
Keywords:
Id: SzAmGg2TVBg
Channel Id: undefined
Length: 93min 45sec (5625 seconds)
Published: Wed Jul 29 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.