AI Robotics for the Real World

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] [Music] hi i am peter reveal and i'm excited to be here to share with you some of the latest advances in artificial intelligence and robotics myself i've spent the last 20 something years as a ai robotics researcher educator and entrepreneur and a lot has happened in ai robotics in the last few years in this session i kind of want to frame it around the question of when and how ai robotics will start making it into our worlds now if you watch movies you might have seen the hollywood version of this maybe you've watched the terminator movies or westworld or maybe something more positive and more aligned with how i think and hope it'll go rosie the robot from the jetsons and some of you might be thinking hold on wait a minute don't we already have robots don't robots build our cars for example and indeed robots actually build our cars and it's amazing to see that in action but here in this session i want to make a distinction between traditional robotics as we see in car factories where robots go through dedicated pre-programmed motions over and over and over and ai robotics where a robot would have to look at what's in front of it and make decisions based on what it's seeing and so the question i'd really like to think about here with you is when will ai robotics make it into our worlds robots dad don't just repeat motions but see what's around them react to it now spoiler alert i think that ai robotics very quietly much under the radar actually made it into our worlds in 2020. what we're watching here is a robot doing order picking at a warehouse in berlin and to be able to do this successfully this robot has to see what's in front of it understand what it means to pick one item at a time out of that blue bin and then place it into the outbound shipping box this is ai robotics in action and helping fulfill orders when you order something online but that was jumping ahead a little bit let's take a step back for a moment and think about what are the recent fundamental breakthroughs that we've seen in ai robotics why is operating in the real world still so difficult and what's the path forward here let's take a look at what's on this slide over here what do you see you see a picture of a person you might even recognize them but what does a computer see what if you want a computer to understand what's in the image well the computer sees a bunch of numbers and so for a computer to understand what's in that image it somehow needs to process these numbers and turn it into some conclusion now if you are an engineer trying to build such a system you might maybe write if this number bigger than that number then it's a person otherwise maybe it's a coffee mug and so forth people try it out for a long time but these more traditional programming approaches don't really work for this kind of computer vision problem so what does work well it turns out that the approach that works is much inspired by the human brain now to be fair we don't really know exactly how the brain works but still a lot of inspiration is drawn from it the way it's done today is when an image comes in it gets processed by a neural network which is shown in the middle of the slide here this neural network in the first layer has the pixels of the image then passes on processed version of that information to the next layer next layer next layer and finally outputs a decision in this case we would hope the neural network fires on dog because there's a dog in the image but what will it fire on what determines what this neural network fires on it turns out it depends on how the network is wired together and how strong these connections are strength is often referred to as the weights on those connections so then you might say okay so then what we need to do is to just wire up this network such that it makes the right decision but that's actually really hard to do by hand in a realistic network there will be millions if not billions of these connections so you can go in and by hand rewire this until you have a solution so how do we find the right wiring of this network so that calculation happening in network is indeed giving us the answer we want it turns out this is done through machine learning more specifically what is done you first collect a bunch of example images in this case images of cars images of cats images of dogs and then you take these example images and you feed them into the network and so in this case dog might be fed into the network the network will process that but initially you don't know how to wire up this network so it might not make the right decision it might say cat but if it says the wrong thing that's okay it's still being trained a backward propagation happens in a network that rewires the connections such that it now better understands that this was a dog and then we keep repeating this over and over and over and over and if you feed it enough examples and that's a lot of examples typically at some point something magical happens it starts understanding what's in those images and then when you feed it a new image maybe of a cat a car or a dog it will actually understand what's in that image now you might wonder how big of a deal is it to use these neural networks for understanding what's in images and how does it compare to what was there before well this is a long-standing problem in computer vision and there is a worldwide competition called imagenet and imagenet the organizers have a secret stash of images nobody else has access to those images and to participate in the competition you send in your computer program and then the organizers will run your program on all their images and report back out the error rate of your program you can see here that the best program that was sent in in 2010 had an error rate of recognizing what's in those images of 30 percent so 30 of the time made the wrong decision and then we see that with traditional computer vision actually this was not getting much better in 2011 2012 but then something really special happened in 2012 out of jeff hinton's lab at the university of toronto it was shown that instead of using traditional computer vision if you use deep learning where a deep neural network is trained to recognize what's in images from many many examples as we saw in the previous slide you can actually cut the error rate in half this was a massive breakthrough people realized that this new approach is becoming the way to go and as you can see in the future in the competition people switched to deep learning approaches and the error rate quickly decreased and in fact by now this competition has been retired because human level error rates have been achieved now to be clear there is no computer vision system yet that can recognize everything as well as humans but in the comments of this competition human level error rates were achieved which was a lot of progress very very quickly compared to what was happening before now understanding what's in front of it is the first thing for a robot to do but the next thing would be to then act and try to achieve some goal and typically the way this works is rob would see what's in front of it take an action the world would change as a consequence the robot would observe that again understand what's in front of it and take another action keep repeating this until ultimate achieves its goal now this is not just for robots this is really for any ai system that needs to achieve goals there's this repeated process of observing the world and taking action based on the current situation there's a field within ai called deep reinforcement learning which is concerned with how you can make robots and more generally ai systems learn to make decisions probably most famous result that you might have heard of coming from deep reinforcement learning was alphago out of deepmind alphago was the first computer player to beat the human world champion in go this was a big surprise people have not predicted this happening anytime soon but thanks to deep reinforcement learning this was all of a sudden possible in 2015. now deepmind actually applied the same ideas to enable a computer to learn to play video games while deepmind was working on deep reinforcement learning for video games and the game of go in my lab at berkeley we were working on the same kinds of ideas deep reinforcement learning but to see if we can enable robots to learn new skills now so far in this session we haven't really seen any learning in action i've talked a lot about results of learning but have not shown you learning happening now let's take a look at what happens during the learning so we have a robot here and we want this robot to learn to run but this robot is controlled by a neural network and the neural network we don't know how to wire it up it has too many connections for us to wire up by hand so when we start it off it's kind of randomly wired up and it's not going to know how to make the right decisions and in fact it's gonna fall over but it's gonna introduce some randomness in these decisions and then sometimes i'll be a bit more lucky and fall over just a little bit later and when that happens it's going to do a back propagation and network to rewire it to make that lucky outcome more likely to happen again and that's what we're seeing in action here and this process repeats over and over and over and over time in this case after 2 000 back propagation updates to the network it actually has learned how to run now the real interesting thing here is that this same algorithm that trains a two-legged robot to learn to run can then be run on another robot a four-legged robot it learns to run now we can run the same code on the two-legged robot again and it learns to get up and in fact we've run that same code on the atari games we saw earlier and it will learn to play the games for this robot that we're seeing here learning to run is one thing but actually we've been able to have it acquire a very wide range of skills as you're seeing here and at this point this becomes very useful for let's say design of video games or making animated movies now how about real robots what we're watching here is brett the berkeley robot for the elimination of tedious tasks and brett is learning to put the block into the matching opening and in about one hour of trial and error learning brett figures it out and puts the block into the matching opening now at first you might say one hour is long i can do it on first attempt and i bet you can but keep in mind that this robot is starting from scratch has not learned anything before whereas when you try it you've done many many similar things before in your life and so it's a variant on something you've done in the past maybe you've even done this specific thing in the past this robot starts from nothing and in one hour figures it all out learns a vision system and a control system to solve this problem now the postdoc who led this work in my lab then went on to google and scaled this up he said well if the robot learns from its own experience the more robots we have the more data we can collect the more experience to learn from and indeed it turns out the more robots you have the faster learning can happen why the robots can share their data they can also share their neural network cell phone called fleet learning and the larger your fleet the faster the learning can happen in this case learning to pick up objects somewhat recently at openai i was shown that you can use the same ideas deep reinforcement learning to have a robot hand learn to solve the rubik's cube it's a very hard task to do in hand manipulation very high dimensional problem complicated contact forces yet they showed it possible we just covered a wide range of amazing breakthroughs in ai robotics where robots are seeing acting based on what they see learning on their own to complete a lot of these tasks everything we saw was either in simulation or in a lab environment how about taking this into the real world there's a pretty big gap still between what we're seeing in research demos and what's needed for the real world when we think about research progress we think about doing something that's never been done before going zero to one and then once we've done that maybe we try to improve it a bit make it more reliable but often once we have 70 success rate we move on to the next thing the next thing that can go zero to one and that's okay that's what research is all about showing things that's never happened before but for the real world that's not good enough if you want to put an ai robotic solution into the world it has to be reliable and not even 90 is good enough let me let me double click for a moment on the not even 90 is good enough when you put a robot in the real world the economics of that robot matter you want this robot to create value now let's say you have a robot and your robot is 90 reliable then if it operates at 500 per hour it means it makes an error roughly every minute now what good is a robot that makes an error every minute since fixing up errors often takes a lot more time than just doing the work you might actually need two humans to manage this one robot so yeah that doesn't make much sense at 95 percent the errors are about two minutes apart so now maybe one human is enough to manage the robot but that's still in terms of creating economic value probably a loss but now if we can go above 95 percent we can start creating real value for example if we can hit 99 reliability then the errors are about 10 minutes apart and one human can start managing several robots and of course the higher the reliability the larger the robot fleet one person can manage the numbers i used are just an example and the details of the math will of course vary a bit across use cases but the absolute need for high reliability in the real world that right there is the fundamental gap between lab and real world now if you're familiar with you know a lot of the progress in ai recently especially deep learning you might say okay okay you know we need to go beyond 90 percent well naively you might say well we just set up a larger network we collect more data we train it longer and it'll get better and indeed in many research benchmarks that is exactly what happened but for the real world problems there is actually some additional complications let me highlight those here are some subtleties in going beyond 90 for real world problems first we cannot ignore the long tail of real world scenarios second we cannot ignore that the real world is always changing third we cannot ignore that it's important to know when we don't know let me double click on each one of those for a moment what do i mean real world is high variability long tail what's different from research well in research let's say you have a benchmark and maybe the benchmark is image recognition you have a thousand categories let's say you want to do better at recognizing those thousand categories you can collect more images of those thousand categories and at some point you have so much data you characterized each category and you're known that you know does really really well now if we go to the real world there's not just a thousand calories and millions of categories some of these categories have objects that are transparent there's unknown variation happening within each category and there's this long tail of things that don't occur very frequently individually but there's so many of these infrequent items that together they add up to the robot encountering those infrequent items a lot and being very important to understand them second in the real world things are always changing so compare that to the scenario on the left here you see a robot learn to run very exciting result that this robot can learn to run but it's always flat nothing's changing on the right you're in a warehouse you're opening a box that just arrived you have to decant it now what's in there how is it packed it's always going to be something new packed differently and so you need to adapt on the fly do whatever you are dealt and that requires new research another thing that happens in the real world is that you can encounter things that are just ambiguous where you just don't know what to do and it's very important if you're going to deploy ai robotic systems that they don't just make their best guess at something but when something is ambiguous something that they can't understand that they take pause maybe call back on a human operator who might be able to you know give some feedback and help the ar robotic system decide what to do now with this new knowledge in mind it might not be too surprising that even though we've been promised self-driving cars for many many years it's still a big challenge ahead of us to make that happen but now let's circle back to our original question when and how will ai robots make their way into our worlds well as i briefly alluded to in the beginning of this session what we're watching here is the beginning of the next generation of robots ai robots helping us out in this case helping with order fulfillment in a warehouse when you order something online that item has to be retrieved from the warehouse that has to be packed this robot is picking it out of the blue storage tote and then putting it into the shipping box that then gets shipped off to whoever ordered the item you can see this robot diligently picking one item at a time and fulfilling orders this is a very hard problem the robot has to see what's there will be faced with new situations things never seen before yet is expected to reliably pick one item at a time and this started becoming possible in 2020 now we're watching here is actually uh powered by what we call the covariant brain built at covariant a company we started about three years ago and in fact the press has been following the covariant efforts quite closely but this is all part of a much bigger trend big companies like amazon and abb they are pursuing robotic order fulfillment and many many startups in addition to caverion are pursuing this challenge i see this all as a part of a big shared vision that the starting point the entry point of ai robotics into the real world is going to be robotic order fulfillment in warehouses and it just started now of course it's very very exciting to have robots starting to fill orders in warehouses but in wrapping up here i want to zoom out a little bit again and think again about the technology under the hood that we talked about what's powering these robots well as you might remember these robots are trained they learn to see they learn to react to what's around them they keep training and getting better over time but these ideas are not specific at all to order fulfillment in fact these ideas are very general and i anticipate that in the near future we'll see these same ai robotics ideas power many new applications next might be farming or maybe cooking or maybe manufacturing of clothing or electronics or maybe recycling i think we're at the very beginning of ai robotics having a very big real impact on our lives now circling back to the very beginning where i mentioned rosie the robot from the jetsons of course you might still be curious when are we going to get a rosita robot to help us in our homes well our homes have a lot less structure than the application domains i described so far so i do think it's a harder problem and it's still quite a bit out there before we'll see that happen that said the ar robotics technologies that we covered in this session i think those same technologies will carry over and will be at the core of what will power our future home robots [Music] [Laughter] you
Info
Channel: Covariant
Views: 5,329
Rating: 5 out of 5
Keywords: AI, Robotics, Artificial Intelligence, Machine Learning, Deep Learning
Id: AAr99hQ64AI
Channel Id: undefined
Length: 23min 57sec (1437 seconds)
Published: Fri Mar 05 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.