The truth about AI and why you should learn it - Computerphile explains

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
(atmospheric music swells) - This is a fascinating story we have for you of a senior Google engineer who says one of the company's artificial intelligence systems has become a sentient being. - He believed one of the company's artificial intelligence chatbots had become sentient. - Engineer Blake Lemoine, says a chatbot project he was working on called LaMDA, can express thoughts and feelings equivalent to that of a child. - Google has rejected claims that one of its programs had advanced so much that it had become sentient. - That's I think the big issue, right? Is that a lot of people get bogged down deciding, well, what does sentient actually mean? - Hey everyone, it's David Bombal back with a very special guest. Mike, welcome. - Oh, thanks for having me. - So, Mike, I've seen a lot of your videos on YouTube's Computerphile, millions of views on some of the topics that you've done, but can you just introduce yourself to the audience for people who might not have seen those videos or don't know what you're doing, 'cause you were telling me you're offline, YouTube isn't your main thing; you do more than that. - That's right. Yeah. So actually, in some sense, YouTube is an aside for me. It's just something I did 'cause I thought it would be fun. I'm an academic at Nottingham, Associate Professor, and I work teaching security. I research teaching AI and computer vision. It just happened that we have some ties at Nottingham to Brady and Sean who do things like Numberphile and Computerphile. And so Computerphile was kind of fledgling, (indistinct). It had, it was a bit established when I, when I started doing them and it kind of just took off really, I think because I did topics on security and AI and things, people thought those were interesting and so I get a lot of views on those now, but it is still a bit peculiar when people say hello to me, because I just turn up and do normal things that the rest of the time. - [David] So you get stopped on the street and- - It has happened. - Yeah. My wife is never impressed when that happens. She just thinks this is ridiculous. But you know, it happens from time to time. I do really enjoy it. And I get lots of emails from people saying, "Thanks for your videos. I enjoy them." And that's why I do it. That's what it's for, for me. - And I loved what you said offline that you meet, someone said that they started computer science because of you that's what's they- - I've had a couple of emails like that and that's, those are the best emails, right? 'Cause I want people to learn about computer science. I love computers. I'm a massive geek. Basically, I program for fun. And the more people do that, the more it's a win for me. So, you know, if I can encourage a few people by doing videos, that's what I really want to do. - That's fantastic. I, one of the videos I watched, obviously in preparation for this interview is this recent video that you've put out about AI. - Yeah. - And that's gonna be the topic that we wanna talk about today. So let me lead with this. I get these kind of emails all the time. "David, is it worth me studying cybersecurity?" "David, is it worth me studying computers because AI gonna take all the jobs away?" And I think movies over the years, like, you know, there's been so many of these movies where the robots take over and, this, can you talk about this? And you can go into the details if you like, but I think this sort of recent event that you spoke about in your video, hasn't helped the conversation at all. So can you tell us about that? Yeah? And what your thoughts are about, you know, what happened. - Yeah. So no, I absolutely agree, that it didn't help the conversation about was, okay? I think when in my video that was kind of what I tried to end with was basically it doesn't, I, you know, in some sense nuances of where this AI is, doesn't interest me that much. All I know is it's not where they're suggesting it is, at least that's, you know, that's what I think. I think, I suppose at the moment AI is very application driven, right? So a lot of it is supervised. There is, there is other ways of doing it, but a lot of it's supervised, which means that you have some kind of training set with some inputs and some outputs that you're trying to get the model to learn. And then you just train the model until that happens. That can work really, really, really well. And so for my own research, I do this on things like image segmentation, where I'm trying to find objects in images, and you know, medical image segmentation and things like this. But, you know, in practice, if I then take that network and try and run it on street scenes, it won't work because it's not trained on street scenes. It doesn't know what they are. It hasn't got any ability to go, "Oh, it's a street now," you know, and take what it's learned somewhere and apply it somewhere else. You know, retraining a network is really the only way to do it. And that involves even more data, right. So I don't think at the moment, it's realistic to suggest that there's going to be some general intelligence that can just do all of our jobs, right? You know, you've seen GitHub Copilot that just produces text, code? And sometimes it will produce a useful function and sometimes it'll produce a function full of bugs that you've gotta then spend time fixing. And have you actually saved any time? I dunno: the jury's out, I think. I so, I wouldn't worry at the moment. I'm not worried. I mean maybe designing things to replace myself is a huge mistake, but I don't think we there yet. - So I mean, tell us, just for people who haven't seen it, haven't seen your video and like haven't read perhaps what's going on. There's this Google person now- - LaMDA? - Yeah- - Yep. - So what is LaMDA and what was he basically saying? - Google LaMDA is a, it's what we call a large language model. So it's basically a very, very large neural network designed in a certain way. They're all designed in a very similar way and it has more parameters in it than we've ever seen really in a model, right? Or GPT-3 is also very, very big. And so really what this brings to table is not so much something new that we've never seen before in AI. It's just, it's this huge, you know, orders of magnitude bigger than the kind of networks I would use to do complex imaging tasks. And what they've basically done is they've trained this model to read a sentence and then predict what the next word will be. And so you could imagine, that if you wanted to do this by hand and you had infinite resources, you could just look at every sentence that's ever been written by humans and work out for any given, let's say, 10 words, what the next word will always be. And if you did that and you had that list of all the possible inputs, you'd do pretty well at generating sentences because at the end of the day, that's what people say, right? This is, you've got it on record as what they've said in the past, you can just say those things again. And so this model is one of those, this model is one where you put in some sentences. So you might put in a sentence that says, what do you think about quantum physics? And then what the model will do is predict the next likely word. And it will probably say, well I'm gonna start by saying, "What I think is," and then generate some plausible text on quantum physics because people have written about quantum physics before and that data is in the training set. What it hasn't done is learned what quantum physics is or connect it to an internet resource that has information on quantum physics and looked it up. So, in some sense, it's a bit like the, you know, in Star Trek, you've got the computer you can talk to and they often ask the computer to do things like, you know, put the shields up or whatever. - Computer dim lights. - It's like that, but it's not connected to any kind of anything on the ship. So it just talks to you and talks back, but it never actions anything. It never, it never has, you know, it's just going from the text in the training set. And I think that's something that's perhaps lost a bit in when it's discussed. It's not connected to anything. It doesn't even have a memory, basically, and so it can't reflect on past experience because it has no place to store past experience, it has no record of those events. And so when it produces sentences, that look really, really interesting, they're actually just really interesting sounding sentences, you know? And I think so anyway, I mean, if I sort of, I digress slightly, but you know, in this particular case, what happened was, someone from Google, who I think was in the ethics department, I don't think he was actually responsible for developing this AI, basically said, "Look at this chat I've had with it. Don't you think it's sentient?" basically is what he said. And the answer I think for me, and pretty much everyone who understands these models was, "No, no it's not." And what, and I think the thing that bothered me most about it was not particularly one person saying this 'cause he's very entitled to his opinion, right, you know, I think it was that the media took it massively seriously and it was all over everywhere: "Is this the next thing?" And that bugs me somewhat because I don't think it helps the conversation like you say, right? People start, people who don't know what a big language model is, are gonna be a bit worried about this. And there's really no reason at this time to be worried. And that bothers me slightly, which is why I do my videos to try and tell people about it. - I mean, the problem is the movies predict this happening. And then people see this stuff in the news and it's like, "It's the end of the world, and end of my job. Arnold and the robots are gonna take over." So it really doesn't have to be like, it doesn't, it really doesn't help. And I like what you said, I mean, in your video, which I'll link below, you said something that I thought was hilarious. You said, "Can Python functions get lonely?" - Yeah. - Can you explain what you were saying by that? - Yeah. So the, one of the comments in the original chat transcript between this researcher and his friend, and his colleague, and this LaMDA AI, was, "Do you get lonely?" And it's spouted off a whole paragraph about how lonely it is. And it doesn't make any sense because it's a function call, right? So you put in your words at the top, it runs what is essentially a big transformer network, which is pre-trained on all this data, and then it spits out words at the bottom, which you read, and then it stops executing, right? Because there's no kind of ongoing process like there is in my, I mean, I like to think that when I'm not immediately saying something to you, I'm still, there's something going on in there, right? Maybe, I mean, you know, I can't prove it to you, but this is not the case, you know? And it's just like, when you run, I mean, I made a joke about it, but when you run, you know, reverse string in Python, you don't worry that it gets lonely the rest of the time because it's not executing. That's just some code that executed. It finished executing and it just lies dormant in memory doing absolutely nothing of interest. And that's, for me, kind of what this model is doing. If they developed a model that was always on in some way, like maybe it was always doing something and it had memory and it had storage, I could, I still probably would think I would need some convincing that it had any kind of, you know, higher level thought process, but at least it would be plausible, you know? It would sort of think, "Well, at least it's got something going on in there," but I don't think it's designed that way. It's designed as a very, very big reverse string. And, you know, I don't worry about those things being sentient. - Yeah, but it's crazy. I mean, I mean the, well, I mean in my opinion, because they, kind of, they were implying that this AI or whatever was like a human, or equivalent to a human, and it seems like that's quite a stretch, but in, you know, popular culture, that's what people equate to AI it seems. - Yeah. That really, that's I think the big issue, right? Is that a lot of people get bogged down deciding, "Well, what does sentient actually mean?" And that doesn't interest me because when anyone uses the word, they're not using it in a different definition, they're using it in a definition we think of as like Terminator from Skynet, you know, this researcher wasn't saying, "Oh, I think it's sentient, but I define sentient as something like a slightly convoluted if statement," right? He was saying, "I think it's like a person, and it's got memories, and it's got experiences, and it gets lonely, and it needs a lawyer."- - Feelings, and that's not - Yeah. And, without with any, with zero evidence to support this and indeed not so much evidence is just, it doesn't even make sense. So I think you have to be extremely careful using the word sentient, not because you might have a different definition, but because everyone has the actual same definition, right? Which is actual, you know, human level cognitive ability, but, you know, which, so I don't spend a lot of time worrying about what the definition of sentience is, because if I go to someone in a conversation and say, "This is sentience," I think we both understand implicitly what that means to me to say that. And so I don't, I think that arguing about the definition is a bit silly because we actually all secretly agree on the definition. - Yeah. I mean, I think for the general population, I mean, I'm not into the AI piece like you are, and that's why you, I wanna talk to you about it. You know, I just think people go off movies and popular culture. That's sort of what people, that's the impression they get and that's why it was so big on the news perhaps. But can you explain AI versus machine learning and like, what is machine learning? What is AI? And perhaps just take us down the road now. - Yeah. Okay. - Like teach us sort of the basics of this stuff. - Yeah. So AI is misused in the sense that it's now a capsule. And I will admit, I do that to an extent myself. And it's partly because I'm lazy, right? I think it's because, it means I don't have to define the exact thing that I'm doing. Any given time, car engines are slightly different, but they all at the moment, I mean, I say they're all, combustion engines all do much the same thing, even though one's got more cylinders and one's got fewer cylinders and one has a turbo and one doesn't, you don't say you don't well, I don't go on about those details. I just say, "I've got a car and it goes." So AI, I think, is a catch all that includes machine learning. So you've got AI as a big kind of thing of stuff with loads of stuff in it. And even my maze solving video where I just do very simple, looking around the corridors of the maze would be defined in some sense as AI, right? But the Dykstra algorithm that we use to do network routing and things and other similar algorithms, you could define them in some ways as AI, because they adapt to messages coming in and they change weights and paths and things. But we wouldn't go as far as to say they were, you know, anywhere, you know, "smart" in some sense, right? So I think AI is quite a broad term. And then there are things like genetic algorithms, evolutionary algorithms, which do slight different things. They are arguably less popular or less prevalent perhaps will be the right way to put it, but they also come under the umbrella of AI. So AI is a very big umbrella term, which basically encompasses most, most things where you could imagine it was sort of intelligent. And then in that, you've got machine learning. And machine learning is just the idea that you want to try and program a computer without having to program it, essentially. You wanna give it some input examples or some other mechanism from which to learn, and it comes up with its own rules for what it's gonna do. So a decision tree is a good example of a very simple, conceptually simple machine learning approach, where you have some kind of data and every time you make a decision, you just split it in two. So maybe you're trying to analyze financial data, to decide whether people get a new credit card, right? So the first decision you make is "Have they ever defaulted on a credit card?" Yes goes this way, no goes this way. And then the next decision is, "Okay, what's the current credit limit?" It's above 7,000 it goes this way, below 7,000 goes this way. And you just split this data into two, and two, and two, until at the end, you get the actual nodes that have the decisions on, right? And it's machine learning, because what you can do is you can, you can basically create this tree, but actually change the numbers and values in it and the decisions based on the data. So you can say, "Well, actually, maybe 7,000 doesn't work that well. We're gonna have it at 6,500 and change the thresholds and things. And you can do this all automatically in the training process. So that's the kind of thing we're talking about with machine learning. Now what happens of course, is there's a big push in deep learning, which I can also talk about, but- - Yeah, that'd be great. - Yeah, but- - 'Cause I mean, we just hear the, I just hear these buzzwords, I mean, preparing for this interview, just like buzzword after buzzword after buzzword. And I think a lot of us, you know, who are not in this sort of field, but are interested in it. So yeah, if you can define as much in like the- - Yeah. Yeah. Sure, so- get rid of all the, rubs that among all the - The brunt, the fi- - So you've got, - Yeah, sorry, you've got AI, right? which is right here, in some subset of that is machine learning, which includes what I would kind of call traditional machine learning, like support vector machines, decision trees, random forests, right? These are all, linear regression even, where you are just fitting a line to some data. And then we have things like slightly more complicated like artificial neural networks, which it, they, I mean, they kind of take some inspiration from our brains, but I would be very careful saying that, do you know what I mean? I think, you know, to suggest it's like our brain is iffy. And that's what- - Yeah, yeah. Sure. But people don't - keep in mind, because that's what they're called. And then what we've basically done recently is we've made them much, much bigger. And we've introduced other terms like convolutional networks and transformers and things, but for the sake of, you know, this sentence they're just bigger, deeper networks that can learn more impressive functions. So they can map that input to that output more effectively. 'Cause that's what you wanna try and do. You've got some data, you've got some predictions you need to make on that data. And your hope is that once you've trained it, some new data comes along and you can make some good predictions, right? I mean, I, let's think through an example, suppose I want to do MRI segmentation for medical imaging, right? So I have 50 patients, some of whom unfortunately have some kind of illness, some of whom don't, and I train the network to try and find the ones that have illness, my hope is that when I then sort of fix that network in place, and bring in some new patients, it will be able to say whether they have that illness or not. That's the idea. And will have done that by basically reconfiguring itself based on the examples I gave it to begin with. - So doing a technology example, it could be something like spotting, is this a virus or is it just- - That's exactly right. And in fact, you know, modern antivirus' will include some kind of machine learning element probably. So, you know, you might have features derived from, so what we usually put into the front of a network is something we call features, which is our way of just saying input data, right? So sometimes you've crafted those features like you've chosen, what you think is interesting features to give the network and sometimes you'll just shove something in, like, you know, an antivirus you could, you could choose things like how many system calls does it make or, you know, how many bytes is executable or how many of this particular character does it have in the executable? And you could choose those features 'cause you think they are indicative sometimes of malware or not malware. You could stick them in some kind of a machine learning approach with a load of examples and then say, "Right now, change your weights and change your rules, internally, so that on this training set, your prediction is as accurate as possible," right? And so let's say you do that. You have a hundred thousand malware and regular samples. You give it to your AI and you just over and over again, say, "Right, you got that one wrong, reconfigure yourself so that next time you get a bit better at predicting it." You do that over and over again, and the hope is then, that when a new virus comes along that you've never seen those same sort of shall we say suspicious things exist in it, and the network flags that up. That's the idea. - So the training data is like the, the stuff you give it initially, which would be this a hundred thousand like virus and not virus. - Yes. Yes. - And then you, when you say weight, you like, it's basically saying like, if it makes like a hundred system calls rather than 10, or you set some kind of threshold, is that right? - Yeah. So it, okay. So in a decision tree or something like that, that's what would happen. There would be some kind of threshold decision basis at some point during it. For a neural network it's a little bit more complicated. What you actually do is you treat all these weights just as numbers, and you just calculate mathematical functions based on those numbers. So what you might do is multiply all of those numbers that come in, by some weights, let's say you multiply one of 'em by two, and one of 'em by negative four, and one of 'em by a half. And then you add them all up, and what that does is take a different amount of each one, and then you repeat that process over and over again, to try and basically learn a complicated mathematical function. That's really the only thing it does. You know, you are essentially trying to fit a really complicated curve through the data, essentially, so that you can distinguish between real and fake malware, or, you know, regular executables and malware. And, and so the weight, when I say weight, what I'm really talking about is the parameters of my model, which influence its mathematical function. - So the, and then you would adjust the weights and the mathematical functions based on the result that it get, did it correctly determine that this was malware or- - Yeah, exactly. So, so let's suppose we were doing malware, right? So we think one, an output of one means it's definitely malware and an output of zero means it's definitely not malware. An output of 0.5 is not very useful to us because we don't know. What we do is we put in, a piece of malware or many pieces of malware, we run through let's say our deep neural network or whatever it is we are running and it produce a value between zero and one. And then we say, "Well, look, you gave us a value of 0.7, but actually it was malware this time, so you've got an error of 0.3. I wanted you to produce more 0.3 higher for that one than you did. So can you adjust your mathematical function to next time when I put that malware in produce a value of one and not a value of 0.7?" Now, if you do that for one malware sample, it's gonna be the worst machine learning ever, because you're just gonna give it something else and it's gonna go, "I dunno what you mean," right? Yeah, because it's this is nonsense. So you have to give it a lot of data. And, and I guess the, what you're trying to do is calculate the best average mathematical function that does the best job it can in the general case of all of these malwares, right? Massively optimizing one malware is not useful because it's not gonna generalize; it's not gonna apply in real world to some new malware. So you put in 10, 20, a hundred different malwares at the same time, and all of them are trying to go to one or go to zero, and you are trying to change the weights simultaneously do all of those at the same time, that's what machine learning does basically for a neural network. The process for actually doing this, it's complicated to describe, but it's fairly intuitive. What you do is you, if you, because all these weights were involved in the calculation, you put your features in for your malware, you go all the way forward through the deep learning or the network, then you calculate your error, and then you go backwards adjusting the weights based on what you just found out, essentially. And so if a weight doesn't have any impact on the decision, because let's say it just sets everything to zero, you won't adjust that weight 'cause it's not useful. You will only adjust the ones because you're calculating the influence that each of these weights has on the error, you adjust all the ones that have the biggest impact. And so the network will kind of try and find its way towards a good function, you know, and we use a process called stochastic gradient descent often to train this. So what we're doing is we're picking random malwares and putting them in, and that will, it will often get them wrong, right? Because it's never seen any of these things before. And so over time, maybe you just nudge it slightly in a better direction. And then over many thousands of looks it slowly converges on something that actually makes reasonable decisions. That's, you know, that's the idea. So it's a long process. - And is this what you would call supervised, or is it, this? - Is, yeah, this is definitely supervised. So supervised is where you have your labeled (indistinct) what we, you know, our, we have labels for data. So we're putting our data in, we have some labels against which we can compare, and that means that we have some idea of how right or wrong the network is in any given case, and that's very, very useful. And the major-, despite what people might say, the majority of deep learning or machine learning is supervised learning because it gets results the quickest. If I want to detect some illness in MRI, having examples of that illness is gonna be much, much easier. - So, Mike, supervised learning, if I understand it right, is you giving it examples of, like you said, actual malware or actual, like in your MRI scans problems, and then you're supervising that it got it right, and then you're correcting it? - Yes. And it makes things much easier, right? So the majority of machine learning is supervised because it is simpler and easier to do. If you work in applied areas like me, where you're trying to get things to work really, really well, if you work in industry, a lot of what you're trying to do is just minimize that error term. You're trying to get as close to good predictions in the, for the majority of cases. So getting some examples is gonna get you to converge on that much, much more quickly. This is, you know, distinct from something like weekly supervised or unsupervised learning. And there's lots of different variants. So unsupervised learning is you don't have any labels. Like maybe the data is too big or the data is too hard to annotate or no one can agree on what the labels are, and so the best you're gonna be able to do is kind of partition the data into plausible groups. So you can say, "Well, look. We don't know exactly what all these things are, but we know that this group is distinct from this group," and that's unsupervised. So an example would be suppose you work for an online shop and you have a load of data on what different customers have bought. One thing you might do is start trying to group customers into some kind of plausible groups based on roughly the things they, and they're not all gonna have bought the exact same thing, so it's not gonna be trivial, but they might have bought, so someone's buying mostly dog related stuff and someone's buying mostly technical gadgets. And then what you can do is say, "Well, look, I put all these people in the tech group, and this guy bought this really nice new microphone for his camera, so I'm gonna recommend that now to other people in the group." And maybe I get a few hits and I sell a few cameras that way. But you can get much more complicated in this, but that is an example of perhaps unsupervised learning, where you don't need to have some kind of label for everyone. You don't need to have labeled me ahead of time as a tech enthusiast, you just need to look at the stuff I've been buying and know it's the same as all these other people, and know that that's interesting, right? Rather than we know exactly what it means. - That's a great example. So in other words, you didn't tell the machine who the people were. It discovered that based on their, the patterns of data, right? - Yeah. And it didn't really even discover who they were. It mostly just grouped them, and that allowed us to make decisions based on the fact they were grouped. Now, as it happens, I've given this group a label of tech enthusiasts, but of course you don't need to even know that. You just need to know that on average, they buy more TVs than everyone else, so maybe send them emails about TVs, you know, it's that kind of idea. You can still do supervised learning and other forms of learning with stuff like marketing and, and recommended systems and things. But you might imagine that that could be one way you would do it. And I think it's a good example. - The problem I see, like from listening to you, is reality versus the movies or reality versus the news cycle, because you always hear about Google doing like, like teaching a machine to play chess or whatever the games are, and it just like magically gets this done. And it teaches itself kind of like, not even knowing what the rules of the game are- - Yeah. So that is a, that's something called reinforcement learning a lot of the time. Reinforcement learning is still supervised learning. It's just that you get the labels as you go from playing the game. So the way it works is, you know, what you might do, is you play a random game of chess, where you literally move at random, right? And you lose. And so you get a strong suggestion that maybe next time don't do that, right? That was stupid. So now you move slightly less at random than you did before, but it's still pretty bad, and you lose again but learn a bit, and this is basically how they train it. So what you do is you play millions and millions and millions of games of chess. And every time it goes well, you just learn a little something about what was better than that time and what the time before. We're still talking about a network which is a big mathematical function, right? So we're still talking about something that has weights, that you adjust, so that when you put an input state in, you get the best desirable output state, which in this case, of course, is you won more often than you didn't. For me, it, I mean, these are fascinating, 'cause they they're trained in a very different way to the way I would train a network. I'd come up with labeled data and I put it in like, and I use the examples. With reinforcement learning you have to start trying to give it rewards, which is where it gets its labeled data from. So is it better that you go 25 moves in chess before you lose? Or is it better that you checkmate regardless of how long it takes, right? You know, 'cause you might end up in a stalemate, you know, there's things with playing chess where you might say, "Well look, these other goals are also important," or something like this. And so you can spend a lot of time thinking about different ways you could train the network, which I think is really interesting. - Perhaps I'm misinterpreting it, but it sounds like the hype cycle versus reality, there's a big disconnect. Like the people have this vision that the robots are gonna take over, but you, you don't think that's gonna happen like anytime soon, right? - Yeah. I mean, I, well, the funny thing is like, I did a lecture once where I said to everyone, "You know the SHA-1 hash function is absolutely fine," right? And then the next, the next day, Google released their two PDFs that had the same SHA-1 hash, right? Now that's embarrassing when that happens as lecturer, you know? (David laughing) So I, you know, I don't wanna say, you know- - You don't want to predict, yeah- - I don't want to predict it could never happen. What I would say is that the, something that's really, really good at Go, or something that's really, really good at chess is really, really good at chess, and that is it, right? It will do nothing else, right? As far as I can tell, human chess players are also good at other things. And we don't have that generalizability yet. And I don't- - the AI- the AI thing, sorry, sorry to interrupt. Is this, this AGI the difference between like specialized knowledge and AI?- - And I mean again we could get bogged down in what the definition means, but I think the artificial general is urgent to most people watching is just something that kind of is a bit like a human, right. And certainly is very, very general. So you could say, "Right, this now is a totally different game, learn to play it," and it would go off and play it. And it would still remember how to play chess and it could play all the games, you know, and it's just super, super impressive. That doesn't exist. Will it exist? Nah, I dunno. I mean, I think that if we keep making these models bigger, we'll probably get to a point within a few decades where they are very impressive at a lot of different tasks, but I still am not convinced yet that we've got any real strategy to get past the idea of just you need to like have a load of data, right? Or a load of, play a load of games. My daughter can have a go at playing a semi-coherent game of chess, just having been told the rules of chess. I mean, she didn't, you know, let's say she's not gonna winning any competitions, right? Not yet, but she didn't need to play a million games against herself to work out what to do, right? There's something that she's doing, that is much, much more impressive than what this AI is doing. That isn't to say the AI isn't incredibly impressive. It's just very different. I do think that the hype cycle is very different to what we actually see on the ground, which is that basically a lot of the time, I mean, you know, aside from playing games and reinforcement learning and large language models, the majority of what people are doing is trying to find objects, segment images, and these things is mostly done in the supervised way, and they don't generalize but we don't care because we were trying to find those specific objects so that's good. And if we needed 'em to do something else, we'll retrain them to do something else. - Yeah. 'cause my next question, I think you've already given us the answer and maybe you can just elaborate is what is AI really good at, compared to you know, it just seems like it's like automation, automation has its place, but you still, it takes like, it's just correct me if I'm wrong but it seems to take away like low level tasks that are boring and monotonous, or difficult for a human to do and then humans can concentrate on other things. What is AI really good at? And where do you see it going? - Yeah. So AI is that, automation is exactly what you right on, but with the caveat that you've gotta have found a good way to train it to automate. It won't just automate stuff. Yo can't just stick it on a production line and say automate that for me because it, we won't know what to do. So yeah, from my point of view, what AI's really good at is, so before I worked in, you know, at machine learning and deep learning just was a normal computer vision researcher, right? And so I was off, you know, this is like early 2010, something like this time before I mean literally deep learning appeared in about 2014. And before that we didn't have it, right? There were some networks, but no one was really paying attention to them and everyone was just doing normal stuff; what I would describe as image processing. So if I wanted to find something in an image, what I would be trying to do is come up with rules in my head about what I needed to do to that image to find those objects, and then I would implement those rules in code. So I'd say, okay, first of all, goals like we're trying to find, you know, something in MRIs. So first find all the bright pixels. Now find all the bright pixels that form a continuous blob that's of this size, you know, and I, I start and I try and design an algorithm to find whatever it was I was finding through these if statements and rules, right? It's just code. And what machine learning lets me do is not worry about these new rules because the problem you have, if you do, if you do it by just coding, is you get stuck in edge cases. You get stuck on the, you solve 90% of the issues pretty quickly because 90% of the images are trivial. And then that 10%, you just will never solve because they're just, they don't apply the normal rules that everything else does. And you know, if you are looking at sort of medical diagnosis AI, or program, that's a huge problem, that you're just gonna miss 10% because you couldn't deal with the edge cases. And so from my point of view, coming from image analysis, that was what it let us solve. It allows you, because it's mathematical function is very, very complicated. It can learn the edge cases, if you give it sufficient numbers of them. So you just, so actually a lot of the time when I work with biologists or medics, and they present me images, I'll say, "These all very nice, but have you got any worse ones? Have you got any really bad ones?" Because the more, because the more bad stuff we give it, the better it will get at at working when those things come along. If you train your AI on a 3- or a 7-Tesla MRI scanner, which is super clear, it won't work when you run it on a 1.5. You know, so maybe you want to get samples from all the different scanners. You know what I mean? It's these kind of decisions, it actually means that the problem is no longer one of, which if statements do I need to write to get this to work, it's now, what kind of data and how do I present the data to this network to get it to work, right? And that, so it becomes much more about the input and output problem than it becomes about what you do in the middle, which it just learns. - That's great. I mean, I just wanted to see if I understand those terms. I see terms like artificial intelligence, machine learning, neural networks, and deep learning. We've covered all of those. Is that right? - Yeah. So I mean, to go into some deep learning, what I would say in terms of a definition of deep learning is, you know earlier I said that you might derive features for your problem, right? So I suppose you're trying to sell cars. What you might do is you might come up with some properties of cars that are relevant to its purchase price. So you might say, "Okay, how many cylinders has it got? How many, how much horsepower has it got? Has it got leather seats, right? Has it got air conditioning?" And you would have all these features and you would come up with a list of, let's say a hundred different properties of a car, and you would stick them in some AI, decision tree, neural network, doesn't matter, and then it would spit out a value for you, and you would train it on a bunch of examples, and you would hopefully have a system that could really nicely predict the value of cars, right? Now, the problem is that suppose I've missed out a feature that's absolutely crucial to the value of cars. Suppose I forgot to put in the engine size and it turns out that 90% of the car's value is on how big the engine is, right? And so I've given it bad data then, right? And then I have to go back and have to put data in again, and I have to train it all again. And, you know, it's a waste of time. And what will actually happen if you tried to implement a system where you'd missed out features, is it would never work as well as you hoped. And a car would come along that looked good on the features I did give it, but actually had a really small engine and it would massively overvalue it or something like this, or undervalue it and you give away a really nice car for almost free. What deep learning does is something called representation learning. That's (indistinct) because it's deeper, it has the power to also learn the features as well as the decision based on those features. So you might say, "Well, I can't bother to decide all these features so I'm just gonna dump the raw specs or a picture of the car in at the front, and have it determine for me, the value." And it would be looking at the size, the model, shape, the color, the size of the wheels, and it would do all this and it would extract the features first inside the network, and then it would use that to make the decision. So deep learning is often described as just the same network, but deeper. But actually it's a different, I think, a different paradigm where you're basically no longer handcrafting what you put in, you're just shoving all of it in and it works out what's useful and what's not. - And so you've explained neural networks already. Is that right? - Yeah. I mean, mean, so a neural network. Yeah. So I, we talked about how a neural network calculates a weighted sum. So it takes some features at one layer and it weights them and then it calculates the sum of those for the next layer. And we have something called an activation function in there as well, which allows the, it basically makes the function a lot more complex, right? It makes it nonlinear. It makes it learn more powerful things. Modern deep networks actually have additional operations like convolutions and pooling operations, which work on grids of data often, right? It doesn't have to, but you know, often may do. So what you might do is instead of calculating a weighted sum of all the features, you might slide a filter over the image to calculate filters every location, and so it's like a sort of a map of activations. And then you might repeat that process over and over again. So what deep networks are capable of doing, convolutional networks, is determining features across the whole image, or across the whole of the data stream and then repeating that process over and over again. That's how they develop their representation learning. They use the filters to create interesting information before they make a decision. - You teach security at university, but you're doing a lot of the AI stuff as well. I think the question a lot of people will be asking, including myself, is, "Do I need to learn some kind of programming language? And which language would it be? Would you recommend? And do I need to learn like a whole bunch of math?" Because it sounds like math is one of the, or maths, as we say in the UK is, is something that you have to, it, 'cause you have to learn, is that right? To get into- - You, you know, having some idea of what's going on mathematically helps from an intuition point of view, right? 'Cause I understand the back propagation process, which is how the actual weights are adjusted. And that allows me to understand what would happen if I connect two bits of network together in a weird shape or something like this. But, in practice actually, day-to-day running of a deep network doesn't really involve any maths. And there is some disagreement in the community about whether you really need to know math at all. You know, I sort of go back and forth. I sometimes think it's useful and I sometimes think it's not. I certainly don't think people should be, if they don't like math, should be put off from having a go, because I'm always an advocate for have a go at something. You might really enjoy it. Right? What I would say is, that actually running a neural network doesn't require a lot of maths. It just requires a bit of Python basically. So that's the language you normally use. Python, and I have a love/hate relationship with Python. And I think that sometimes I just wanna declare what my types are and stop having runtime errors a half an hour into something. But what, what they've done is they've got a lot of libraries like TensorFlow and PyTorch to operate, that sit in Python, and then they very quickly go down into C and CUDA for fast matrix multiplications, which is all the stuff that goes on behind the scenes in these neural networks. So they're very, very quick because they're not implemented end-to-end in Python, but Python gives you a very convenient and nice way of doing all this, you know, load in the images, it just appears as a kind of array. You know, you might have a list of images that you use for your data set, and then you put that into a network and so on, right? You, you know, a lot of it's just inputting outputting lists and dictionaries like the rest of Python and so it makes things quite easy to use. You know, you'll have a look at Python and but Python and for me is a nice enough language, in the sense that it's fairly easy to pick up particularly if you already know a language. It's often a language people recommend you start with anyway, because it's fairly relaxed about syntax and just you making a total mess of it. So that's, you know, that's always good, but doing, going from knowledge of Python to having implemented a deep network will not take you very long. You will understand everything the first time, but you can get, give it a go and you can watch it training and you can start to, you can start to pick up on what's going on, and then you can make a change to the network and maybe improve your performance slightly. - Do you have to write it from scratch or like it's TensorFlow or something that like Google have created the- - Exactly, they do a huge amount of heavy lifting, right? Which is one of the reasons why you can kind of get away with not having all this mathematical background. So, I mean, I use PyTorch mainly, and in PyTorch, it handles all of the weights and learning for you. So you say, "I want my network to have this many layers and I want my layers to be like this. And I want it to take an image of this size and turn it into a 10-class classification problem," where I'm picking cats and dogs and airplanes, or what have you. And then it just trots off and does it, and it just goes, it goes, puts the images in it, it retrains the network, and it puts the images in, and it retrains the network, and it iterates. And you can watch your learning rate, so you watch your loss function go down as it gets better and better every iteration. And so eventually you can then just deploy it in some sort of production codes or whatever. And maybe without, maybe test it first! (David and Mike both laugh) But, you know, like it does a huge amount. There's a lot of mathematics behind the scenes, not all of it particularly complicated, but it's definitely a lot of it. And it's all massively paralyzed on a GPU and, you know, so you can actually get away with a few dozen lines of code to get a pretty nifty neural network going. - Let me say that's good to hear because know, when you start talking about the ins and out, it's like, this sounds so complicated. So it's like PyTorch is just a library or something that you would import and then just, you just send some commands to it. Yeah? - Torch started off as a machine learning library in, well, it was written in C presumably, but, and CUDA, but it was for Lua. And again, that's another language I have a, should we say a very strong, mixed opinions about, however, since then, TensorFlow came along in Python, I think it was seen as Python's more convenient for the majority of developers, and so PyTorch, spawned off Torch basically, and is now the dominant library for this. So TensorFlow is Google, and PyTorch is Facebook AI, or Meta AI, I suppose it is now. - And that's the one you would start with, yeah, if you were starting? - I, yeah. So this is a, people have different opinions on this. I think that the, - Just give us your opinion because you know, we, I just, sorry to interrupt, I just want to put it this way. I like to have paths, like when I talk to experts like yourself, it's like, "Okay, I'm new now. How do I go from like, knowing nothing to like, at least getting started?" If so, if you anything you can help me- - Yeah, Well, I mean, I tell you what. - like knowledge, whatever would be great. - Yeah. Yeah. I would start with PyTorch personally. My, from a research point of view, PyTorch is more flexible, which helps me, but it also doesn't require a lot of lines of code to get running. And it also does a nice thing where it doesn't hide away all of the details. There's just enough detail in there that you can kind of type away and it will kind of work, but you do see the network going forward, and learning, and optimizing the weights, and things like this. There's a few lines of code, that do that, that you can kind of look at and go, "Hmm." And then you kind of pick these things up, right? It's not a case that you just type PyTorch dot train, and pass it your input data and then it just does it and you have no idea what happened, which I like, because that wouldn't be fun, right? But also you wouldn't learn anything. So I like PyTorch from that, for that reason. It also has a load of examples. So if you go on the, if you go on the GitHub repository for PyTorch or Torchvision, you get the whole, you've got all the like core networks for a big, from the literature in there. And you've also got some examples of simple data problems and things like this that you can run from end-to-end, and just basically run the file and it will start training a network. And then you can delve in and see what it is it's actually doing. - Do you need, you, I think you mentioned a GPU, do you need specific hardware or can you just run the single laptop? - You need, you really need a, so PyTorch is, uses CUDA, right? So you really could do with using a, I don't know if PyTorch support OpenCL; I can't remember. Ideally you would have access to a CUDA enabled GPU that would make this process much, much faster. So as I mentioned, the back end of PyTorch, and most of these deep learning libraries is written in C, and CUDA, and it's just massively parallelized matrix modifications most of the time. And that is something that you don't wanna be doing on a CPU, right? You can, for very small networks, run it on a CPU. So if you download the simplest PyTorch example and you run it on a CPU, it will run okay and you'll be able to see what happens. Anything with images, anything where the dimensionality is high, you're gonna be waiting half an hour for it just to finish one pass and we won't get anything done. One other thing you might like to try is Google Colab. So Google Colab is, is Google's public, Jupyter notebook-style, laboratory environment that actually provides limited time, but fair use access to GPUs to have a go at these things. It's a great place to go. And you can also download loads of Colab notebooks, existing implementations to test them out. That's a great place to start. You know, I'm a big fan of Google Colab. I think that as a platform it's really, really useful and you can actually pay, I mean, I'm not, I don't work for Google Colab, you can pay a small subscription to get access to, higher access or more preference access, more, should we say higher priority access to GPUs. That's what we, you know, you can get. So it's like fair use normally. So if you, if you use it a lot, you might have to wait for a half a day or something. - I mean, in the best case scenario, I'd come and attend one of your classes, but not everyone's gonna be able to do that. Do you have books, or online courses, or stuff that you would personally recommend- - Yeah, so- - or suggest? - So, I mean, what I always recommend to people is, is Andrew NGs Coursera course on machine learning is a great place to start. Right now, it's lower level. So Andrew Ng is very well known in the machine learning community. He's, you know, he's done a load of great work. His Coursera course is really good. It's quite mathematical, right? So that isn't necessarily a problem. You just have to go in knowing that's gonna happen. But what it does do, is it gives you a lot of information on stuff that we haven't really talked about. So things like watching your learning rate, your loss function go down, right? So if you, if you draw a graph of your loss, which is your error at the end of your network, over time what should happen, is it gets better and better, it goes down. But it might not go down. It might do sort of do this. A lot of machine learning is understanding what that means and what you could try and do to rectify that problem. You know, for your first day of machine learning it's not important, but it, over time, some of the concepts that you talk about in this machine learning course will come in handy. And there's a book by Yoshua Benjio called, "Deep Learning," which also, again, a lot of maths in it, but it covers a lot of the core concepts. Personally, I'm a kind of, I've always been a kind of learn by doing kind of a person. - Yeah, exactly. Exactly. - So, in what I like to do is just get on the PyTorch or the TensorFlow tutorials and just start running some stuff and see what happens. And if you know, Python or you know any language that's even plausibly similar to Python, you're gonna have a great time doing that. - I think, especially for a lot of the audience, if they're starting out with this, let's say there's younger people who starting their careers. And, and I spoke about this in the beginning about, people are worrying that this will take their jobs away, but I'm assuming there's, whenever I see the hype cycle, there seems to be a lot of demand for AI skills. - Huge, huge demand. - Yeah. - Yeah. There's a huge demand. So I would say there's, there's kind of, you know, you've got your different levels of sort of data analyst, right? So you've got people who are pretty good at a spreadsheet up to people who are working, trying to train self-driving cars and things. I suppose, if I'm being sort of a bit, bit random in my choices of job description and, you know, you got anywhere in between, there's huge demand everywhere. So, you know, if you have any kind of data analysis ability, if you can look at a table of data and start to pick out patterns and start to work out what's going on and make predictions on that data, that's a really useful skill to have in lots and lots of jobs. And it's a very, very, very, very popular thing that people have. So a lot, we have a lot of graduates who graduate with a few modules in machine learning and a few modules in data analysis and things like this and they're in a really strong position. These things are not, you know, you can learn these things yourself. So, you know, you can go in, I've got a data analysis course. It's not very long, obviously, 'cause you know, YouTube videos, but I have some data analysis videos, there are lots of data analysis videos. - On your YouTube channel. Yeah? - Yeah. On our YouTube channel Computerphile, we have like a 10 part series on data analysis, which is just kind of like a taster, but you can have a go at that. There's lots of stuff on data analysis. Data analysis and modeling and machine learning in some ways go hand in hand. It's often good to have a little bit of a look at both of them because you know, cleaning data, for example, like you get, if you get a spreadsheet of data that doesn't make any sense, it's unwise just to stick that straight into a neural network and see what you get out because there could be some complete, you know, it could be missing values, there could be errors, there could all just have hugely different scales of data. These are all things to think about. So some knowledge of how to prepare that data for let's say a downstream task like machine learning is a really useful thing to know how to do as well. - I love that you're teaching at the university, you're teaching security, cybersecurity type stuff, but you're also doing AI. So there, do you see that, like that's a really good mix and I'm assuming based on what you've just said, you know, it's a really good idea if you are into cyber, or want get into cyber to, you know, add this to your skillset. - Yeah. I mean I would be hard pressed to find any career that wouldn't be at least helped a little bit by knowing some data analysis some machine learning because just, it just comes up a lot, right? You know, and also, I mean, as you know, we already spoke about how people can be misled by the hype cycle, right? And you will be much more resistant to this if you understand how these things work and that's gonna put you in a good position. Yeah. I think that, so I, as it happens, I teach security. I partly, I find it really interesting so I try and cling onto that module with, you know, with a vice grip, and not let anyone else have it. I also teach cryptography at university as well. - We need to get you back for some more interviews, man. - Yeah. Right. So yeah, by all means. But so those are subjects I find, I don't actively research day-to-day, but I do find very, very interesting and I do have some collaborations with, 'cause we are actual security researchers working at Nottingham and lots of places we have good collaborations with them. There is obviously machine learning involved in quite a lot of security, because it's one of many strategies for detecting malware or for anomaly detection, or, you know, any smart system that's doing something that, hopefully, you don't have to program all the rules yourself. So yeah, it does help. I've got, I think I've got a project, an undergraduate student starting, who's gonna look at malware detection with a bit of machine learning as well. And so she can bring the knowledge of the malware. I can bring the knowledge of the, of mostly the AI, you know, and it'll be great. - Mike, I always like to ask this question. If you were talking to your younger self, let's say you were 18 or, you know, I dunno, let's say some, not everyone is 18 who watches these videos, but let's say they were 25, 30, or whatever. What would you advise someone to do based on, you know, what you've seen? - I think if you are, if you're really interested in a career in cybersecurity, or a career in machine learning, it's worth noting that not everyone has a degree that does those things and that's fine, right? It's also fine if you do have a degree, I see people saying, "Well, you don't need a degree for this," or, "You do need a degree for this." I actually think, learn the skills, right? And then you get a job based on your experience and it's, you know, and you're gonna have a great time. I think that, again, it's not one of these debates I like to get into because everyone has their own career path that they wanna follow. If you're, if you did a degree in something completely different and you've worked in a job you're not really enjoying and you wanna try something new, I think that's absolutely fine: have a go. There's so many resources online, that there weren't 20, 30 years ago that there are, you know, people doing interviews and videos on different topics that you can just watch and learn about. And, as I say, I'm a very hands-on person. If I wanna try and learn a skill, I'm just gonna try and do it, and it will probably go really wrong the first time. So I think that practice, and this is true of coding as well, I think I'm big, big believer that coding is mostly practice. People say, "Well how did you know that was gonna be a bug?" 'Cause I've seen it so many times before, you know, like, because it happens all the time. I think, yeah, that would be what I would do, is find something you love doing and do more of that. You know, I program at home for fun and it's partly 'cause I find it fun. And also sometimes I wanna learn something new. I did a video a year or two ago on the Enigma machine. I don't need to program the enigma machine for my job. I just thought it was super interesting and I just sat at home and did it. And I learned quite a lot actually about the whole process and the history of it, by just having to implement the thing. And so I think, yeah, I think, crack on and learn, would be what I would do. - I love that. I mean, and I just have to say this, you are Doctor Mike, you got PhD. - Yeah. - Is that right? - Yeah. - In what, in, what was it? - In computer vision. - So I mean, it what I really love about this and this is just my opinion, so I don't wanna put you on the spot, but I love that you, as someone with a PhD, are not excluding people who perhaps never had that opportunity. And, and I love that you're encouraging everyone, you know, just to go for it. Don't let your limitations or- - Yeah. - Or lack of resources, stop you. So. - I mean, as it happens, like I was an, I was a pretty average student at school, right? I mean, I didn't do much. There wasn't much in terms of computer science in, at school when I was younger. It was, you know, "Let's use Microsoft Word," and, "Let's try that out." And so I didn't, I barely did any computing at all. I could only a little, I could only program a tiny bit when I arrived at university. Loads of people arrived at university with huge program experience, but, and loads of people who arrive with no program experience. And we always say to them, "You'll all be the same in the end," right? Like that's the whole point of a degree. And it's the whole point of what we teach. I think there's never too late to, to get into computers and learn about programming and stuff. I try and teach people to program all the time. I mean, not all of them are interested, which is annoying, but, you know, so like if it was up to me, all my family would be able to program, 'cause I'd giving them extra lessons, but some of them wanna do other things, apparently. But yeah, I'm not a gate, I don't wanna be a gatekeeper because that's not gonna get more people doing cool computer stuff. There are some things where a massive specialism is important, right? You know, I'm not proposing to go into a hospital and start surgery on people because you need a lot of training to do these things. But I also think that if someone wanted to be a surgeon they should crack on and do the training. You know, I think you can learn those skills and we require, if you're gonna work at university, we usually require a PhD and that's something that universities require, but there's a great deal I don't know about the real world and industry that people who are watching will know way more about the me and that's also fine, right? You know, everyone's got their own expertise. So I like to learn from those people and hope I can teach 'em a bit about the things I know about. - I love that. I love that. Another thing, I mean, I said 18, but I get a lot of pushback sometimes on these videos and I'm not sure if you've heard this question before, "Am I too old to start learning AI?" - No, no. I mean, consider also that the majority of academics who are using AI aren't 18-year-old, fresh graduates. They are researchers but been doing it for decades because you know, so we've all had to learn it from scratch as well. Like I say, deep learning only appeared in 2014 so it's been a mad rush since then. There's loads of scope to learn. And I don't think it takes, to get a little bit going it doesn't take that many hours, you know, if you wanna do something, you know, around your job or whatever it is, your current life situation is, I think it's doable. - I love that. Any closing thoughts? - No, I think I hope people found it interesting, right? And I'm happy to come back and talk about more topics in detail. But I think that, you know, I love my job and telling people about stuff that I think is interesting. So I would encourage those people to go off and, and look into it in a bit more detail and have a go. Just download a PyTorch tutorial, and start running it and you'll train a deep network. And then when someone goes, "All this deep learning's a bit scary." You can go, "Well, actually I did that last week and it wasn't that difficult." Yeah. That's what I'd suggest. - So for everyone watching, please put in the comments below topics that you would like us to discuss. Definitely wanna try and get Mike back. So let us know what you want us to talk about. Computerphile has a lot of fantastic videos that Mike has created. So go and have a look at those. I'll link some of those below. Please give us your feedback. Mike, thanks so much. - Thanks so much. Lovely to be here. (upbeat music plays)
Info
Channel: David Bombal
Views: 444,850
Rating: undefined out of 5
Keywords: ai, artificial intelligence, google ai sentient, google ai lamda, google ai sentient conversation, google ai alive, google ai conversation, google ai phone call, google ai robot, google ai chatbot, google ai interview, google ai conscious, google ai engineer, google ai self aware, ai sentient, terminator, ai movies, robot, robot movies, ai jobs, elon musk, google ai, ai robots, machine learning, ai robots 2022, cybersecurity, cyber, cyber ai, ai cybersecurity
Id: PH9RQ6Yx75c
Channel Id: undefined
Length: 55min 38sec (3338 seconds)
Published: Sun Jul 31 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.