GPT-2: Why Didn't They Release It? - Computerphile

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

So, I think it's worth talking a little bit because like I'm usually talking to you about safety About the decision that opening I made to not release the fully trained Model the big one So because this has not been released we know that it works like a transformer left to its own devices without being Fine-tuned it's just a massive amount of data and off you go. Is that right? Yeah, like there's enough information given in the paper to reproduce it and you just need the giant giant data set which is a real hassle to make especially because You really need high quality data, does it say anywhere in the paper about how long it took to train Yes And how and how many different how many TP use you need and stuff like that? What's the TPU? That's a tensor processing unit Okay So like a GPU, but fancy you need a lot of money if you tried to train this with just like amazon's cloud computing offering you would be You'd end up with a bill that I I expect would be in the hundreds of thousands of pounds like it's a lot of compute But with all of these things it's a lot of compute to train them. It's not that much computer run This isn't a new architecture. This isn't like a vast breakthrough From that perspective. It's just like the same thing but much bigger and And nobody else is keeping their research and like not releasing their models to the public So, you know, you think it's dangerous to say that you think that your work might be dangerous and you're not releasing it It's kind of like you think it's much more dangerous than other people's work and therefore like it's so powerful that it's dangerous it's kind of like you're saying that your stuff is so good that It's you know, it's too powerful for you. You know, I can't release it or whatever I think people reacted in a sense to that There's just smack a little bit of publicity stunt I mean assuming it's not a publicity stunt assume rest is not that which I don't believe it is what are they worried about? So that the worry like people make a big People make a big deal of the Evette generating fake news like fake news Articles that will convince people that there are actually unicorns or whatever. I don't think that that's the risk I also don't think that that's really what opening. I thinks the risk is if you want to generate a fake thing It's still not expensive to do that. You can just sit down and write something right You don't need a language model to write your fake news and In fact, you don't have that much control over it So you wouldn't if you were trying to actually manipulate something you would want to be tweaking it anyway, I don't think that's the risk the the thing that the thing that most concerns me about things like gbg 2 is Like the content is not particularly good but it is convincingly human and so it creates a lot of potential for making fake users and So there is this constant arms race between bots Operators and the big platforms right? There's teams working at Google at YouTube at Facebook everywhere working on identifying Accounts that aren't real and there's various ways. You can do that one of the things you can do is you can analyze the text that they write because the language models that are out there aren't Very good. And so if some if if an account is like repeating itself a lot Or you have a whole bunch of accounts that are all saying like exactly the same thing Then you know that this is like a spam maybe manipulation attempt and so on But with GPT, too You can have things that produce you give the same prompt and then you post all of the outputs and all of those outputs are different from each other and They all look like they were written by a human and it's not a Human can look at them probably and figure out Hang on a second. This doesn't quite seem right But only if you're really really paying attention which Human attention on the large scale is super expensive right so much more expensive than the compute needed to generate the samples So you're outmatched if you if you spend more they can spend it you can you can spend 10 times more and you cripple yourself financially and they can spend 10 times more and it's fine. So you're gonna lose that battle The other thing is so it becomes very difficult to identify fake users. The other thing is one way that you can identify fake users is by Analyzing the graph like the social graph or the interaction graph and you can see that Because Humans, usually when they see spam posts that are full of links to dubious websites and whatever They download them. They don't reply to them and You can create you can fake the voting metrics by having these accounts vote for each other's stuff But then you can analyze the graph of that and say oh all of these plate people They all only vote for each other and the people who we know are humans like never vote for them So we assume those are all bots and we can ignore them But the samples that gbt to produces the big model are convincing enough to get actual humans to engage with them Right. It's not like oh my god, that's so persuasive. I've read this article and now I believe this thing about unicorns It's just like I believe that a real human wrote this thing and now I want to argue with them That there aren't unicorns or whatever, right? And now you have real humans engaging in actual meaningful conversation with BOTS and Now you've got a real problem because how are you going to spot who the bots are? When you can't do it automatically just by analyzing the text You can't even do it by aggregating the human responses to them because the humans keep thinking that they're actual humans so now you have the ability to produce large amounts of fake users that the platforms can't spot and therefore they can't stop those users votes from counting on things up voting things and down voting things and liking them and subscriptions and everything else and maybe plating the metrics that way one thing people would do is spot the Their profile pictures if you're trying to generate a large number of BOTS where are you going to get your pictures from and so you can do like reverse image search and get the Find of it and they're all using the same picture or they're all using pictures from the same database of facial photos or whatever Now we have these really good generative adversarial networks that can generate good-looking cases So that's now really difficult as well and like you can't automatically detect those Almost by definition because the way the gams work the discriminator is like a state-of-the-art fake face image detector and it's being fooled like that's the whole point and if you released If somebody came up with a really reliable way of spotting those fake images then You can just use that as the discriminator and keep training right so not releasing their full strength model to me feels Very sensible in the sense that people will figure it out, right they published the the science Someone will find it. It is worth their while to do it to spend the money to reproduce these results, but By not releasing it. They've bought the platform's Several months to like prepare for this to understand what's going on and they are of course Working with them and sharing their full strength model with selected partners people. They trust to say here's what it can do Take a moment You know govern yourself accordingly like get ready because this stuff is going to come but they're giving everybody a heads up to Mitigate the potential like negative impacts that this work might have and the other thing is it sets a really good precedent I think Because maybe GPG - isn't that dangerous? but the stuff that we're making is just getting more and more powerful and at some point somebody is going to develop something that is really dangerous and by then you want there to be accepted practices and social norms and industry standards About thinking about the impact of your work before you release it and So it's good to start with something that like there's some argument that there could be some danger from it just so that everybody is like aware that this is a thing that you can do and that people won't think you're weird or you're bragging or it's a publicity stunt or whatever to make it like socially okay to say we found this cool result and we're not going to put it out there because we're not sure about the safety of it and I think that that's something that's really really necessary. So I think that open AI is very smart to Start that off now For we really really need it. I Make a principled decision now I want the seven so in principle I should be going this way right and would think I'd want to steer towards the seven but on the other hand at This point it's your choice. You give it some random noise and it generates an image From that noise and the idea is its supposed

Info

Channel: Computerphile

Views: 167,067

Rating: undefined out of 5

Keywords: computers, computerphile, computer, science, GPT-2, AI, Robert Miles, AI Safety, Bots, Language Models, Transformers, Computer Science, Unicorns, Unicorn Paper, Open AI, OpenAI, PR Stunt

Id: AJxLtdur5fc

Channel Id: undefined

Length: 9min 10sec (550 seconds)

Published: Wed Jul 24 2019