ChatGPT and large language model bias | 60 Minutes

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
60 Minutes over time this week on 60 Minutes we take a look at chat GPT a new technology fueled by artificial intelligence that was developed by the company open AI it is a chat bot that uses massive amounts of data to predict the most likely sequence of words allowing it to seem like you're talking to a human the bot can answer questions explain complex topics and even help write code emails and essays Tim need gebrew who is the founder and executive director of the distributed artificial intelligence Research Institute she explained to us what she sees as the pitfalls of training artificial intelligence applications with mountains of indiscriminate data from the internet well let's start by you just telling us what generative AI is what does that mean so broadly speaking these um are models that take in as input text mostly and output either images or other kinds of texts or even videos the generative part means that they're kind of making up text as they go based on how they've been trained and do do all these generative AI systems just stop up everything they can from the internet I mean willy-nilly so almost all of them are trained with a lot of either text or images or videos depending on what they're trained to do from the internet and often indiscriminately but many of these companies have found out that that's not acceptable to have as an application systems like chat GPT have produced outputs that are not sensical factually incorrect even sexist racist or otherwise offensive Tim need gebrew was the co-head of Google's AI ethics team until 2020. she says she was forced out after she co-authored a paper highlighting the risks of certain AI systems can you explain to us why we're hearing that output is biased there is an assumption by many people who build these types of models that just because the internet has lots and lots of texts or lots and lots of data that somehow when you train something based on that internet it will encode so many different points of views and so so much so many diverse views Etc right and what we argue is that size doesn't guarantee diversity there are so many different ways in which people on the internet are bullied off of the internet and now with let's ask which people are bullied off of the internet we know women get harassed online all the time we know people in underrepresented groups get harassed and bullied online right when they're bullied they leave they leave so they're not represented in what is being stopped up exactly so the text that you're using from the internet to train these models is going to be encoding the people who remain online who are not bullied off all of the sexist and racist things that are on the internet all of the hegemonic views that are on the internet that's what you're going to be encoding right so we were not surprised to see racist and sexist and homophobic and ableists Etc outputs can't you train these systems to identify toxicity and not allow the system to spew it out there are many ways in which a different organizations and research groups are are building toxicity systems toxicity detectors there are so many different ways in which this these models can be toxic right in so many different languages and so many different aspects so many different tests sometimes it's not even clear so actually similar to social media platforms that do content moderations they now have a lot of people that they pay to try and figure out which content is extremely toxic or horrible so that they can actually train another system that can tell you which is toxic and not Tim need gebrew says this approach removing harmful content as it happens is like playing whack-a-mole she thinks the way to handle artificial intelligence systems like these going forward is to build in oversight and regulation food medicine Cars airplanes they each have an agency devoted to that there's no Tech regulation agency no should there be yeah yeah I absolutely think that there should be an agency that I don't know what it would look like but I do think that there should be an agency that is helping us make sure that some of these systems are safe that they're they're not harming us that it is actually beneficial you know there should be some sort of oversight I don't see any reason why this one industry is being treated so differently from everything else
Info
Channel: 60 Minutes
Views: 42,585
Rating: undefined out of 5
Keywords: 60 Minutes, CBS News, researcher, timnit gebru, language models, chatgpt, technology, oversight
Id: kloNp7AAz0U
Channel Id: undefined
Length: 5min 39sec (339 seconds)
Published: Mon Mar 06 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.