Risks of Large Language Models (LLM)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

With all the excitement around ChatGPT, it's easy to lose sight of the unique risks of generative AI. Large language models, a form of generative AI, are really good at helping people who struggle with writing English prose. It can help them unlock the written word at low cost and sound like a native speaker. But because they're so good at generating the next syntactically correct word, large language models may give a false impression that they possess actual understanding or meaning. The results can include a flagrantly false narrative directly as a result of its calculated predictions versus a true understanding. So ask yourself: What is the cost of using an AI that could spread misinformation? What is the cost to your brand, your business, individuals or society? Could your large language model be hijacked by a bad actor? Let me explain how you can reduce your risk. It falls into four areas: Hallucinations, Bias, Consent, and Security. As I present each risk, I'll also call out the strategies you can use to mitigate these risks. You ready? Let's start with the falsehoods, often referred to as "AI hallucinations". Quick sidebar -- I really don't like the word "hallucinations" because I fear it anthropomorphizes AI. I'll explain it a bit. Okay, you've probably heard the news reports of large language models claiming they're human, or claiming they have emotions, or just stating things that are factually wrong. What's actually going on here? Well, large language models predict the next best syntactically correct word, not accurate answers based on understanding of what the human is actually asking for. Which means it's going to sound great, but might be 100% wrong in its answer. This wrong answer is a statistical error. Let's take a simple example. Who authored the poems A, B, C? Let's say they were all authored by the poet X, but there's one source claiming it was the author Z. We have conflicting sources in the training data. Which one actually wins the argument? Even worse, there may not be a disagreement at all, but again, a statistical error. The response could very well be incorrect because again, the large language models do not understand meaning; these inaccuracies can be exceptionally dangerous. It's even more dangerous when you have large language models annotate its sources for totally bogus answers. Why? Because it gives the perception it has proof when it just doesn't have any. Imagine a call center that has replaced its personnel with a large language model, and it offers a factually wrong answer to a customer. Right, here's your factually wrong answer. Now, imagine how much angrier this customer will be when they can't actually offer a correction via a feedback loop. This brings us to our first mitigation strategy: Explainability. Now, you could offer inline explainability and pair a large language model with the system that offered real data and data lineage and provenance via a knowledge graph. Why did the model say what it just said? Where did it pull its data from? Which sources? The large language model could provide variations on the answer that was offered by the knowledge graph. Next risk: Bias. Do not be surprised if the output for your original query only lists white male Western European poets. Want a more representative answer? Your prompt would have to say something like, "Can you please give me a list of poets that include women and non-Western Europeans?" Don't expect the large language model to learn from your prompt. This brings us to the second mitigation strategy: Culture and Audits. Okay, culture is what people do when no one is looking. It starts with approaching this entire subject with humility, as there is so much that has to be learned and even, I would say, unlearned. You need teams that are truly diverse and multidisciplinary in nature working on AI because AI is a great mirror into our own biases. Let's take the results of our audits of AI models and make corrections to our own organizational culture when there are disparate outcomes. Audit pre-model deployment as well as post-model deployment. Okay, next risk is Consent. Is the data that you are curating representative? Was it gathered with consent? Are there copyright issues? Right! Here's a little copyright symbol. These are things we can and should ask for. This should be included in an easy to find, understandable fact sheet. Oftentimes we subjects, we have no idea where the heck the training data came from these large language models. Where we were that gathered from? Did the developers hoover the dark recesses of the Internet? To mitigate consent-related risk, we need combined efforts of auditing and accountability. Right! Accountability includes establishing AI governance processes, making sure you are compliant to existing laws and regulations, and you're offering ways for people to have their feedback incorporated. Now on to the final risk, Security. Large language models could be used for all sorts of malicious tasks, including leaking people's private information, helping criminals phish, spam, scam. Hackers have gotten AI models to change their original programming, endorsing things like racism, suggesting people do illegal things. It's called jailbreaking. Another attack is an indirect prompt injection. That's when a third party alters a website, adding hidden data to change the AI's behavior. The result? Automation relying on AI potentially sending out malicious instructions without you even being aware. This brings us to our final mitigation strategy, and the one that actually pulls all of this together, and that is education. All right, let me give you an example. Training a brand new large language model produces as much carbon as over a 100 roundtrip flights between New York and Beijing. I know, crazy, right? This means it's important that we know the strengths and weaknesses of this technology. It means educating our own people on principles for the responsible curation of AI, the risks, the environmental cost, the safeguard rails, as well as what the opportunities are. Let me give you another example of where education matters. Today, some tech companies are just trusting that large language models training data has not been maliciously tampered with. I can buy a domain myself now and fill it with bogus data. By poisoning the dataset with enough examples, you could influence a large language model's behavior and outputs forever. This tech isn't going anywhere. We need to think about the relationship that we ultimately want to have with AI. If we're going to use it to augment human intelligence, we have to ask ourselves the question: What is the experience like of a person who has been augmented? Are they indeed empowered? Help us make education about the subject of data and AI far more accessible and inclusive than it is today. We need more seats at the table for different kinds of people with varying skill sets working on this very, very important topic. Thank you for your time.

Info

Channel: IBM Technology

Views: 41,946

Rating: undefined out of 5

Keywords: IBM Cloud, LLM, Large Language Models, AI Ethics, ChatGPT, Chat GPT, GPT, (Generative Pretrained Transformer, chatbot hallucinations

Id: r4kButlDLUc

Channel Id: undefined

Length: 8min 25sec (505 seconds)

Published: Fri Apr 14 2023