So I'm excited to share a few spicy
thoughts on artificial intelligence. But first, let's get philosophical by starting with this quote by Voltaire, an 18th century Enlightenment philosopher, who said, "Common sense is not so common." Turns out this quote
couldn't be more relevant to artificial intelligence today. Despite that, AI
is an undeniably powerful tool, beating the world-class "Go" champion, acing college admission tests
and even passing the bar exam. I’m a computer scientist of 20 years, and I work on artificial intelligence. I am here to demystify AI. So AI today is like a Goliath. It is literally very, very large. It is speculated that the recent ones
are trained on tens of thousands of GPUs and a trillion words. Such extreme-scale AI models, often referred to as "large
language models," appear to demonstrate sparks of AGI, artificial general intelligence. Except when it makes
small, silly mistakes, which it often does. Many believe that whatever
mistakes AI makes today can be easily fixed with brute force, bigger scale and more resources. What possibly could go wrong? So there are three immediate challenges
we face already at the societal level. First, extreme-scale AI models
are so expensive to train, and only a few tech companies
can afford to do so. So we already see
the concentration of power. But what's worse for AI safety, we are now at the mercy
of those few tech companies because researchers
in the larger community do not have the means to truly inspect
and dissect these models. And let's not forget
their massive carbon footprint and the environmental impact. And then there are these additional
intellectual questions. Can AI, without robust common sense,
be truly safe for humanity? And is brute-force scale
really the only way and even the correct way to teach AI? So I’m often asked these days whether it's even feasible
to do any meaningful research without extreme-scale compute. And I work at a university
and nonprofit research institute, so I cannot afford a massive GPU farm
to create enormous language models. Nevertheless, I believe
that there's so much we need to do and can do to make
AI sustainable and humanistic. We need to make AI smaller,
to democratize it. And we need to make AI safer
by teaching human norms and values. Perhaps we can draw an analogy
from "David and Goliath," here, Goliath being
the extreme-scale language models, and seek inspiration from
an old-time classic, "The Art of War," which tells us, in my interpretation, know your enemy, choose your battles,
and innovate your weapons. Let's start with the first,
know your enemy, which means we need
to evaluate AI with scrutiny. AI is passing the bar exam. Does that mean that AI
is robust at common sense? You might assume so, but you never know. So suppose I left five clothes
to dry out in the sun, and it took them five hours
to dry completely. How long would it take to dry 30 clothes? GPT-4, the newest, greatest
AI system says 30 hours. Not good. A different one. I have 12-liter jug and six-liter jug, and I want to measure six liters. How do I do it? Just use the six liter jug, right? GPT-4 spits out some
very elaborate nonsense. (Laughter) Step one, fill the six-liter jug, step two, pour the water
from six to 12-liter jug, step three, fill the six-liter jug again, step four, very carefully,
pour the water from six to 12-liter jug. And finally you have six liters
of water in the six-liter jug that should be empty by now. (Laughter) OK, one more. Would I get a flat tire
by bicycling over a bridge that is suspended over nails,
screws and broken glass? Yes, highly likely, GPT-4 says, presumably because it cannot
correctly reason that if a bridge is suspended
over the broken nails and broken glass, then the surface of the bridge
doesn't touch the sharp objects directly. OK, so how would you feel
about an AI lawyer that aced the bar exam yet randomly fails at such
basic common sense? AI today is unbelievably intelligent
and then shockingly stupid. (Laughter) It is an unavoidable side effect
of teaching AI through brute-force scale. Some scale optimists might say,
“Don’t worry about this. All of these can be easily fixed
by adding similar examples as yet more training data for AI." But the real question is this. Why should we even do that? You are able to get
the correct answers right away without having to train yourself
with similar examples. Children do not even read
a trillion words to acquire such a basic level
of common sense. So this observation leads us
to the next wisdom, choose your battles. So what fundamental questions
should we ask right now and tackle today in order to overcome
this status quo with extreme-scale AI? I'll say common sense
is among the top priorities. So common sense has been
a long-standing challenge in AI. To explain why, let me draw
an analogy to dark matter. So only five percent
of the universe is normal matter that you can see and interact with, and the remaining 95 percent
is dark matter and dark energy. Dark matter is completely invisible, but scientists speculate that it's there
because it influences the visible world, even including the trajectory of light. So for language, the normal matter
is the visible text, and the dark matter is the unspoken
rules about how the world works, including naive physics
and folk psychology, which influence the way
people use and interpret language. So why is this common sense
even important? Well, in a famous thought experiment
proposed by Nick Bostrom, AI was asked to produce
and maximize the paper clips. And that AI decided to kill humans
to utilize them as additional resources, to turn you into paper clips. Because AI didn't have the basic human
understanding about human values. Now, writing a better
objective and equation that explicitly states:
“Do not kill humans” will not work either because AI might go ahead
and kill all the trees, thinking that's a perfectly
OK thing to do. And in fact, there are
endless other things that AI obviously shouldn’t do
while maximizing paper clips, including: “Don’t spread the fake news,”
“Don’t steal,” “Don’t lie,” which are all part of our common sense
understanding about how the world works. However, the AI field for decades
has considered common sense as a nearly impossible challenge. So much so that when my students
and colleagues and I started working on it several years ago,
we were very much discouraged. We’ve been told that it’s a research
topic of ’70s and ’80s; shouldn’t work on it
because it will never work; in fact, don't even say the word
to be taken seriously. Now fast forward to this year, I’m hearing: “Don’t work on it
because ChatGPT has almost solved it.” And: “Just scale things up
and magic will arise, and nothing else matters.” So my position is that giving
true common sense human-like robots common sense
to AI, is still moonshot. And you don’t reach to the Moon by making the tallest building
in the world one inch taller at a time. Extreme-scale AI models do acquire an ever-more increasing amount
of commonsense knowledge, I'll give you that. But remember, they still stumble
on such trivial problems that even children can do. So AI today is awfully inefficient. And what if there is an alternative path
or path yet to be found? A path that can build on the advancements
of the deep neural networks, but without going so extreme
with the scale. So this leads us to our final wisdom: innovate your weapons. In the modern-day AI context, that means innovate
your data and algorithms. OK, so there are, roughly speaking,
three types of data that modern AI is trained on: raw web data, crafted examples
custom developed for AI training, and then human judgments, also known as human
feedback on AI performance. If the AI is only trained
on the first type, raw web data, which is freely available, it's not good because this data
is loaded with racism and sexism and misinformation. So no matter how much of it you use,
garbage in and garbage out. So the newest, greatest AI systems are now powered with the second
and third types of data that are crafted and judged
by human workers. It's analogous to writing specialized
textbooks for AI to study from and then hiring human tutors
to give constant feedback to AI. These are proprietary data, by and large, speculated to cost
tens of millions of dollars. We don't know what's in this, but it should be open
and publicly available so that we can inspect and ensure
[it supports] diverse norms and values. So for this reason,
my teams at UW and AI2 have been working
on commonsense knowledge graphs as well as moral norm repositories to teach AI basic commonsense
norms and morals. Our data is fully open so that anybody
can inspect the content and make corrections as needed because transparency is the key
for such an important research topic. Now let's think about learning algorithms. No matter how amazing
large language models are, by design they may not be the best suited to serve
as reliable knowledge models. And these language models do acquire
a vast amount of knowledge, but they do so as a byproduct
as opposed to direct learning objective. Resulting in unwanted side effects
such as hallucinated effects and lack of common sense. Now, in contrast, human learning is never
about predicting which word comes next, but it's really about making
sense of the world and learning how the world works. Maybe AI should be taught
that way as well. So as a quest toward more direct
commonsense knowledge acquisition, my team has been investigating
potential new algorithms, including symbolic knowledge distillation that can take a very large
language model as shown here that I couldn't fit into the screen
because it's too large, and crunch that down to much smaller
commonsense models using deep neural networks. And in doing so, we also generate,
algorithmically, human-inspectable, symbolic, commonsense
knowledge representation, so that people can inspect
and make corrections and even use it to train
other neural commonsense models. More broadly, we have been tackling
this seemingly impossible giant puzzle of common sense, ranging from physical, social and visual common sense to theory of minds, norms and morals. Each individual piece
may seem quirky and incomplete, but when you step back, it's almost as if these pieces
weave together into a tapestry that we call human experience
and common sense. We're now entering a new era in which AI is almost like
a new intellectual species with unique strengths and weaknesses
compared to humans. In order to make this powerful AI sustainable and humanistic, we need to teach AI
common sense, norms and values. Thank you. (Applause) Chris Anderson: Look at that. Yejin, please stay one sec. This is so interesting, this idea of common sense. We obviously all really want this
from whatever's coming. But help me understand. Like, so we've had this model
of a child learning. How does a child gain common sense apart from the accumulation of more input and some, you know, human feedback? What else is there? Yejin Choi: So fundamentally,
there are several things missing, but one of them is, for example, the ability to make hypothesis
and make experiments, interact with the world
and develop this hypothesis. We abstract away the concepts
about how the world works, and then that's how we truly learn, as opposed to today's language model. Some of them is really
not there quite yet. CA: You use the analogy
that we can’t get to the Moon by extending a building a foot at a time. But the experience
that most of us have had of these language models
is not a foot at a time. It's like, the sort of,
breathtaking acceleration. Are you sure that given the pace
at which those things are going, each next level seems
to be bringing with it what feels kind of like wisdom
and knowledge. YC: I totally agree that it's remarkable
how much this scaling things up really enhances the performance
across the board. So there's real learning happening due to the scale of the compute and data. However, there's a quality of learning
that is still not quite there. And the thing is, we don't yet know whether
we can fully get there or not just by scaling things up. And if we cannot, then there's
this question of what else? And then even if we could, do we like this idea of having very,
very extreme-scale AI models that only a few can create and own? CA: I mean, if OpenAI said, you know,
"We're interested in your work, we would like you to help
improve our model," can you see any way
of combining what you're doing with what they have built? YC: Certainly what I envision will need to build on the advancements
of deep neural networks. And it might be that there’s some
scale Goldilocks Zone, such that ... I'm not imagining that the smaller
is the better either, by the way. It's likely that there's right
amount of scale, but beyond that, the winning recipe
might be something else. So some synthesis of ideas
will be critical here. CA: Yejin Choi, thank you
so much for your talk. (Applause)