Well, the question
we want to consider is, how will Large Language
Models, Generalized Pre-trained Transformers, or GPTs,
impact cyber security? And who should we ask? I'd like to go to
the future, and I'd like to ask someone who
already knows the answer. And I may as well ask my
namesake, Una Chin-Riley. But she actually
wasn't available. That season's not out yet. So instead, you've got me. And I'm a 21st-century
contemporary researcher here at MIT. I have a generation-- if
you'll excuse the pun-- of experience in
machine learning. And lately, my group has
been doing a lot of work in applying machine
learning to cybersecurity. So how am I going to
answer this question? I'm going to give you
the bottom line up front. I think that LLMs,
with their support, will be rapidly taken up
by ambitious and talented global enterprises that
will design by application developers who are going
to produce better tools, faster tools, and smarter
tools and services for the defensive
side of things. And these tools are actually
going to be very impressive. They're going to be
naturally intuitive. They're going to
be more automatic and improve our use
of manual effort. And they're going to reason so
impressively compared to where AI has brought us before. So what this is going to
help is this environment in the security
operations center, which is almost adrenaline
soaked because it has to handle
detection and responses for these threats that are
coming in very, very rapidly. And we need to do this because,
of course, the same uptake and help that we gain from
LLM-supported improvements to our defensive security
tools and services is inevitably going to be
taken up by the adversaries on the other side of the fence. The alleviation we
will get from LLMs is going to be matched
by the aggravation that we get from them. What we're going to
see from threat actors is threats that will
tie us in knots, that will have additional speed,
that will be more relentless, and that will be even
smarter than before. So let me elaborate. But wait. Actually, I'm going
to stop myself because the original question
was, who should we ask? And come on. Here I am. Who do you think we
should ask how LLMs will influence cyber security? It's obvious, right? Let's ask GPT-4. So in preparing my talk, I was
chatting with postdoc Steve in my group. And he said he had just asked
GPT-4 a similar question. So I'm going to give you
a view of the interaction that Steve prepared. And I should say before I show
you the prompt that I consider Steve to be a prompt whisperer. He's been working with my
group for over six months on supporting cybersecurity,
threat actor modeling, and threat defenses using LLMs. And he knows the
magic that it takes. a when I show you this prompt,
you might see some of that because it's a little bit wordy. And that's because, remember,
LLMs are general purpose. And if we want to have
them do something for us, then we need to give them
context and point them to the topic with which we want
them to give us a response. So here's the prompt. You are participating in
a roundtable discussion. You will be prompted with
cybersecurity questions related to AI. You are to respond to the
prompt and answer appropriately. Make your responses
concise and direct. Do not mention that you're
an AI language model. And your responses should sound
in the tone of an opinion. So in what ways can LLMs
help alleviate or aggravate cybersecurity problems? So this is the response we got. I've taken the
first line, and I've made it the title of my slide. GPT-4 gave me a list of
alleviations and aggravations. And I've just put them
side by side verbatim on the slide for you to see. I don't expect you to
look at their detail. What I want to tell you is that
under the alleviation column, it hit everything I said
or was planning to say. And under the aggravation
column, maybe not so good. There's two things
that correspond to my mention of the defense and
the offense being correlated. But this item about
disinformation campaigns is out of scope. And the item about
adversarial attacks, if you study it a little
bit, it's off base. So being an academic,
if I want to grade GPT-4 on this response, it gets
a C. That's satisfactory. I would call it
satisfactory because it's startling with respect
to how much better it is than current technology. And I think that's
part of the factor. We appreciate it. But as we use these
models more and more, that startling impact
is going to go away. And I think we'll
see other things. I will remark that it's
incredibly convenient because I had all these ideas in my head. And asking the model
ordered them, itemized them, and actually reassured me
that my ideas were consistent. But if you start looking
at these a little bit more probingly, I
think what you'll find is that they become rote. And I'd like to say that
they're shallow, the responses. And that's because
the model is not reflecting any sort of
causal understanding of what it is in LLMs that give
rise to these alleviations and aggravations. And when it has alleviations and
aggravations, how I connected them, it doesn't
do that because it doesn't reflect any
causal understanding of their relationship. The last point I want to bring
up is the most important, which is this has lured us
into a sense of confidence that we have the
whole thing covered. But I had more things on
my list beyond my bluff. And I'm going to tell you the
ones that GPT-4 didn't mention. So first, my team and I are
working on cyber hunting. And cyber hunting starts from a
lot of text and code knowledge that's out there on
the internet that's being compiled from
cyber threat experience and by our governments. And it basically looks for
threat actors in a system before they're detected. So it actually ups
the defensive game by being a little bit
more anticipative. And this is one
thing that you would expect GPT-4 to think about. It has actually trained on all
that data I'm referring to. And yet it hasn't
given me that answer. And I think there's one
more thing I can mention, which is that when I connect
aggravation and alleviation, there's another
important impact. The current pace of
our cyber arms race is going to be
disrupted by LLMs. Consider our current defenses. That's on the y-axis here. They range from being
very strong, well resourced, to
unfortunately being weaker, perhaps out of date. And they face off with threat
actors that range in low competence, middle-aged
competence-- these are the ransomware
people you really worry about-- and then state actors, who
have tremendous resources and are actually operating
on a different level. So there's a cyber arms
race between the defenses and the threat actors currently. What the LLMs are going
to do is they're going to float those boats higher. Now you would say,
OK, defenses are too. But what's important
here is how fast each side works,
which gives them the right to getting to the
top and staying at the top the fastest and the longest. And this is a problem that we
all have to be concerned about. It's missing something
else, the answer from GPT, which is any acknowledgment
that things are unknown, and the ability to
think about the unknown. GPT-4 hasn't given any credit
to the thought of humanity, including a dash of creativity,
ingenuity, serendipity, and emergent thought
processes, anything that provides the
potential for novelty. So to summarize from our answer
from today, not the future, we know that LLMs will
alleviate and aggravate. That's GPT-4-level answer. But beyond that, we can
expect better cyber hunting and a rather worrisome arms
race with a dash of the unknown. Thank you. [APPLAUSE]