So I've been an AI researcher
for over a decade. And a couple of months ago,
I got the weirdest email of my career. A random stranger wrote to me saying that my work in AI
is going to end humanity. Now I get it, AI, it's so hot right now. (Laughter) It's in the headlines
pretty much every day, sometimes because of really cool things like discovering new
molecules for medicine or that dope Pope
in the white puffer coat. But other times the headlines
have been really dark, like that chatbot telling that guy
that he should divorce his wife or that AI meal planner app
proposing a crowd pleasing recipe featuring chlorine gas. And in the background, we've heard a lot of talk
about doomsday scenarios, existential risk and the singularity, with letters being written
and events being organized to make sure that doesn't happen. Now I'm a researcher who studies
AI's impacts on society, and I don't know what's going
to happen in 10 or 20 years, and nobody really does. But what I do know is that there's some
pretty nasty things going on right now, because AI doesn't exist in a vacuum. It is part of society, and it has impacts
on people and the planet. AI models can contribute
to climate change. Their training data uses art
and books created by artists and authors without their consent. And its deployment can discriminate
against entire communities. But we need to start tracking its impacts. We need to start being transparent
and disclosing them and creating tools so that people understand AI better, so that hopefully future
generations of AI models are going to be more
trustworthy, sustainable, maybe less likely to kill us,
if that's what you're into. But let's start with sustainability, because that cloud that AI models live on
is actually made out of metal, plastic, and powered by vast amounts of energy. And each time you query an AI model,
it comes with a cost to the planet. Last year, I was part
of the BigScience initiative, which brought together
a thousand researchers from all over the world to create Bloom, the first open large language
model, like ChatGPT, but with an emphasis on ethics,
transparency and consent. And the study I led that looked
at Bloom's environmental impacts found that just training it
used as much energy as 30 homes in a whole year and emitted 25 tons of carbon dioxide, which is like driving your car
five times around the planet just so somebody can use this model
to tell a knock-knock joke. And this might not seem like a lot, but other similar large language models, like GPT-3, emit 20 times more carbon. But the thing is, tech companies
aren't measuring this stuff. They're not disclosing it. And so this is probably
only the tip of the iceberg, even if it is a melting one. And in recent years we've seen
AI models balloon in size because the current trend in AI
is "bigger is better." But please don't get me started
on why that's the case. In any case, we've seen large
language models in particular grow 2,000 times in size
over the last five years. And of course, their environmental
costs are rising as well. The most recent work I led,
found that switching out a smaller, more efficient model
for a larger language model emits 14 times more carbon
for the same task. Like telling that knock-knock joke. And as we're putting in these models
into cell phones and search engines and smart fridges and speakers, the environmental costs
are really piling up quickly. So instead of focusing on some
future existential risks, let's talk about current tangible impacts and tools we can create to measure
and mitigate these impacts. I helped create CodeCarbon, a tool that runs in parallel
to AI training code that estimates the amount
of energy it consumes and the amount of carbon it emits. And using a tool like this can help us
make informed choices, like choosing one model over the other
because it's more sustainable, or deploying AI models
on renewable energy, which can drastically reduce
their emissions. But let's talk about other things because there's other impacts of AI
apart from sustainability. For example, it's been really
hard for artists and authors to prove that their life's work
has been used for training AI models without their consent. And if you want to sue someone,
you tend to need proof, right? So Spawning.ai, an organization
that was founded by artists, created this really cool tool
called “Have I Been Trained?” And it lets you search
these massive data sets to see what they have on you. Now, I admit it, I was curious. I searched LAION-5B, which is this huge data set
of images and text, to see if any images of me were in there. Now those two first images, that's me from events I've spoken at. But the rest of the images,
none of those are me. They're probably of other
women named Sasha who put photographs of
themselves up on the internet. And this can probably explain why, when I query an image generation model to generate a photograph
of a woman named Sasha, more often than not
I get images of bikini models. Sometimes they have two arms, sometimes they have three arms, but they rarely have any clothes on. And while it can be interesting
for people like you and me to search these data sets, for artists like Karla Ortiz, this provides crucial evidence
that her life's work, her artwork, was used for training AI models
without her consent, and she and two artists
used this as evidence to file a class action lawsuit
against AI companies for copyright infringement. And most recently -- (Applause) And most recently Spawning.ai
partnered up with Hugging Face, the company where I work at, to create opt-in and opt-out mechanisms
for creating these data sets. Because artwork created by humans
shouldn’t be an all-you-can-eat buffet for training AI language models. (Applause) The very last thing I want
to talk about is bias. You probably hear about this a lot. Formally speaking, it's when AI models
encode patterns and beliefs that can represent stereotypes
or racism and sexism. One of my heroes, Dr. Joy Buolamwini,
experienced this firsthand when she realized that AI systems
wouldn't even detect her face unless she was wearing
a white-colored mask. Digging deeper, she found
that common facial recognition systems were vastly worse for women of color
compared to white men. And when biased models like this
are deployed in law enforcement settings, this can result in false accusations,
even wrongful imprisonment, which we've seen happen
to multiple people in recent months. For example, Porcha Woodruff
was wrongfully accused of carjacking at eight months pregnant because an AI system
wrongfully identified her. But sadly, these systems are black boxes, and even their creators can't say exactly
why they work the way they do. And for example, for image
generation systems, if they're used in contexts
like generating a forensic sketch based on a description of a perpetrator, they take all those biases
and they spit them back out for terms like dangerous criminal,
terrorists or gang member, which of course is super dangerous when these tools are deployed in society. And so in order to understand
these tools better, I created this tool called
the Stable Bias Explorer, which lets you explore the bias
of image generation models through the lens of professions. So try to picture
a scientist in your mind. Don't look at me. What do you see? A lot of the same thing, right? Men in glasses and lab coats. And none of them look like me. And the thing is, is that we looked at all these
different image generation models and found a lot of the same thing: significant representation
of whiteness and masculinity across all 150 professions
that we looked at, even if compared to the real world, the US Labor Bureau of Statistics. These models show lawyers as men, and CEOs as men,
almost 100 percent of the time, even though we all know
not all of them are white and male. And sadly, my tool hasn't been used
to write legislation yet. But I recently presented it
at a UN event about gender bias as an example of how we can make tools
for people from all walks of life, even those who don't know how to code, to engage with and better understand AI
because we use professions, but you can use any terms
that are of interest to you. And as these models are being deployed, are being woven into the very
fabric of our societies, our cell phones, our social media feeds, even our justice systems
and our economies have AI in them. And it's really important
that AI stays accessible so that we know both how it works
and when it doesn't work. And there's no single solution
for really complex things like bias or copyright or climate change. But by creating tools
to measure AI's impact, we can start getting an idea
of how bad they are and start addressing them as we go. Start creating guardrails
to protect society and the planet. And once we have this information, companies can use it in order to say, OK, we're going to choose this model
because it's more sustainable, this model because it respects copyright. Legislators who really need
information to write laws, can use these tools to develop
new regulation mechanisms or governance for AI
as it gets deployed into society. And users like you and me
can use this information to choose AI models that we can trust, not to misrepresent us
and not to misuse our data. But what did I reply to that email that said that my work
is going to destroy humanity? I said that focusing
on AI's future existential risks is a distraction from its current, very tangible impacts and the work we should be doing
right now, or even yesterday, for reducing these impacts. Because yes, AI is moving quickly,
but it's not a done deal. We're building the road as we walk it, and we can collectively decide
what direction we want to go in together. Thank you. (Applause)