OK. So it's my great
pleasure to open the second half of our
program with another three talks from my faculty
colleagues, the first of whom is Mert Demirer. Mert, the floor is yours. [APPLAUSE] So, today, I'm going to
talk about productivity effects of generative AI. And I'm going to present
some initial results of an experiment
we are currently running with some
software developers using GitHub Copilot. And I have a bunch of
coauthors in this project. So generative AI is
transforming industries. And it is now clear
that it's going to have a significant impact
on the future of work and labor markets. But when you think
about generative AI, I think it seems different
than previous automation technologies. Previous automation
technologies mostly replaced low-skilled workers. And, if anything, they
augmented high-skilled workers. But generative AI is
different because it is about information. It is about processing
information. And it's about making decisions. So, in that sense, it is
more about knowledge workers and high-skilled workers. And early evidence already
showing that exposure to generative AI is positively
correlated with salaries and education level. And this among these knowledge
workers, software developers are early adopters
of generative AI. And they can offer
a leading indicator for the future of work and offer
lessons for other industries. But let me tell you the
evolution of LLM-based coding assistants our software
developers are currently using. It started with Codex, which
is a GPT-3-based model. OpenAI trained this
model using millions of public GitHub repositories. And this Codex
turned into a product as GitHub Copilot in 2021. After that, we have seen
additional tools developed by many different companies. Amazon launched CodeWhisperer. Replit launched Ghostwriter. Then Google recently
introduced Codey. And now we also have
GitHub Copilot enterprise. So we see more
and more tools are used by software developers. And these tools have been
widely adopted so far. For example, there is one
million paid individual users of GitHub Copilot and 37
enterprise subscribers. And to remind you when
ChatGPT was launched, ChatGPT was launched early 2022. So even a year
before ChatGPT, we had this GitHub Copilot
tool which was widely used by software developers. So we have a relatively longer
history of generative AI and AI-based tools for
software developers. And that's what
we want to study. So let me tell you what these
coding assistant tools do. A software engineer downloads
a coding assistant tool-- let's say, GitHub Copilot-- and then starts writing the
code in the preferred language and framework. GitHub Copilot reads the code
and provides some suggestions. And this could be
a line of code, this could be code snippets, or
it could be an entire function. Developer, while coding,
see these suggestions and review them,
either approve or not. And if these suggestions
are accepted, then it is incorporated
into the code. So what is the benefit of this
tool for software developers? First, to the extent
that it completes, it's going to reduce the
number of keystrokes. It's going to substitute to
need to go online and search for different functions. It's going to write
documentation. It's going to save time
for software developers. Moreover, it can include
the quality of the code. It can suggest a new way
of coding that the software developer is not familiar with. There are, of course,
some potential concerns. These suggestions
could be incorrect. If developers blindly
accept these suggestions, the quality might worsen. And, of course, for
enterprise customers, there is open-source and
security implications. So these tools do,
actually, more than this, but this is a simpler way
to describe what they do and how software developers
interact with these tools. OK. So what do we do is,
we wanted to understand how productive software
developers become when they use these tools to do that. We run a field experiment
with 400 professional software developers. These are all full-time
Accenture employees working on a variety of
software development projects. These are all
located in East Asia. And we studied them in their
natural work environment. So that's, I think, the
first field experiment with software developers. There has been several
lab experiments giving software developer a task. OK, you use GitHub Copilot. You don't use GitHub Copilot. What is the effect? What we are doing here is,
we only do an intervention. We only introduce GitHub
Copilot and do nothing else. We don't give them a task. We study them in their
natural working environment. And we think this is important
because an analysis from a lab experiment in a
controlled environment might be different from
their natural environment. So in this
experiment, we started with 400 software developers. We randomly selected
these developers into two groups, 200 treated
developers and 200 control developers. The treated group, they became
eligible to use GitHub Copilot. We sent them an email
saying that you are eligible to use GitHub Copilot. And we also provided
them some training to teach them how
to use these tools. The control group,
200 developers, they didn't have access
to GitHub Copilot, though they could
use different tools. They could use, for
example, ChatGPT. So the only difference
between these two groups is that one has access
to GitHub Copilot, and the other 200 do not have
access to GitHub Copilot. So after the experiment, we
developed many activity metrics from the software
developers, such as number of pull requests, number of
commits, number of builds. I'm not going to go into
details of what these are, but these are all output
metrics of software developers in their
software-development process. So we started the
experiment in July 2023. And, currently, we
have three months of data from this experiment. And I'm going to show you the
results from the experiment. So, first, I wanted to
show you the adoption rate of GitHub Copilot
in the treated group, so what fraction of
eligible software developers are actually using
GitHub Copilot. So we see that the adoption
rate after three months is around 60%. And we see a slow
and gradual adoption. So in the first month, we
only have 30% of developers use GitHub Copilot. And over time, this increased
and converge to 60%. I think this slow and
not universal adoption was a bit surprising to
me because I was expecting that software
developers are going to use this tool, given
the hype around LLMs and different-- like, ChatGPT. And I think this is an
interesting question in and of itself. Like, if a software
developer doesn't adopt this, what is the reason? What are the main barriers? OK. So we collect the data from
August 2022 to October 2023. So we have one year
pre-experiment data and three months post-experiment data. The experiment is still
ongoing, so we are currently collecting data. I will show you the
initial results today, which will be updated
as we collect more data from these software developers. So what we do is, we follow
an event study design. We compare the
change in the output in the treated group with
the change in the output in the control group. So we look at the change in the
output of the eligible software developers who use GitHub
Copilot with the control group who don't use GitHub Copilot. We have many output metrics,
including weekly pull requests, number of commits, and
some other output metrics. But we don't observe
quality, which might be important for
software development. And we don't take into
account the team production. Sometimes many
software developers work on the same
project together, which can lead to peer effects
or different allocation of tasks. Currently, we are not
speaking towards those issues. OK. So let me show you our main
result from this experiment. So this shows the weekly number
of builds activity over time for developers who use
GitHub Copilot and developers in the control group. We have one year of data before
experiment and three months data after the experiment. Before the experiment
started, we see that these two groups
follow a similar pattern. This is because these two
groups are randomly chosen. So this is ensured
by experiment. And after the experiment,
we see a huge increase in the number of builds
activity with the group that use GitHub Copilot. So there's a sharp jump,
and then it comes down, but it is larger than
our control group. So this suggests that there
is some potential productivity increase of GitHub Copilot. Software engineers who
use GitHub Copilot, they produce more output. So in order to put some
numbers into this result, we compare the change
in [INAUDIBLE] output of the treated and
the control group. So we ask in what
percent more productive the treated group are,
relative to control group. And we use three
different outcomes-- total number of builds, total
pull requests, and total commits. Overall, I think, even though
results are slightly different, in terms of the magnitude based
on these activity metrics. We see that total number
of builds increase by 50%, total number of pull
requests increase by 20%. And there is no statistically
significant effect with the total commits. So, overall, I
think these results suggest that there
is some productivity increase of software developers
when they use GitHub Copilot. But it is, of course,
important to understand why these different metrics
provide different numbers. OK. So this was the main
result of the paper. And as I said, we are
still collecting data. And we are going to
have more results as we collect more data. s let me summarize
what we found so far. We found that coding
assistant tools raise the productivity
of software engineers, the evidence
from this experiment. And we are currently
running surveys to better understand the
developer experience. So this is the first
field experiment with software developers. And these results
confirm the results from the lab experiments. In general, the findings
from lab experiments typically point out in the
range of 30% to 50% increase in productivity. And our results are
consistent with that. So some other important
considerations, it is important to
remember that it is early days for these tools. They are going to
certainly improve, and the productivity
effect could also change. And I think another interesting
question-- at least to me-- we see the adoption
rate is only 60%. 40% of the engineers
were eligible at no cost to use this tool, but
they didn't adopt. So it is important to
understand whether there are any barriers against
adopting these tools into their workflow. OK. So let me conclude
my presentation for the implications of the
labor market for this study. So I think the most and the
first important question is, will these tools replace
human software developers? I think my answer is, unlikely. There is still growing demand
for software developers with new skills. So even if these software
developers become 50% more productive, there is
always more advanced tasks to work on for these
software developers. I don't think these
tools are going to replace software developers. The crucial question is, what
is the joint production function between software developers
and coding assistant tools, generative AI? How do they interact
with each other when they work on a project. And, in particular, whether
generative AI is a substitute or complement the
work previously done humans is the most
important question. Because it's going to tell us
to what extent these tools are going to augment the
software engineers versus have the
potential to replace. And, finally, in terms of
the policy implications the most important
question is, who will benefit from these
tools and whether there is any role for policy, in
terms of providing training to software engineers to
utilize these tools as much as possible. Thank you. [APPLAUSE]