The era of the AI Copilot | KEY02H

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[MUSIC] ANNOUNCER: Please welcome Chief Technology Officer and Executive Vice President of AI, Kevin Scott. KEVIN SCOTT: That was an amazing video and thank you, Satya, for sharing it. It really is inspiring to see this technology getting diffused so quickly and having a real positive impact across the globe, not just in the global, the urban innovation centers here in the United States and the capitals of the industrialized world. I'm so excited and happy to be with you all here today, in-person at Build after a four-year hiatus. I guess it goes without saying that a lot has changed in the world of technology over these past four years. One of the biggest changes, and it's the theme of this conference, is what has happened in the world of AI, just even in the past year and what that means for you all as developers. I wrote my first program as a developer in the early '80s, when I was 11 years old, I think. I remember what a thrilling moment that was, being able to do something for the first time that I didn't even realize was possible. I've been chasing that feeling my entire career, trying to find those moments where something impossible became possible. Then as a developer, figuring out how I could participate in that change. The thing that I can say that it is the most exciting thing in the world to me, maybe the most exciting time that I've experienced in my career, is what that power of AI is doing right now to help all of us have that moment, that ability to take something in our hands to look at what was possible, what was impossible and becoming possible now, and then going and doing something great with it. I'm going to spend my next half an hour or so chatting with you all about some of those technological themes that are driving all of this great progress in AI that we're seeing. We're going to start with maybe the obvious thing. There's an incredible amount of attention being paid right now to what's happening with the rapid progress with these AI models, these foundation models, as we're calling them now. In particular, the rapid pace of innovation that's being driven by OpenAI and in their partnership with Microsoft, we really are setting the pace of innovation in the field of AI right now. I think, even for us, it's been surprising to see how much of the zeitgeist is being captured by things like ChatGPT, and applications that people are building on top of these large foundation models. The reason that this partnership between OpenAI and Microsoft has been so successful is that we really do have an end-to-end platform for building AI applications. We build the world's most powerful supercomputers. We have the world's most capable foundation models, either hosted that we built ourselves and make available to you all via API, or open source, which run great on Azure. We also have the world's best AI developer infrastructure. Whether that is using these super powerful computers to train your models from scratch or to build these applications that we're going to be talking about at Build this year on top of that infrastructure, we have that end-to-end platform. You're going to hear a ton about it today and tomorrow. Scott's keynote is right after mine. He's going to dive into detail on a bunch of this stuff, and then the breakout sessions are going to be amazing and equip you all with the information that you need to go do some pretty awesome stuff. This end-to-end platform starts with Azure. We really believe that Azure is the cloud for AI. It's not just the amazing, technically complicated and brilliant work that our partners OpenAI have done on top of all of this infrastructure. But it's things that the teams at Microsoft are doing to build our Copilot applications and our own advanced AI models. It's also the things that our partners, some of you here in this room, are building on top of Azure, making Azure this amazing platform for doing the most ambitious flavors of AI in the world. But it's not just Azure; Windows, we believe, is the best client for AI development. You're going to see a bunch of that today. Panos is going to dive into it pretty deeply tomorrow. Satya showed the Windows Copilot, which is going to be an amazing part of your productivity story, like GitHub Copilot works great on Windows. But increasingly what you're going to see is the ability to run these powerful AI models on your Windows PC so that you can develop these true hybrid AI applications that span the Edge, all the way to the Cloud. It's just a really really exciting thing. But what I'm going to spend most of my talk discussing with you all is this idea of the Copilot. Satya has already referenced a whole bunch of the Copilots we've launched. As he said, it's almost as if we woke up on January the first and decided to do a whole bunch of press releases. But it's really been years of work where we had built a platform for building copilots that has enabled us to do these amazing releases that we've been doing and we are sharing with you all today some of those patterns that have helped us build copilots, and showing you and opening up our platform so that you can build copilots of your own. Just to start with, a copilot, simply said, is an application that uses modern AI that has a conversational interface that assists you with cognitive tasks. We're going to talk a lot about what that means later. We believe that it must be an open ecosystem. One of the most important things that we believe is even though there are a whole bunch of copilots that Microsoft has built, that maybe the most interesting copilots to get built are by you all, using these powerful tools that you have, both on Azure, on Windows, and in the open source community. As we start talking about this, I would love to bring to stage Greg Brockman, the President and Co-founder of OpenAI, to talk about his experiences building GPT4, this powerful model that's powering a bunch of these copilots, and about ChatGPT, which maybe is the most interesting copilot in the world right now. Please join me on stage, Greg. [MUSIC] GREG BROCKMAN: Awesome. KEVIN SCOTT: Fantastic. Thank you so much for joining us today here, Bill. I wanted to start with the ChatGPT experience. So, like I believe it's caught us all by surprise, like just how crazy the adoption of ChatGPT has been and how much interest there is. But like it's a really big engineering challenge to build something like ChatGPT, so maybe you could talk a little bit with us about that. GREG BROCKMAN: Yeah. ChatGPT was a really interesting process, both from an infrastructure perspective and an ML perspective. We'd actually been working on the idea of having a chat system for a number of years. We even demoed at Build an early version called "WebGPT" and it was cool; it was a fun demo. We had a couple hundred contractors, literally people we had to pay to use the system. They were like, "It's kind of useful; it can kinda help with coding tasks." But for me, the moment that really clicked was when we had GPT4, and that we had a traditional process to GPT3, we just deployed the base model, so just been pre-trained, we hadn't really tuned it in any direction. And that was in the API. For 3.5, we'd actually gotten to the point where we were doing instruction following, where we had contractors who were given, here's an instruction and here's how you're supposed to complete it. We did that training on GPT4, and the thing that was so interesting was I just, as a little experiment was like, well, what happens if you follow up with a second instruction after it already generated something? The model provided the perfectly good response that incorporated everything from before then. So you realized that this model was capable enough. It had really generalized this idea of, well, if you really want me to follow instructions, you give me a new instruction. Maybe you really want me to have a conversation. For me that was the moment that it clicked that, okay, we have this infrastructure that's already in place with earlier model and this new model, even just using this technology that wasn't meant for chat, it wants to chat like it's going to work. So this was a real aha moment and from there we just were like, we got to get this thing out; it's going to work. KEVIN SCOTT: Yeah. I think it was really surprising to me. I remember when Sam called me up and said, "Hey, we want to release this ChatGPT thing and we think it's going to be a few weeks worth of work to condition one of these models." I was like, "Sure, why not?" I had no idea that it was going to work technically as well as it did and that it was going to be such a crazy success. Maybe related to that, I know that you are one of the principal architects on all of the infrastructure that was used to train GPT4. GPT4 powers parts of ChatGPT. It has just really been a revelation for everyone who's been working in the field of AI. I wonder if you could share a little bit of, what are some of the interesting things that you found about the development of GPT4? GREG BROCKMAN: GPT4 was very much a labor of love. As a company, we'd actually, after GPT3 had multiple failed attempts, to surpass the performance of that model. It's not an easy thing. What we ended up doing was going back to the drawing board, rebuilding our entire infrastructure. A lot of the approach we took was, get every detail right. I'm sure that there's still bugs; I'm sure that there's still more details to be found. But an analogy from Yaakov, who was one of the leads on the project that I really like is, it's almost like building a rocket, where you have all the engineering tolerances to be incredibly tiny. So, lots of little details, like for example, it used to be, turns out that if we had a bug in our checkpoint, where if you kill the job at exactly the wrong moment, you could end up with a blend between new weights and old weights when the job restarted. Machine learning mostly doesn't care. It's happy to recover from that. But it's one of those things where every time you see a weird wiggle in your graph, you're like, I wonder if this was that particular issue or if it's a real other one. So, if you go back and you really pay attention to every single detail and just do the boring engineering work, that is the main thing that I do. KEVIN SCOTT: Yeah. Well, the boring engineering you work do is just an unbelievable, phenomenal scale. But I do think that that's a good parable for everyone in the room. It's sometimes the boring engineering work that really leads to success. So, Satya talked a little bit in his talk about this shared approach that we're developing for plugins. This idea that we're going to empower all of these folks in the room to write software that can extend the capability of things like ChatGPT, and like all of these copilots that we're building. I know that that also has been an interesting technical challenge and we still don't yet have all of the technical issues sorted out and there's a lot of work left to do to get it into the state that we ultimately want it to be in, so I wonder if you have some thoughts you wanted to share on that. GREG BROCKMAN: I love plugins. I think it's been a really amazing opportunity, both for every developer to leverage this technology in a way that just makes the system better for everyone. That's what I think it's so exciting. Part of the reason we designed it as an open standard was because that way, as a developer, you build this thing once and then any AI can use it. It's such a beautiful idea; I think that the web part of what really drove it was anyone could build a website and now everyone gets access to that. Then you build an API and suddenly anyone can leverage it. I think that this core design principle of really having any developer who wants to be able to plug in and get the power of the system and be able to bring all of the power of any domain into ChatGPT is really amazing. KEVIN SCOTT: Yeah, and the thing that I really love about plugins is conceptually that it's so simple. It reminds me a little bit about the first HTTP server that I ever wrote. Like if you understand the core concepts, you can stand up something very quickly that can do something very powerful. And I think that is an awesome thing as an engineer. In your role at OpenAI, like you are constantly thinking about how to push the limits of the technology. I think one of the really amazing things about our partnership is working with you-all, it feels like we get to see a little bit further into the future than we otherwise would be able to. I wonder if you could say a few things about what's exciting to you about what's over the horizon, like either with applications or with the models. GREG BROCKMAN: The thing that, to me, is interesting is we're almost on a bit of a tick-tock cycle, like Intel of yore, where you come up with an innovation and then you really push it. I think that with GPT4 we're in that early stage of really pushing it, that we still have vision capabilities that have been announced, but that we're still productionizing. I think that it'll just change how these systems work, and how they feel, and the kinds of applications that can be built on top of them. I'm really excited to, if you also look back at the history that over the past couple of years, I think we did like a 70 percent price reduction two years ago. Then basically, this past year we did a 90 percent cost reduction, intense cost drop, like that's crazy. I think we're going to be able to do the same thing repeatedly with new models. So, GPT4 right now, it's expensive, it's not fully available, but that's one of the things that I think will challenge. KEVIN SCOTT: I think that is a thing that I would want to leave everyone here in the room with is that, and it's what we say to all of the developers inside of Microsoft building on top of these things, like what's expensive today won't be tomorrow because the progress there is so fantastic. So, I think we've got time to squeeze one last thing in. So, you've already dispensed a bunch of really great advice for developers here in the room. But like maybe one more thing that you would leave the audience with. GREG BROCKMAN: I think that in this field, the technology is getting better and better. But the thing that I think every developer can do that is hard for us and even Microsoft, at Microsoft scale to do, is to really go into specific domains and figure out how to make this technology work there. So, I really have companies that are in the legal domain and really getting expertise, and talking to lots of lawyers, and understanding what their pain points are with this technology. So, I think that there's a huge amount of value to be added by the efforts of everyone. KEVIN SCOTT: I think that's awesome. You heard it from Greg. You all are the ones who are going to make AI great. Thank you so much, Greg, for being with us here today. Thanks for all you're doing. Thank you very much. One more interesting OpenAI thing that we're going to do, so we have Andrej Karpathy here. I think I see Andrej in the front row. Andrej is going to be here on this stage later today doing a "State of GPT." He's going to walk through the technology from beginning to end. It's going to be an awesome session, probably going to be tight on seating. Try to get your spot here. You are not going to want to miss that. Let's talk about Copilots. Satya mentioned a bunch of the Copilots that Microsoft has launched, that our partners have launched, so we think of ChatGPT as fitting this Copilot pattern, Bing chat, certainly GitHub Copilot, Microsoft Security Copilot, Microsoft 365 Copilot, and Designer. Many, many more. The thing that we noticed as we were building these copilots, starting with GitHub Copilot several years ago, is that the idea of a copilot is actually pretty general. This notion that you're going to have a multi-turn conversational agent-like interface on your software that helps you do cognitively complex things applies to more than just helping someone do software development. That's what you've seen, we have search copilots, now we're going to have security copilots, we have productivity copilots, and we're going to have all of the copilots that you-all build. The thing that we noticed, for us at Microsoft, is that we needed to look at what is common across all of these things so that we can understand how to design great user experiences and what the technology stack is that is going to empower us to deliver these things safely, responsibly, cost-effectively at scale. The only reason that we have been able to do this blitz of Copilot announcements and delivering these products to users so quickly is because we stopped and took the time and energy to go build a Copilot technology stack that would allow us to move quickly with safety. One of the things that I want to talk with you about today is what that technology stack looks like. But before we dive into the details, I think Satya's reminder to us all, why do we do what we do? One of the important reasons that we have taken the time to think about this Copilot stack as one coherent thing is, platforms are important. It gives us the opportunity to build things that are more ambitious than you otherwise would be able to build. It gives you, the developers, a chance to build things that wouldn't be possible if the platform didn't exist. I love this quote from Bill Gates. It may or may not be apocryphal, but it's still just been attributed to Bill for many, many years. What Bill is saying here is that the true value of a platform only materializes when the value created on that platform is, accrues to the people who are building on top of the platform, not the platform builder itself. If that's not true of a platform, then it's really not a platform. The thing that makes platforms even greater than all of that value that they can potentially produce, is it prevents folks from having to bear the burden of building very complicated things from the ground up just to build the application that they want to go build. It's great if you want to build all of this stuff, if you want to be a platform company or an infrastructure company. But if what you want to do is build a legal copilot, like Greg was talking about, or you want to make a copilot for medicine, or a copilot for helping people get through their insurance claims. You are not going to want to build all of this stuff from the ground up. It will be economically infeasible. The amount of compute that we are investing in and just the scale of all of that infrastructure is absolutely astronomical. The fact that the things that come out of the other end of the compute, these foundation models and this entire platform, that they are reusable and generalizable is really a fantastic thing. One of the things that we've been betting on for five years now, that this was going to be a durable property of these systems. One of the things that you're going to hear a lot about at Build is this idea that the foundation models are powerful and they're getting more powerful. But the can't do everything. You shouldn't have to wait around until we train a model that can do the thing that you want to do. You should have ways to accommodate your application, build your application on top of this technology, even when the model itself isn't complete or perfect. We're going to talk about a ton of ways that you can do that. Satya's already referenced plugins, and Greg and I chatted about plugins. Plugins are going to be one of those powerful mechanisms that you use to augment a copilot or an AI application so that it can do more than what the base platform allows you to do. What a plugin may do is it helps augment your AI systems so that it can access APIs, and via API can do anything, like change state in a digital system or retrieve information. For sure, people will use plugins to retrieve useful information. You've already seen some video demos of that happening already and you're going to hear a lot more about that. It allows you to perform arbitrary computations and to safely act on the user's behalf. Really, the way that we think about these plugins is they're almost actuators of the digital world. Anything that you can imagine doing digitally, like you can connect a copilot to those things via plugins. But what I'm going to spend the rest of this talk focusing on is the anatomy of a copilot. What does a copilot look like? What is shared? What's common among all of these things that we've built, and what are the platform components that we're building to help you all build copilots of your own. This starts from the user experience. There are some things that are the same and there are some things that are different about building Copilot user experiences. There is an application architecture and there will be some familiar things about it, but a bunch of new stuff to learn. Then, it is so important for all of us to think about safety and security. You'll inherit a lot of that by using the tools that we built for you all, but it's a thing that you need to think about from the very first steps of building your copilot applications. I just want to start with the thing that doesn't change when you're thinking about building a copilot, you have to build a great product. It is something that we sometimes forget. But you have to understand what that unmet user need is, what it is that you are trying to make better, where you have a unique understanding of that thing that maybe no one else has. Then you need to apply the technology. Sure, the tech is great. It's making a whole bunch of things that were impossible, or infeasible, or expensive, possible, easier, and cheaper. But it does not absolve any of us of the responsibility of thinking about what good product making looks like. One thing in particular that you have to really bear in mind is the model is not your product, unless you are an infrastructure company. The model itself is just infrastructure that is enabling your product. It isn't the thing in and of itself. One of the mistakes that I've seen, just being in the tech industry over 20 years, is having people fixate on infrastructure versus fixating on product. It's just the thing that we even have to remind our teams here inside of Microsoft over and over and over again, is use the infrastructure that you have at hand that is best going to enable you to solve your problem. Don't build infrastructure that you don't have to build. Again, it's just up to you all, it's up to us; we have to build great experiences, things that delight users. We got to get things out into the hands of users as quickly as possible, see what works, see what doesn't work, iterate, make them better. Let's dive into the Copilot stack. Satya already showed this and we're going to blow it up a little bit now. This is how our Copilots at Microsoft are structured and these are some of the things that we're going to be diving into greater detail in subsequent talks for you all to have a look at, to pick up, to use, to learn about, and to make things. Some of this may look familiar. There are three boxes. You can think of these as roughly corresponding to the three tiers of a normal application. You've got a front end and you've got a mid tier, you've got a back end. The front end, like the things that we've already talked about, is you start with understanding what your amazing product idea is. The thing that's a little bit different about the user experience design with Copilot is we have more or less been building user experiences the same way for 180 plus years, since Ada Lovelace wrote the first program. We have had to understand what the machine is capable of. And then we are fiddling around with how we express the connection between the human and the machine in very explicit ways. What that means for you all is like fiddling around with user interface elements, menus, binding code to actions, trying to fully anticipate the needs of the user, and architecting your applications in particular and familiar ways so that people know how to get at all the functionality, that capability that you really built into your code. The thing that's a little bit different in a copilot is you're going to spend less time thinking about what your user interface widgets are. and trying to second guess the user about what it is they want, because they have this really natural mechanism to express what it is they want, natural language. What you have to think about in the design of these copilots is what it is you want the copilot to actually be capable of? What are the things a model can't do that you need to augment with a bunch of the stuff that I'm about to show you in the orchestration layer with plugins and maybe even with fine tuning models or using portfolio models to accomplish? But it's going to be way less of that fiddling around mapping user interface elements to little chunks of code than you're accustomed to. You also, on the flip side of that, have to think about what you want the copilot not to do. This is important in how you're thinking about safety, but also because the thing at the bottom of the stack, these foundation models are a big bucket of unrestrained capability. You're the one who oftentimes has to restrain it to your particular domain. For instance, with GitHub Copilot, a bunch of the work that we did is to keep the model on task, which is helping you solve your development problems. You're not trying to figure out what the best menu item is on Taco Bell when you're sitting in GitHub Copilot, trying to write a piece of code. That's the user interface, just broad brush, what is different there. Now let's talk about orchestration. Orchestration is the business logic of your copilot. As I mentioned, when we started building our own copilots, every team inside the company was building their own orchestration layer. Like, all of that logic to figure out how to get a thing to sequence through all of the models, do all of the filtering, do all of the prompt augmentation that you have to do to build a really great app, and we just noticed that there was commonality across all of those things. One of the things that we did that greatly affected our ability to get these copilots out to market at scale and to do more ambitious things was to decide that inside Microsoft, we are going to have one orchestration mechanism that we will use to help build our apps. That is called Semantic Kernel, which we've open-sourced, and there's a session on Semantic Kernel later at Build, which I would encourage you all to attend. But like, we also know that we're not the only ones who see that there's all of this commonality across orchestration, and there's some really great open-source orchestration tools that work super-well inside of the Azure ecosystem that we're building. Harrison from LangChain, shout out to Harrison who's here with us in the front row. Yeah, give Harrison a round of applause, please. LangChain is one of the most popular open-source orchestration mechanisms, and Harrison, with a very small team, has built a thing that is useful to an extraordinary number of developers. And orchestration isn't a solved problem. We're going to see a lot of new ideas there, and the thing that I want to assure everyone here in the room is that you'll be able to use whatever orchestration mechanism you want. We'll give you some options that we think are great for us. We'll point you to some of our open-source favorites. But if you want to roll your own thing, that's your choice. I'm a developer, I like rolling my own stuff sometimes too. One of the things that you'll see in Scott Guthrie's talk that's coming up next is prompt flow, which is another orchestration mechanism that actually unifies LangChain and Semantic Kernel, and so I encourage you all to go dive a little bit deeper there. Inside of the orchestration layer, the fundamental thing that you're going to be manipulating is a prompt. A prompt is just a bucket of tokens that is generated by the user experience layer of your application. It could be in something like Bing Chat or ChatGPT, like a question, or like a thing that a user is asking the model to do. Or it could be something that your application constructs, where it's not a direct natural language thing from the user, but a natural language thing that you are conveying to the model from your application. A big part of handling those prompts at the beginning stages of orchestration is prompt and response filtering. Basically saying, I'm not going to allow these prompts through because maybe they will cause the model to respond in a way that doesn't meet the needs of your application or do something unsafe, and you also filter responses on the way back up. After the model has produced a response to the prompt, you may decide that you want to filter some or all of the prompt out. A natural thing where this happens is with the safety infrastructure that you're going to see Sarah Bird talk about in her talk later. But there are other reasons that you may want to do some filtering on the responses. You also have this unit of prompt code called the meta prompt, and the meta prompt is the standing set of instructions that you give to your copilot that get passed down to the model on every turn of conversation, that tells it how to accommodate itself to the copilot that you're trying to build. It's where a bunch of your safety tuning is going to happen. It's where you tell the model what personality you want it to have. For instance, we use the meta prompt to do things like telling Bing Chat to be more balanced, versus more precise. It is also how you teach the model new capabilities. You can even think of meta prompt design as a form of fine tuning, and so it's just far easier to do things in the meta prompt than to have to go down to the lower layers of the infrastructure and start rolling your own things. Once you get past the meta prompt and the prompt filtering stages, you start to think about grounding. Grounding is all about adding additional contexts to the prompt that may be useful for helping the model respond to the prompt that's flowing down. In the case of Bing Chat, which I think is the first place that was really doing retrieval-augmented generation before retrieval-augmented generation had a name. We basically look at the prompt, the user query, and issue a query to the search index to find relevant documents for the prompt, we add those documents to the prompt and send it to the model so that it has an extra context to provide a good answer. Increasingly, people are using vector databases to do retrieval augmented generation. You may take the prompt, compute a set of embeddings for them, and then do a lookup in a vector database that is indexed by those embeddings to get relevant documents for the prompt and give that extra context for the model to give you a better answer. But you may also augment the prompt and do grounding with arbitrary web APIs, and you can even think about using plugins for doing grounding. The next step here is, this is where plugin execution happens. At this stage, again, what I just mentioned in grounding, you may use the plugin to add some extra contexts to the prompt before it goes down to the model, or you may do a plugin execution on the way back up from the model so that you can take an action on a system. Once you get through all of the stuff in the orchestration layer, and I should say also, you maybe do multiple turns through this whole system. Calling multiple models, making multiple passes through this whole pipeline in order to get what you need from the system. But at the very bottom of the stack are foundation models and infrastructure, and we give you a bunch of choices for how to use foundation models in this Copilot platform on Azure and on Windows. You can choose to use one of the hosted foundation models, like the ChatGPT model or the GPT-4 model that are now available on the Azure OpenAI API service. You can fine-tune one of these hosted foundation models, the ChatGPT-3, 5, fine-tuning APIs are live now, and you'll be able to fine-tune GPT-4 soon. But if neither of those options work for you, if you have exhausted all of the things that you can do in the orchestration layer to get your copilot to do what you need, and neither of these things will work for you. Like, you can't wholly solve your problem with hosted API because of whatever reason, you can't use the fine-tuning APIs to accomplish what you want to accomplish, you can bring your own model and we are incredibly excited about what's happening in the open-source community right now. There's a bunch of brilliant work happening with open-source models, and one of the things that you will see in the next talk is we have the Azure AI model catalog that is going to be a place you can go inside of Azure to find the most popular models on Hugging Face and in GitHub, where you'll be able to push button provision and deploy those models to Azure to use in your copilots. Also, you can train your own model from scratch. As we mentioned several times, from the most ambitious models in the world, the ones that OpenAI are training, all the way down to smaller things, like Azure AI supercomputing infrastructure and our environment and give you a great way to train your model from scratch if that's what you need to do. This is the Copilot stack, top to bottom. What I want to do now is make this maybe a little bit less abstract by talking about a copilot that I wrote. I host a podcast called "Behind the Tech," and every month, when the podcast airs, my team comes and bugs me to write a social media post to advertise the podcast, and I suck at this. I forget to read my emails. They have to bug me over and over again, and they really want a Kevin social media copilot so they don't have to go through the irritation of dealing with me. I had the honor recently of interviewing Neil deGrasse Tyson on the podcast. I'm just going to walk you through this copilot that we built that actually just ran, and it did the social media posts for the Neil deGrasse Tyson podcast that just went live. Here's what it looks like. Just end-to-end picture, the copilot runs on a Windows PC. It uses a mixture of open-source models and hosted models. It does retrieval augmented generation, and it calls a plugin to finish doing its work. Let's walk through these step-by-step. The first step of this process is we have an audio file and we need a transcript. On a\our Windows PC, we take the OpenAI, open-source Whisper model, and run the audio through the model to get a transcript. It does a really amazing job. Once we have the transcripts, like the next stage and the orchestration, is we have the Databricks Dolly 2.0, 12 billion parameter, large language model running on our Windows PC, and we ask it some things about the transcript. For instance, who was the guest in this episode? Because again, we want to do this lights out, not have to have Kevin answering a bunch of questions, because he's slow and annoying. The next thing that we do once we have the transcript and we have all of this information that we've extracted from the transcript, like we want to send a chunk of that to the Bing API, or like we want to send Neil's name to Bing API to get a bio, and then we're going to combine all of this stuff together into a single packet of information, like a big prompt that has some stuff about the transcript, some some stuff about Neil, and we get our social media blurb. Like, this is a pretty good blurb, so we're going to go to the next step here, which is like, we need a thumbnail. We call our hosted OpenAI API to get an image from the DALL-E model; this looks pretty good, it's cosmic, it's podcasty, like plenty good enough for this post. The last step is we want to invoke a plugin for LinkedIn that will take this thumbnail, and the post and the link to the podcast and just post it to my LinkedIn feed. Before we take an action on the user's behalf, we want to present to them like what it is that's going to happen, because if for some reason or another, the model went haywire and produced something that we didn't want to post, like once I hit "Yes," this is going to 800,000 people on LinkedIn. We review, we click "Yes," and we post. This is the live post that's on LinkedIn right now; you should go check out this episode with Neil, it's awesome. This is really just an illustration for you all, like I'm not claiming that this is the most interesting copilot in the world, but it was really pretty easy to do. We posted all of the code on GitHub repo, like I encourage all of you to check it out, like it's a good template for thinking about how to build your first copilot. The thing that we want to talk about last before we jump in to Scott's keynote is AI safety. It's the first thing that we think about when we're building copilots, and we think about it at every step of the process. You're going to hear a ton about this great AI safety work from my colleague Sara Boyd, who runs our Responsible AI infrastructure team inside of the AI platform group. It's really super good stuff, like we're giving you all some amazing tools to go build really safe, responsible AI applications. Just very quickly, like I want to mention, one of the things that you're hearing here, Satya mentioned is, we're giving you a bunch of amazing media provenance tools that will help users understand when they're seeing generated content or not. Like we're going to be watermarking all of the content that we are producing and we're giving you tools where if your AI applications, your copilot is generating synthetic content, you'll be able to call our APIs and add these cryptographic provenance watermarks to your tools, is super exciting stuff. Copilots, you have heard from us that we have this amazing new software development pattern. You have heard about how we think about architecting copilots, and you have heard our enthusiasm that not only are there going to be a bunch of copilots from Microsoft and from our partners, but we really think that you all are going to be the ones who build the most interesting copilots in the world. It's just like any other major platform, like the thing that makes your PC great, the thing that makes the internet great, the thing that makes a smartphone great, aren't the things that launch when those platforms launch, it's what you all will create on top of them. I want to share one anecdote before we go. I was an intern at Microsoft Research in 2001, I came to MSR with my PhD advisor when he went on sabbatical. We would go out with our research group every Thursday to this burrito joint in Bellevue that I think it's closed now, called Acapulco Fresh. Occasionally this gentleman would join us, his name is Murray Sargent. Murray, like I was a 30-year-old PhD student, seemed like a legend to me because Murray was the guy who had broken the 64K limit on the Intel microprocessors. Many of you may be too young to even remember this, but at one point in time, the computers that we shipped could only use 64 kilobytes of memory for doing the work that they had to do, and Murray was the guy, when the 286 came out, that figured out protected mode and got Microsoft software to work beyond that 64K memory barrier. It's unbelievable to think about what impact small things like that had on the trajectory of the industry. I was in awe of Murray and I wondered every time we had lunch with him, what am I ever going to do in my career that would allow someone like me, a younger version of myself to look at me and think, "Wow, like this guy did some legendary stuff." This is the moment for all of us now; we have capabilities in our hands with these new tools, in the early days of this new platform, to absolutely do amazing things, where literally, the challenge for you all is to go do some legendary shit that someone will be in awe of you for one day. With that, I would like to bring to the stage my colleague, Executive Vice President of Cloud and AI, the legend himself, Scott Guthrie.

Info

Channel: Microsoft Developer

Views: 54,003

Rating: undefined out of 5

Keywords: AI, Azure, English (US), Greg Brockman, Intermediate (200), KEY02H, Kevin Scott, Keynote, OpenAI, The era of the AI Copilot | KEY02H, build, build 2023, microsoft, microsoft build, microsoft build 2023, ms build, ms build 2023, msft build, msft build 2023, w1m5

Id: FyY0fEO5jVY

Channel Id: undefined

Length: 44min 56sec (2696 seconds)

Published: Thu May 25 2023