Cooking with Semantic Kernel: Recipes for Building Chatbots, Agents, and more with LLMs (2023)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
well thank you for that intro in the interest of time I think there's probably more interest in the content than me so we can talk later but yes uh I'm glad we're having this talk after lunch so hopefully you're not super hungry or anything but I like to use this metaphor of of cooking because it really much is taking a lot of these sort of best practices creating recipes out of it templates if you want to call it that and yeah if you're building an AI application well let's Dive In all right so if you haven't heard Microsoft is all about co-pilot and co-pilot is you know this kind of center of gravity of all of Microsoft's AI efforts especially with open AI for building uh products products that are meant to assist in the human creation of either work tasks creative tasks um and and all that so it's your everyday ai ai companion you know we already see a lot of this um actually this was just all just announced I think last week where whether it's in word or the revamped version of paint um to even the newest uh surface products co-pilot is everywhere and it's getting more and more uh you know adopted as a as a standard one of my favorite sort of demos that came out was uh for copilot and Excel so one prior announ M that uh came before the co-pilot announcements was python being able to be run in Excel and you know having a co-pilot interface in this case it's expresses a chat interface being able to do your sort of analysis over your data in a more hopefully friendly way so you're not having to memorize formulas uh VBA whatever the the tool of of choice was back then so yeah now this is hopefully more more natural so natural question is how do you build co-pilots right I think this is a very much an interesting field that is emerging and Microsoft is one of the voices in this but as you all have seen and heard uh throughout the the couple days right there's many different approaches for for building a co-pilot but before we go into this right let's just kind of take stock in the the moment that we're in like what is this Burning moment in technology well I think for many of us who have either been building an AI or you're a product leader business leader you might have seen or experienced that AI was always kind of hyped up or or talked about but it had a hard time graduating from the research lab from just kind of experim exps that uh your data scientists are are doing in in their jupyter notebooks um to you know actually manifesting itself in a real product so that was back then well what about now right I think we are now at this point where chat gbt just exploded on the field I think it's a very interesting story right open AI didn't intend to become a product product company but they kind of put it out there put Chach out there to just learn learn uh how people would use uh Ai and in so doing they sort of uh launched this revolution right Revolution for building um chatbots uh or the revolution of the the innovation of chatbots and even just co-pilots so the number one use case that we're hearing that I'm sure you all are uh experiencing is you want to be able to have that sort of chat GPT like experience on your own data data so how do you do that well at one of the opinions that we have is that you need more than just a model API and it's actually kind of speaks to this new way of coding new way of development that right when you're building with large language models they are uh it's it's kind of unfamiliar right maybe a lot of the developers in the room they are very very much attuned to writing code natively in their programming language of choice but when you're interacting with large language models with with these uh Foundation models it it's a little weird right you're you're talking in natural language you're offering to maybe go back to your I don't know high school days to figure out how do I speak coherently to to an AI and there's all these like other uh things or considerations you have to think about tokens temperature and just overall dealing with non-determinism right AIS uh when you're building with AI you're starting to delve into this area of a probabilistic outcome as opposed to uh a deterministic one at the same time right if you're going to be interacting with uh open AI or any of these Foundation model apis uh well a huge benefit is that they've just made it super easy right you can just get your API key from um open AI or Azure open AI uh and just run something uh very simply out of the box but the challenge is is that um as you're building more complex applications as you're building co-pilots right you're you're going to start hitting some some real limits and part of that is there's no memory right these uh large language models are effectively kind of like State machines that you send in you're prompt to you get a response back but it doesn't by itself know anything about you or remember your past conversations so we had a lot of uh talks today from um a lot of the vector database companies and and others looking to build this type of capability we heard rag mentioned many times uh over as a pattern retrieval augmented generation right and that's really to help bring memory into your co-pilots there's also this idea of this knowledge cut off right where these models are trained up to some point they only have the the training data you know to 2021 um and if you're trying to ask for factual questions beyond that you won't be able to to get it and the probably the trickiest thing is this idea of hallucination right the models can be very confident in giving a wrong response and especially if you're going to be building any sort of co-pilot in a very regulated or sensitive industry or your use case right you don't want your model to be wrong or you don't want your application to be wrong so as a result right uh we believe or that we we actually experienced um inside the office of the CTO uh that you need new tools right when you have Foundation models that introduce new capabilities right gbd3 to gbd4 was this huge step fun that when we had early access to that model uh we for sure uh you know concluded that we needed to build common tools common Frameworks so that everyone's not Reinventing the same thing uh again and again so imagine having experiences where you can talk with your uh AI or your chatbot infinitely right there's no uh there's no sort of forgetting of of who you are or what you talked about right a conversation that you had last year year or 10 years ago you know it should be able to remember you just as we remember the the people that we meet at this conference or the people that we talk to um especially when you're starting to do long running tasks right multi-step multi uh you know things that require multiple decisions or or uh actions take being taken then you need to do more advanced orchestration for that to happen so as such whoops go back and actually there's no sound coming so I'll try one more time so I'll actually let our CEO CTO talk a little bit more about what you need to to build this future the co-pilot stack right after all we built all these co-pilots with one common architectural stack we want to make that available so that everyone here can build their own co-pilot for their applications uh we will have everything from the AI infrastructure to the foundation models to the AI orchestration so one of the things that we did that greatly affected our ability to get these co-pilots out to Market at scale and to do more ambitious things was to decide that inside of Microsoft we are going to have one orchestration mechanism that we will use to help build our apps uh that is called semantic kernel which we've open sourced all right so so far I have not even mentioned the the title of the talk right semantic kernel so semantic kernel open source project um really encapsulates all these sort of best practices that we learned um inside of Microsoft building first party co-pilots and now open source to allow Enterprises developers to hopefully you know come alongside with us on that Journey too so semantic kernel in one statement it's a lightweight open source orchestration SDK that lets you integrate large language models with native code in languages such as C Python and Java I think I want to emphasize the lightweight portion of it and you'll see or I'll talk about it in the kind of higher level architecture but we want it to be something that is very easy to deploy especially when you're talking about bringing something something to production um and not a framework that doesn't have too much baggage or at least all this extra stuff with it I think it's important for that type of you know scenario so fully open source uh you can we have a community growing Community uh that you can all contribute to join on Discord I host pretty regular office hours as well so uh if you want to drop by say hi talk about your use case or talk about your challenges happy to talk there so in one picture also right the semantic kernel is ultimately addressed to app developers and uh Microsoft is is all about empowering not only the experts right the the data scientists the ml researchers of the world uh but also really just the the hobbyists or the people whose job it is to build an Enterprise application and for many of those people uh they haven't been able to participate as much in AI or at least have the the Frameworks to interact with these things so that's why for the semantic kernel we've taken this opinionated approach of how to build a AI orchestration framework that uh developers can yeah can choose to to bring into their toolkits so you know for for this at the top level if you remember from that the picture of the the co-pilot stack right it's the AI user experience whether it be a chatbot whether it be something else right that is a separate conversation I invite you to check out my colleague John Ma and um you know VP of AI and design he's he's much uh more equipped to talk about that piece but in this middle right this is where the the heart of the the colonel uh is and you know you'll see as you all heard earlier um the last couple days you know bringing bringing memory and bringing State into your applications is enabled by things like vector databases um even traditional ve uh data stores as well uh but doing that in a more intelligent way um if you want to start bringing other tools though other connectors you know having your application interact with the outside world well this is where you know this notion of plugins um it really comes in and imagine right being able to I don't know interact with Clara or instacart and you just want to have that kind of magical experience where your chatbot or or whatever can kind of use these tools use these apis to bring a differentiated experience to your to your application right here at the center of it is this notion of planning and I'll actually talk about this for the this latter half of the the the talk but planning is really I think the magic unlock that AI has demonstrated right uh gbd4 uh really showed off in particular its reasoning capabilities to be able to take a user's intent or users's ask um or question right and be able to kind of express it or or take show off the steps needed to complete that that goal or that ask so we call that uh planning or we it's wrapped into this the these abstractions called planners of which there are U multiple versions of and all this is built on top of foundation models and yes while Microsoft and open AI are very uh tied at the hip they you know we also support the open source models that are that are out there so if you have a model on hugging face or if you've built your own and you just want to host it Beyond some endpoint you could bring that to the semantic chol very easily and in terms of the full like Dev cycle which again super important if you're going to be talking about uh rolling something out into production uh things like Telemetry and just overall Dev tools we have that as well but again that's a subject for another talk but all this hopefully right if you see this big picture it's meant to empower the the AI app developer okay so let's dive a little bit into it so it all begins with the user app ask and that ask comes into the kernel which again super lightweight and the kernel is you know meant to be like okay I'm taking this ask and I'm either going to run a deterministic workflow that you've defined a mix of native and semantic functions or I'm going to have ai do this uh this plan planning so yes I I will take this plan and generate a stepbystep um kind of road map a blueprint for how to solve it and that is based off of the skills or the plugins that you provide that plan will then you know list off the set of functions the call and you can choose whether or not to accept it and say let's go let's run with it or you can go back and edit it if you so see fit and I think that's a big point to call out because at least for now and I'm a friend firm believer that human in the loop is still super important and especially if you want to build trust in your AI applications right having a plan that's transparent plus being able to modify as you see fit provide additional feedback is is core for for any uh application so to put it in a different view right starts with the user ask that ask goes into the kernel the kernel will take its skills that it has available to it and starts creating U uh a plan and that in addition also brings in external context so these are the connectors I was talking about where let's say you have the Microsoft graph and you wanted to know about your entire like Enterprise work profile well those connectors can come in uh as a plugin if you uh see fit and you know even external data sources like external apis and with that right the colonel will do this orchestration for you uh to ultimately create that plan and allow you to to run it run it in a pipeline and that pipeline can be uh expressed again as something that you want to run over and over again and there's I think some really interesting ideas about caching of plans or uh making them reusable uh or if you just want to have a more determinist or creative kind of expression of like help me do research right or help me I don't know plan for uh my my date right you can have the colonel kind of take multiple different uh paths to to arrive at some answer okay so why build with semantic kernel uh ultimately right you want to especially in this world that we're in where things are moving so quick L and it's just very hard to keep track of it all you want to go from just showing off a cool demo to really putting something in production right you don't want to be in this world where you're just shipping one-off features that you end up having to rewrite code again and again for so it is part of the sematic colonel is part of this paved path for future large language uh future applications on built on top of large language models and and yeah it's helps you mix in the code that you've already written right if you already have all this expertise writing native code then that's uh something that you should bring back into your application right and combine that with prompts combine that with uh semantic functions and uh yeah just allow you to pursue scenarios that you were not able to before okay so let's talk planning all right so there was this paper um or blog post written by Lil lilan Wing from open AI called llm powered autonomous agents and agents is actually another kind of important term that is effectively in in summary kind of combining tool tool use planning actions and if you're ambitious enough you know incorporating memory as well so this was I think a good kind of one picture of what this concept looks like and you know a a big part of that is is about planning so in sematic kernel today we at least expose three main planners the first one is called an action planner and the idea is that it's its main job is to take whatever the user ask is and just provide the one function or skill that you've imported to the Kel that best matches that goal so if you're saying you want to do some addition right and you have a math skill then uh the action planner will just identify that one particular skill as uh the right one to use the other one is about sequential planning and this is where you have maybe a more open-ended or multi-step ask and you're having the the planner uh stitch together multiple skills to run in a sequence um from start to finish so and I I'll show this off very shortly this last one that's probably more even more magical and even more sometimes concerning is a stepwise planner that can actually like reflect on itself make observations uh take action and even correct itself if it's made mistakes so let's just talk through a demo um this is a The Notebook that that we have on our GitHub repo so just um walking through it but so again all begins with this ask so in this case the ask is tomorrow's Valentine's Day and need to come up with a few date ideas uh my partner speaks French so write it in French then convert it to the text to uppercase so you can see all these little steps that I've very explicitly defined um and be like okay Colonel help or istrate to do do that so in order to do that you provide the kernel with some skills in this case there's a summarized skill uh writer skill a text skill um and each of those skills underneath it have separate functions that can each have their own prompts but uh these are the the top level things that are imported to the kernel so in a basic planning uh scenario uh this is one that like okay okay let's create a Json based plan that will aim to take that ask and solve it sequentially so this is this one in particular this is the python version um and the basic planner more or less you could think of it like a sequential planner as well but this one's Json based whereas uh the sequential one is XML based those are just implementation details but you can see right here when I printed out the the plan like that's what it looks like it takes this input Valen data ideas and it has these several subtasks and functions to call and each uh output of a particular function becomes the input of the the following function to call and just running this in a sequence um so if you wanted to do a little bit more complexity and you wanted to uh you know introduce some more skills right okay now rewrite let's let's create a new uh semantic skill which again you could just do this very simply in in a prompt and a uh oneline um function fun function creation uh to rewrite the style rewrite the above in the style of Shakespeare right so just let's modify the ask and run the plan again and you can see that the the generated plan is now you know updated to to reflect that and now when you execute the plan so the the plan creation and plan execution step is they're separate they're two distinct things and I think that's by Design you you want to be able to review the plan you want to be able to uh inspect it before you actually run it and running it can be an expensive multi call to the to the llm so in this case that's what it created uh I don't speak French so hopefully this is uh okay okay so I talked about uh sequential planning uh this this actual sequential planner contrast to the basic planner um in Python is uh XML based but it's it's more or less the same concept okay so action planner as I uh brought up earlier right its goal is to just identify the single function that that is appropriate to to do the task so again if the ask is what is the sum of 110 and 99 right it will identify okay among all these skills that I have and I printed it out there's like a lowercase function a trim function an uppercase function and and there's more um as well there's uh the math math go at at the very top hopefully right if if it does if it does the job correctly it should just identify the the proper function to call and in this case it does because it gives the right result I always have to double check with math but looks right um and I think this super important at least right in the case of math or in the case of something that you need to have more I want to call it guarantees but it's non-deterministic right you want to have more control and that's really the core of what we've heard Enterprises really care about is give us the tools that give us more control over these powerful but sometimes unruly llms okay so uh spend a little bit more time on this one stepwise planner again it's it's based off this um uh mkl modular reasoning knowledge and language or react reasoning and acting in language models where you make uh where the AI is able to form thoughts observations and take steps and even fail to ultimately achieve the the user's uh goal so in this case I won't go line by line but this is basically I I added a quick like web search skill you know allow it to uh search uh in this case Bing and I added a few more skills like time and math and now the planner is like okay let's take this ask which I wrote down here as how many total championships combined the top five teams in the NBA have um and it's going to run so it gives this result but you know there's one thing about just taking a result and looking at it at face value but let's look at the actual steps because I think it's more Illuminating to see what's happening behind the scenes so if you look at it and I highlighted the these different pieces in order to solve this ask that I gave it the kernel is making different uh the step-wise planner is making different thoughts it's taking some actions like okay I'm going to search with this particular query and then I'm going to take get those results back and then synthesize it right and then formulate some new thoughts about that and take any further action that I need to so this was just Step Zero but you know fast forward it even more and you get to see like okay I found these things I did I collect enough information like I I the planner would say I think so so now I'm just going to go add this all up right use the the math function to to do that but so and this is where we talk about some of the challenges like what's the answer for this question well I'll grant that maybe top five teams can be ambiguous that ask so that that's actually a big key principle of the semantic kernel is uh you want to ask smart to get smart right you want to make sure that your whatever you're providing to the large language model is as explicit as you C can be uh because otherwise it can either make something up or go off in directions that you don't intend um but the actual answer is 52 championships I had to make sure I added this up myself um so all this is say is that and I was using GPD 35 uh turbo to to run this stepwise planning but there's a long way to go and even I literally took the screenshot of of what the Bing chat responded and it returned 42 so it's not nothing's perfect here we're all still learning and many open challenges M so I think we all are on this journey that uh really just kicked off what like a year ago and at least in the the mainstream right and we're all just trying to figure this out uh uh together so as part of that right if you want to join this community definitely invite you to join our Discord um love to have you there a lot of great uh conversations are happening of real Developers trying to to build these things uh I guess plug for my my own YouTube channel is where I've been posting a lot of the semantic kernel content uh so if you want to dive deeper into some of these topics I even talked with uh we8 um not too long ago so you know it's it's a place for for more content last obviously if you want to connect happy to uh to do so um you can find me on LinkedIn YouTube Twitter and and whatnot and one one more thing shout out to Microsoft for startups they are out there in the booth so if you're building a startup and you'd like to have Microsoft um or oops and you like to have some help go go to the booth to talk to them okay I literally have four seconds left so thank you everyone for listening uh hopefully it was clear but feel free to reach out and I'll be upstairs
Info
Channel: Alex Chao
Views: 2,622
Rating: undefined out of 5
Keywords:
Id: AX8xM9YnV3k
Channel Id: undefined
Length: 29min 56sec (1796 seconds)
Published: Wed Nov 01 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.