Which AI should you use? Copilot, Copilot Studio, Azure AI Studio and more!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
he everyone today there are a lot of different conversations around artificial intelligence and generative Ai and co-pilots and it can get a little bit confusing about well what is the right AI technology to use for my different use cases so in this video I want to try and just walk through what are some of the key types of Microsoft offered AI Solutions and when I might use them now if we think of artificial intelligence this is not a new thing if we traditionally think of AI it's the idea that a computer is emulating some human behavior so if I think about what are the things that I can do as a human well I can see things and I can tell you hey this picture has a tree in it a car in it I could tell you where in the picture the tree and car is I can do object detection I could read texts from it so if I think about artificial intelligence and some of the things we used to well we have different types of Vision Services as a human I can do speech so we have speech capabilities I can make decisions so we have decision making capabilities I can translate maybe I can't do that very well but some people can and there's all there's a number of different ones all around this but primarily we think about there is some input and then from that input what the AI can do is it can make a prediction it can do a classification and it does it with a certain confidence level but it's primarily acting on some existing information now we think of this brand new set of AIS which is all about generative Ai and as a name suggests the idea of the generative is it is creating new content now this can be text based this can be images it can be videos it's creating songs that we see happening today there's a whole set of these capabilities where it's now not just acting on existing data making some um prediction although in a way it really is it's seemingly being very creative and creating text creating stories poems scripts summarizing things creating a picture based on a description and this ability to generate new content for us really has revolutionized uh the AI that we are seeing in a more widely adopted consumer and corporate types of framework and this is what I want to focus on these generative AI back services and what are they being offered as what are the co-pilots what is co-pilot Studio what is azure AI studio and windows AI I want to go through all of those different things now what's really important when we think about this when we talk about generative AI you'll often hear the term large language model which is generally the idea of a set of AI That's using a Transformer architecture to enable natural language task and that's what we're seeing with these you'll see if you've ever seen a demo that I'm acting in a very natural language way I'm not having to be very strict about what I say and the words I use and I get a very natural language response back we have tests to try and work out can we even tell it's a computer anymore so we have these large language models and very commonly what we're seeing today is GPT a generative pre-trained Transformer which is one of the more popular llms that's really geared towards that generating content but I do want to stress it is not the only one and you hear about open AI it is not the only solution out there if I jump over for a second and actually if I just go and look at Azure AI studio now all I'm looking at right now in Azure AI studio is I'm simply focusing on the idea of chat completion and we'll see these gpts GPT E4 is very popular there also 35 35 turbo 4 Turbo but they're ones from open AI sure but there are others as well mistol llama which is an open source large language model and there are many others and if I change the type of task I wanted and we can see down here on the little bar it's got all the different types of tasks we'll see there are a lot of different models available to us so don't think oh it's just GPT from open AI it's not that at all there's a massive number of these large language models it's just we hear a lot about the open Ai and Microsoft have teamed very heavily of open AI working on the open a GPT but Microsoft also have their own models that they're working on with Microsoft research there's a huge number of the third parties that Microsoft also offer those through different services so you're not just thinking oh there's only open AI GPT it just happens to be there's a closed partnership and they've built a number of their Solutions on that GPT and so we are going to focus primarily on the open AI GPT which is currently version four is the latest one but even within that they create slightly newer versions of the model that's been trained on newer data that maybe has new token limits and those terms will make more sense as we go go through the video but you will still see 35 used and you may Wonder well if there's a four why would I ever bother with a previous version realize as the models get newer typically that means they've got bigger they have more parameters which you can almost think about as imagine neurons in your brain that connect together to enable our brain to work and do stuff well in the AI terms you have these parameters which are different values you have biases you have weights in a particular node and they're all connected together in layers and the more of those parameters I have well the the more complicated the more sophisticated the more things the model can do so as these versions go up they tend to have more and more parameters hundreds of billions of parameters now but it means it requires more compute to leverage them and so if an older version like the 35 can do the task we need it's less computation less cost so we'll use it so that's why you'll still see previous versions even though there might be a newer one out there so we're focusing on GPT and to make it very clear about what this really is so what open AI have done it means they did the training they created the model and so if we talk about those neurons in my brain for a second so if you think I start out and I have the parameters but they have no weights they have no biases so what we have to do is we want to train the model and so I need a huge amount of data and this is the way these work you feed them in data and through a series of processes they modify the different parameters the weights and the biases on each of these little noes and the connections between them which it its capability so there's a huge amount of data goes in and this is the training and join the training well all those little nodes in all those layers I'm not drawing a 100 billion um nope but they all kind of connect and they they modify these weightings on how the things work and once all of that is done once the training is complete you end up with your model so this is open open AI have been creating in close partnership with Microsoft Microsoft are providing these massive clusters because the training requires a huge amount of time a huge amount of data and just a huge amount of compute primarily gpus because gpus are very good you about tensors they're very good at performing mathematical operations on these huge arrays of numbers which is really what all of this is based on so great we create this model and the whole point of what the model is really doing is once we have it it gets exposed as an API and what I can do is I as the user of this solution I give it my input now we call the input The Prompt I'm sending my prompt to the API and what the API it then responds it gives me an output you might hear this called an inference and what this is actually doing is it takes my input and then it's predicting the next most likely word and then the next most likely word after that so it feeds then that in and it's just continuing that until it reaches what it consider a completion token now I'm saying words the reality is it actually breaks it down into something called tokens because just words on their own well that doesn't work in different languages that's not ideal and so it's working on these things called tokens that give me the flexibility to work in multiple languages it lets me um perform mathematical functions on numeric values of these tokens and so it's representing a part of a word and we'll we'll come back to that but the whole point is these models I give it a prompt and it's predicting the next token then the next token next token based on what you gave it and based on the knowledge it has in a way that that's really all it's doing so it's all working on these tokens and it's predicting the next most likely and then the next most likely after that and it has a whole range of possible outputs each of them with a certain probability and there's things you can tweak that maybe make it more imaginative and creative which says hey my my tolerance for saying not as likely it can then start to choose from that so that's why you see that behavior that's what it's doing and we could see this so if we jump over for a second now I'm just going to start off with Bing chat so this is co-pilot in bing and it's telling you you can ask it anything um tell me a story about YouTuber John saffle and a dragon named Bob and I'm just sending it so that's the prompt now it's not a very good prompt I know what it's going to come out with but notice the way it's outputting the information it's doing it almost one word at a time because it's predicting that one word at a time then it feeds that word in along with the previous parts and says well now what's the next most likely token and the next one and the next one until it reaches what it considers a completion so it has gone through and created it has generated this very imaginative story about John and the flying dragon and John's recording and laughing and video I'm going to get away from that pretty quickly no idea what that says don't want to know but the point is you see it it got the prompt then it outputed a token at a time it was generating it it doesn't understand what we're giving it it's purely predicting the next most likely token but based on this massive amount of data you can almost think of it it's not the complete internet and all the books but just huge amounts of information that make that prediction well very intelligent and how different is that from how we work I don't know exactly how the human brain works but are we when we talk just predicting well this is probably the next word that should come next and next and next and so on and we can see this tokenization if I jump over super quickly just you get the idea about this if I enter some text so the quick brown fox jumped over the lazy dog you can see it's breaking it down into these various tokens now all of these words are quite common so was one token for each of my words that's not always the case if I type in Oklahoma we can see here Oklahoma was actually three different tokens because Oklahoma is not one of those base words that it understands and so this is what actually gets passed in and this is what's coming out of it and this is where we start paying attention because you may say well why do we even care about these tokens at all and it's because a lot of times when we use these capabilities we pay for the number of input tokens and we pay for the number of output tokens and also there are limits there's a limit on how many tokens we can send in and how many it can send out so we want to be respectful and think about let's only send it what we need especially when we start adding additional data into it now one one of the things we have here and this this looks fantastic but realize what is this doing then we give it a prompt and it gives us an output but that means on its own the only thing it can actually do is inference based on that set of training data it was given which is vast but it's that's supposed to be a book by the way it's fast but it's at up to the point of time the model was trained and it's only on what it was trained on and then that's all it can do it has no idea of any other types of information and often we want more if I think about how do we use this and what do I need from it well I want it to be able to use additional so external data my mailbox my calendar this PowerPoint I'm working on the internet to get more current information stuff from my security system the list goes on also though external compute I might want to be able to tell it about certain functions that exist so it can go and call another API to go and schedule a trip for me uh go and log my paytime off whatever that might be I want to be able to tell it about that I want it to have a memory if I want to be a to chat then hey I give it a prompt it gives me an output then I want to be able to give it a follow on question to that this has none of that by default it's a prompt it's an output if I then give it another prompt it has no idea about that previous output so I I would like it to have a memory maybe that's I have to with my new prompt send it my previous prompt with its output it gave me me but that's a facility I want I might want to be able to modify the behavior I how is it actually responding in what format I want you to Output in a table I want you to explain it to me like I'm 6 years old whatever that might be I want to be able to modify how it responds and I might want to chain things together and when I talk about chaining it's well I want you to do this thing then do this thing then do this thing I want to do these multiple sets of different actions and prompts and different styles to achieve my overall goal and that's not really an unusual set of things think about as a human being I have a certain amount of knowledge that I've accured over my life and you can ask me a question now you could give me your prompt and I can respond in my natural behavior how I talk based on what I know which is unlikely to be that useful to you and so what you might say as well is John tell me about this here's a book on this topic or here's access to my mailbox go and read through this and now go and summarize this stuff well I can read the book I can go for your email then I could do that you might say and I want you to explain this to me as if I'm a middle schooler so I would adjust how I explain it to meet that so that was me asking to change the behavior then you ask me a follow-on question well I remember what you asked me previously so I have that context so I can continue that you might ask me to do multiple things so as a human being we have all of that and so how do we think about filling those gaps when I have this API that only lets me prompt and then get some output and there are a number of different capab abilities we have around this but for the most part they boil down to two or three Core Concepts so think about the idea of other data well so I've got some other data over here if I want it to be able to use it we have to add it into the prompt and you'll hear this term called retrieval augmented generation very fancy term for saying we're going to add some data that we're retrieving from somewhere to let you generate stuff off of it so hey I want you to summarize my emails here's a list of the emails that are pertinent to that it can go and get that maybe through the Microsoft graph maybe I'm going to SharePoint maybe it's in a database somewhere and we'll talk a little bit more about how the details of that really works but that's a huge area adding additional data in that's maybe newer that's pertinent to what we're asking it to do we may include examples of the type of response we want it to give this called fuse shop fuse shop prompting I might describe functions are available I might want to give it additional prompts to describe how I want it to act so this is maybe the user promp PRT but what I can add into here is the idea of a system prompt and that's where I may specify certain rules you are a helpful AI assistant you should never be rude if you don't know the answer say whatever you want it to say so I can guide how I want it to act and then here's the users prompt that they typed in if there's different functions available that it should be allowed to ask me to call on its behalf Well the description of those functions I could add in to the prompt say hey these functions are available to you if you want me to go and call that for you and all of this is called prompt engineering it's really just a a fancy term for hey I'm I'm going to change the prompt to make it behave the way I want to do to meet my end result I might even include that idea of the memory so remember what's going to happen now is I have some application and the application is what's really calling the API so the application has the prompt and it it calls the API and it gets the response back but what I could also now do is when I get that response hey I'll include it back into here so this is how I get memory so we can start to add in all of these things to make it more human and make it behave exactly what we want to do and this is all background so you understand those things have to be provided to give us this nice experience where hey back to my terrible terrible story over here um can you add eating a pizza and it's going to know the memory and that's what he's saying absolutely let's add pizza into the mix so now it's going through and it's modifying it has context it knows what it was doing before it has that memory to let it now go and do other things and I actually now want to read it CU it's talking about pizza but I'll I'll do that afterwards great that's the core of everything we're doing now how do we actually start using it in the various environments so open AI trained the model say GPT 4 in Azure but we don't then use the same instance of the model what happens is they copy it so this completed model gets copied so this is taken over here and it's copied to many many different places so it's copied here and here and over there and I can think about why do I want different copies we realize they might be different environments it has to run in there might be regulatory requirements there might be data sovereignty requirements I need it in certain countries it could be Geo it could be just for scale I can't meet every single inference request in one cluster I want to think about latency I want them close and latency can actually mean two things when it comes to these gpts yes there's a distance thing but also it comes to that scale it takes time to think and perform these operations I might want the ability to have a certain amount of provisioned capacity for my inferences for my critical app so I want to reduce the latency of the inference that's where you hear about things like provision throughput Unit A ptuu which is hey I want to provision this amount of GPU capacity for this model so I reduce the latency of those responses and so that's why we say all these different instances exist all over the place so now I'm I'm not going to be able to just draw it all the way up we have different models of GPT available to us in different environments and this line is going to go up and up and up and up and up I try to keep adding to this line but we now have GPT available to us fantastic we have the ability to send a prompt and get a response so how do we use this what are our options for actually using these and I want to start with coilers now remember all of those different things at it at heart a co-pilot is an orchestrator it knows about certain data it knows about certain functions it knows hey this is what I want to add to the prompt this is what I want to do with the response from the model and maybe reformat it and then give it to the user so it's really providing all of those different functions and access to different data based on the particular service it is in and that's what gives me unique co-pilot it's primarily hey what date am I going to pull from to add to the prompt what are the different functions that I may want to tell the model it can call to get additional data and do other stuff and what is the type of interaction what instructions I give it on how I want it to act that that really is the primary thing beyond that they don't modify they don't retrain the GPT it's the GPT that just exists so let's try and do this using a little bit of a a walkth through and a decision Matrix so we're going to start here and the first real co-pilot um I think was GitHub co-pilot and that's where all of this really kind of started so I might say okay well do I need development help I'm in a develop I'm a developer I'm in a developer environment um well yes yes I need this type of help so The Logical solution here is going to be GitHub co-pilot and the idea here is what happens is we have our prompt and obviously very commonly this has got some portion of code as well and what GitHub co-pilot does is it's going to interact obviously with the model so it's constantly going to go and talk into our GPT with the prompt and get a response back but the functionality this enables is Way Beyond just yes it can generate code it can also go and generate documentation here's my code document it here's my code go and add comments to describe it here's my code go and create a test plan you can do all of these things and more but think about migration scenarios I've got some Legacy mainframe system I can tell it hey I have this Legacy mainframe system here's this old old code I would like to move this to a modern P platform written in C or python whatever you want to do it can do that so think of those migration scenarios it can help hey I need to help integrate this code into this messaging platform service bus mq whatever I want to do it can help do those things and it's not even a specialized coding model it used to be GitHub co-pilot now just use his GPT because it's that good at coding as well and so if I'm a developer in a Dev environment GitHub copilot is typically going to be that key solution there that I'm going to want to leverage okay I'm not a developer I don't want to use it for development tasks Oh wrong wrong eror there we go so no I would now bring it to the idea of well do I want a solution for my knowledge workers in Microsoft 365 in Dynamics in in one of those solutions that has a specific co-pilot for type offering so um worker solution for I'm going to just write M365 but it could be Dynamics and there's a huge number of those cop Pilots available so now we'll say well yes I want to help my people use those types of solutions but before we go any further with the actual solution I'm going to say does it need any customization beyond what it can do out the box and for right now we'll just say no I don't require any customization so the solutions here is going to be those copilot 4 M365 Dynamics for sales for service you'll see all these different co-pilots so they're designed to work with a particular service and remember the interactions here is I as the user I'm working in some context I'm working on some PowerPoint or some opportunity I'm in my mailbox whatever that might be so the context is me interacting with it and I have certain data that I have access to now in an office world this will be accessed via Microsoft 365 and remember this Copart is really acting as the orchestrator so when I ask it to do something hey summarize all the emails from my manager over the last two days what the co-pilot is doing is it goes and talks to the data gets the relevant information brings it back adds it to the prompt that it then goes and sends to the large language model and that's the key part it's doing so it's just interacting with the GPT and it gets the response back now the important part is when it's interacting with that data it's acting as the user's context it can't access anything the user can't access is using constrained delegation so I have authenticated and then the co-art the orchestrator acts as me so I can't accidentally go and get access or get data returned that it gets sent to large language model that it's added in I don't have access to I already have access ACC to it and then the prompt engineering is used to make instructions better so I get quality responses so then it's creating the prompt that actually then goes so the system prompt functions you can ask me to go and get you extra data from Microsoft 365 or from Dynamics or from uh this security plugin for co-pilot for security etc etc it's all just different sets of data different instructions on things it can call how I want it to behave that get sent and returned now this is a point of concern for a lot of organizations today now it's acting as the user the co-pilot cannot access anything the user couldn't already access but as a human being I may not be that great at finding stuff I have access to um this is going to be very good at finding anything that's pertinent to what it believes is going to help answer and give a quality response behind the scenes very often now it creates something called a semantic index now you'll hear about vector embeddings and what this does is if you think about natural language I can use the same word to mean maybe potentially different things and many words can mean the the same thing um dog mut Wolf Maker I don't know many names for a dog but they're different names that are the same thing and what the semantic index does is it converts the data into these multi-dimensional arrays that represent the meaning of the data which makes it very easy to find things even when using natural language and so this can go and find stuff that maybe would never find so this is not a weakness of the co-pilots or large language models but it could absolutely be that I as an organization have not done a very good job of my data governance I have not tagged data accordingly I do not have good data leakage prevention I've not set the permissions correctly and this is why you'll see a lot of organizations are now getting concerned about these co-pilots because will go and find data that the organization didn't realize was open to all employees or this group of employees and this is where if I am concerned about this this is where you would think about using something like Microsoft purview so Microsoft purview is the solution that will go and look at your data and there's even special functionalities now in perview um it will show me how AI apps is using my organizational data it will find where my sensitive data is and suggest labels and uh suggest protection policies suggest permissions it will find that unlabeled data and sites being referenced by for example co-pilot from Microsoft 365 it can find non-compliant and unethical use of AI interactions against your data it has compliance assessment templates so there are solutions to help me solve if I don't have fantastic data governance and labeling again this is not a co-pilot issue but co-pilot will enhance any weaknesses I have in my data governance and labeling and protection because it's going to go and find that data very very easily and so if I just need a solution for various types of workers knowledge workers whatever that might be and I don't need customization hey co-pilot for whatever if it exists is probably going to be the right solution now even on the internet remember you have the Bing chat if I'm a home user I can do co-pilot Pro and you'll see a little toggle box if I'm in an organization I'll see a toggle box that says web so that's going to perform internet searches it will analyze web pages and bring that to me if I toggle it to the work option well then it's going to search the internet it will find my chats it will find my emails it really behaves lot like the co-pilot for Microsoft 365 chat so I can do that all from a nice web interface okay but I said no to customization what if I do want to customize stuff so I need customization let me go over here yes I need customization so now there's many different types of people in an organization I have my developers I have my knowledge workers and obviously they're comfortable with different levels of programming of of changing things so maybe initially well do I want a low code or maybe no code text based so I don't need multi I don't need images and video it's just a text based set of interaction and I really want it as a SAS solution IE I don't want to be managing a bunch of back end infrastructure to do various things okay so now my arrow yes I want that that sounds really good so this is where you're going to get into co-pilot Studio let's pick another color do this one so copilot studio is there to do these things I'm going to move this over just a little bit so this is all about the idea of creating my own sets of capability to meet my requirements it's a fully managed service it is no to low Cod types capability it used to be power virtual agents if you ever used power virtual agents basically it was rebranded but then they added in the magic of generative AI which transforms not just the types of output it can create and the interactions but it can also change what I have to do to light up various types of scenario so it's got a broader mandate than what it used to have and it really can do two different things so we start off with the idea the first thing it can do is I can create a custom co-pilot so forget about anything else I have I want to just create a custom co-pilot now within the custom co-pilot there's a number of different elements to that now if we jump over let's go and find my custom co-pilot area now straight away when I go to it it's telling me hey power virtual agents is now Microsoft co-pilot studio so it's telling me about that rebranding but I can go to co-pilots and I've created one already but I could say hey I want to go and create a new copilot and it's asking me for a name so I'm going to call it Ms co-pilot it's got language and notice what it's asking me for enter a website so from here I can just enter any website I want it has to be publicly accessible but I can also hook this into SharePoint one drive and files that I can update it's giving me kind of a simple option but I can do edit Advanced options down the bottom but initially it's just saying look give me a website and I'm going to create a co-pilot that would work off of that it will use Bing to go and search the website you can see here I could go and select different icons types of solution I want schema names but I would go and enter my website says a microsoft.com create and it will now go and it's setting up that co-pilot for me now if I go back over just to my base environment and I already created one so I'm just going to save a bit of time walking through all the capabilities sure sure sure now to get started previously we would have to create topics so a topic is some single turn conversational thread between the user and the co-pilot that I would created an answer to based on the specific utterance so I would create a topic now notice I could do from blank up here and from blank means I have to give it all of the different phrases and all different ways they might ask about it but the other thing I can now do prob just leave that I can create a topic based on a description with copilot so I can use copilot to create the description to call copilot and here I give it a name um look up documentation and I just tell it what I want it to be able to do um let's the user search documents for answers and I hit create what that's now doing is based on that topic it's created some phrases so look at the phrases that it created if it sees any of these and there's more on the scroll bar it will now call the next set of actions and here I could enter particular questions and there's a whole set of different capabilities there even Advanced things I can call https requests I can send an event I could have an Adaptive card presented to them so I still have all these capabilities to interact the user and call different things and do various things I have all of these topics and that's great but I don't even have to do that now because what we also have is generative AI so generative AI I can use the topics a lot less now notice here it's got my website so this will go and pull information using being it's up to two levels down if I remember correctly if we look at the help make sure it's external because it's got to be found by being don't use sites with forms or com comments because this could obviously reduce how good it's going to be and two levels of depth so I can give it multiple public websites but I could could also add in SharePoint providing I've got authentication enabled or one drive or I could directly upload files and if I directly upload files well then these will go and get stored into my data verse whatever my data verse limit is and with that it it's done like I can just do this and what it's doing is it's using these generative answers and I've got this little test co-pilot area down here and I could ask it a question um how can I reboot Windows 11 and so what it's now doing is it's using Bing to go and search the Microsoft documentation and it's giving me an answer okay what about Windows 10 now I didn't say how do I restart Windows 10 I just said what about so it's given me memory let's really challenge it what about Windows Vista I'm sorry can't help you that can you try refr raising or please get a better operating system it's basically what it's saying but we can see what we're doing with the custom co-pilots is we can bring in a whole set of different things so we have the generative Ai and what we can do is with the custom co-pilot and when we turn on the generative AI the generative AI is interacting with the model so it is going to send the stuff we're sending try me nuts it keeps doing that it is sending to that GPT model and getting responses back just keep drawing that line up and up and up but what I can add to it remember is I'm going and adding in well public web I'm adding in SharePoint or one drive I can add in files that I can upload to my data verse I can Define topics if I want to and there's still going to be times I want to drive certain behaviors and maybe bring up cars and drive certain behaviors but I don't have to do as many things as I used to have to do so still have the topics but I might want it to be able to do other things Beyond just adding in these additional types of data so and just to be very clear this might be one reason I use a custom co-pilot over co-pilot 4X I want to use data from somewhere that is not native to any of those Solutions so I just want to create um a help chatbot for my software maybe my help documentation is in my SharePoint site it's on the public site or maybe it's in a file a PDF document I could just go and add it to my custom co-pilot and now I can make it available but what if I want to enable it to do other things well I also have the ability to Define actions that I can add into here now there's a whole set that are built in and I can also create custom ones and there's there's other elements to this but one of the really nice things is if I was using these actions before I would have to create a topic I'd have to create a top topic to say if it's this utterance now go and call this action I don't necessarily have to do that so if we go and look back over again if I go and look at actions over here why I already added an action but I can just do create now it's showing me a whole set that I kind of built in as forecast for today get a row that there's other ones I can have here I can create a brand new flow so one of the things I can absolutely have in here is I could go to Power automate create a new custom connector and do anything I can do in power automate so it's nice low code call any existing rest API I have do various things and once I create this custom flow it would now be available here as an action so I've got all these different types of actions or I can create a new one I could upload a new skill again I can create that new flow so I have all of these abilities and when I do this if I go and look at the MSN we for example one of the things we'll notice and what's really important is this model description get the forecast for the current day in the specified location now why this model description is so important is that currently to be able to use this what I would really need to do is go to my topics and then create a topic based on those utterances to call that new action but if in generative AI I go and turn on Dynamic chaining what this is now going to do is automatically find the different plugins that exist it will look at that model description and now if I go and look at my topics there's there's a whole bunch new topics I notice a lot of them are Dynamic chaining and there's ones for example on unknown intent so this if it's not sure what this is its fullback action on what it's going to go and do but now with this based on that description of the model it will automatically call that action if it thinks it's the right thing to do so what is the weather today in I don't know New York one of the nice things when I'm turning on this Dynamic chaining you get this little icon up here this lets me see exactly what it's doing and so today in New York there will be rain but it's showing me the log and what it did is it called get the forecast for today because I gave it a location it was able to just go and leverage that automatically and there was a whole set of outputs from there there's even more data that I could have got if I'd have requested for that within it so the key Point here is this gives me this massive amount of power to go and do many many different things Without Really coding anything so this is called a low code but thus far this has mainly been a no code Type environment then there's also things like entities which are real world subjects I want the natural language to understand and use particular information from I could go and add my own entities when I think about how can I use this well if I actually go and go back to my this model so this is my overall co-pilot right here there's various sort of settings around it but I've got publish so if I go to publish I publish the model and then well there's all these different channels that I can deploy it to teams uh a demo website a custom website Facebook Skype slack group me depending on different solutions I have in mind environment I can interact with SMS so once I have my custom co-pilot created what I would then go and do is I deploy it and I can deploy it all over the place so again teams might be super popular slack um Facebook um telegram I mean the list goes on there's many different places your own custom site and I can enable authentication remember for these depending on types of data I want to be able to get to but now I've created my complete own co-pilot based on my data based on actions it might be my own API I have a restful API that went and created just that little custom connector for so I could go and call the action that it would know when to call very very powerful things that I can make available in a lot of different ways now for the licensing of this I'm not a licensing person but basically you buy a capacity pack of messages a month I think it's packs of 25,000 messages and I stack those and then I can look at well how many messages am I actually using to make sure I stay compliant so that's at a per tenant level I is there is also a per user license but that 0 that just controls I have to have the per user license assigned to me to be able to use the co-pilot studio so it's not a money thing it's to just control who's creating the co-pilot to go and consume those various messages the users leveraging these that I make available they don't have to have any license apply to them but obviously they they're going to go and use up those messages now I said copilot Studio can do two things so one of them was yep custom co-pilots but the other thing is I can extend a first party co-pilot so this is where hey I have my co-pilots I'm in teams I'm in M365 primarily is where I commonly use this today I just want to add some additional functionalities I can do all of this stuff and I'm really deploying it into there so I have all the same capabilities but I'm taking the stuff it can already do and I'm just supplementing it with maybe some additional actions some additional sets of data so it can do some other stuff now when I do these extending a first party if I have copilot for M365 that includes the license to use co-pilot Studio to extend and do the light features around typically conversational experiences in teams so that's licensed a little bit differently but what if I want to do more what if I don't want just the low code uh solution over here I want to have multimodal speech and images and video and I want to call multiple models maybe I want to do more complex chaining I want more control over how the responses come and the functions I called well if I don't want this sort of low code solution so I'm going to say no then I get into the idea of okay what do I want pro pro code based in the cloud I I want to run the models in the cloud so now if the answer to this is yes myself a bit more space this is where we get into it's really Azure AI studio now I say Azure AI Studio there's also and I'm sure you've heard about Azure open AI Studio that's specific to the open AI models those GPT the Ada for my embedding Etc a lot of functionality is very very similar though and so the idea here is now I have complete control I pick the models I want to use I have complete control over the system prompt I send it I have complete control over the memory I have complete control over that retrieve augmented generation and where am I getting that from am I using Azure AI search am I using a cosmos dbb am I using post Square SQL uh with the embedding models added to that so now I have complete control I can use provision compute capabilities so this is where I think about the pro code Type scenarios now I'm not going to go into detail on this cuz I have separate videos where I talk about using your own data describing rag so I don't want to reinvent I hadn't shown co- pilot Studio before which is why I went into more detail there but the point of the Azure AI studio is I have control over all of those now I've got an open AI model deployed so I'll quickly just show the open AI so again if I was in Azure AI Studio I could go and create a project and I could deploy any of these many many types of models that exist and there's just a huge number of them if I look at collections for example I could say created by Azure Ai and there's all these different models that I could leverage there's large language models there's small language models that are still very very powerful for specific types of interactions but I could deploy it and then use these I already have a deployment so here I'm us using the Azure open AI Studio but it's it's really got a similar set of capabilities but what I've got deployed is a gp4 and I've got an ADA Ada is used to create those text embeddings that give us that natural language nearest neighbor find data based on semantic meaning not based on the exact words and then you have a playground so I can do chat now within here straight away you see things like we chat templates ways you want it to act you'll see straight away there's examples in here where I can go and change the configuration there's parameters how creative do I want it to be so I have temperature I have top P generally you only do one of these at the same time current token counts that I'm leveraging but I could also go and add in additional data that I want to use so if I do add different user assistant type interactions over here can add my own data where is it so this is all about retrieval augmented generation and this lets me test those interactions so if I think what can I do with the Azure AI Studio I have complete control so from here I completely can drive that system prompt how I want it to behave so I manage all of that I can call things like Azure AI search now I don't have to use Azure AI search it's just Azure AI search I think rag is very powerful for that retrieve lmen generation now I absolutely could just directly hook in with rag to things like Cosmos DB I could hook in into postgress there are other databases that support vectors absolutely I don't don't have to use Azure AI search at all but one of the really nice things that Azure AI search does and I've got the DAT to it's it's talking to and embedding is it actually has a hybrid search so yes it does that embedding that Vector based search to find based on the multi-dimensional arrays the semantic meaning of the nearest neighbor to what I'm asking but it also does electrical search based on specific words which is useful if I'm looking at product names or part numbers but then it does a hybrid it merges and finds the top results of both then it does a semantic reranking to similar technology to what Bing does to say well okay well nowar out of these results which is the most relevant to what I was asked and then it will score them and only return the ones that are the very most accurate and related to the question I asked and that's really important because remember what this is now doing so if I think about the Azure AI search what I love is this hybrid search and the semantic ranker well this then goes and talks to the GPT to the large language model where I pay for tokens I pay for the number of input tokens so I don't want to return a whole bunch of data that's not that relevant so I'm going to pay to send it I'm going to waste money I'm going to pollute the quality of the answer back because garbage in will be garbage out so that hybrid search and that semantic ranker is really powerful not just find the best but it will only return the most relevant info so I'm not wasting tokens sending junk that will muddy the answer and waste my money so that's why that's such a useful component but absolutely I don't have to use that there are many in fact many nearly every database now is adding some kind of embedding capability because of retrieved augmented generation ways to go and hook into that so I can customize a system prpt I could cause multiple models I can hook into other types of API capabilities that I want to go and call I can tweak all of the different parameters so when I talk about that system prompt I can also change all the parameters that temperature how within the probability range I want it to be so I could be more creative or less creative I have complete control and when I talk about interacting with a GPT remember what I actually create an Azure AI studio is a model instance and that's where I could also optionally add things like provision throughput units if I have the critical type of generative Ai and I want to remove as much latency as I can in it generating those inferences those tokens back or if I have provision capacity it's not that pay as you go where I'm battling with other people I have a certain amount of compute capacity provision for me I'm going to reduce those latency interactions so if I want sort of all of the power of the pro code hey aure AI studio and creating those things now one thing I was stress though is yes I can manually create the system prompt I can manually tweak all those parameters I can manually hook into the different search Services I can manually hook into the models and change the way I write my code because yes I have to playground but eventually I'm going to run an app that calls apis to do all of this stuff this is just helping me write my app but then my app still has to go and talk to all of these things to make the work actually happen and I may most likely not want to create the wheel every time I want to do that now what did we talk about this co-pilot really was it was an orchestrator IT managed the prompt it managed the interaction it managed going and talking to the data now I'm writing my own app but do I really want to have to Define all of the different interfaces for the different models for the different dates I might want to pull for the different prompting um calling different external apis chaining multiple different steps together I probably don't want to do that so while Azure AI studio is fantastic for creating the models and experimenting and tweaking what I want most likely I'm going to want to use an orchestrator to abstract away some of the details and really what this Orchestra is going to do I'm going to create a particular set of configuration that's going to end up creating an agent this is going to I'm creating an agent that does a specific task now there are multiple orchestrators available uh I'm just really going to talk about two so one of the big ones right now is Lang chain and this is again I'm a developer there's a a set of code I download I install it on my machine I develop against this and then I I leverage this when my application is running but this has a number of different components so what I can do within here is it has wrappers for example around the models so if I think I want to call this well it has the models defined and it has a wrapper what this means is I don't have to worry about the specifics of every different type of model interaction that there is it's doing a lot of that work um for me that's still GPT going going further and further up there's different types of customization of the behavior I want from the model I the system prompt well what this has is prompt templates I leverage remember that retrieval augmented generation when it has a concept of indexes now the indexes actually do a number of different things so it's got something called a document loader so that could be from file storage web content databases says a loader component but it also has a component of text splitting and then it actually has the vector database interaction so all of these things are about that retrieve lmen generation but really nice capabilities make it usable and then it does the retrieval it has the ability to talk to agents so if I think about an agent typically it's going to be talking to an API but it might be decision making so from here it can go and talk to an agent which is going and talking to some API I want over here it adds the concept of memory and so I get that contextual continuation of the the ability to have just a conversation with the bot without me having to write the code to do that and really the big deal what I then create is chains and the chain is just well you do this bit of work then call that agent then call the model to do this so I'm chaining together different units of functionality and that's really what Lang chain is all about so I could imagine the user types in a prompt I map it to a certain prompt template I send it to the model I get that response back now I call a certain API with the response now I take that response now I combine it with a different prompt template explain it to me like I'm 10 and send it to large language model and get the response now return it to the user so I'm chaining together three different things that each of those were certain unit of work that's the chain in Lang chain it's letting me create these sequences of actions that it's going to perform then there's output pauses there's all different sets of things that it can do so Lang chain super powerful another one that we have that that's a Microsoft created is something called the semantic kernel now I can really think of the semantic kernel in three different buckets there are plugins so this might be hey I want to do retrieve augmented generation it might be I want to run some kind of automation so you'd hear this called tools or or functions then I have the idea of planners so planners is the function calling so that's calling a function and then I have the idea of a Persona and a Persona is really just an in set of instructions plus using some plugins and then this goes and does the various callings to the large language model and get you the resp resp his back so that's the goal of what these do and these planners are actually really powerful it uses the concept of function calling where it just gives the function detail to the model and it works out the detail of what it needs rather than me having to describe using a prompt-based approach of every single thing that uses a lot of tokens cost me money so the semantic kernel is doing very similar things to Lang chain the the goal is to abstract ract away some of the detail and provide building blocks so that I can very very with minimal code have these more sophisticated sets of capabilities have the memory have that context have the hooking into Data customize the prompts call different types of functions make them available to the solutions Nows prompt flow is another one that's part of I think the Azure Studio natively open AI has the open AI assistance API which is a basically provides a thread which is a history and then a run a call of a function um it running in their Cloud there's also autogen I guess I should quickly so if this is an agent I'm creating One agent what autogen does is imagine I have multiple agents so I've created lots of different agents autogen enables the agents to talk to each other to achieve a more complicated task so you might see autogen and I'm really just telling you these names just you understand what they're doing so if you hear the term autogen oh there's multiple agents that work together oh Lang chain or semantic kernel oh they're orchestrators that I as a pro code developer would use to give me some of the more advanced capabilities on top of a model like retrieval do manage generation like customizing The Prompt like calling functions like adding in the memory so that's why you'll see those terms um leveraged in the environment and I guess finally um no I don't want to do Pro clode in the cloud running out of room but maybe I want to do pro code locally I a small language I can't run a large language model on my PC I can't do that but what I can do is run some of the small language models and again these are getting really powerful and so here the solution could be um Windows AI Studio and this is enabling me to interact with locally running so this is on my PC these small language models Microsoft Fire 2 meta llama 2 mistol I can run these locally on my machine experiment and do various different things um and that I guess is my summarization uh I guess it got pretty big the whole point uh fundamentally is we have a large language model GPT is very popular today that to make it useful I want to change the way it responds prompt engineering I might want to tell it about functions it can ask me to call on its behalf and I want to give it additional data boils down to that so if I'm a developer and I want help in my development my migration from Legacy to model GitHub co-pilot is the perfect solution for that integrates very nicely with my development tools if I just want to browse the web where's co-pilot there Bing chat there's co-pilot Pro if I'm a home user at work I'll see the little work which will then go and hook into my internet my mail it's acting a lot like mik SOS 365 co-pilot if I'm a worker solution working in Microsoft 365 or Dynamics or any product that has a co-pilot for and I don't need to customize it we just use the co-pilot it's doing all of the work to hook into the relevant data to customize the prompts I don't need to know anything I just interact with it it knows the context I'm cominging in and we'll do the right thing well I want a custom I need to add some additional capability well then we have co-pilot studio so it could extend typically Microsoft 365 go and add some new capabilities in some external API maybe some additional data great want to create my completely own custom co- pilot I can hook into my own data be it public web SharePoint one drive files I upload I can then hook in different types of actions buil in actions I can create my own flows to call other apis and do different things it's all generative AI based commonly today so I don't even have to create manual topics for most of the things it will work out the right things and I can go and deploy it team slack Facebook telegram SMS if I've got the right additional Services custom websites copilot studio is phenomenal and as you saw it's no code a lot of the time uh but even in the worst case it's low code no I want to be a pro developer Azure AI Studio lets me go and stand up the right models it lets me go and experiment with different parameters it lets me experiment with hooking in the different data Azure AI search I love because of the hybrid search and the semantic reer ranker to get the most relevant data to only send the relevant data to only spend that money but then I still have to go and write my app to hook into the AP that that model instance is exposing well I probably don't want to do all of that so there are these orchestrators out there Lang chain semantic kernel two of the big ones that make it very easy to abstract away some of the details in very small amounts of code I'm customizing The Prompt I'm adding in my own data I'm telling it about functions it can call it's adding in the memory capabilities so I have that ongoing contextual also Jen hey I've got multiple different agents doing different things let them work together to achieve way more complicated tasks if I want to develop locally uh Windows AI Studio can help me do that and run the actual model on my PC using my the GPU I have in there so hopefully that explains what they are and maybe that very basic flow it will be pretty obvious to you uh what I'm trying to achieve and what the right tool would be again I just went through a decision Matrix to show where they kind of fit and layer on but I hope that helps uh as always till next video take care for
Info
Channel: John Savill's Technical Training
Views: 23,123
Rating: undefined out of 5
Keywords: azure, azure cloud, microsoft azure, microsoft, cloud, ai, LLM, GPT, generative AI, copilot, microsoft copilot, AI studio, copilot studio
Id: ArRpwLGA2Hk
Channel Id: undefined
Length: 79min 7sec (4747 seconds)
Published: Mon May 13 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.