AI-102 Study Cram - Azure AI Engineer Associate Certification

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone welcome to this AI 102 designing and implementing a Microsoft Azure AI solution and when you pass that exam you get the Azure AI engineer associate certification so shiny new artificial intelligence certification for you now it's a two-hour exam but that two hours includes 20 minutes for survey time so the actual exam time is only one hour 40. I had 42 questions and I actually finished in 30 minutes so my point is when you do this exam take your time because you have a lot of time well now I thought it doesn't sound that much but it is only 42 questions they are fairly short in types it's not a huge amount to have to understand and comprehend and I finished in less than a third of the time and that's just because I was in a bit of a hurry anyway but you have a lot of times you can really take your time on every question um I did pass and you get now this shiny little link you can use you're kind of curious so this is the page for the exam this is the ai102 and I I did the exam yesterday morning you can see it was um July 15th if I actually passed it but you get this little page now you can refer other people to to prove you have so it's online verifiable you actually did take this exam now I definitely recommend you go through this ai102 page and specifically I would go through this self-paced learning make sure you click the see more button because there are lots of different modules but if you go through all of these modules I think you'll pass it has all of the information you need now you should also look at the skills measured you'll also see you have this study guide now if you look at the study guide it breaks it down into more detail about exactly what skills it is going to test so you want to be able to go through this and tick off so yeah I understand that I understand that I'm good I feel confident I can do those various things now in terms of what the exam is there is no Hands-On coding as such there is no lab now there'll be questions where it shows you a piece of code and it will ask you at the start do you want to use C sharp or Python and then based on which one you pick that's what the examples and the questions will be in so it may show you a piece of code maybe you have to select what function you use or what endpoint you'd want to use or what is this code doing so you need to be able to understand code as part of the example you're not actually just sitting there writing any but again if you go through that self-paced learning there are Hands-On Labs as part of that now I didn't do any of them but I went and looked at the GitHub repo to get an idea around it but you should you should go through those Hands-On labs and it will get you the experience you'll need to handle all of those coding type questions now I'm not going to go through the labs there's no point but you should it walks you through step by step of exactly what you're going to do with really nice sample code in C sharp and python and as I mentioned you do have a lot of time to take your time and if you don't know the answer to something just think logically Microsoft do not design these Technologies to try and trick up developers trying to leverage the solution so it's always going to be logical so think about what's the logical likely sequence of events going to need to be to achieve something because that's some of the questions hey here's all the possible actions you could do which ones do you need to use and what order should they be in so just think what makes the most sense understands the endpoint naming structure for example they're all something.cognitiveservices.azure.com understand what each service does often what we're going to need to do is leverage multiple solutions to solve a problem hey I want you to I don't know take this collection of books and translate it into a bunch of different languages well okay so one hand I've got images of a whole bunch of books I need to consume large amounts of text that's probably the read API and I need to do translation to multiple language okay well then there's the translation services so I'd probably want to bring those in so just think very logically how would I solve the problems you need to understand what this is so if we actually look at what is the AI 102 it is the whole point around an AI engineer so when we think about an AI engineer well I need to have experience in development and specifically this is going to be focused on I need to know either C sharp or python I don't need to know both but I need to know one of those I need to understand what those languages we have to develop in one of those languages I need to be okay with the idea of calling apis for example a restful MPI I'm doing a put or a post a get to a URI it has certain structure to that URI I have a Json body and then I'm going to get a Json response back so I need to be good with the idea of using an API and also using a software development kit and often a software development kit abstracts away that API for me and I'll have software development kits available for many different languages so I need to be familiar with those Concepts I want to understand the idea of devops so when I think of devops I'm thinking of the idea of source and Version Control with things like get I want to think about pipelines so I have the idea of a continuous integration a continuous deployment continuous delivery be good with those key Concepts now one of the things we'll hear time and time again when we think about the idea of artificial intelligence is this concept of responsible AI so I want to be very familiar with what are those considerations and we're hearing about it on the news right now if you think about the gpts the large language models and hey what does all this mean well they have a great example of highlighting some of the concerns about it but we always think about fairness so if then it's the idea that an AI system should treat all people the same there should be no bias on gender or race or authenticity there should be no unfair advantage or disadvantage to any specific group of applicants so as part of our training of those we need to make sure we have the right data to not influence the model in those ways we need to think about reliability and safety the more we use these systems they need to perform in a reliable a safe manner think of an autonomous vehicle again we see those a lot in the news well imagine that wasn't safe there's this human life at risk in those scenarios imagine it's a model that diagnoses the health of a patient with it wasn't done very well it might miss a diagnosis and we cause a loss of life so we need to ensure there is no unreliability in these systems there needs to be rigorous testing and deployment management as part of those privacy Supremacy and security once again the idea that they should respect privacy we have these models and we train the models on huge amounts of data if you look at again at these GPT models they took huge amounts of public domain Data Book databases wikipedias you name it they use that data to train the model but we need to ensure as part of that personally identifiable information is kept private we don't expose things we should of when we think about private use of the models we're always making sure that hey if I'm using my data in my blob storage account I want to make sure it's not being used to train the model or could be exposed to other people as part of its predictions its inferences going forwards so that's a really important component of that we think about the idea of inclusiveness include sieve oh nice sometimes I can't spell early in the morning inclusiveness so all systems should Empower everyone it should engage all people every part of society ideally regardless again of any gender or authenticity or maybe even physical abilities should be able to take advantage and leverage this currency I should understand how it's doing how it's working what is the purpose of this system are there limitations now as we get more and more into GPT models we don't always understand how they do what they do we have billions maybe trillions of parameters and although we know how we design the model we're not 100 sure how it does some of what it does but transparency should be key and then accountability foreign we should be accountable for the AI systems we're using the designers the developers there should be a framework of governance of principles that the solution meets from an ethical perspective legal standards are clearly adhere to so that's a really important part of all of this okay so there's some of the Core Concepts that we're thinking about as part of this so when we talk about AI what exactly is then artificial intelligence so hey we say this AI and the whole point of artificial intelligence it is software that exhibits human-like capabilities so this is the fundamental goal of AI it should be able to do things that we as humans can do and we'll often break those into different areas so I could think well for one component of that visual perception I can look at a picture and tell you it's a car I can maybe understand something around is it a happy image I can understand video I can't extract text from a picture we think about the idea of language spell it all this morning language even so with language you have the idea to use natural language processing to not just read but understand the semantic meaning from text-based data actually understand the goals around it I can also think of speech both hearing speech they've been able to create text from it having text and be able to then synthesize and speak that but also I can think about well taking that step further with conversational AI and then we have the idea of decision making together as a human I can look at a trend and say this is normal and then there's something outside of that normal range I'm passing a certain threshold that triggers an Allah in my mind so I could think about having an influencer prediction detect those anomalies as part of decision making now when we think about artificial intelligence we often say it's machine learning we combine those things together but AI Builds on top of machine learning so if we have ai really it's built on top of machine learning and machine learning is all data and algorithms that's the key point so we have the we have huge amounts of data and it's really built on data science which we'll talk about in a second but we use this data to train prediction models I mean that's the fundamental goal of what we're doing here we have data that we label correctly we train the model with that data and then once we've trained the model we've tested it with other data that's labeled see if it gets it right or not we can tune that we deploy the model then we can feed it in new data and it will predict based on the correlation that it's learned in the past so this is really the the key goal around that's what machine learning is and AI Builds on top of that idea and then machine learning itself it's just built on top of data science and of course data science is really just math statistics to analyze data and it's interesting if you take a step back and I did a deep dive video into what is chat GPT and if you take if you keep going back far enough down it's really just billions of different parameters in layers and a parameter is just a fairly simple statistical algorithm that represents a particular part of a data set and by adding different stretching and cropping of elements of that algorithm when we put them all together in the layers and billions of them it really can just do remarkable things and that's the key around all of this now if we then think about azure and the services where this applies the most basic one is just Azure machine learning so if we think of just this machine learning we do have azure ml an Azure ml is just that idea of I have data that I have labeled so it's my training data Maybe it's also I'm going to have data which is going to be used for the testing so I'll split my core data set into data use for training and then data to test the model that I try and see how if it's working and tune it if I need to but I take that data and I use it to train the model so from that training I get a model and then with that model once I'm happy with it I deploy it but potentially then I'll deploy as a web service which I can then use through different applications however I need to to pass in new data that it can then do that prediction on based on the correlations it finds from how it's been trained but that may then feed back into then I'm tuning bringing more data in to train the model to make it better and better so it's one of the key goals around it so if I actually think then about going Beyond just raw Azure machine learning and what it can do how can I really start to think about using and the azure Technologies around AI so if I was to break it down into what I would think about some of the real core services around all of this I'm actually getting myself a little bit more room for a second give myself a bit more space so if we think of the Azure services around this obviously the big service we have right here is the azure foreign services now these include a huge range of different capabilities and what we're going to do is really go through all of what those services are firstly at a high level and then we'll come back and go into detail about them so if we start with the idea of visual perception so the first thing we have is the idea of well we have image analysis here's a picture give me information about it the category of the image tags in the image and where are objects in the image likewise we can do the same thing for video but also then I might want to break it up into different scenes maybe I need to understand what is in the content of that image I also have the idea of image classification we can train it on our own labeled images and then give it new images that will give us the classification for this hierarchy of classifications object detection where in this image is the specific object what are the coordinates a bounding rectangle of where an object is facial analysis wearer faces information about that face detect and verify the same faces in multiple pictures or maybe even recognize specific individuals then of course OCR optical character recognition is a very important technology so we have all of these around the idea of our visual perception then when we think of language language has a huge broad range of different components to that I can absolutely think about well language understanding and there are many aspects to language understanding which we were going to dive into a lot of detail on but as a sub part of those well we have basic thoughts of world question question answering I might think about text analysis and then a huge part of this is translation so many different aspects just when we think of language and again we're commonly going to combine these with other services we come on around the other side and as we talked about the idea of speech well again there isn't fairly obvious ones when we think of speeches obviously hey it's hearing speech and it has to give us the actual text a file but then there's also the reverse one we need to synthesize so we have text that we want to speak so we can go the other way but there also could be the concept of speech translation we have an incoming speech in one language we want to go out and convert it to one or many number of Target languages and maybe even speaker recognition so we can identify the individuals that are speaking as part of speech now imagine we were trying to transcribe a conversation if we can detect the individuals well now we can make that transcript so much richer when we can actually identify the individual hey Bob is talking John is talking Etc and then if I think of decision making will we break that into the idea of anomaly detection so it's outside of what we would expect maybe it's machine hardware and we the temperatures Rising maybe it's one signal maybe it's multiple signals that fire into that I can think of content moderation and that's important when we think about maybe there's text or images that we want to make sure it's suitable for the audience and even the idea of content personalization if I'm doing a shopping cart let me recommend things you're really really interested in so you'll then go and buy them and give me your money which would be very nice thank you so those are some of the services we can go and then build a solution on top of but in addition to the things that I could use to build my own nice solution which I absolutely can do there's also a number of pre-built solutions that I can essentially deploy and it's ready to go I don't have to do much and we're going to again we'll come back and look at these in detail but I can think about things like the forms recognizer so I can give it a form and it will understand it and extract the information and put it maybe into a database metrics advisor through a service built on the anomaly detector cognitive service to actually then give me real-time monitoring and a response to critical metrics a video analyzer for media again built on top of other services but makes it easier for me to go and get analysis based on video content without me having to go and build my own solution I can think about the immersive reader different ways to take text and make it more accessible maybe it spreads the text out in different ways maybe it can show me pictures of different elements of it but all ages all abilities help them with reading certain content well think about the bot service and I think we've all probably experienced a bot somewhere in our interactions with computers but it's the whole idea about some kind of con conversational interaction so it's delivering an interaction with the customer so it's AI powered and we see these all over the place there's different channels so we might see this in for example a web chat I might see it through email interactions there I might see it through teams and there's others so there are different ways I can interact with these conversational models so we have different channels for the same bot and then a huge huge service is cognitive search now cognitive surge is all about the idea that you have data and that data you're going to ingest now what happens is you ingest that data in and an indexer uses it to create an index and then what I can now do is hey I can run different types of search against that index to get a result now I don't just have to have the raw index there's actually capabilities that I can enrich the index I could maybe take an image I could run it through optical character recognition gain more inch site from that add it to the inset into the index I might translate certain Fields so it's available in multiple languages so again I could hook into other things but there's many different aspects to actually leverage and we're going to come back to all of these and talk about all of them in more detail so we can really understand hey what are the benefits and how do we think about getting the most from this but to start with then okay we have this idea great of this Azure cognitive service how how do we actually use this Azure cognitive service so what we have to do to leverage it is it lives in an Azure subscription so if I think I have okay my my Azure subscription which is a essentially a billing boundary for many things I'm going to deploy provision my service to a specific subscription and within there I can use many of the different regions so I'm going to provision an instance of my service into this and what that's going to give me is my resource my cognitive service resource now when I create this and I guess one of the things to think about for a second is when I use the region there are many regions throughout the world if we quickly jump over to the portal let's have a quick look so if I'm in my portal if I was to quickly open up a cloud shell so this is a way to interact with azure using Powershell or I can use the azcli but if I just quickly paste this command this shows me all of the regions there are many many regions that I can leverage in azure notice when we provision things we always use the name not the display name so get used to the idea of using this short name like East us2 or West us2 we're not saying e-space us space 2. so yes there's a nice friendly display name but really what we focus on when we use the service is we're going to use that short name that's really how we interact when we provision things you'll always use that short name now when we create a resource we have a choice our resource can either be multi-service or it can be single service remember we drew this picture of all of these different Services now if it's multi-service the nice thing here is it's a single endpoint because we're going to interact with a restful endpoint to call our services so it's going to be a single endpoint with a single access credential whereas if I do a single service well the single service will be a different endpoint for every single instance there'll be a different set of keys for every single instance I could use different regions different billing for every single instance and also no so I'll do a little note here training may require or maybe you want to a different resource what I mean by that is we talk about there's multiple stages we train a model then we use a model so it's very common maybe for billing I want to do the training in one model and see how much that costs me and charge that a certain way then the consumption of the model I might want to do a completely different way so if I jump over to the portal if we see cognitive services for a second we have all these individual ones so for example computer vision I could just create a computer vision resource and you can see I've got different aspects to it there's different pricings is now when it's an individual I get both standard and a free option a free limits me to a certain number of uses but as the name suggests it it doesn't cost me anything it's a nice free way to do that so I do a name a region and for what I want to do but likewise if I just do custom vision for a second notice with custom vision I could if I wanted to break it up I could create a version of resource that is only used for training and then a completely different one that I would then go and use for the actual prediction elements of it so these are all examples if I'm creating a resource that's specific to a service but there's also the ability down here at the bottom let's have a multi-service so the multi-service there is no free option if we look at the creation my pricing tier is standard that there is no free option when it's the multi-service here but the benefit if I look at my multi-service the multi-service has its endpoint and again my end point is the name of my service and then it's that dot cognitiveservices.azure.com and then it has its two keys if I was to look let's go up here for a second my ones that aren't multi so I've got one that's just computer vision one that's just speech well these would have their own endpoint and their own keys just specific to that service so maybe we want to split out the billing or maybe we don't maybe you just want it all in one place but we have the choice in how we want to leverage those now when we talk about this idea of I showed you the idea of the keys and the endpoint so a key aspect will be our resource right here has some very important aspects to it so one is that endpoint URI I have to know that whenever I'm interacting with a service I have to know that endpoint URI often I I'll need to know the region as well and remember there are the two Keys now why are there two keys so we have these two keys the reason there are two keys is I have to specify a key when I use the service well if I want to rotate the keys the idea is I would switch to start using key one while I'm using that I would rotate I regenerate key to once key two is finished rotating I would update the apps to Now using the new key too and then I'd go and regenerate key one so it's a way to not interrupt the usage of the service and there are commands I can use to go and fetch the keys there are rest apis so for example if I wanted to regenerate a key we can see I'm posting here to the management endpoint but what I'm basically doing here is hey I want to regenerate the key of my particular cognitive service account and in the body I would put the key name so do I want to regenerate key one or key two so that's one of the ways I could think about hey I want to go and regenerate that now a very common pattern this is not part of azure cognitive services but one of the the very common things we will do is you could imagine what I'll actually have is an Azure key vault and what I can do is I can store the key in my Azure key Vault and then apply role-based access control which means only certain service principles have rights to get that key now it could be a managed identity where it's native to a resource or it's just a regular service principle that that anyone could go and use now sometimes some services use a token that's maybe time limited to 10 minutes and I have to use the access keys to get a new token if I use software the sdks it does that for me some of them even use Azure ID for the authentication to use the service it varies you'd want to actually go and check exactly how that is working for the individual service now remember when we interact with this it's talking to that endpoint URI now one of the things we can do is this endpoint URI does have its own firewall in front of it so one of the things I could do is restrict it to only certain public IPS I could also restrict it to a service endpoint so what a service endpoint does is if I have a virtual Network it has subnets now ordinarily on a firewall I couldn't restrict it to the IP range of a virtual Network so private IP ranges it would mean nothing to the firewall but I can go and tag subnets so I'd have to make a change on the virtual Network on a particular subnet Say Hey I want to enable you for a particular Azure cognitive service and then that particular subnet would be visible in the file would say yes I want to let you through so I could use service endpoint to restrict it to only certain subnets in a certain virtual Network we could also use private endpoints a private endpoint completely disables the public endpoint and now there's a virtual Nick created in my v-net that points to a specific instance and some of the services need a special configuration if we look at the documentation it talks about all of the services that have that are supported by this cognitive Services Management Service tag which I can use again on that subnet but then maybe there are certain services that need certain special permissions and it just goes through that in this document so it's it's really a nice thing to just understand how those interact with each other okay so then how do we use this service so if we think about well there's me on my machine so maybe I'm sitting over here and I want to use it now remember the whole point is we have this endpoint URI now I have to know the endpoint URI and I have to know one of these keys so I can't do anything without that so I've got one of the access Keys now maybe I know it maybe it's in the config file we don't want to put things in a config file ideally what's happened is this is running as some resource or service principle and I actually go and did a read from the key Vault to get the key that I'm going to use so what I now do is my interactions with this are going to be using maybe rest so one option here is I'm using rest now remember rest has I can do puts post there's a whole bunch of different verbs I can use here but what it boils down to is we are creating a Json payload wrapped in http and what I'm first doing is I'm creating a request so my request to the end point and then its response will come again as Json over HTTP https it's going to be encrypted so this is one option I just use rest directly and if we go and look for a second at some of the examples so this is the git repo for the the free training and the key thing we see if we look at the app settings what are the two things I have to tell it the end point and the key and that's going to be the same no matter what I do but then once I've done that my actual program I'm not going to go through all of this but it has to take those settings and then it's going to construct a Json payload so we can see it's creating this Json body it's going to put in the information it wants and then it has to construct okay these is going to send this data to it there'll be a response coming back so we can see the status code we did this post over here we created the header and it was a type application Json so we'd all this work around the restful interaction so okay and we can do that and now the benefit here obviously is this works from anything when I think about using that idea I can do this from any that I could use Postman I could just go and play around with this I could use Curl there's no limits to what I can do so no matter what I'm using I could use the rest endpoint but there's there's a work involved in that and I lose some of the benefits for example again the SDK if there was that 10 minute token it does it for me I can just now call a native function so my other option instead of using the rest or I'm going to use an SDK now the SDK is available things like C sharp and python I think there's JavaScript you'd have to check all the languages go Java there's a whole bunch of them and what the SDK is really doing is abstracting the rest I still need to give it the end point I still need to give it a key it doesn't change that but now I can do all of my interactions in a far more friendly language specific way so if I now looked at the same code for the SDK I still need to do exactly the same endpoint and service key I still need those things but now my program is going to be a lot simpler now well it creates a client which is just talking to the endpoint of my credential and then it just calls this detect language and I pass it the text of that client object I created and then that code can just return the detected language.name so it abstracts away all of that messing around with the Json and the rest and it just makes it a native command in a native Library so it makes it much much nicer for me Sue there's options available so if I was writing in a language that had an SDK hey most likely you're going to use the SDK it's a far more pleasant experience than trying to mess around directly with that so we're going to think in terms of of using the SDK if we can now the next thing we always think about is hey great we created this resource but it costs money it's an Azure resource now as we saw if I do as a single service there is a free option with a certain amount of included but outside of using that free let's go over here what are we paying for it's an Azure service so an Azure service means is consumption based I'll pay on the number of interactions I do against it and we can look at the pricing page and the pricing page will tell us hey for the different services and different regions hey if I was using computer vision well it's a dollar per thousand until I get into higher numbers and it's 40 cents per thousand I'm just paying for what I use which again is very very typical there's a pricing calculator so the pricing calculator I could go through and I could add Azure cognitive services as we see right here and that would then allow me to estimate so if I added Azure cognitive services I could tell it well which service notice it's telling me how there's three by default over here and if it's free what do I get and how I can do 5 000 tax records of the following six features or I change it to the standard plan and then hey it's Billy me based on and I would tell it number of records and how I want to use it each month or I want to use a different service or face API or maybe I want to use a q a maker whatever I want to do I can go through the pricing calculator to actually go and get the detail on what that thing is going to actually cost me so I can plan ahead for how I'm going to use that and as I showed before single service you have that free option multi-service you do not I'm always paying if it's if it's that version now one of the things I might want to do is alerting Sue Hey for my pricing I can absolutely set budgets and a budget could be set on one of two things it could be based on the actual spend I've done so far or it could be based on forecasted so hey based on the trend line you're at right now if you carry on you're going to bust your budget and so it could alert you can send an email a text message integrate with an itsm call a serverless function like a Azure function or a logic app call a web hook lots of different options around that I can use the Azure cost management billing to go and see what it's actually costing me now in addition to from the cost element there are whoops a whole set of metrics so we do have metrics available which again can be used also to do alerts so I could do alerts for metrics I could actually do alerts based on my costs and once again those alerts could call an action group which is a common thing in Azure but I could do many many different things now in addition to the basic metrics there is the idea of logs and there's different logs for example around audit it could be request I think response there's tracing and all of these if I need that data I enable through something called diagnostic settings so these I have these are not created by default I have to turn them on if I need that metrics I can optionally include in this and what this would then do diagnostic settings lets me send it to a number of different places I could send it to log analytics workspace which is azure monitor logs I could send it to an event hub event Hub is a published subscribe and that would be really useful if I had something like a external Sim system and the Sim system wanted to come and grab those the security information the event management solution which it could then subscribe and take them from here do something else or I could just send it to a storage account if I just need to store it for a long time as cheaply as possible so there's many different things I can do to interact with the key point about all of this is we're creating an Azure Cloud resource so it's in Azure we might be calling it from an Azure service it could be a service on premises could be a service anybody want but it's talking to the service that is running in the cloud and I'm paying for it on that consumption basis and that's the the common way we're going to leverage these services however sometimes the cloud is not a fit imagine a scenario we talked about some of the safety elements so if we think about some of our goals over here reliability and safety imagine this was that autonomous vehicle idea and imagine it was driving the car or object detection am I okay with the idea that well it's good as long as I've got a 5G connection that may not be attractive to me my car should always work so sometimes I need to run the trained model at the place so maybe it's because of it wouldn't have consistent network connectivity maybe it's a safety feature like in a factory and it's a camera looking for a dangerous circumstance and I can't accept the latency of going to the cloud and getting the response back it would take too long so a scenarios where milliseconds count and I don't want to run it in the cloud so one option is why I provision it in Azure which is which is absolutely the normal way I would do these things but we do have another way I can take the model and I can deploy it as a container so my other option right here and we'll do this as a I'll do it as green to show on premises but it doesn't have to be on premises it could really be anywhere but my other choice for my provisioning space good provision it is I'm going to use containers so over here I've got this idea of containers now I'm not going to talk about what is containers I've got lots of different videos on my Channel about that but the point here is there is some kind of host a container host it could be Windows more often it's Linux and what containers do is they create these user mode sandboxed environments that are isolated from each other in a user mode and I run a certain container image so this is container one this is container two Etc and what I'm doing is I'm running a certain image now okay what what image am I running here the whole point what Microsoft are doing for us here is we get these images from a registry case e EG there is a Microsoft container registry I don't think it's actually MCR anymore I think they've got a different name I mean it's the Microsoft artifact registry because now it's more than just containers but it's still MCR I think.microsoft.com and what's Happening Here is I can create container images now container image is fundamentally layers because I create an image built on a layer built on a layer built on a layer so someone creates a Docker file which is the definition of what I want now my Docker file starts with a from so I'm building from another image which I can get from a registry and then what I do is I basically create my a new image that I can then publish and then when I want to go and create something I run it so I run a certain image on a container host to create a new container running that code and the whole point here is because this is nothing to do with containers whatsoever well here's where it is containers they are azure cognitive services set of container images which I can use now when I run them I have to specify a few key things I have to do an API key I have to have building and I have to accept the Euler because you're still using their intellectual property so you still have to pay it's not free so I still require the end points that's the endpoint URI of the service that's still one of the keys it has to have some internet connectivity so we can still go and Report these things and I say yes to the Euler to be able to leverage that so they're the key parameters and we can see this now in terms of what is available so these are all the Azure cognitive service containers and it lists them all so here's all the different ones I could use if we go and look at one of them so it talks about how the different model the prereqs let me pick a different one oh there you go imit that's what I meant to click I meant to click the image there we go so this is a link to the image and what it will show you is the pull command if I can see it there we go what you can see is I'm pulling it from mcr.microsoft.com it's Azure cognitive services in this case then it's Vision read so that's how I actually use it so I can go and pull these images but what's important if I go and look at some of the samples for the using of it if we actually go and look at all of the different elements to use the image I must have these arguments I must accept the Euler I must give it the API key and I must give it the billing endpoint so you can see that's the end point so for me to be able to use one of these containers I still have to have that connectivity because it has to be able to check in it has to be able to connect to a particular endpoint and give me the key to prove I'm allowed to use here I'll just go and build someone else um but I have that let's just get that correct because it's a dash not an underscore there we go so I would accept this that's one of the keys and that's something so I have to have that to be able to use that container but now I can go and run it wherever I want and that's the beauty of what those capabilities are going to give me so if I don't want to run it in the cloud anywhere I can run a container I could run the service but it is not air gapped I still have to be able to talk to Azure for the billing purposes to make sure it's charging me for using that intellectual property so that that's really the the key goal around that okay so let's actually go and look in a bit more detail about some of these services and actually what they're doing so if we start with the idea of the visual perception and we had this image analysis so if we think image analysis and I'm going to try and Alternate between colors as I draw these so it's clearer as we go through so for image analysis this is the idea that we send it an image and then we're going to get a Json response so the Json response is going to give me information about the image we have a certain confidence level so if I think okay image analysis we have some idea that hey I'm sending it a picture I mean that's supposed to be a car but I don't know on a road maybe the sun is shining whatever that is so it could give for example an image description that might be useful for captioning purposes so what I would get here is it might say hey it's car on Road that would be a description so I needed to automatically caption images hey I can use that description it can also give me the idea of a category so it's maybe his vehicle it would be the category it could have tags so the tag might be car Sun but the category overall might be vehicle and then also it might have object so an object would say hey it's a car here's the X and the Y top left and then the width and the height so when we we think of X Y it's kind of the left and the top is how it describes it so in that case it would be doing this kind of it's giving me this coordinate and then and then why are you adding the width and the height you work out the surrounding rectangle from it but it's giving me that left top and then I can go from there so this really is useful for getting information it can also do content moderation so it could scan for adult or violent images and I would get a rating there's also a celebrity celebrity recognition but you have to be on boarded anything where it talks about identifying people there's special onboarding and permissions I have to give again think responsible AI there's special things that you have to on board to to be able to use those Technologies um it can also do a smart thumbnail so if I gave it a really big picture and said hey create me a smart thumbnail it would use AI to identify what's the main subject to this picture either car and they could create me a thumbnail just cropped to the car of a science that I give it then I can think about the idea and let's alternate to a slightly different color every time we'll go orange that's pretty bad color we'll go light blue so if then if I think about video analysis let's draw this picture up here so video analysis is doing similar ideas but now I could think about well there's a whole set around facial recognition so detecting the presence of individual people in an image again that requires that approval it might be optical character recognition again and I can I have elements of that in here as well so hey it's seeing text within the video it could have the idea of speech transcriptions to create a text transcription of the dialogue it could also identify key topics that are in the video it could have sentiment hey is this a positive video is this a negative video it could also have the idea of label okay so what type of labels might I apply to this video the key themes or objects within the video once again I have ideas of content moderation so I could think about adult or violent themes and giving me a confidence rating that that is part of that content and also I can do the scene segmentation so when there's a major shift in what it's doing it can identify now there's also custom insights so people if I I could train it on specific people I wanted to recognize as part of this again I need that limited access approval language if I have specific terminology used in my industry or my company I can train it to detect and transcribe language that's specific to me Brands I have specific Brands I wanted to recognize products projects companies even animated characters I can recognize as well if I go and train that so that's possible for me to do then I can think about image classification so for image classification should we come back over here again so let's do that similar idea but here the idea of the image classification is I want it to predict a class label based on the main subject of an image it's a vehicle it's a plane it's a fruit it's a vegetable and the goal around what we're going to do here is we have to train it so we'll start off with a bunch of data so that's a car and I label it so I'm labeling it we use that label data to do the training to create the model and then what I can do in the future is just send it other images and that's a car and it will predict oh it's a car with a certain confidence out of one so 0.95 or whatever that is so I'm using this to go and train my own classifications for images now there's probably multi-class multi-class means there are multiple possible labels categories but an image will only be one of them it can only be car or it can be vegetable there's also multi-label so multi-label would let me have one image where it could have multiple labels attached to it and once again because we have these phases of training and then I should I'll make this clear prediction sorry prediction I can absolutely split those into different resources so one resource I use for the training we saw that at the start one resource I use for the prediction and I can use the custom Vision portal but I could also use the rest API the SDK to upload image label them and then do all of that classification now there's object detection I think that's fairly obvious that goes another step and identifies the location of specific classes of object again we think about training it we'll give it data but then actually give it the bounding rectangle of the object and once again I could do portal rest API or SDK then we get the the facial analysis and the facial analysis is very interesting because there's a whole set of different Services when I think about this face service and obviously when we think about faces there's a lot of personally identifiable information we have to think about responsible AI with a face obviously it's pii it's someone's face I need to ensure the privacy of that information I need to be transparent about how that facial data is going to be used and it needs to make sure it works for all individuals and remember that treating everyone equally it's why things like age and gender have been removed so I remember in the old days when I would play around with this I could walk up to it it'll be like bored 100 male and it would guess my age it doesn't do that doesn't doesn't say your gender anymore it doesn't say age anymore because obviously it could be wrong and it could call offense to people and so that has been removed it will not say gender it will not say age so those those have been removed from it there's actually two services that can help with facial analysis there is just the general computer vision service um but the general confuse computer vision service would just say hey there's a face and here's where it is then there's the separate face service now the face service goes a lot further yes there's a phase but then here's its attribute so they have glasses what's the pose of the head I can do verification I could do recognition around all of those things so yes we get the face rectangle the top left coordinate and then the width from the height sure but then we get additional information hey here's where the nose is here's where the eyes are how much blows in the picture how much exposure is in the picture so I get the idea of sure where it is but I also get information about it you can also do detection and it can do verification now when I think about this is what it's going to do is for the detection every face it sees will get a unique ID generated for it now this will get cached for 24 hours I'm actually going to write that in after which time it will be discarded so while this is useful is imagine if it can detect and verify a particular face within a limited window how you came into a building and you've left the building and that will be one case of that so it doesn't remember you need to remember you long long term but it's going to remember the face for 24 hours so it could validate those types of scenarios then we also can have the idea of recognition now with the recognition is not going to just last 24 hours this is a much longer term thing and what we do here is we have the idea if we create a person group and then we have the idea of kind of person one person two person three and we give it a bunch of images they would actually look something like the person Etc and now you're training it to recognize specific people so I've got multiple images of people ideally in multiple poses and that will be persisted that's not going to expire after 24 hours that'll be pretty useful useless if it did expire after 24 hours and obviously optical character recognition now again we have two different apis we can use for this the primary one we want to use is the read API the read API is really useful I can read from images I can read from PDF documents and it could be both small too large like huge amounts of data in terms of volume I could take entire books in time magazines also the image analysis API so this the read api's got really good accuracy I can do printed text in multiple languages I think it's over 160 it could do hand written in fact let's look at that quickly so if I looked at again we're looking right now at the read it talks about all the different image formats talks about PDF and Tiff up to 2 000 Pages getting paid must be less than 500 Megs or four Megs for the free tier 50 by 50 or up to 10 000 by 10 000 PDF no limit then it talks about 164 print languages nine languages for handwritten text so we we get some of the ideas of what it's capable of and the way this uses it because it could be far far larger amounts of data there's going to be a call we make which gives this an asynchronous ID then we make a subsequent call to get the results for that ID the image analysis API is for small amount of text for an image give us the line number position of the text but it's a single synchronous call I mean obviously it can identify other things like Brands and faces image categorization so if it's just a very limited amount of text maybe it's as part of something else sure the image analysis API can do it but our main sort of star service for OCR in any kind of large volume will be the read API that's going to be the main one okay so then if we carry on thinking about all our different capabilities let's go and look down our next thing would be language understanding so if I think for language understanding I went over my space a little bit over here so for my language understanding we'll go and carry on give myself a lot of space go over here there's different elements to Define and train models to predict user intent from natural language now there are certain learned so from a learned perspective I'm doing the training I'm going to come back to those then there's also the idea of pre-configured now some of the pre-configured we're actually going to come back some of the those topics up there but in addition there's things like summarization the II I so summarization hey give it a bunch of text and it's going to break it into key sentence sentences to predict the overall meaning of it pii detection hey there's IP addresses there's email there's home addresses or street addresses there's names there's protected health information that it's gonna find and again there's the question answering text analysis we're going to come back to then I can really think about these learned features and I have to train these is custom language to understand the user's intent so we have this idea of custom language understanding clu and the whole goal here is right we have to train it we have to train it on a certain intent and to understand that particular intent well here are the possible utterances that map to that intent for example the intent could be get time the utterances could be what is the time what time is it tell me the time there may even optionally be an entity that it relates to what is the time in London and when I think about the entities there are pre-built entities so if we go and look at the pre-built entities there's a whole bunch of them because the age number percentage ordinal Dimensions temperature currency emails I mean there's a whole bunch of these so there are all of these just pre-built ones or I could train it with my own ones I can train the entities as well and once again like all of these things there's going to be this Loop of training so the idea would be I I train I train it I test it I deploy it I review what it's doing and I use that to train it some more so there's always these Cycles around all of this there's also then the idea of custom named entities so these are particular entities I wanted to train it to understand this could be a person uh place a thing an event a skill value so I'm just going to define the entities I want I'm going to label existing data that corresponds to The Entity I want to make it as accurate as possible we're training it if we're not accurate in our training that's going to impact everything else that it's going to be able to do and the way I do this is I'm going to pass an array of labels and then give it a certain entity that those labels would represent I could do it from the portal I can do it again for that Json and I want a diverse set as possible so the whole point here is I have a whole bunch of data with the labels and I'm going to pass all of this as is everything we're ever going to do here basically Json then there's also custom text ification so custom text classification I think is is fairly obvious I'm gonna documents text to be classified into custom groups now there's a number of different ways I could actually think about having this there's both single label and multi-label so if I think about single I'm going to always do this so single label I have a bunch of possible labels L1 L2 L3 L4 whatever and I have a whole bunch of different texts go level says single label says hey you can only be one particular label multi-label exactly the same thing a bunch of different labels a bunch of different texts but the key Point how it now is a text could be multiple labels no that's all that means so I can for custom text classification is it hey I just have one label for a certain set of data or do I allow multiple now with this once again we're going to train the data so I'm going to just find my labels I'm going to tag my sample data I'm going to train the model now when I do training remember there's the training and then there's the testing so my data set will get split between an amount of data I'm using for the training then the amount of data I'm using to test well is it trained correctly is it accurately labeling classifying based on what I would expect now I can view the model details to improve it may be misclassifications I might get false positives so it's predicting something when the file isn't actually that I might get false negatives it doesn't predict the label but the label should have been tagged at that so I'm going to get these various scores now there are two things that I think are really important to understand there's a whole bunch of different things I will get but there's two values one of them is called recall one of them is called precision recall is of the actual labels how many were identified so the ratio of true positives to all that were labeled so what does that mean so imagine I searched for pizza so we're always going to search for pizza okay so we're searching for pizza and what was returned to me was four documents but three of them were correct one of them was just uh misnomer we do it red so it returned me three but there were actually three more that it didn't return to me so this is the set of data that got returned to me three of them were correct but it missed three so in this case my recall here would be three oops out of a possible six so 0.5 would be my recall because it missed some so it only gave me three out of what it should have responded to me Precision if we take this exact same scenario again well it gave me four documents but three of them were actually correct so now my precision would be three out of four so 0.75 so recall how many out of what it should have responded did it give me so it recalled half of what it should have done Precision what it responded to me three quarters of them were actually correct and they can combine those insane called an F score like f answer F1 score is a function of the recall and precision so it gives me a nice balance of how good that actually was so that that's a good thing to understand if I'm trying to understand well how good a job is this thing actually doing uh we like to look at that so those are important Concepts to understand recall how many out of what it could have returned did it and then position out of what it returned how many of them were correct and then if we come all the way back up to here there were the other pre-configured ones I talked about so for example question answering so question answering is mostly about a pre-configured feature I'm going to give me answers to questions that are provided as input and I'm going to have the way this is actually going to work is what I want to create let's get my colors back so if a question answering what I'm going to create is a knowledge base so I can think about this huge knowledge base that I'm creating as opposed to be a book and that knowledge base is coming from a number of different places so what I can read into this is maybe it's pointing to a fact that is online that it can read in maybe I'm passing it files uh maybe I'm have some Chit Chat files so I have Chit Chat so I'm feeding in all of that information and then what I can do with that is I can now use that from my chat assist assistant so I've got these Q question and answer pairs that I have defined now I may also Define synonyms maybe there's multiple words that actually mean the same thing but then I'm going to feed it in so I'm probably going to have some bot that is a question that will feed it into the restful endpoint and what it's going to give me back is an answer that's going to be Json as always that's going to have the Matched question based on so I've got natural language going in it's going to use the service to equate that to a specific question it's going to give me the answer and a confidence score it's going to tell me what the source of the answer was and maybe other information other prompts that could be applicable now as part of this interaction I could say Don't just give me one answer I could use top so top would say top three it would give me the top three answers the the highest predicted confidence Score first out of one and then going down it could be a multi-turn type interaction there could be a follow-up prompt it might be waterfall where I'm expecting it to say one thing and then it gives prompt then I go to the next thing prompt and then I go to the next like there could be a whole series of interactions it might have a follow-up hey cancel my reservation visit a flight or hotel reservation there can be different types of interactions that I need to do with that and then for text analysis so we're up here go back to our other color designed to help me extract information from text and there's a whole bunch of different features that I would think about for the text analysis as much I'll give myself enough space just come down to here this time so text analysis well I could obviously think about language detection what language is this so I could say hey language detection so that could be language equals English I might want phrase extraction so this is a i it's azure it could be sentiment analysis how positive positive 0.9 neutral point one negative point zero hopefully but I have a central analysis of the video it could be named entity recognition that will not recognize John Savile I'm a nobody maybe I trained it John's Havel it might even have entity linking so entity linking could then actually have a specific instance of that maybe linking to for example a Wikipedia article so it can help disambiguate common entities with the same name so imagine I said boots well boots could be a type of shoe it could be a brand of chemist in England so that's where entity linking could help me if there are entities that share the same name but obviously very very different so that's where I can leverage that and then I can think about this idea of translation now translate there are some overlaps in some of these things so once again when I think translation well over here I can obviously have the idea of language and detection so hey it's very similar to that text analysis over there but one of the great things I can do here is a one-to-many translation and again think this could be combined with other services so maybe I go and use the read API to read in a whole set of books and then pass it to the translation and do a one to many to convert it to French and German and Spanish and whatever I want I could feed that into there's also options like word alignment profanity filtering so I could either delete it or mark it with a character like an Asterix put in style so if it could also take that out of language it finds and there's also trans literation the transliteration is converting text from a native script to an alternative script so maybe a word from one alphabet to a different alphabet so it's not converting the language but it's converting either the alphabets it's using I can also have custom translations so specific business specific industry vocabulary vocabularies that I need as part of my custom translations so I can add in all of those different capabilities as well then if we go to the next area of what we're focusing on obviously we have the idea of speech now there's different aspects when we think of speech speech to text may be fairly obvious an API that's speech recognition I can accept spoken or maybe it's from a file so there's two different apis there's a speech to text API which is the primary way to perform speech recognition there's actually a speech to text short audio API for up to 60 Seconds now there were some specific configuration files we have as part of this so we had this idea of a speech config and as part of our speech config we're going to have to give it the location and the key to use this and then optionally we can have an audio config [Music] the config could overwrite the default import input so instead of it being for example the system microphone it could be a different one and then what these are actually going to go and do to use this these will go and create a speech recognizer object so it might be a speech nicer in this scenario for this speech to text now the thing that would then do is it's going to return information like the duration of it and the text that that's its key thing we can actually do but we also then have text to speech now we actually have these same files for the text-to-speech if this is here we also link that same idea from text to speech so we still use these files here as well but now that speech config because now what we want to do is have text and speak it so once again we have the text-to-speech API there's also a text to speech long audio API so that's for batch operations I actually want to take a whole book and convert that text to audio so I can leverage that now it's the same speech config file but now I can also have things like the audio format um I can have voices that I want to use can get read into that once again the audio config could override where I want to go it to the mic to the speakers or maybe to a file and this time what these are gonna create is a speech synthesizer coming to spell that today it's been spins for the Sizer it's like it's 6 a.m right now so forgive me for not being able to do silly things and the whole point of some of this is when I think of this text to speech there's actually two different ways I can feed this in yes I can just feed it in text but I can also feed it in s s m l so ssml is speech synthesis markup language so that's going to let me do is I can have speaking Styles pauses pronunciations pitch rate and more I'd have multiple voices so yes I could just have a very basic text which would use the same for all of it or I could use the ssml if I wanted variations in actually what I'm doing as part of that so if I want a richer output dialog I can leverage that there's also the speech translation so if I think about speech translation yes speech to text so I can have the idea of hey we need the source in the target language we use those short country codes so let's have a look for a second so here if we look at translation what we're going to Leverage is this language code as part of this it's always going to be this very short language code here sometimes you'll see it's a dash because there's different variances of the language so if I looked at United States is that in here it's weird okay maybe it's not here on this one but you can see this Chinese simplified as a variation there French Canadian you can see there are some variances for some of the languages so I'm going to use those short codes so if I'm specifying The Source or the target doesn't matter what I'm doing I'm going to use those short codes as part of that so here what we can do we specify these as part of a speech translation configuration and so what this will take is the speech would come in and we would get the original language text but also the translated text as well so it's going to do multiple for us what we can also do is we can synthesize translations and I have to spell it now now what this is really doing is combining multiple things hey I have to take the speech understand what it is and then straight away take the text translate it and then speak it now I can make this event based so it's just going to trigger straight away but only if it's a one to one one input language in one output language or I can manually do this if it's a one to n so it's obviously an a language coming in but I want to convert it to three or four different languages it's also speaker recognition hey recognize individual speakers based on their voice there's also an intent recognition where I can understand the semantic meaning of the spoken input so there are a different capabilities around there and then the last of these broad kind of categories is we have this idea of anomaly detection so normally detection there's really two types we're looking at one signal so you have some signal and then it goes outside so then we have this variant outside that or maybe it's multi so the whole set of different signals and it's when there's some combination over those signals now that's probably going to be a more complicated system or maybe a manufacturing piece of equipment where there's a correlation of there's an impending failure or an issue when there's multiple signals I need to understand there's Azure content moderation and this is going to work for both images and text and the goal here is for images is it image adult classified or is image racy classifieds there's different levels so image adult would be presence of images that are sexually explicit or adults in situations racy could be sexually suggestive or mature in certain situations text well profanity which could I could have categories of offensive sexually explicit sexually suggestive and personal data email address IP phone numbers so we have all of those content areas around there and then content personalization is really just around the best decision so I could think again the shopping cart idea we're going to have reinforcement learning where hey we suggest something was this good or bad and it can then learn from that to have better suggestions so we see this hey recommend you watch this movie Next of thumbs down don't like that movie okay that's reinforcement learning so then the next time my Media Services suggesting a movie hopefully it will get it better says this learning Loop and we have the idea of rankings and rewards based on what that's actually giving me so that's obviously we see that all the time and everything we do we get these suggestions and recommendations and sometimes we like that we I'm not interested in this we tell it that and we tell it that so hopefully it will improve next time by not suggesting that thing to me again and then we move into the idea of the pre-built Solutions so the pre-built Solutions all those other things I have to do something with I have to build my solution to use it that's the whole point these perform a function without me potentially having to do anything so the forms recognize are I mean this this is obviously a very common thing we might want to do now there are pre-built models now the whole goal of what we can do here is I could have things like a JPEG I can have a PNG a PDF uh Tiff and it could be 500 Meg Max so these are using OCR but it's also ideas about looking at this and is there key value pairs are there selection marks maybe even tables from the documents so it's 500 makes four Meg so I'm using the free version uh 10 000 by 10 000 Max pixel size and what it's going to return me is Json Json with the text any bounding boxes any selection marks so all of these pre-built models are really powerful for things like W-2 forms invoices receipts ID documents business cards and I just call the analyze function and it's going to be a get result and it's going to have a result ID and then I give that result ID to retrieve the actual result or I might want to create a custom model now obviously a custom model there's more works I'm gonna have to do to make this actually function and do what I need to to function it's the mic my goal with this would be I would obviously have to create a whole set of training data so I'd have a whole set of the forms I want to train it on I'd have the different optical character recognition and the labels so what I'll do is I'll build this out now I could either use the forms recognizer Studio to bring this in to label and train it or I could have sample data in a blob container so in Azure storage account put it in Blob and then it'll be corresponding Json files so for every document there'll be an OCR file and a label file and then also I'll have one Fields file so all these are Json documents but the point is there'll be an Oceana label for every form and then one Fields Json and then I use this to train because this is obviously telling me the text from the OCR hey where this is the data I have the OCR layout and then the actual labels that it should have based on that content of what the fields that are available are for it so I can train it to create my own ethics advisor we saw that anomaly detection it's just a pre-built solution for data monitoring and anomaly detection in Time series data I can tune the detection model I can help trigger alerts video analyzer again using those visual Services we saw but now it's just a complete video analyzer service that can do that scene segmentation the content labeling shot detection people tracking face identification audio transcription closed captioning speaker enumeration speaker stats emotion detection all of those things immersive reader is really nice so this is one of the ideas that it can make content available to anyone and this is actually pretty best by just looking at the documentation but the point here is we could make it easier to read so we could ice slate out text it could display pictures it could highlight it differently to help understand what it's doing it can even read it to me it can even translate it it can split it into syllables so it talks about how we just create this iframe that's calling this and it will make it easier for people to absorb so this really is about bringing it to anybody it's going to make it very easily accessible to leverage the content okay so now we get to the bot service oh don't mention that at all nope one more time why is that not undoing come on uh it's not liking me today all right we're gonna have to just accept the weird line I think I make my boards at a certain point so big and certain functions just stop working properly anymore it's probably undoing something else so we'll redraw in in Rich so if you think about the bot service this is really powerful these conversational interactions we see time and time again now their returns initiated by an activity a user joins a conversation uh sends a message now the message could be text it could be speech it could be a visual interface like a card or a button and we can maintain state if we have a multi-turn type dialogue and once again I can have multiple channels for a single solution I'd have web chat and email and teams I can Define all of this now there's really three layers to this solution so I could think about well there's the Azure bot service the Azure bot service is exposed there is a bot framework service this itself has a restful endpoint and then there is a bot framework SDK that cools this and of course I do my development against the SDK says these complete set and there are templates available there is a an empty bot just a basic box skeleton there's a echo bot there's like a hello world that just Echoes the message back and there's a core bot that has some core functionality integration with the language understanding the service and what I need to do is Implement activity handlers so the idea here is there are event methods that I'm going to override to handle different types of activity there'll be a turn context that was shown that show the text of the text received there are dialogues for more complicated handling of State form multi-turns and we're gonna have a recognizer the recognizer is going to interpret the user's input we're going to have a trigger to respond to the detected intent than a language generator to formulate the output to the actual user there is a local bot framework emulator so I don't have to go and use the cloud I can run this locally in my development environment and again a single bot can be developed delivered through multiple channels uh there's a bot framework composer there's a bot framework SDK there are power virtual agents for no code that's part of the Microsoft Power Platform to make it easier for your citizen developers to actually go and create these Solutions and then finally when I think of these pre-built Solutions and I guess the final key thing we'll focus on that's part of that time of recording the topics is the Azure cognitive search now this is so powerful it's so powerful today but it's even getting more and more powerful we start to think of what's coming down the future with large language models and using our data as part of them because really what happens is we have huge amounts of data um and we want to be able to extract information and make it available in different ways to be consumed so the point of azure cognitive search is it's a cloud-based solution for the indexing and then the querying on a huge wide range of data sources so I can index data that could be actually maybe I should put this in so I could think about data could be in Blob that data could be in tables in a database that data could be in Cosmos DB but it could also be something else I can use things like Azure data Factory to bring in data which could then bring that into the data now if I use that's data Factory there are some limitations I'm going to use the index rest API to do a push into the search index I can't handle complex data types like arrays I'd have to make sure I'm not being throttled or a certain document in a batch fails but it really opens it up to anything but I then can use cognitive skills to enrich this data and I'll talk more about that in a second but I I have the ability to do other things with this to leverage and maybe take images extract the text from the image and then add it to the index I could make hey I need this particular field translated so it'll work across different languages now this is its own service so when I think about cognitive search because this is a complete solution it's not available as part of like that multi-service I create a Azure cognitive search resource and there are different tiers available I can think there's a free I think standard and then a storage optimized which is the L so hey obviously free I can go and play around with it but I'm limited in what I can do basic there's a limited amount of data I can index standard there's different sizes we've been standard increasing scale and indexes and then L is the storage optimized really really large indexes and obviously the cost I need to look at these to make sure I'm optimizing how I spend my money so what I'm really going to get is this giant index well to use a giant index there's two things I have to be able to store it and then I have to be able to index talk to it so search from it and so I have the idea that I can have replicas so I could have for example replica one and maybe I've got replica two replica Three N numbers so these I could think about as different interactions so I could load balance the queries against these to improve my scale improve my resiliency if there was a failure so maybe I've got a replica of three as well so I've got these different replicas and then within the replica I have partitions mission one petition to petition three petition Etc um so how I can think about this is every replica will have the same number of petitions so if I have three replicas and four partitions what each of these is is a search unit so my number of search units is the number of replicas Times by the number of partitions and then even the search unit actually gets sharded to handle some of the ways it does some of his own interactions now obviously the more replicas I have um the the better my resiliency gets as well now if we search about reliability in Azure cognitive search it talks about we could just do a single one there's availability zones remember so availability zones are distinct sets of data centers in a region with their own power calling networking it was about higher high availability but notice hey if I want the three nines so for read only I need two replicas I need three replicas for three nines of read and write so that might guide how many replicas I want if I want the three nines if it's just read I need two replicas if I want read write I need three replicas so this would give me three nines for read and write just that would give me three nines for just read purposes and I can have up to 36 so 36 is the max now I can get to that 36 through different combinations of partitions and replicas that's really up to me now there are other components that actually come into this so I talked about this idea of this enrichment so this enrichment actually comes from a skill set so we have these skill sets that do enrichment and one of the things the enrichments may do is hey maybe it is calling something else and it is extracting other information maybe I want that as well so yes I want it to get added to the index but I want to keep that enrichment I want to project it into something else so I can also project so I can do a projection into a knowledge store so I want to maintain that work in something else as well now that knowledge door if it was objects would be Json if it was relational schema data it would go to a table if it was just unstructured like an image it would go to a file so what that knowledge tool actually is would vary but I can absolutely do that so then great um we have these capabilities we have the index we have the indexer which is actually driving that mapping of the fields um the index can be multi-language again I can Mark certain Fields as needing translation as part of that and when I talk about these skill sets I can actually create my own custom skill so I could create a custom skill for example I might use an Azure function so that would have an endpoint a URI and what I could actually do now is I could point to a web API skill which points that URI to then be consumed as part of a custom skill so I create an Azure function which has a URI I create a custom skill which has a web API skill property which points to the URI of my Azure function and I'm good to go there are again things like synonym Maps where I have the same thing meaning the same entity UK GP Great Britain United Kingdom well they're not exactly the same maybe they're synonyms if I want an actual field returned I could mark it as retrievable I can go beyond basic interactions and as someone's typing I might want to auto complete what they're typing so there is an autocomplete option so as I'm typing a word it would say hey is this the word you're trying to finish it could have suggestions so as I'm typing it would actually fill out entire suggestions for what I may be meaning to do so a particular capabilities that I register Fields with a suggestor so I have a suggest suggestor my mouse getting dry so I have a suggestor I register certain Fields with that suggestor and then I can use it for that auto completion Force part of those suggestions there's a whole language Studio that I can leverage with this I can do a basic search but I can also use the leucine query parser that's a more precise type query I can boost certain terms by using the carrot the little upper arrow symbol I could wait certain scoring profiles if I geographical data I could use geospatial functions to return something closest to the point given so there's really powerful things I can do as part of this um but but that's really all of this is at time of recording the scope of the exam like you should have gone through the examples you should understand all of these things and you should be good to go I'm going to cover one other thing time of recording is not in the exam it's almost certainly will be added to the exam so you should go and check that document and see what's in scope you can ignore this if today it's not in scope but it's it's a really powerful component and what that is and again I'll make it as a slightly different color to to call it out that it's different today is we have this azure open AI heard of this you have heard I did a whole video about large language models so these are all about large language models and there are different large language models there are things like GPT 3.5 turbo there's version four which at time recording is the newest one but there's bound to be newer ones there are embeddings to create vectors based on information there's Dali where I give it a description and it goes and creates me a picture which I don't like people keep using it to create pictures of Mia's some super villain or something else but the whole goal of what happens here is these large language models were trained on vast vast amounts of data and it took huge amounts of computational power huge amounts of money and huge amounts of time to create this large language one of these particular models like huge amounts of time if you look at when it was trained on a time of recording like gpt4 was trained on data only up to September 2021. so that's a long time ago but it's trained and that model really becomes this read only way to use it and what's happening here with open a the Azure open AI is we can now leverage instances of this model and we can leverage it on it we pay as we use it the number of tokens we're doing and I can do a number of different things with this but the whole way we interact with these Solutions is I am at basically my machine and we prompt it we give it some prompt which we send to the large language model and then what it is doing is an inference and it passes that back now an inference is predicting the next most likely token and the next most likely token after that and that on this phenomenally powerful billions of parameters many many layers that when you do those things again based on fairly simple statistics which is crazy but it's trained it's like a natural I'm interacting with a human it's very hard to tell difference and it's been trained on so much information it seemingly knows nearly everything if you look at things like all the Microsoft GitHub co-pilot all the co-pilots for Microsoft 365 and security and windows and Bing chat they're all using this and they're not modifying the large language model it takes so long to train you can't retrain or tweak the model the model is essentially read only which was like oh how is that working then well I can use it for different things there's completion there's chat completion where it's more about roles within message sending again those those embeddings where it's designed to return a vector vectors very useful to summarize the intents and relationships between different things but when we send this prompt I could ask it to summarize something and give the sentiment of something have a conversation with me give me information I can use a completions playground in the open AI Studio I can test chat in the chat playground in the open AI Studio but one of the most important things is this large language model is responding to the prompt and so the more specific the prompt the better the response will be so there is this concept of sure I say prompt but there's really this idea of well there's what the users are putting in but then I may also want to add the idea of a system prompt to give it more context more useful information or maybe I modify the prompt so we always talk about this idea of grounding so if I'm grounding the prompt I'm enhancing it I'm making it better to make it more useful like the system prompt I can use this when I'm interacting using for example chat completion the system prompt might say you are a helpful assistant you are teaching people about Azure you should be respectful and polite if you do not know you should you're giving it more instruction and then you give it the user's prompt which may be a bit vague and may not give them the best response so I can make it more descriptive clearer of the actual expectations of what I want I could tweak how the model responds with things like temperature and the top probability make it more artistic and maybe more imaginative or maybe make it stricter I can give it cues hey I want you to start your response with um so it helps make it better and better but the other thing I could do is as part of this grounding I can because the the question probably is is well okay if it's trained on September 2021 how is it telling me how does Bing chat possibly work how can I make it use my data so one of the things grounding can do is I can also describe to the large language model these apis are available to you you could if you want ask me to go and find out about something so you can describe an API and instead of it returning a response that it goes to the user it might actually return a plan and the plan might say right you're essentially acting now as an orchestrator I want you to go and find this out for me so if it was Microsoft 365 as the user's context go and run this search against Azure cognitive search to find these documents and then what it could now do is in its response next time this prompt could actually be built of you can use like these three lines so I might have information oh I already yeah it doesn't work anymore does it so I might have my information that is requested and then I might have what the actual request is so I could break it down into sections on what I'm actually doing and so what that would enable me to do is I could now add my own data and one of the nice things this studio does is my own data is my own data could I could upload it it could be in Azure storage as a blob it could be in cognitive search index so now as part of this process I bring in my data in and the Azure AI studio will help me now use that and it would be used as part of those responses so this is how it can do things with my data so I could absolutely now expose it to how you can go and search my knowledge base so if people ask about my catalog hey you'd be able to actually chat and respond and give you information has a help desk API here's all the facts for previous problems we've ever had you could now go and look up that information so my first request you would be hey I'm stuck on this your first response may be a plan that says hey go and call the Azure cognitive Surge and search for these terms and then I would follow up with the information the previous prompt so it has context of history and then what my request is and now it would have more information to craft a nicer response for me so this is a very very common thing for coding there is not a separate codex model anymore uh gpd35 turbo and 4 is just good enough but it could explain code comment code create documentation complete code I've written convert code from one language to another create code based on a description fix bugs refactor code make it better a right unit tests it's really just going to help overall be more productive and better efficiency so again at time of recording this is not in the exam I fully expect it to be added to the exam at some point so I guess I mean that that's it that that's the study cram I really hope it's useful my key recommendation would be go through the online learning it's really everything you need but go through the labs get more familiar with it you want to be able to understand what every service does so it's going to present you of hey how would you do this okay I'll use this solution and then that solution um what are the types of functions and calls I might make what information would I need um what structure would those things maybe have again I'm not writing code from scratch but it may give me code and say what is this doing would this do this what function would you do for this part of it but if you go through and understand it I I think you you'll be fine um if you don't pass the first time go and look at the descriptions look at your weakest area and go and double down on that but again you have a huge amount of time it may seem like an hour 40 is not that much but officially 30 minutes and actually pretty good so you have time you're not in a rush look at each question you can Mark some of them and come back to them if you're not sure later on sometimes if you're not sure a question later on might actually give you a hint it's like oh wait okay I remember that that sort of helps me answer the previous one so you can come back and review it and answer it pay attention if it says you won't be allowed to come back to this section just make sure of those type things some of the questions you have to answer and you can't go back because it's going to ask a series of questions the same thing with just different possible answers would this do this well would this do this would this do this so it won't let you go backwards so really think about does it or does it not meet the requirements but take your time don't stress out about it and uh good luck foreign
Info
Channel: John Savill's Technical Training
Views: 66,400
Rating: undefined out of 5
Keywords: azure, azure cloud, microsoft azure, microsoft, ai, ai-102, cognitive services, artificial intelligence, certification, associate, engineer, openai, gpt
Id: I7fdWafTcPY
Channel Id: undefined
Length: 126min 31sec (7591 seconds)
Published: Mon Jul 17 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.