AI powered Q&A against your own custom data and api's such as schedules using HuggingFace and BERT

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey welcome back and in this video we're going to look at a really interesting technique that you are going to be able to use with the question and answering ai model that we covered in our previous video but this time we're going to use that to take data that you've got from a kind of typical api or database type format and we're going to convert it into english and then be able to ask questions against our data now i think it's a super interesting technique and i think you will be able to start using that against your own data sets really really quickly [Music] to get started what i thought we would do which would be a bit of fun is we will open up hugging face again and then i'll show you what i mean by that and then once that works we will go and then sort of code it up in python really quickly and then of course you can use that in javascript and stuff uh later on so what i'm gonna do is i'm gonna open up my browser and we're going to go to hugging face just like we did in the previous video um and what we're going to do is we're going to select the questioning and answering filter as you can see and i'm going to pick one of the existing models it doesn't really matter which one i'm used but i'm going to use the uh largon case the same models we used the last time and then as you can kind of see before with hugging face you can you can ask a question but you need to put the text of what you want to ask a question against so i thought it would be interesting and i'll show you how this is going to work but i thought it'd be interesting to be able to ask questions of data sets like things like schedules so as everyone probably has figured by now i'm a huge nfl fan so i thought what could be useful is imagine i had a ball where i were able to ask questions about the nfl schedule so you could imagine me typing in something like uh in week one of the nfl season tampa bay uh buccaneers play the dallas cowboys and i can never remember when it is so i'm just gonna sort of steal this text actually from here um we'll just put it in here so in week one of the regular nfl season tampa bay buccaneers play dallas cowboys in tampa florida at the raymond jane stadium on september 10th 2021 at 8 pm eastern time so that's the kind of context but as you can imagine i can start asking questions of that now so if i was to type in something like who are the dallas cowboys playing in week one we can just hit compute it'll take a second and then it'll come back with tampa bay buccaneers but i can also ask things like when are the tireless cowboys playing in week one and then it's going to come back and say september the 10th so let's look to this and figure it out september the 10th i can even ask what time are the dallas cowboys playing at 8 p.m eastern time and i can also ask it who are tampa playing in week one and it should come back with the dallas cowboys so that's the model and if you think about what i've done here is i've taken data that i could be getting from an api and then i could be converting that in english to ask questions really powerful technique right just think about the data that you've got right that's kicking around it could be in a database format or an api format imagine you took that data and you just converted that into an english text type form and then suddenly you can start asking questions about it so good use cases of that could be well you just seen on this one it could be used for any type of scheduling algorithms right so you could imagine things like appointment systems booking systems right you could just then start saying which slots are available you know when do you have a free appointment because all you would need to do is take that data for that appointments or bookings and then just write it out convert it into a format that is english right so you could have you know um the following booking slot is available between 8 p.m and 8 30 p.m for uh whatever right and then you can start asking questions when what slots have you got available you know have you got a slot at 8 pm absolutely huge and then again you can imagine this being used in completely different industries imagine banking bank statements for a second right think about your account statement that you have in your banking form right it could be at the moment it's something like you know here's the day you know paid electricity bill um 20 pounds on whatever day or you can imagine your direct debits you've got a direct debit or a standing order you know money comes out of this account at 2pm on on the 15th of august pay electricity bill on the 17th of august now once that's contextualized you take that data out of your database or your api convert it into an english format you can start asking questions imagine being able to ask your bank account you know when does my electricity bill come out what you know have i made a payment to the water company this month how much have i paid to um whoever right um to my you know aunt dora for example you know and you could just ask those questions and then you know because the data that you've had has been translated into english you could just query things in the exact same way as we've done here okay so now that you've got the context how do i then convert that into english well well actually let's use my nfl example for a second so what i'm to do is i'm going to pull an api that that we've sort of had before um so if i were to go to i think if i take the espn uh api um you can see espn has got an api which um essentially returns back a sort of json version of the schedule and i can put in dates and queries behind it now that is pretty unreadable so if i just bring this up in something like postman for a second if you've never used postman before it's an api explorer tool um it's pretty cool it's free and and you can just start utilizing it so um as you can see i've had that open up already so um just for a bit of fun i'm going to create a new http request i'm just going to paste in the api that i showed you before i'm going to click send and then you can see it's coming back with the formatted data so you can see in this case it tells me at the various dates etc but if i were to look for let's just uh find um tampa in this case you can see here's an event uh dallas cowboys at tampa bay buccaneers and then it's got all the information i would need it's got the dates it's got the uh the stadium that it's playing at the state the it's got the capacity as well who's playing tampa bay's at home uh dallas cowboys are away so there's there's my data and then all i really need to do if you think about it let me just close this down for a second is i all i really need to do is uh just parse that api into english format so if i bring up visual studio code um i've pre-written this you don't want to watch me sort of type this out but but the quick version is what i'm gonna gonna do is i've sort of uh taken that json file i've stuck it in this regular week one json for a second so it's the exact same json you saw before i've just put it in a file and then what i'm doing here is i'm reading that file so that regular week one json which is the same data that we got from the api earlier and then i'm just parsing that so the adjacent parts and then i'm going to call my parse schedule function and then if you look in this all it's going to do is there's a little bit of date calculation remember i showed you when it was being played it's a little bit of a date calculation but all it's going to do is loop through all of the week 1 matches there for a second or events in this case it's going to do a bit of conversion to get the date and time and then you can see i'm using a bit of template literals and node.js just in javascript to say in week blah i'm passing in week match type you know and then i'm saying who the home team is i'm and i'm saying they're playing against the away team the game is being played in uh you know in tampa bay uh in florida at the stadium the game will be played on the formatted date at the formatted time eastern time so i'm just doing that and then if i was to run this very quickly so if i type in node schedule to text you can see uh it's just generated in text format the textual representation of that data so if i was to take that let's just copy that and we if we come back into a hugging face for a second and now if i get rid of that that that sort of one in week one example i could paste every game so in the same way as it was coming back with dallas cowboys before uh when i asked who are tampa playing in week one that still has the same answer but now i can ask questions about any of the teams so if i want to know who new york giants are playing i can just type in new york giants and hit compute and then you will see the new york giants are playing it will take a second the denver broncos and then i can uh change your giants to denver broncos and we can hit boom and then it should come back with the new york giants which it does and then we can say we can we can ask what the things like what time is the new york giants uh playing at in week one right we can find out what time they're playing so let's hit compute give it a second and you can see it's at 4 25 eastern time and we can also ask what day so across the board because i've now come back in a second across the board because i've converted my data and you see it's on september 12th because i've converted my data into english textual form that whole data set is queryable and i could i could increase it right i put one week in there but i could put all 16 weeks in if i if i wanted to but you can see it is picking up some pretty cool stuff so again if i wanted to know what stadium there you go the atlanta falcons are playing the eagles and atlanta mercedes benz so i could say what stadium is the philadelphia eagles i don't know i thought they were playing in week one hit compute and eight should be coming back with the mercedes-benz stadium in this case give it a second and it does it comes back with mercedes-benz city this is a super powerful technique as i said right just imagine the data that you've got the data in your databases we've said bank statements as an example we've set direct debits you could do the same for insurances booking scheduling anything you hold data for you could be using this exact same technique which is take the data out convert it into some sort of english format right add some context you can pull data from multiple sources make a sort of english essay type thing and then be able to run question and answer models over the top provide chatbot capabilities really interesting technique and again it's something that you should go and sort of play with and of course as as we've sort of seen in previously if i wanted to then uh run that in something like python i could very quickly let me just close this i could open up my uh q a example before or if i wanted i could just really quickly um type it up so if i just do new notebook here for a second we'll do exactly what we did in our previous video we could just install the uh hugging face transformers library like we did last time we just hit pip install um that's going to install the transformers library and then what we can very do very quickly do is uh code up a brand new um question and answering model so we type in from transformers uh you know import we'll use the pipeline libraries as we did before again in my previous video i'm sure it comes up in on the top right hand corner then we look at how you can use that with something like tensorflow and then in future videos we'll look at how we can sort of embed this with javascript as well but you know i i will show you this working with python again you can use that today and then i can just answer ask questions so if i do a question answerer and we'll say pipeline we'll just set that to pipeline question type in this terrible question answering and this is using the sort of pre-built uh functions that they've got which take away a lot of complexity yeah so what i'm going to do now is just set the context which is going to be the text that i've just copied from over there so we'll just set that in here and that is basically all of the in week one in week two etc and all we're gonna do now is we're gonna ask a very quick question of the routine so in our case what we will do is ask it who are the dallas cowboys playing in week one so we'll say question is equal to uh who are the dallas cowboys playing in week one we'll set the context equal to the uh to the context that we set before and then all we will do is we'll print out the result and if i hit the play button take a second and it should be coming back with the tampa bay buccaneers which it does and that is it working of course if you wanted to use a different model or if you wanted to then start testing it against tensorflow so that you can then pull that into tensorflow in your microservices later then see my other video on how you can test out different model types in the python notebooks so anyway that has been our video i hope you think this is a super great technique and i hope you uh pick that up and start using it for your own use cases i can think of a ton of use cases i'm sure you can too anyway i will catch you soon in the next video thank you goodbye
Info
Channel: Chris Hay
Views: 123
Rating: undefined out of 5
Keywords: chris hay, chrishayuk, huggingface, bert, q&a, qna, question answering stanford, question answering squad, artificial intelligence, ai, huggingface tutorial, huggingface nlp, huggingface bert
Id: sZTijgwMH_o
Channel Id: undefined
Length: 16min 2sec (962 seconds)
Published: Mon Sep 13 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.