From 0 to Hero in Writing Elastic Search Query in 1 Hour | ELk | Kibana | Crash course

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right good afternoon one and all and welcome to my course from zero to hero in writing elasticsearch queries my name is sawmil shah i work as a software engineer at job target over the years i have been working on for about nearly a year i have been working on elasticsearch in my company and i whatever skills i have gained over this past years i would like to transfer it to you guys or would like to teach you about elasticsearch as i said i'm not an expert but whatever i know i would love to teach you i know a lot and i am sure i have i can contribute a lot to you so let me share my slides first of all what we are about to learn in this tutorial well in this tutorial i would be teaching you we'll be talking about introduction learning basics of queries we'll be learning match phrase queries match query query string nested boolean queries filter queries pagination queries jio queries aggregation mappings and much much more a little bit about myself for you know i have an excellent experience in building scalable and high performance software applications combining distant uh distinctive skill sets in iot machine learning and full stack web development with python well uh about my education uh you know i have completed my bachelor's in electronics double masters in electrical and computer the best way to reach out to me would be through an email and let's get started with the series so the first step in this video tutorial series would be i would like to first of all upload a data we would be working with the netflix data set so this data set could be found um in the description section below or you could download so let me share my screen and let's do the first part i'm right now on the kibana uh localhost as you can see 5601 i would basically click on upload a csv file here the screen is loading so let's wait all right simply drag and drop your csv file i'm gonna use this netflix data set so it's now analyzing uh usually we do we write a script to do all of these job but let's just for understanding purposes let's call this index learn and let's call the import function on that this should create an index called as learn or in simple language you can say a database name callers learn let's wait for it to complete so it's right now indexing all the documents by the way meanwhile that is being done uh we i'm hoping the following things because uh let me just show you elastic so this is the library and open source library i have published i'm the author here uh i wrote a library to generate complex elastic search queries it's officially published on pipeline we we would be using this library a lot to generate about the queries because this tutorial is about writing queries right so now let's go back to the kibana so now i'm going to show you how to write your first very basic queries okay so i'm on the kibana on the tab click on the dev tools so the first command you want to enter is basically get underscore cat and i would zoom in as much as i can so that you do not have any problem in seeing uh what i'm typing here so get underscore cat underscore indices click on the triangle button right here and this should basically tell you all the indexes you have and as you can see i have an index called as learn good job so you have the index now let's learn how to query the index get the name of the index and you can say the search this would basically just search the documents on those index as you can see on the right hand side you have these properties in the index on the document country show id director release your rating description type title duration cast and date added so we have all of these attributes which is good now uh if you want to get so as you can see there are several documents if you want to get a document what you could do is you could say get the index name is learn underscore talk and the document id hit the run button and you should get the document you could also leverage postman for this example you don't have to use a you don't have to use what i'm saying is you don't have to use a kibana dashboard so i can run all the commands from the postman as well as you can see index underscore search uh i think we have a typo it's called learn so let me change that uh as you see i have all the data here right and i could copy the same query everything what i do on kibana you could do it on the postman as well okay so send this should give you the data of that particular document uh if you want the code for this let's say you want to write the code in python c javascript whatever you want select the language you want and then here you have the code for that as well any language node.js how to do this in node axios so whatever you want everything is there okay so that's that now once you know how to do this now uh we would write basically we would like to move to writing our first queries please install this following library which is developed by me pip install elasticsearch query generator the link should be there in the description so check that out let's understand the very first step in the writing queries so don't worry about these all codes so the first thing you have to do is as you can see in the in the documentation or the example from elasticsearch creditrender.querygenerator import elasticsearch query so you have to import the class once you import the class let's get started with the action so i created an instance and i gave it a size of hundred size means that how many documents i want to return right i'm saying that when i search i want 10 documents from the elasticsearch bucket name you could leave it to default or whatever name you want to give i would cover the bucket names later when we are doing aggregation so whenever you create an instance of a class you have you can access the base query object so as you can see i'm accessing the base query running that and this gives me the query now i'm i would be explaining you this in a little bit detail so now what i can do is i can dump the query here i can say learn underscore search and i can provide the query let's just run this and see if it works yes sure enough it does work so there are following fields in this json documents let's understand these one by one first of all we have source source stands for what fields do you want elasticsearch to return as you can see it's returning a lot of fields let's see just one title right you don't want other fields well you could insert that in the source attribute and after that you could run the run this and sure enough all the documents would be filtered and you would only have the title variable there so that is easy right now the next thing is size size is basically how many documents you want to return when you search something right now i'm saying that hey please return 10 documents when i search the maximum cap is about ten thousand if you want to get more than ten thousand you have to do pagination we should be covering in much details late in the in the upcoming sessions min score is basically as you can see whenever you write any queries elasticsearch gives a score by default so let's understand that a little bit so here you can see these are scores one so there are scores so what we're saying is hey filter of the document that has scored less than 0.5 i don't need all of that so just filtering out now this is the main chunk of the the queries you have query bool must filter should must not this is extremely important for us to understand this so before we move that let me tell you about must must stands for and operator filter stands for filter commands should stands for or operator must not sense for not operator that means if you want to write a query like hey find me a and b or c so you would put that a and b query in the mass section the r operator would be in the should section and then if you want to filter out something put that in the filter section if that should not have any things you could put that in the must not section now that the the basic very basic types of query in elasticsearch are match and match phrase these are the two most popular queries that elasticsearch have let's understand what's the difference between these two queries not everyone understands this so that's the reason i would like to i would like to you know show you a small demo i have this three document which says i love elk it's great i love elastic so it's elk elastic sword i have some terms in that okay now what i'm showing you is basically uh let's experiment with the search first of all i would be i said we are doing the match so let's search for the word elk and the field means what field you want to search so i have a field called text i'm gonna search on that so now if you see all the three documents are being returned so if i add the word search let's try a try and see what happens it's gonna return all the three documents yes still three documents are returning that means what happens behind the scene elasticsearch whenever you write a match query it breaks you search for the terms help and the word search so then based on the number of terms present in the document it's going to give you the result but hey i don't want that if i just want to do an exact match for example if i just wanted to search elastic something like search well in that case you would use something called as a match phrase match phrase and when you say match phrase you are exactly matching the phrase so l l can search as you can see is only there in the one document that's why only one document was being returned wow that's amazing this is the very basic concept we need to understand this very clearly in order for us to write complicated queries so now with that being said what you could do let's go to the scaffold generator we had this is the scaffold let's say i'm looking for uh first of all in order before even moving that i would say learn and i would say search so i would search all the documents now tell me one thing i would like to search for a variable called i would l basically i want to write a query where it is the where the word blazing or where the word uh let's see in the title uh so what you want to do is you want to search for the word blazing or the killer so okay so i want to write a query so as i said i want to search for the word right so i would use match or match phrase think about it i would use the word match so in the must section i'm gonna add a word match and here it says field i would select the field to be title and i'm looking for a word killer in that okay so whichever document has the word killer now i'm also inside of that let me take a good example of which is a long title let's say i'm looking for the word nasi so if i do that so as you can see there are three dots these are four documents has the word nasi nazi so now if you want to say give me the document that has the word nasi and camps so what you could do is you could again copy that so this implies that you're doing an and operator when you add more queries here so now i'm saying that hey in the as i said must section when you enter something in the must section that means it's an and operator so now if you search only one document was returned that means you told elasticsearch hey please look for a word nazi or and sorry please look for a word nazi and camps does that make sense now the same thing if i put that in the or operator now see you would be what happens so i put that in the or operator that is search for the word nazi or camps so when i run the query uh i got much more documents i got four document that has the word nazi and camps so you see that's an or operation so now let's say i got all these documents in the or now i i want to make sure that the word mega is not included well it's easy you could include that in the must not section so i'm saying that hey look for the word nazi or cam and that or filter i mean and and the document should not have the word uh let's say mega i'm assuming it to be work it shouldn't be working uh and sure enough it does work as you can see now we have three documents here so we removed the word mega from this so you see how how easily you could form complicated queries now let's say you want to filter out all the date so as you can see there is the score is 15 here we have a score of three here we have a score of five so what you want to do is you could also filter out the documents with the score greater than four so in this case this document would be removed so now let's try that and sure enough it does work right so as you can see so powerful and so easy it's intuitive right if you understand the basic i think everything should be so easy so let me go back to the basic again we have four section must filter should must not as i said must is an and operator showed is or operator must not is a not operator filter will be filtering out the commands so that was basically about the match and the match phrase and the boolean now level two would be hey could i write a nested boolean queries that means what i'm trying to say is you could write a a nested boolean query inside this so even that is possible we'll i'll take a look i'll i'll show you how to do that so give me one second but this should be very much clear how to write the basic so first of all we learned the command cat and dices we learned how to search right and then we basically learned the basics of elasticsearch that is source size min score query pull must filter should must not so these are the basic now let's see about the nested boolean operators uh so let's take a look at the nested boolean operations so remember how i said uh for example before even adding this let me actually remove this one i want to show you something nice so this is what we had before right so now what i'm saying is with this operation what you could do is you could only achieve following things in this if you want to go beyond this complex queries what you can do for example must represent an and operator then as i said it should represent an r operator then must not represent a not operator okay let me this is very very important guys if you understand this and the entire elastic search becomes a piece of cake for you okay and then the filter would basically filter out okay so with that being said if you want to do something like this tell me about if you can do that or not for example you said whenever you enter something in the not it's basically don't include that word right what if i include too many i want to say like hey search for the document this and this then i want to make sure that the the document should not have this word or that word for example uh i know it's a little tricky but just try to understand so i'm saying that hey the title should include the word norm or the king or so let's do a very easy so let's say the title should include norm or king right so it's an r operator we should just put in the r and then you want to filter out make sure that the it should not contain should not contain the word x or y so if you want to achieve some kind of this if you want to do something like this well you have to use a nested boolean queries what do you mean by that samuel uh basically you see i have a bool again inside a must not section first of all let me try to implement this the title contains norm or that that's easy right so we come here okay we have a match query we use the word match we call title here and we say hey can you look for the word king then once you do that i also wanna say this look for the word this i would indent all the query so don't worry i know it's a little it's hard to see i'm just zooming in because i want you to see this carefully and understand the concept because once you understand this i think everything is a piece of cake and okay so with that being said if i just run this code right here oops seems like we have an issue here let's remove the must not for now okay so we have all of these documents right so let's see how many matches we have so we have 32 matches okay and now i just want to see title and see how i'm teaching you now okay so i'm gonna see the word title here and run this okay so these are the you know dinosaur king because has the word king in it uh we have this king jack you know you know what i mean right so all of these documents there are 30 documents now what i'm saying is whenever you give me the result make sure this document does not have this word or that word so i want to do a not operator so what i'm doing is so i'll collapse the should operator uh now remember whenever you enter something in the must not that means it should not contain that word right now what i'm saying is i'm gonna actually copy a snippet so in the must not again uh you want to say you want to do an r operator right so i want to remove all of that and let me know if this makes sense actually let me actually do a little json formatter because it kind of get a little uglier let's go to elasticsearch paste the formatted query i would collapse the now you see what i did is in the must not i said bool and then the should so i'm saying that make sure that the document does not have the word let's say peaking so now let's run this query and see if it works and sure enough it does work so now we do not have the word peaking into that because we said this now let's say you want to do an or operator on a knot so what i mean by that so it does no it should not have the word picking or it should not have the word jacking okay if you understand this guys you are you're like master i mean trust me so now we say the word jack so see how what happens now we have less documents now okay so as you can see before this okay we had 30 matches i guess 31 matches right now i'm entering one more rule here right so i'm saying that so try to read this with me in english okay let me remove this operator so it's more a little easy okay so what are we doing guys we are saying that hey give me all the document that has the word king or nazi and make sure that this document should not have the word peaking or jack so if you run the query sure enough we have more filter data we have 29 we filtered out feud of few more documents so as you can see there is no word called check here similarly no word called jack no word jack or or are peaking as you can see seems like working fine right so you see how amazingly powerful the elasticsearch is so you could write nested boolean inside a boolean you could achieve whatever you want so what i did essentially is here this so let me write it in this in english but title is so i i wrote this query title is a or b title is a or b and not this is exactly what we did so you see this is a very complex okay let me just switch my screen again so if you did understand this i mean this is not easy right we made a very complicated nested boolean queries right so you see how easily you could implement any sort of logic a or b or c and this or this or that so it's just uh about your if you understand the basic it's just piece of cake after that all right um that's great samuel uh this this looks amazing i mean that's pretty much uh great uh samuel do you know how i can do scrolling uh by chance um yeah sure let me show you how to do that so if you provide the word question mark and scroll equal to and if i say 1m and if you search and sure enough you now you can see if i make the document size to 2 and now you can see i have two documents and i have a scroll id so whenever i want to go to the next document so i want to get the next result for this same query so if i'm not wrong and if i know the things correctly we could recollect it again uh so let me do scroll uh we say kibana elastic search just want to make sure that i don't forget i don't give you wrong information that's why i just want to verify the things i think i know it's post basically it's a post api uh so i just got to verify that quickly i guess this is the link yeah so yeah that's the one so we gotta copy that and remember whenever you go for the next document you never specify uh the index name okay so now in this i'm gonna grab this document uh the scroll id here that i got okay and we go and we get the next store we get the next result you see how amazing it is right so easy right and uh guys you could you could use this in postman right uh you say uh somehow how how show me how do i do that well no worries relax first of all uh we just basically have to do this um you know i guess this and where you have this json body right uh so that json uh would be supplied as a body uh go to the raw and make sure this is uh json uh dump that here and send and you're here you can see we have the results so you don't need kibana you can use postman i mean if you know what you're doing right so hopefully that should make sense now i have something to say here guys i want to say like do not use this do not use the scroll method because uh i'm telling you from my experience that i've been working in the companies so let me show you why uh instead of doing that i would recommend i would leave all the links so don't worry about that there is a better way to do pagination uh i think i have a i did upload on my blog four ways to do pagination the method the method three works the best this is the best method to do pagination so we use search after queries right you could come here and read a little bit i have a video as well and the reason i'm saying is because what happens is when you do this pagination elasticsearch would store the things in the cache and there would be like the problem with the expiration think about if 20 people are doing different different searches all the results are being accumulated on the cash i know this is gonna be only active for one minute but if you you see what i'm saying it's not the best thing so i'm saying instead of this use a search after query i this is my recommendation uh based on my experience i'm saying use search after for pagination uh you know the the blog is there kindly go ahead and read how i did and if you have questions there i would be more than happy to like okay guide you out okay this is how you do it so that's about the search after query okay now comes a very interesting part okay so we did all of these queries uh we understood this amazing complicated uh boolean queries and all give me one side of my actual throat yeah so you did all of this congratulations first of all a job you know always now what you want to do is you want to learn different sort of queries so let's head over to her and i'm actually going to walk you through all of the queries so this library you know what i just said it generates all the queries for you you don't have to write all this manually so for example if you want to write a match query it takes a field name so let's say the field was titled right and the value i think i'm forgetting a quote here the value was um let's say you're looking for the word king and you want to put this in the should operator again if you want to write this one i'm just showing you a better way to do this that's why i'm saying you know let's follow the best way i developed this library so people could use it i mean you don't want to write all of these by hand all these queries oh good luck with that i mean if you want to do that but uh yeah so let me try to run this and show you same thing we'll get the queries all we are doing is programmatic so check this out um should operate so we have king or nom in the boolean section so yeah this library is amazing i developed it you just have to supply operation operation is must must not shoot so you could do all of that fancy things so whatever complicated queries you want this is the way you you do it okay all right that was about match and match phrase so you know match phrase would match phrase would basically match the exact phrase right i taught you about that so hopefully you know about match and match phrase okay now uh the another type of query that i would like to i i would like you to introduce it all right so we are about to learn about the aggregation now so yeah aggregation could be complicated these queries are like it's all nested queries and all of that but hey don't worry i got you man um uh we have this library right as i said uh it's officially published on the pipeline i guess oh yeah this one so it's there everything is there okay so download that have fun play with it okay now let's talk about the aggregation excuse me so we are able to do aggregation uh so what we will do is basically uh we have done all of these geo queries so that's fine so in order to add aggregation you would use the word helper.add aggregation you would give the name of the aggregation in this case um you know uh let's see what fields we have here get learn search okay so we'll do that and i'll show you an aggregation now so let's try to do an aggregation uh maybe on the duration okay so what we'll do is basically we'll call this duration i'll just put the d as capital to the name that's the bucket name that i'm gonna provide now here what i'm gonna do is i'm gonna set duration and leave everything to default so let's run the query and see what we get uh seems like uh oh yeah i know why so uh wait i'm not on the right am i on the right stuff oh yeah i'm not printing and that's why uh yeah so let's print that one and let's see the aggregation in action so we get this query copy that dump that here okay so we have this aggregation uh we have a we are saying that hey find the word norm and then uh do an aggregation on on on the field um what's the field we have uh yeah duration so we just run this uh let's just do yeah that's the aggregation so here you can see uh for that the word norm um you know there are these men doc so let me actually remove this it might be a better to show you actually what it does so i would do the size as zero search start and yeah now you can see so the duration 90 minutes there are 111 documents uh similarly with 91 minutes or 104 documents and so on right so that's about aggregations if you want to do aggregation on multiple um what you call that multiple fields let's say i did for the i did for the type right so i also want to do for the cast hey it's so very easy man you just gotta add one more method just call this one uh let's call this as a cast like that see capital uh let's give it uh a field name that means the field on which you wanna do aggregation and hey that's pretty much it man uh let's remove the match query the wildcard query we don't need that so just want to do an aggregation okay grab the query put that here and here so once we do that uh we search the documents oops it seems like uh one sec so maybe i'll do it on uh rating let's do a rating maybe that's much oh my json is also a little bit backed okay so we do our aggregation on rating you see this library generated this all for me okay so you don't have to write all these complicated queries that's all i'm just showing you how to like make everything work together so let's remove the hits and this is the aggregation so here you can see that's the aggregation and for ratings it seems like uh it did not do the aggregation was it ratings okay ratings field rate so as far seems like the mapping of these fif the rating field was not appropriate that's why i guess it's not able to do an aggregation let me just do 0 here and try that out yeah it seems like uh i was not able to do the aggregation on the rating as you can see it clearly shows an error there um it's fine it's not a big big deal but uh what i wanted to show you is you know how to do aggregation how to add multiple aggregation so the library actually does that job for you like as i showed you you just got a call um dot add aggregation and once you're done you call the method complete all of these examples are already you know given on the documentation right so uh that's that's the aggregation part okay so we'll remove that one okay so aggregation now what you could do is you could add queries here like whenever you wanna for example i don't have much of massive data but let's say you wanna say uh you wanna do an aggregation so so so here hear me out let's say i wanna let's say that there is a movie called um i don't know uh let's see i'm thinking of a good example let's say you want to do an aggregation on job titles right so you want to do an aggregation on job titles what would you do so you let's say you want to filter out let's say someone said oh do an aggregation for software engineer so you want to see you want to do an aggregation on other fields and see okay how many you know what's like like like a document count it would give you like a document count so for your um stuff so that's about aggregation i mean that's the easiest i can explain you about aggregation so okay so that's that coming to the geo queries um so one thing that elasticsearch does is basically if you put this document here uh so actually i want to delete the document first making sure it does not there and then we'll create it so if you see i'm adding this document okay so what happens is if you do not specify the mapping elastic search would create a mapping for example there's a field called jio right and you can see the type is text so i cannot do radius search and all of that functionality in order for me to do all of that i have to define a custom mapping so let's delete the index first abc uh i'm gonna delete that okay that's done now you before even like sending the document you have to define a mapping uh mapping would be let's see if i have the mapping first i think i had one i mean or i can write one that's not a big deal uh i think i had one but it's fine we can always get the mappings from google i it would be a good habit for you as well to how to do that so okay you want that right so elastic search geo mapping okay you google up the first one go to the official documentation usually the the the documentation is the best place where you could get things okay go to the official documentation always uh well okay so so what i'm saying is now we will do this um and uh we'll say we'll put a mapping here called um we are custom what you're doing is mapping means of the data type we are saying that hey this field called jio would be a type of geo point we are customly saying to elastic says hey please you do not do your stuff i'm telling you what to do so we said that hey give it a mapping called a jio point so we defined that right excuse me uh so once we do that uh you can see we should say acknowledgement rule and then when i upload the document the same document and then i go to the mappings of that index uh if i search for the word geo and you can see it's a geo point right now once that is done what it allows me to do is it allows me to use jio queries for example uh you know i want to search for the word new york as you can see the word new york is there and then i wanna filter out all the people which are in two thousand kilometers right now it's just a one document so it's just gonna show that one but uh what i'm trying to say is you could add these rules here like you could say uh i want to do this and this and then from that result filter out everyone who's between like 200 kilometers or 200 miles or something like that so that would be regarding the jio queries uh in in short okay so that should explain um that is pretty self-explanatory uh other thing that i would like to ask is you could also do an aggregation on on the jio queries so what i mean by that is let me show you so i think i would have to disable all of these one uh i'm not sure let's let's see if it works so uh geo aggregation basically does uh aggregation on the jio um it seems like there's an error maybe uh yeah i think like i i i think you have to call it complete method i think this function is not complete yet i was working on this library i was adding a new feature but seems like i never added this code i have this one so base query we return the base query uh so i mean it should work i don't see why it should not work though oh it has to take all the fields and stuff what am i doing uh i'm just going to give a one shot and let's see if we want to get so geo aggregation field name lat i give a long aggregation my distance what is this type error non-type object is not supported for assignment from 100 ah does it take anything else let me just make sure aggregation well i'll fix that part but uh you could uh probably do aggregation on aggregation on the jio points as well so for that's that now what else i want to cover is uh let me think about this so i guess i've covered most of the points on the queries and now i would like to cover a few more additional aspects of queries such as i would like to cover uh let me see so i would like to cover about aliases and elasticsearch i would like to teach you about aliases once we do the ass then i would like to show you a little bit about uh was that uh the autocomplete queries that i have been working on you know just some fun stuff that i want to show you just about the query so everything about the queries then how to update the document or how to add a field in that something like that so let me show you a little bit about that about elasticsearch uh alias is in detail so i'll be talking about this aliases what aliases are in elasticsearch and why is it so powerful that you should definitely learn about aliases if you are working with elasticsearch i'll be talking about first of all about uh simple aliases then i'll be talking about filtered aliases and a little bit about routing and writing aliases so let's get started with this video to make you guys understand what aliases are in the last exception so let's get started so let me uh minimize my video for a while okay so aliases right now for example you know like my alias or my nickname is scientist my original name is tommy right uh people know by sawmill but people also know me by a name called a scientist or you know that's called as an alias or you gave me a nickname some people some people also gave me some other nicknames so those these are aliases different ways of calling me right so now what i'm saying is so let's create an index first of all i want to create an index call as my first index uh you know i usually like to explain with examples so you know so we created an index okay perfect let's insert some document there i'm gonna insert a document called suhas that's my mom's name then we're gonna insert uh something called uh sawmills oops i messed up yeah i messed up so now let's uh insert one more document called swamisha uh let's do a ninten show and let's just do okay okay so i have these four documents here uh hopefully that should make sense it's nothing complicated here so so i want to show you the all the documents that i have in this uh by saying search so let me show you oops one sec have two doubles for okay the simple uh nitin shah samur okay so these are the documents i have now what i'm doing is basically i'm creating an alias for this index uh saying that alias one i'm giving it a name so what happens is basically now i can query the data with this alias okay i'll show you what it means don't worry i'm there let me explain so created okay so of course you can search this by this index by its original name that is my first index but now once you did this with you have an alias you can also do my alias one underscore set it returns the same set of documents that's amazing now you would say oh that's amazing like but when will i use this well think about a situation where you have a read-only database in elasticsearch environment and um you wanna uh so so you have some data right and and after like several months you have to ingest a new data uh so what you would do is basically you would create a new index and you would push all the data there and then you would basically create an alias of the original index and then uh once the ares alias is created you would delete the old index so for database migrations and you know for when a new data comes in so all of this for all of these stuff you would use um aliases so very very powerful now let me show you the power of filter aliases in this example so if you want to remove aliases of course um uh you can say remove i guess yeah so you can say all the code is there so don't worry about the code guys focus on learning for now so let's remove the alias okay so remember in this um uh where is that index uh might get my first index okay so remember uh how in this document we had uh shazam everywhere and in one document we did not have that so if i want to show you uh oh not this one let me just do a normal search here this one so remember guys we have nitin shah we have okay we have sawmill and then we have suaves suhaas that's my mom's name so this document does not have sha so if you want to create an alias which only includes uh the last name like sha so what you could do is say you can use the power of filtered alias so look look at look at this so i'm creating i'm creating an index i'm basically creating an alias at least two and i'm saying that hey only make sure to include only sha uh in it so when i post this alias right makes sense now when i search it if you see suhas is not that because this is a filter release already uh when creating alias we already entered a filter command it would not include anything that does not have sha isn't that amazing that's pretty powerful right like the for all the bi tools and all you can create multiple uh aliases like this okay okay that makes sense okay that's that's that's pretty good now you say some way did you learn all of this well i usually read a lot of stack workflow i i read this guy i would put the links in the description we explained this uh there so we also give a very nice example say you have a log indexes so let's say you have a 2018 02 logs which has all the logs for the year 2018 uh february and you have an another index called 201803 logs which has all the logs for the month for january february march okay so now if you want to query both the indexes at the same time you would create a alias called current clock which is pointing to both of these and then you can just click the data wow that's amazing it's pretty amazing okay so that's that uh if you really wanna learn more uh in depth about aliases uh you can come to the official website and they have a pretty decent amount of documentation here they they talk more i mean there's a lot more here they talk about routing and all uh so you can of course come here and check about routing um you can read a little bit here i mean i tried my best to you know explain the basics and the fundamentals so there is like i just said so only thing that i have not covered i guess is the right index and the routing and i guess it's pretty self-explanatory uh if you read uh routing is possible to associate routing values to new ls the feature can be used together with filtering aliases in order to avoid unnecessarily shard operation that makes sense the following commands creates a new aliases alias one that points to index test after the alias one is created all the operation with this alias are automatically modified to use the value one routing so uh it's self self-explanatory so he explained so basically this is used to uh create an alias one uh which points which is which basically points to the index test and then uh as the documentation clearly says uh it's usually used to avoid like the shards operation unnecessary charge operation and you can automatically use value one for routing and then you can use value two for routing um i haven't explored much about routing in detail but yeah this is what aliases are basically learn about elasticsearch array mapping and how to search within those array so we're going to learn about some queries in that so we have a question from cable jain he posted a one day ago and we are kind of addressing this issue he has uh he has a array in which he has those nested documents and the problem is he basically want to say like whether the main vendor is n and the priority let's say is one so i mean the thing is this is considered as a one document but he wants to only get this document in the result um because that's the match right so let's kind of learn about a little bit about array mapping and writing queries about that so let's get started i have a very nice demo here for you guys so first of all sorry for the background though i'm in india so yeah so um put my index doc one okay so i'm gonna put this okay so this should create the index so here you can see the index is created let's have a look at the mapping that elasticsearch did generate for us by default uh so i'm gonna show you the mapping so you can see uh the user is a so user then we have property of course first which is text and then we have a field like a keyword so we can you know do that all of that so that's that right now uh if you want to do something like where the uh user.firstname is john and the user.lastname is smith it should return only list this document i don't want to return the complete document but the problem is this is entire one document so you are trying to index a document inside or something like that so there are two things you can use like a script query of course but that's slow not pretty efficient what i would do is just let's understand what elasticsearch does so uh let me copy this above so i can explain you first of all so when you do array right so elasticsearch behind the scenes break down into this so user.first so all the first name would be an array last name would be an array as well that's how elasticsearch does okay and i'm not talking this from my own stuff this is there on the official website of elasticsearch from where i am talking about okay so that's that okay so what we wanna do is if you do something like this it's not gonna work so um of course we have bull must means uh it's an and where the user.first name is alice and this last name is something like whatever like uh let's say last name is what was that alice white so you want to do that right but it should only return that particular json but the thing is this is just one record it's a complete record you are trying to do search inside that record so it's interesting so if you try to run this query what will happen is i'll show you it's gonna match right because those two words are there in the document and the thing is i don't want to get this one how do i do that of course you can write score to filter out that but is there a way i can do that in the query time um yes you can and um let me show you that one so that's the part two now so now how can you do that kind of stuff okay with arrays and mapping nest basically nested documents or nested array documents so first of all very important you have to define a mapping called nested so user is a nested right you have to tell elasticsearch hey that's a field called as nested you have to explicitly say elastic charge you have got to define that but you already have an index and um if you don't have that mapping you cannot do that because you got to delete the index or i mean yeah you gotta you get you gotta do this when you create an index right so you do that okay acknowledge true now we'll do the same documents okay same document right there now look at the query so i'm saying that i want to look for a document like alice and white okay so but see the catcher this query is wrapped inside a nested one okay so you when you wrap it inside a nested okay path is user that means you're telling it to look inside this one right here then you're saying query inside that you're saying a query bool and must so you must have this and this inside these documents and inner hits of course is for the inner heads and check it out so if you oops i up somewhere ah nested object under path user is not nested type ah hold on i screwed up something it's fine so let's do that okay now it's done let's put the document right there perfect search and there you go now you gotta understand here how everything works okay so first of all you have hits okay in that hits you would have the outer document that is being matched okay so for example um if you see uh carefully we have a source right so let's collapse the inner head right here sorry for the background noise um so basically we have uh um where was i yeah sorry so we have this hits okay and in that hits uh we have one more hits and in that we would have two things source and inner hits so the source would be the outer hits and the inner hits would be what you would want to look for so if you see you want to look in the field you want to look in the field basically called source right and here in the source if you see these documents were matched so alice and white right but in the outer document if you see that so if you collapse the inner hits and i open the source that's the main source right where both the documents are there and if you want to filter out stuff by inner documents you would go to the inner hits and inside inner hits you would go to again remember that this is a flow okay inside inerts you want to go inside hits hits and then source okay so hopefully that makes sense right so now you would say sawmill what if i want to get like alice or um uh first name is um john well uh simple piece of cake guys um uh let me show you uh was that uh well should yeah should should okay user dot last um actually you know let's say first is white and i want to remove the i want to put it in the or section uh let's say user dot first is um what's that name john alice let's see what do we get okay so run that oops i think i have a syntax error ah i always do this do this i mean always ah it's kind of hard to find the typos usually especially i don't have my specs so it's even more harder to look okay we'll fix it no worries let's try that one user is white uh bull should match user dot first is there a first name oh yeah of course there is no white so it should be alice try out that one i'm not interested in the outer hits i'm interested more in the inner hits so uh so of course i i told you the format remember so hits so that's the outer hits uh one sec one sec i should have lowered my specs uh let's see this one so that's the outer hits okay yeah ignorance so now if you see uh white is there right uh their first name and last name right so alice and white so that's that but interesting thing if you just want the first name in that so say you said sawmill i mean i that's correct it's working fine but i just need the first time i don't want the last name you want to still filter it out okay okay i don't know if this is gonna work i usually works in the nested queries you can use a source attribute uh like that one and user dot first let's try out that one and it seems like i cannot do that for some reason malform no start object after query name i think i cannot do it here but uh let me see uh let me see the query one second so usually it's above oh yeah it's above the query of course it's not going to work here i should have done somewhere here uh should that should work uh still does not work that's fine we can uh of course write in the code we can filter out that stuff it's not a problem um that's interesting um i guess i gotta even go in the outer level so maybe that one but the thing is this is gonna filter out the outer uh stuff so if you see um hits hits let's see the inner hits uh yeah we still have that so um that's that's that's that's that's that's it yeah it this filter out the outer um hits yeah that's the thing so yeah for outer hits it's fine but if you want to filter out the inner hits i mean technically it should work uh uh let me see so i did it uh issue work if i put it here but it seems like for some reason it did not like that maybe that way nah it did not work it does not like that for some reason maybe here nah it's not gonna work i'm pretty sure yeah it's fine you can write a javascript code on the client side or on the on the server side it should be easy right that should be pretty um [Music] simple right so that's that uh what else uh so of course you can you should must section must not section must not section and all of that so to filter out the documents and if you want to filter out uh if you wanna you know like uh the name is alice and you wanna filter out everything by let's say last name is like whatever like regular expression you can do that right you can keep adding as many queries you want in the boolean sections uh so hopefully that should be it uh this should give you a very good start on elasticsearch mapping and nested json documents filtering out nested json documents uh i'll do is i'll put this links in the description this is where i learned from so they have an example here right they did show how to use these queries um yeah so that's that the nested fields you've got to define a mapping uh of course one then you put the document and of course you can put it in the bool section um and it should be in the type nested right you can do that um of course they have more you can do highlighting as well they have a highlight field so that's that um so yeah i'm going to be talking about how to update records on an elastic search um so let's get started um so i'm going to be opening my uh dev tools um here to show you for example so i have a netflix data set in which i'm gonna be explaining you this okay so i'm just gonna do a random source query here um i think gotta provide the json body okay so i'm gonna be playing with this one uh this particular record so i'm gonna copy that on a notepad okay that's the record let's play with it guys so first of all what i'm going to teach you is basically how to update a key value let's say here in this data let me actually grab the id oh okay that's much better ah not the dev tool i don't know why did that open up okay so let me actually get the record of that guy okay so for get netflix underscore doc and okay run it okay so you have the data here right so here you can see about that guy now what i'll do is basically i'll add a i'll add basically a new uh let's say name called a sawmill share so let's try that i have a template here so the template says you first say the index name then you have to um give it the id of that person the user script tag then here in the key you would say whatever you want to say so um name okay now if i run this all the snippets are there in the description so you should not worry about these snippets okay so updated so now if i do dock in this to get remove the json here okay now if i try to run that now that particular record we have a name attribute perfect okay guys how do i delete that okay so i have again a snippet for that as well so easy uh netflix update uh give the id of the person or the record i would say uh cts remove and the key wanna remove the key called as name updated right perfect jason body okay so you see has been removed right now let me also talk about updating fields for example let's say title transformer minister so let's do that man uh alternately you can also do this way if you want to update things are we rolling let me see okay perfect so id um title title yep that should be fine updated let's get the record title song one thing if you update things you cannot revert it back okay because you already made the changes so you have to be very very careful for that so yeah i hope you have enjoyed this quick uh tutorials on elk stack uh for updating updating a key value pair in a document or deleting uh them i'll do is basically i'll put the snippets that we have done in this tutorial in the description of the video okay all of these talking about elasticsearch analyzers what analyzers are and how you can leverage the use of analyzers in elastic search so let's get started with the video tutorial series i have some nice docu i have i have some very good examples which will make things easy for you to understand so let us see what are the analyzers types of analyzer elastic search offers following types of analyzer standard analyzer simple analyzer white space analyzer stop analyzer keyword analyzer pattern analyzer language analyzer and fingerprint analyzer now the very first question anybody would ask me who don't know anything about another analyzer is basically what is basically an analyzer right so to answer that let me let me show you some of the documentation of elasticsearch and then we'll understand uh these analyzer in detail by uh by looking at the examples so i'm on the official website of elasticsearch elasticsearch it says elasticsearch ships with a wide variety wide variety of built-in analyzers which can be used in any index without further configuration so if you see standard analyzer the standard analyzer divides the text into terms of word boundaries as defined by unicode so let's read stop analyzer the stop analog analyzer is like a simple analyzer but supports removal of stop words words like the full stop i will explain you all of these you know with a very nice example but let me just show you what happens okay so for example i have a wonderful example here so which shows a different different analyzer for example uh i wrote a sentence here hello my name is sawmil shah i'm a full stack software developer dot one one one okay so when we by default if you don't specify anything it uses standard analyzer so what i mean by that that entire text is broken down into tokens of words hello my so you can see this array so whenever you say like hey find me the word hello in this so it's gonna quickly grab the word hello and it's gonna give it to you that's a standard analyzer then we have a simple analyzer so what happens in the simple analyzer is this if you see the number one one one one got rid of that right so the numbers got rid away so if you wanna have some application where you wanna get rid of the numbers or or something like that you can use simple analyzers so if you want to read more you can always come to the website the simple analyzer divides the text into terms whenever it encounters a character which is not a letter it's a lowercase all terms make sense okay then comes to the white space so white space analyzer splits the string at a white space which makes sense right so the length of the array is still 12 here you can see so if you observe the word developer.111 since there was no white space this is how it came at um at the index time it will be this word you see so it it's very very important to just understand how analyzer work because based on the type of analyzer you provide you're gonna get relevant results for example the stop if you see the word dot the full stop was taken out the one one one was taken out you see all the stop words were taken out similarly keyword means this entire text would be treated as one keyword similarly we have a pattern analyzer and fingerprint so that that that basically gives us a very nice idea of analyzer now the question might be okay how the hell do i use it i have very very nice documentation or i would say an example for you guys so let's learn that so i'm gonna create an index with uh which has a field called as my text the type is text okay so basically just creating a field okay so i'm gonna run this one actually before that i need to delete the index because already i think i do have the index so i'm gonna get rid of the old indexes and i'm gonna try to explain you oops now i'm gonna start with this one so we created the index okay so we have an index you can verify that with this command so for example the name of the index was my test right everything is okay now we post one document into it hello my name is swamil shah okay oh okay i am so it's it's basically a it's basically a sentence or whatever you want to call so i'm gonna post that so we get an acknowledgement and when we search for all the documents we should see that sentence here now what happens here is at index time for example now here where here is where i need to explain you stuff so at index time elastic search convert this into an array hello my name is because by default it uses a standard analyzer remember that sawmill okay i am so these are the words right now if i want to search the document documents here i can actually prove it to you that like how it works so for example get my test i'm going to say search and if i just provide a very basic query let's do a match query and the field name would be of course you know that my text because we just have one field here uh let's search for the word the so let's see if we do have a match so it seems like we did not match anything let me do it again so let's try for the word hello so you see it did match right the hello was there but what if try capital it's still match right perfect makes sense so for example if i try to see for the word hello like i wanna look for hello anything like that still shows up now let me say something like this xyz xyz so this will basically search for the word and xyz xyz so for example it should still match the document so make sense right how elasticsearch works like it breaks down these sentence into tokens so you can do efficient searching so this will make even more sense i have more much more better example so let me show you so i'm gonna delete now so now let's create an index with a custom analyzer okay for example uh my let me take a better example before that so for example uh if we say if you wanna let me see okay let's uh look at this one this is a very good example which will clear your concepts so let's say i wanna i just run the code right uh for example if i tried oh yeah because the index is not created right so i need to create an index first so let me do that okay perfect now i'm gonna try that so check this out guys this is important okay when i use the um uh test index in which i did not specify an analyzer right so the length of the array i have commented here it's this is 11. so there are like 11 elements so you can see in the number two the number quick the brown so as i said at index time it break basically breaks down into tokens right but if you do specif i just want to show you like if you do specify an analyzer for example let's say um you want to remove all the stopwatch like the full stop and all of that so when you specify an analyzer and then when you try to search for the word the it's not gonna show up because at index time you skipped it you specified an analyzer right now a little bit it's making sense i hope it's making a little bit sense um okay so uh let me show you um one thing so now let us learn basically um so so so what i'm saying is we know that elasticsearch has all these fancy analyzers and stuff like that okay but how can we define our own custom analyzer or how can we actually use it so let me show you an example um i have a very simple example that i'm trying to explain you uh let me get rid of that one so okay so we have a mapping properties this is the field name that is my text right earlier we did that right now we had a type text so if you see on the first example we had a type call as text but now after that i'm also specifying the analyzer so i'm saying hey analyzer my analyzer so the analyzer that i defined the analyzer now if you want to define the analyzer this is how you do it in the settings you would say analysis it's a json body now the analysis takes an argument called as analyzer which is again a json then it takes the name of the analyzer that you want to give so in that case my i just said my analyzer type as custom because we are you know making our own analyzer right you can use a built-in analyzer also like stop analyzer and stuff like that so let me just explain you okay so type as custom tokenizer as standard so what tokenization you want to use so we're just using a standard stuff now filter this is very important so here is your custom like filter so i'm saying whenever you put any documents on elasticsearch convert that all into a lowercase so if for example if this was a sentence so this will be interpreted in elasticsearch as hello you see so that's the that's the lowercase analyzer the english stop words means that we are simply trying to uh get rid of we are defining um our own like a filter called as english stop because there's nothing called english stop we need to define that so if you see below in the filter i defined english stock type as stop and then i said stop words so for example i said what are the stop words i want it to understand so for example i'm saying whenever you encounter the word do and over remove that from remove that in the index time so for example if i say something like so if so if i said hello my oops the tree is great so for example if this was my text this would be interpreted hello at index time okay hello tree and is great so these are the words for so the the word the would be skipped okay so you would also save like memory so it all depends like what is your application and then accordingly you would use your analyzer so i did that for example i said though and stuff so let me actually run this okay so that that was created now let me show the difference okay the same word the two quick brown fox jumped over the lazy dogs if i run that uh if i run that here now check this out the word the the should be skipped at index time so if you see the word the is not there because in the stop word we define that it should be skipped so at index time if you try to search for that word it won't be there does that make sense how elastic search work for example in case of the index test in which i did not define my analyzer here you would see the word the for example let me show you check this out see you see how easy it is it's pretty pretty easy once you start understanding stuff i mean you would really love love love learning stuff so for example uh as i said so my test now i'm just i'm gonna look for the query word do in that now i define my analyzer in which i said explicitly the word should be not considered at index time so if i search no document forms so you see how it's working now so so here we specified right so for example you said like oh skip this word so you can specify your analyzer so that's the job of analyzer okay in layman language that's how would explain it now oh yeah there are a couple of more analyzer for example we have a character filter you can say html strip so whenever you provide an html it's gonna consider you know for example this would be interpreted interpreted as i always have to show you in the snippets otherwise it's hard to understand for beginner so it's broken under this is a test so based on the analyzer you provide for example the tokenizer is white space so it's gonna break down into white space but if you say tokenizer at keyword this is the entire one word at index time you won't break it i will prove it to you let me show you so check this out this is a test okay same thing if i change that into keyword check this out makes sense right how analyzer works so hope this is clear basically it's it's used analyzers are quite used when you wanna uh you know when you wanna improve the efficiency like you you want certain words to be skipped at index time for example the uh you wanna convert that into lowercase you know all that all that stuff so that's when the analyzer will come into place now if i at the end i would conclude it i have a nice conclusion here so whatever i explained you have written it in a plain english so let's read that quickly what happens when you don't provide an analyzer the default analyzer will be standard in simple what elasticsearch does it converts those string or data into tokens for fast indexing for example the the word two quick brown fox jumped over the lazy phone on the field would be interpreted as this this is this is what it will be interpreted as but if you provide a custom analyzer if you want to skip the word the or stop words you can provide that at an index time that's what it does right so i explained you all of that so for example when you index the string the two quick brown songs whatever the same string this is what will be done instead if you don't say analyzer what would happen is it would consider the stop words as well like the word though i i showed you an example right so i uh so these are like the you know the kibana um whatever like commands i was running on kibana to guy to you know explain you and these are the references that i have taken so hope this will clear a lot of people's um doubts uh on analyzer how you how to use an analyzer and stuff like that at last i just want to revise one more time how to write your own custom analyzer i know it's a little bit uh uh kind of can be a little tedious for beginners so you know how to write mappings right so you can provide analyzers so if i don't want to specify my custom analyzer i can simply say your stop so this means that you're saying elasticsearch to use the built-in stop analyzer if you don't want to do that you can specify your custom analyzer and then you can define your analyzer so whenever you want to define analyzer it is inside analysis and then you have an analyzer which is again a json right and which takes the name of the analyzer and it takes all the attributes here for example filter tokenizer type type should be always custom because it's a custom and then whatever filter you define your user defined filter you would basically define that below the analyzer right here i would highly you know encourage you guys to use kimbana because for beginners it's pretty good it helps you to write queries so it gives you like automatically snippets and all that so yeah i hope you have enjoyed this tutorial on um elasticsearch analyzer i tried my best to explain this in an and then as easy as possible because a lot of people are complicating these things right so i just wanted to make it very very simple for beginners or people who are learning this for the first time well thank you so much for watching this video and i hope you have enjoyed a small walkthrough on elasticsearch queries and if you did really enjoy this video please kindly give me a like on the video because it's very very hard to create contents it takes lot of effort and time that's it i hope you guys have enjoyed it if you have any more questions write me an email and if you have um write me an email and i will get back to you and as usual guys keep smiling keep coding never give up and see you guys in the upcoming videos
Info
Channel: soumilshah1995
Views: 7,390
Rating: undefined out of 5
Keywords:
Id: e5awiVnkuEc
Channel Id: undefined
Length: 75min 45sec (4545 seconds)
Published: Sat Mar 13 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.