Solving SQL Interview Query using a "VERY IMPORTANT SQL concept"

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey guys tawfiq here in this video let's try to solve one more real SQL interview problem now this was shared to me by one of my Learners from my recently concluded SQL course and I had faced a similar problem in one of my earlier SQL interviews okay now again in this problem the problem statement is actually very interesting but the solution is very simple if you understand one of the most important concepts of SQL okay now straight away let's look at what the problem is and try to solve it so as you can see the problem statement is actually pretty straightforward we have been given a input table and we are expected to write a query which is going to convert this input data into something like this so this one is basically our expected output okay now if we carefully look at our input data we can see that we have three different columns so we have Source destination and distance so this looks like this is a table which kind of gives you the distance between two different cities so you can see here I have the first record sources Bangalore destination is Hyderabad and the distance is 400 and the second record is kind of like the repeated data but the source and destination are interchanged right so here the source is Hyderabad destination is Bangalore and the distance is again 400 okay and the same thing is repeated for a few other cities so we have Mumbai and Delhi as 400 and then Delhi and Mumbai as again 400 and then we have Chennai and Pune as 400 and then again Pune to Chennai is again 400 okay now just by looking at this data we can see that the data is repeat for the same cities right so the distance between Bangalore and Hyderabad and the distance between Hyderabad and Bangalore are basically should be the same right and that is why in our output we only have one record for Bangalore and Hyderabad and the distance is again 400 okay so just by looking at this input data and looking at the expected output can you imagine what concept of SQL you could use to solve this problem now I'm sure you can solve this problem in many different ways but the solution or the concept of SQL that pops into my mind using which you can solve this problem pretty easily is self-join and the reason why I say self-joint is because I can see that I have this table here okay and I can see that the source present in this column is also repeated in my destination column right so if I could join this same table to itself okay so I take one copy of this table and try to join it with another copy of the same table and I can join it using either the source column or the destination column okay and when I'm doing that join probably I will be able to join these two records to itself elf and then somehow I could put an additional condition so that out of these two records I only fetch one record okay now I'm sure this is not very clear so in order to explain this better let me show you a sample now as you can see I have basically copied the same table twice and the first table I'm calling like T1 and the second table I'm calling like table two okay and I'm going to do a self join okay now what is a self join it's basically when you join the same table to itself okay based on certain conditions right of course you need to provide a joint condition okay now my requirement is when these two records will join with these two records I want to somehow only fetch one record okay so first of all how can I join this data from my table one with the data present in my table too I can easily see that the source in this table is Bangalore here in the first record and that same Bangalore is also present in my destination column in my table two okay so somehow I could join this Source here with my destination here so hopefully I will be able to get a successful join okay but in addition to that I also do want to join the same record of my source table or basically my left table with the same record of my right table right whenever you are doing a left join you kind of try to avoid joining the same record to itself because then you will get duplicate records right so in order to avoid that and since here I do not have any unique identifier in any of these tables what I'm going to do is I'm going to add a unique manually I'm going to create a unique identifier so I'm just going to say one two three four five and six okay and I'm going to copy this and maybe I'm going to call it like ID or row number or whatever okay so I can easily create this ID for each record by using the row number window function okay I'm just going to use that and once I have this ID when I'm processing the every record from my left table when I say left table it's my table one okay I'll make sure that I'll put ID is kind of like maybe not equal to the ID from this table or maybe the ID is less than the ID from this table okay so I can just say maybe I'm going to call it like ID okay just so that it becomes clear here and I'll make this as ID as well okay and what I want want to do is ID from this table should be less than okay so maybe I'll call it like T1 dot ID is less than C2 dot ID okay so what basically will happen is and so this is my first join second joint condition okay and my first joint condition would be T1 dot Source okay source is equal to P2 dot destination okay so I'm just going to clip it like this okay now what I'm saying is this Source here that is Bangalore okay will join okay so let me Mark it in yellow we'll join with destination here which is Bangalore that is the second record okay so this record here will have a join with this record here additionally I want to also check that the ID from this table is less than the ID from this table because since I am doing a self join every record of this table will try to join with every record of the other table right but I want to make sure that for every record I only be able to join with one record from the other table so my final output will only we have one record for the same cities right so I'll say ID less than the ID in this table so when this record tries to match with this record the ID condition will fail so this will not get processed only these five records will get processed and out of these five records only Bangalore and Bangalore will match so only I'll get this will be matched okay so this will probably match okay and this will be successful and when I am processing the second record here here I can say that my source Hyderabad okay will try to match with destination Source here this I have Hyderabad here right in my very first record but the ID condition is the ID from this table that is 2 should be less than the ID in this table here the ID is one two less than one is actually false So This Record will not match okay so this will not match okay and the same thing will happen for every other record okay I hope it makes sense you kind of get an idea of what I'm trying to explain okay not very easy to explain but I hope you get an idea okay now what I'm going to do is I'm going to convert this explanation into a query hopefully that will make more sense okay now I have basically created the same table with the same data as you can see here okay now the very first thing that I want to do is since I want to have an ID a unique identifier created for my table what I'm going to do is I'm going to create a unique identifier here okay so I'm just going to say this is star and let's say I am going to have a new column here and I'm going to use a window function row number to create the unique identifier and I'm going to give the Over clause and I don't want to do any partition or any sorting because I just want a unique identifier I'm not really bothered about the order of the data in this case okay and I'm going to call this like let's say ID okay and if I run this now you can see that I have a unique identifier or unique number for each record okay this is fine now since I want to do a self join and since I don't want to repeat the same query twice I'll just use a CTE table so I'm just going to say with cpes okay and I'll just move this to the right I'll put this here okay and then I'm just going to say select from CTE and this will be let's say my table one and I'm doing a self joint so I'll join it with the same table again and I'm going to give an alias like T2 okay and now comes my join condition okay and here I'm just going to say T1 dot my source okay as I explained my source from this table I want to join with destination of my table 2 so I'm just going to say equal to T2 dot destination okay that's fine and additionally I also want to do a check for the ID so that the ID that is present in this table should be less than the ID from this table so two records does not get matched because I have two records with the same cities right I only want one of them to match okay and that is why I'm just going to say T1 dot ID is less than T2 dot ID okay and then when I'm fetching the data from here I'm just going to say let's say T1 dot start okay I will only fetch the data from this table because I know that this will get matched so I want Bangalore and Hyderabad and then here only Mumbai and Delhi will get matched so I want this and here only Chennai and Pune will get matched and here and this is what I want okay and because if I look at my input output here this is exactly what I want okay so let's go back here and I'm just going to say t1.star and I think that's all now I'm going to run this and now you can see that I have got my expected output okay and instead of doing a star I'll just tell Source comma T1 dot destination comma T1 Dot distance okay and if I run this this is exactly what I wanted so Bangalore to Hyderabad one record Mumbai to Delhi one record Chennai to Pune one record okay I hope this makes sense I hope you kind of understood my explanation and thinking behind solving this problem I'm sure you can solve it in many different ways if you have a different solution definitely share it in the comments below okay thank you so much for watching and see you soon in the next one bye
Info
Channel: techTFQ
Views: 90,826
Rating: undefined out of 5
Keywords: sql, sql queries, sql interview questions, self join, sql interview query, solving sql query, practice sql, practice sql queries, interview problem, techtfq sql, sql interview questions and answers, data analyst, interview preperation
Id: WhkNQ3g0U64
Channel Id: undefined
Length: 9min 38sec (578 seconds)
Published: Mon Sep 19 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.