Normalization - 1NF, 2NF, 3NF and 4NF

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi this video is about normalizing database from normal forms one two three and four for the example I mocked up an imaginary company that sells video game consoles some of their products are Xbox one from Microsoft PlayStation 4 from Sony and the handheld PlayStation called PlayStation Vita from Sony and they also have two newsletters one is for Xbox one new releases and one for PlayStation 4 new releases signing up for these newsletters is optional it's not tied to any purchase member can sign up for either one of these or both of these or none of these air will it doesn't matter so these are some of their sales records they currently keep track of customer name the item purchased the shipping address what you letter this person sign up for what is the supplier of the item and what is the phone number of the supplier and what is the price of the purchase so they have someone named Ellen Smith these are just imaginary items but anyway did somebody am alan smith he bought an Xbox one he lives in Miami he subscribed for the Xbox newsletter the supplier for the Xbox is Microsoft and their phone number is supplier phone numbers is toll-free this is what somebody entered in here and the total price is 415 and there's another item named Ellen Smith it's the same as this at least the name is the same this person bought a PlayStation 4 he also lives in campus Road in Boston which is the same address as for this person Roger banks he subscribed for the PlayStation news Sony by Sony and the price was 300 and the color code just ignore the quarter code there's no meaning to that it's just for visual cue so anyway these are some of their sales records and I'm going to normalize these so first normal form the criterias are each cell to be single valued so Xbox one and PlayStation Vita occur in the same cell so is this so we have to eliminate that entries in the column are the same type so such as wholesome toll free here these are not the same these are the proper values so we need some kind of entry here that matches these it would be like let's say this field would be eye color and someone put like on a blue or green and beautiful well beautiful is not a color it's an opinion so we need some kind of a actual entry we can't we can't let this be in here also Rose uniquely identifying either add a unique ID or add more columns to make the Rose unique well these rows there's no duplicates in these roles so they unique in a way but there's still a question about this Ellen Smith well this campus wrote in Boston happens to be a housing for a college so this this Roger banks rents a house while he's in college and Allen Smith also rents the same house with them they ran it together so they have the same address while they are in college so the question is now is this Allen Smith from Miami is going to college in Boston and this is the same guy it's just now this is his address or they are completely different people from this table we cannot figure that out so those are the issues so this is the table in first normal form so each cell to be single valued so the Evan wheels on these two items will separate it out to two separate rows one for Xbox one and one for PlayStation Vita notice that this has four roles here and this has now five because he made two purchases in one transaction so that takes care of that entry in the column are the same type well wholesale and toll free was replaced with the proper values so now we don't allow entry of just random values in here they have to be filled out properly and the rows must be uniquely identified so these people are made to create some kind of customer ID that they can sign up sign in to this website so this person his name happens to be Allen Thomas Smith so he created user ID 80 Smith Rogers Bank has Roger 25 Evan Wilson is worse than 44 and this person his name happens to be Allen Michael Smith so his his a user IDs AM Smith so now we know for sure that these two people have nothing to do with each other and now we can uniquely identify each of these customers okay second normal form all attributes meaning non-key columns meaning these these white ones depending on the key which is this one this red so which one of these don't depend on the key well let's see how about the price so does the customer ID determine the price in other words does the price depend on who buys it and the answer is no it doesn't matter who buys an Xbox one the price will be the same when when Ellen Smith or Evan Wilson buy the price will be the same so the the key in the table doesn't determine this so that we have attributes many columns that don't depend on the key so those have to be separated out so what I did is I take the customer ID the name the shipping address and the subscription into one table and then I take the item which becomes the primary key in the other table the supplier supplier phone number and the price into another table and obviously I'm only going to list one item one time so now the price the supplier and the supplier phone and the supplier depend on this because Xbox one is Microsoft's supply this is their phone number and this is the price and this customer has this is his name this shipping address and this is what is subscribed to obviously this person subscribe to - so he's in here twice so this is great the only problem is we lost the transactions we don't know who bought what so for that there needs to be a table called the junction table which is this and basically these in this table these are both of these are a key this is called a compound key because it's made up of two columns and now these entries from a tea Smith bought an Xbox one is over here so this guy bought this a Roger 25 by the PlayStation 4 Wilson 44 bought Xbox one and the Vita and a and Smith bought a PlayStation 4 so the transactions are here and now this and this our lookup table so basically AM Smith bought an Xbox one who's a I'm sorry a t Smith who who's that guy well his name is this he lives here and he subscribed to this what's Xbox one Xbox one is supplied by Microsoft this is hi buy more of it if you run out and this is the price so this is nine second normal form because every every attribute depends on the key we separate them out great third normal form all fields meaning columns can be determined only by the key in the table in no other column okay so what can we the problem here well the problem is that Microsoft and the by Xbox phone number always goes together to Sony and the by Sony phone number always goes together it even says in here this phone number is never going to go with Microsoft and this phone number is never going to go with Sony so in the item table if the only thing we would know is Microsoft we already know that this is the phone number or if we know Sony when we know it's going to be this the phone number so obviously we need to separate it out also because these phone numbers keep repeating if I would add more items here okay camera for PlayStation a camera for Xbox controller blah blah blah you would notice that Microsoft Sony marks on Sony and their phone numbers keep repeating so if I would change if I would have to change the phone number I would have to change it in a lot of different places and that's not it's not normalized properly how about here can I figure out let's say the address from the customer name well Allen Smith is one customer name there's an address from Miami let's see there's another address from Boston so I guess there is no one-to-one relationship how about backwards let's see campus Road goes with Roger banks and goes with a so I guess there isn't a one-to-one relationship between the name and the shipping address newsletter well that's definitely not going to happen because a person can sign up to multiple newsletters and multiple people can sign up to the same newsletter so that's there's not going to be a same relationship as here we can't just separate it out just yet so what I did is basically separate this out this for now is going to remain unchanged but this is separated out so I would remove the the phone number column in here so now the item a is supplied by Microsoft and it's 250 but the phone number is listed here so now if I need to change the phone number for either one of these I can I can do it in one place and not there's no redundancy and obviously the supplier is now its own primary key in its own table and it's a foreign key in the item table and the foreign key simply means that in here you can only enter values that appear here if it's not in here and none in this table you cannot put it in here let's say if you want to put in supplier Revlon or L'Oreal or something like that those are not those supply cosmetic products or what none so they if you don't sell those you can just enter in here it has to be in the supplier table first before you can put it in here that's what the foreign key means okay so now all columns can be determined only by the key in the table and no other column well this side is done for sure there's no no further way to normalize it there's only this side left so fourth normal form no multivalued dependencies well what does that mean what's a multivalued dependency it's basically this this field depends on the customer ID this address depends on the customer ID and the subscription depends on the customer ID so these are all dependencies of the key however there's still a problem as you can see Evan Wilson is in here twice with the same address because he subscribed to two different newsletters so if he would change his address I would have to change it two places or what if it would unsubscribe to this thing then then what do I just put a null here or try to deke lose my database because now I have two entries or what if it answers graph or both or and then I'm going to have to knows and two of the same values or same thing if this person signs up for a new the Xbox newsletter I repeat his user ID and the customer name and shipping ad is just just basically like you either leave it as null or just randomly repeated because what what are you going to put there so the multi-value dependency is a situation where one column can have different values a different amount of values then let's say the other column for example in this database I didn't allow multiple shipping addresses per customer every customer has one shipping address but they have multiple newsletter but in real life let's say if you have an Amazon account you have multiple let's say credit-card numbers on file and the checkout at checkout you choose which one you want to use and you could have multiple shipping addresses one for home one for work so for one person you could have at let's say two shipping addresses and four credit cards well how do you make a proper table out of that the amount of rows in this column is not the same as in this column let's say for example I would have one shipping address and two credit cards and then I would add another shipping address what would I put in the credit card field just randomly select one of them you see how that's that's a situation where you have these fields depend all on the key which is called the multi value dependency but there's different amount of values that can be put into each row I mean I'm sorry in each column so obviously at this point this needs to be separated out and what I separated out is the customer ID customer name and the shipping address is going to be one table and the newsletter and the customer ID will be another table this these remain the same so as you can see now every customer only appears once Wilson 44 was eliminated twice because we don't need that so now if I need to change his address he's only in here once I don't have to change it twice and these are his description he's in here twice so if the query that returns the who what kind of newsletters need to go out will return him properly because he'll be in here twice if this person signs up to another one he'll be in here twice and the query will return it accordingly or let's see Roger unsubscribes he'll be removed and the query that returns the newsletters will not return his name but his address will be still intact here no problem these are the relationships between the tables and and also gave the tables names because now I know what they are so this means this one-to-many this means the customer appears in this table one time but in the sales invoice table he will appear multiple times an item will appear one time in the items table but will be sold multiple times supplier will appear one time but can supply multiple items and the customer will appear one time here but he the person can have multiple subscriptions or not at all none at all so now this table is fully normalized and thanks for watching
Info
Channel: channel5567
Views: 1,739,202
Rating: 4.8369255 out of 5
Keywords: 1NF, 2NF, 3NF and 4NF, Database Normalization (Literature Subject), First Normal Form, Second Normal Form, Third Normal Form, Fourth Normal Form
Id: UrYLYV7WSHM
Channel Id: undefined
Length: 19min 1sec (1141 seconds)
Published: Fri Aug 14 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.