How Kafka works internally, Install & configure Kafka and Zookeeper on local machine.

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hi all thank you so much so as you can see my screen uh today i'll be discussing on kafka installation and there are other things as well as to what is up how it internally work what is do keeper i mean these are the key things i've noted here broker and replication what is topic partition segment offset now producer and consumer and what are the tools to to explore your cop car i mean cluster like the of the offset detail the partition details that operate and all so we know kapka is messaging system as of now it's vastly used everywhere in organization okay so at the end we'll discuss on as to how to install the kafka and that would be on my local machine it would be windows version okay so and then how kafka work i'll i'll take up this question at the end so i i try to you know put up everything together so that you have some in initial knowledge if you're if you're starting to work on these kafka things okay so starting with what is zookeeper basically so zookeeper is something you can say a a container you know which which is basically managing the which knows and which coordinate with your copper cluster which knows as to when the topic got created topic or deleted another detail you know when the broker died when topic was removed topic was edited and so on so zookeeper is sort of tool which is managing your kafka which has the copper cluster detail and uh which which i would say it works in in sync and it has all the information of kafka cluster okay so zookeeper you can install on your same machine wherein you have the kafka cluster or if you want you can install zookeeper on any other other message so it could be you know standalone or or on the same machine what is broker and replication broker is server basically you may have multiple broker on uh in a same i mean single cluster copper cluster and or you might have multiple brokerages so this is something which goes inside your kafka cluster it is server and replication replication is based on i mean as to how many replications you have defined if you want to uh keep the data uh replicated on any server okay now pick uh what is topic i mean i'll not go in detail but yes uh i'm just trying to touch upon the key things topic so i would say i'll just say the topic is basically uh i'll just try to you know create a shape here so shop is some topic something like your you know folder just say it's a uh something like a folder you know uh which uh which for which you are giving something you know some name like uh i'll just put here uh it's a test topic okay so it is a test topic and you have given it the name the test topic and it has uh zero or i mean it cannot have zero but it must have one partition or more so you may have multiple partition into it so while you create you may have multiple partition uh you know or you may have at least one partition so this topic nothing but a folder or name you can say i space name space wherein your multiple partitions are live and partition is uh now partition is something i would say of a segment in your topic you know as you create the topic you also provide the number of person you want to create into a topic so that partition is something you know area where in your index file your log file everything will be stored so that is what it is the partition and segment you see below at the partition so segment is having two file segments segment is something is known uh i mean i mean ultimately would be known as offset you know so segment is having two things uh law it's a log file and index so basically uh the detail of the data and the offset value and the index value so that that segment has two things you know the log and index okay and offset is something known to be segmented only at the end offset you know when when you're accessing any data so every data has its own offset value i mean the i key value you know which is known to be the id of that particular location or the data okay so that is what it is offset is called so offset is basically uh if i have to show here if you have a partition zero you know i'll just try to just write down something here so if it is your p0 partition one and you have data one two three four five these are the offset you know offset id one two three four five and any data coming into it it will have offset value six super producer will write the data into it and it will go by you know appending at the end so this the data goes into it like a link list sort of you know collection value so any any new data will will be appended here and it will have their own offset so this offset one two three four five so these are basically offset id okay and so this offset id is known to i mean if you have any consumer consumer is trying to control the data from it and you might have multiple consumer as well consumer one consumer two so if your current offset is five i mean the total data is five and consumer one is accessing till two consumer second consumer is accessing offset value four so every consumer have their own offset id they will know it you know they'll remember it and you you will be having the maximum so you may have multiple consumer group and every can your group will be having their name as well so you can i mean you'll able to know that which all consumers are accessing your data so that is what offset is about producer consumer it's a very non we know but you sorry something with writing data into your kafka topic and consumer who is accessing your data so you may have zero or more consumer you know and for producer to write data into it kafka ui tool there are multiple tool as well available as of now through which you can explore you know uh the cluster you know you can see the topic you can create you can delete and you can also see the partitions and uh further detail onto it so as of now i see offset explorer and uh it's a free for uh i mean single use i would say not for the company or in any organization so these are the things i have captured here and uh again just revisiting it so partisan or kafka store is unique basically you know and the partition are split into segment further on it getting uh splitted into segment and segment are of two file log and your index file and log an index is basically having the it stores the value in form of like hash map you know it has key and value so segment have two things and the log and index and both log and index having their offset id and basically they identify to connect or to identify a particular record okay so i'll jump into uh is to install this on my local machine okay so first of all this is the uh kafka apache rg download the page you can visit you know and from here you can download any version of it so i've downloaded it to uh 0.12 and let us see 2.13 you can see so you can download this zip so i've already downloaded i'll go to [Music] my folder location and what changes i have made i'll also let you guys know so this is the cop car i've downloaded and inside data what i have uh created new things is created two folder cup and zookeeper because both the kind zookeeper is running on my same machine so i wanted the log should be separated created two folders zookeeper and uh inside inside config you see the zookeeper properties and server property so these two properties are for zookeeper and copper cluster configuration detail so for zookeeper what i have made the changes is uh just the data directory i have given the new folder that i have created into it this this path i have given uh for the data director and similarly in the server server properties i have given the path for kafka folder so this is manually created and otherwise it will have the default one and the second thing your zoo keeper will have a port h2181 default port okay so i'll move to this one so g zookeeper we know it's used for the level distribution quarter scrap does not area and the zookeeper is having 2181 default port and similarly your kafka server cluster is having 909 to default so once you done i mean you made those two changes first of all you created the two folder into data and then you're changing the data directory of zookeeper property and server property so this that's all the changes we do and then simply we can rush into starting the zookeeper and uh copper cluster server so to start the zookeeper this is the command till here i mean zookeeper server start out bad since this is installed on my windows operating system so i'll use the batch file okay and if you have unix you can use the access file so you'll be having all the configuration uh for both the version i mean the batch and assets for windows and unix machine respectively and further on you give the property so this is not needed i'll say i'll just uh take this down and so till here you just need to start i mean you can open cmd and you can hit so this is the some uh the you keeper i've already restarted and this is the command you can restart batch file and then configuration of that properties apache kapka we know it's a scalable messaging queue i mean we have already discussed and it is used for io operation you can it can handle lack of data you know horizontal um lack of data in a very very few times so we know it's open source also and most of the companies are using this apache messaging service only to start a kapka server uh i mean you can again use the cmd and uh you can you can you can locate your cmd to that particular folder i mean the folder is basically you need to go into bin and then window with you here for the windows you have different uh section you know and these are for the unix machine so you need to go to windows and here you have the command you can use and then the kafka server properties file again this is the port i've just noted here for the information and it is not needed to be no in your restart command so this is the command to start the copper cluster and if you want to create any topic you can use the copper topic batch file again the create command zookeeper you need to let it know as the zookeeper detail i mean it is also installed on and running on local machines so i've given the low colors and then port and replication factor as of now is one because i do not have multiple broker and and partition i've given one so here you can specify the partition value i mean for this particular uh this is the name you know test topic name is details so for this test i'm um creating one partition only you can have multiple partition value here okay so this will create your desktop okay if you want to know as to how many test topic our topic they tell you is there on a particular kafka cluster you can use this command i'll try to show here i have another command prompt and then i'll try to so here so as soon as you hit this topic.batch and uh you you're using the list command and you're giving the zookeeper details so it will list out all the topic let me just see if it is running let me restart it give me a second okay so zookeeper got started similarly for kafka cluster let me kill this existing instance and then i'll try to restart it okay so this is nothing started so now we can use yeah so as soon as that zookeeper detail uh zookeeper server instance got restarted it has given me the output so uh with this command kafka topic batch and if you use the list camera you will have the topic detail all the topic detail so these are my topic test of big one test and the other the topic detail i have uh if you want to describe any topic you can use this uh again kafka topic batch file and describe is the comment so if i use this let's read this cup quarterback batch and then describe command and for test topic one i want to know more about it so it will give me the detail of this topic just a second i think it is started so we need to have zookeeper running i i think it's throwing some exceptions so okay this cluster not getting started and let me restart it from fresh cyrus so let me let me try to restart it again i'll kill this instance yeah so i mean now it has run you can see the described command has worked you know so i have this topic one it has a three partition zero one and two okay and replication is zero as of now and i sr is also i mean it's not the data is not being replicated anyway so this leader and application is all one zero only i do not have any further so these are the detail of this uh you know topic so similarly you can i mean there are other commands but if you want to delete any top you can delete this uh using the delete command and topic name you need to pass and this joker particle so that's all i wanted to you know know uh just down here and this is for i would say the the brief description every definition of the another thing the important important thing you know so if you want to install you can install it on your machine and you can write the installation to uh i mean the producer and consumer to write the message into topic and read message my message from the that particular topic so i've already created some of the video wherein i've used the springboard solution to write the message into a topic and also consume the message from there okay and there are other video as well in my uh youtube channel so you can go through it so yeah that is pretty much for this i mean for the initial information for the basic information that is what i thought to put in together in order to know now i know in brief so thank you so much for your presence and time see you

Info

Channel: Abhimanyu Kumar

Views: 461

Rating: undefined out of 5

Keywords: java, spring, springboot, ebx, mdm, jboss, tomcat, j2ee, ejb, jms, kafka, ibm mq, devops

Id: Zotv5ISprJw

Channel Id: undefined

Length: 18min 15sec (1095 seconds)

Published: Sat Sep 04 2021