Apache Kafka for Beginners (3+ hours long)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hi guys welcome to Apache kafka beginners guide in this video I'll explain you most important concepts of Abijah Kafka like Kafka Brokers zoki for Kafka Kloster Kafka topics partitions per application and so on but Who am I my name is Bogdan stay shook and I'm developer with multiple years of experience and I have personally used Apache Kafka multiple times on different projects I hope you'll enjoy this course and please note that the discourse is part of complete Apache cafe developers guide where I'll explain you how you're able to use the different APIs kafka clients for Java 5 + or no GS link to full course you'll find under this video all right guys let's get started and they wish you good luck [Music] hi guys this chorus is a hands-on course that includes tons of practice activities dedicated to imagine Africa and of course in order to follow me along and from all practice activities on with me you need to have Abacha kafka installed on your computer and over the course of the next sections I will demonstrate you how to do that for Mac OS for unix-like systems and for Windows but in production Apache Kafka is usually running on unix-like systems and most popular choice is Ubuntu that's why if you are Windows user I highly recommend you to create eyes or futile private seroius Ubuntu and run Apache Kafka or create a butyl machine on your computer and run a bundle there as well avakov installation is pretty easy all what you need to do is to download archive with files and run executable screeds bets if let's get started and let's start with installation of avakov on Mac host and unix-like systems [Music] in this section I'll demonstrate you how to install Apache Kafka on Mac horse or any other unix-like system using terminal it is pretty easy you need to download archive these files and afterwards run executable scripts in order to start a zookeeper kafka brokers and from echos and unix-like systems scripts are the same and they have extension dot CH also if you are working on a Mac and if you don't want to install Apache Kafka directly on your computer you are able to create butyl machine on your Mac OS computer and for that you could use for example butyl box software and in this section I'll demonstrate you how to do that as well but I recommend you to install Apache Kafka directly on your Mac OS computer let's get right into it [Music] in this lecture I'll install Apache Kafka on Mac if you have Mac please follow me along navigate to a cop car dot Apache dot org scroll down to download button click on it and here you'll find the section dedicated to download of Apache Kafka in my case 2 dot 4.0 is the latest stable release and here below I can find instructions on how to download it please notice that there are links for source download and binary downloads we need to download binary file it is actually completed project as you see from the node below there are builds for multiple versions of Scala and if it doesn't matter which version you want to use you can download version 2 dot well it is recommended 1 let's open up this link in a new tab and the here you'll find basically mirrors where you are able to download this file from mirrors are located all over the world and they are actually located as close to the users as possible also below you may find links to backup sites this one and this one and use them in case mirror link doesn't work properly let me copy this URL again to the mirror site located as close to me as possible let me copy it open up terminal and the list download binary Apache Kafka file to downloads folder I have already downloads folder here on the list and I am able to use for example guru command in order to download remote file and the save it using - o option to specific location and to specific file and I'll save it to downloads folder slash and let's name this file Kafka dot star J zip like so enter downloading the load was finished and the size of the downloaded file is 59 megabytes let's verify that this file is present in the lost folder LS - la downloads and yes here it is Kafka dot tar gzip and now let's create new directory called Kafka and let's extract contents of this archive - created folder let's do that let's clear terminal MK dir Kafka CD to this folder CD Kafka and now let's use tar command tar - x v CF here type allocation of the tar file with archive it is located in downloads folder the name of the file is Kafka dr. Jessup and here I'll use option there - strip one without this option new folder inside of the Kafka folder will be created and the contents of the archive will be placed there but I want to extract all files and folders located in this archive directly into Kafka folder that's why I need to use this option there's - strip 1 let's press ENTER files were extracted and we can verify that at last - la and yes here I see several folders called bean config leaps site dogs and two text documents license and notice I can read for example license file using cut command cut license and here our license details for Apache Kafka okay that's all what you need to do in order to install Apache Kafka on Mac OS computer you need to download the tar file and extract it notice that we have extracted it - newly created Kafka folder but in order to run Kafka you need to have Java available because Kafka was developed using Java that's why you need to check whether or not you have Java installed let's type command Java - commercial in my case Java is relatively old on this computer so let's update it and if you have Java not available at all please perform same steps on with me you can download Java eyes or from official site or you can use cask utility in order to download the binary Java from terminal let's do it here I'll use blue blue is package manager from echos next type install afterwards type homebrew slash cask slash Java like so let's press Enter first of all homebrew will be updated this may take a while after blue update the Java download will be started it may take a while several minutes so you may have a break now okay Java download has finished and now I need to enter my password in order to finalize installation and Java was successfully installed you could verify installation using a command brew cask in for Java and I see that in my case Java development kit aberration 13 was installed okay no I can also enter Java - where should command and now in the output I see that Java open Java development kit by Russian 13 0 that one was successfully installed and available for use okay now you have Java installed on your computer and also you have all files necessary to run Kafka let's proceed and next I will show you how to start Kafka server and zookeeper see you next bye bye in this lecture I will show you how to get a window up and running on your Mac OS computer and you need any virtualization software that allows you to run any other operating systems like Windows or Ubuntu on your Mac OS and there are actually few choices for that for example there is VMware fusion and this is actually paid software arm and it goes up to $100 if I'm right and there is also open source and it means free software called the VirtualBox your top box is supplied by Oracle and it is available for Windows OSX and Linux basically if you are already running Linux you don't need to install your top box you can run Apache Kafka directly on your computer I'll show you how to install your top box and how to run a boot inside of your toolbox for that let me download the butyl box for OS 6 was click on this link download has started and also let's download official Ubuntu image of course this also open source it is free and for that please go to Ubuntu dot-com slash download and I'll install Ubuntu desktop it is basically a fully featured operating system and if you want you can even replace the windows or maiko's with ubuntu desktop ok let's click on Ubuntu desktop and the latest longtime support release my case is aging or 4.3 and I will click download button here Don what was started as well and you see that the size of Myrtle books is about 100 megabytes and the size of Ubuntu ether image is around 2 gigabytes let's wait until both the loss will be finished okay both files were downloaded let me basically open up downloads folder and let's first install your tall box let's double click on this dmg installation file opening let's double-click on butyl box dot pkg verifying package let's a click here continue continue install enter your password let's open up security preferences and the law application to Garang click allow here let's close this let's close this as well here you see that installation has failed but basically it should be successful but for the moment let me keep installation disk image and let's try to open your toolbox and it was successfully opened okay let's now create a new virtual machine in order to create new virtual machine click here new icon this one and the let's give name to this machine and I'll name it Ubuntu Kafka like soap type will be Linux and the version will be a bun to 64 bits let's click continue and here you can allocate how much RAM this virtual machine will be allowed to use and the minimum recommended size is 1 gigabyte of RAM and let me actually increase it to two gigabytes of RAM like so I think that it should be enough click here continue on the next step you should allocate hard disk memory for this virtual machine and let's allocate it 10 gigabytes of storage it will be pretty enough let's create here you can choose hard disk file type and let's choose default one your tall box disk image VDI let's click continue and the size of the hard disk should be dynamically allocated in order to free your sources for hours or built on machines if you use multiple of them let's click continue here you are able to adjust the name of VDI file that will be stored in this folder under users bog down your top box VMs Ubuntu Kafka and I'm happy with this name and let's click create okay new Kyoto machine was successfully created and now I need to adjust settings and attach ISO image with Ubuntu installation to this virtual machine for that let's open up settings here go to storage and the here under control dot ID choose empty disk and click here on this small blue disk icon and choose a disk file navigate to downloads folder or any other folder that contains easy image with Ubuntu installation let's select this file and click open and now let's click OK and run actually start this virtual machine click on this icon with adjust also accessibility access open system preferences and unlock those settings enter password scroll down find the butyl box application check this checkbox and the closed settings here you'll see that butyl box has detected that there is optical disk with file attached and I'm happy with that I want to actually create a new virtual machine using this ISA image let's click start and here I will see Preta create a small desktop and in order to make it larger let's click on this desktop icon select here tall screen 1 and scale to for example 250 percents like so and now it is larger let's close those notifications yes let's allow access to microphone installation is running let's click here install Ubuntu and the only language English here selected by default install Ubuntu again leave English as default language click continue here I'll select default normal installation and the select the check box download updates while installing Ubuntu was click continue on the next step you'll see that this computer has currently no detected operating systems and the default option is erase disk and install Ubuntu and I'll leave this option selected and please note that your disk will not be erased only virtual disk allocated to this specific virtual machine will be erased and new open to operating system will be installed there also you are able to enable encryption of your installation if you want but I will not do that ok let's click Install Now button click here continue on the next step you can select your location and I am happy with default one New York let's click continue now you are prompted to enter your name and the pickup username I'll enter my name here like so and username will be also bok-dong and here I could enter my password notice that at this step new user called bak done in my case will be created and this user will have sudo privileges I would like also to select this option login automatically in order to speed up process or for logging into this operating system and now let's click continue and final installation process was started and usually it takes up to 5 minutes depending on resources on your macbook ok installation is complete and now I need to restart computer in order to use new installation let's click restart Now button it seems I need to press ENTER here let's close this notification and finally a boon to was booted up successfully and noticed that I wasn't prompted to enter my password for user Bogdan okay you can read the what is new in Ubuntu and click here next to multiple times let me just skip this tutorial and now let's try to the following let's try to expand this window to full screen mode by clicking on this button and here I'll see message that I will be able to get back to windowed mode by pressing command + F again let's switch to full screen mode and here you'll see this fancy small desktop that wasn't actually stretched to full screen and the reason for that is that by default there are no so-called guest additions installed on this virtual machine and we need to install them manually and you can do so by going to devices menu and selecting insert guest additions CD image let's click on it let's remind me later about updates and after that you should see that new CD was inserted into yogurt on machine and this CD contains software intended to be automatically started and if you see this message simply click run let me enter my password I was indicate installing butyl box guest additions and notice that Wilde is installed my screen will be scratched to full screen as well press return to close this window installation was completed notice that if you didn't see this CD attached to your tow machine you need to do following let me make this window smaller and go to machine menu and select ACPI shutdown this one shut down your dull machine after that go to settings click on this yellow icon go to storage and here you need to create new beetle CD again in my case installation was successful and if in your case you have seen message that CD isn't available you need to perform those steps click on this blue icon and create new optical drive leave it empty click on this button leave empty and you should see new ID controller attached to your video machine and after that you need to boot your virtual machine once again and try to perform installation of your top box additions once again in such case installation of butyl box additions should be successful ok basically I don't need this new virtual CD I'll remove it for now and here I'll also eject this installation is the image from your tall drive click on this menu item and now I'll see that my CD controller is empty ok let me know boot up butyl machine again click start okay booted up and now let's try to make guilt on machine full screen once again don't show this message again swish and now after successful installation of four guest additions you should see that desktop of this virtual machine was thrashed to full screen now let me show you how you're able to adjust resolution of this butyl machine display for that let's go out from full screen mode by pressing command F and the here in a bottom in this row with different settings click on desktop icon here go to view tall screen one and drag app that I have set scale to 250% and that's because I am actually using creating a display on my macbook pro and now let me change scale to 200% like so and go to full screen mode once again and now resolution of this described in butyl machine was adjusted and now I have more space than before and actual resolution is here let's go to settings click on this drop-down icon afterwards click on this icon settings should be opened select devices here and here you'll see resolution of your display in my case resolution is here and I'm actually happy with this resolution it is not too large and not too small okay now we have one to run in kiannmac was using your toolbox let's now proceed and I will see you in the next lecture bye bye I still here enjoying this tutorial then smash the like button and leave a comment and don't forget to share this video with your friends and co-workers let's go on [Music] a logic Africa is a server-side service and usually in most cases is running in production on unix-like systems and most popular choice is Ubuntu and the very often Abacha Kafka is running not just on a single server instead it is running on multiple servers and that is called a basic Africa cluster also it is possible to have multiple clusters around the world and in this section I will demonstrate you how you are able to create a beautiful private server with Ubuntu and install Apache craft inside of it after creation of a remote vehicle private server you'll need to connect to it remotely from your computer and for that you'll use SSH protocol but please note that when you need to start zookeeper several brokers you need to open several SSH connections from your computer let's get right into it [Music] I have told you before that the most popular way and recommended way is to run Apache Kafka on any unix-like system and the most popular choice is Ubuntu operating system and if you don't have access to a Punto server and don't want to create any virtual machines on your computer you can always create any little private server and there are many many companies that offer such services and if you enter open to VPS server here in the search bar you'll find bunch of offers where you are able to rent Guto private server starting from five dollars per month you could also buy virtual private server from such larger hosting services as Amazon Web Services Google Cloud or Microsoft Azure but I will demonstrate you now how to buy VPS at digitalocean it is very fast and easy to create any virtual machine here at digital ocean in order to create one you need to create an account and the click here sign up if you have already account please sign in I'll sign in because I have already account created let me sign in and here you should create a new project and the create new droplet droplet is basically server in order to create new droplet click here create and select droplet and first default option that will be chosen is Ubuntu 18 dot or fraud LTS stands for long term support release and this version is 64-bit version next you could choose plan I'll choose standard plan and here scroll to the left and choose smallest server possible it will cost you $5 per month but please note that if you will use this droplet only during several days you'll pay only for those days and here you'll see price per hour this server will have one gigabyte of RAM one beutel's CPU 25 gigs of SSD disk storage and up to one terabyte of network data transfer ok I am happy with this plan let's scroll down and here you are able to choose a data center where you would like to host your tall private server I'll choose one closest to me it is located in Amsterdam and here below you will see number of data center for example in New York there are two data centers available ok great let's scroll down here you could add additional services I don't require any of them at the moment and here you could choose type of authentication if you are going to use droplet for a long period of time you could select SSH key authentication in such case you will need to setup as a sage locally on your computer create SSH key and import it into configuration of this droplet but I will use for now one-time password option in such case the password root password will be created and you will be asked to enter it while connecting to this server remotely and of course we will connect to the server from our local computer ok I'll need the 1 droplet and here you could set up host name for this droplet I'll name it Ubuntu - Kafka you can set any name you want also here you could add some text and if you have many droplets tags may help you to find needed droplet faster also here I will see that this droplet will be created in my project this name of the project you can rename it create new projects and so on also you could enable backups it will cost $1 per month per droplet I don't want to create backups at the moment and let's click create droplet creation of the droplet will take some time usually up to two minutes let's wait a bit okay droplet was created and here you may find public IP address of your server let's copy this IP address and let me expand settings of this droplet and here you'll find the option access console let's click on this option and web browser-based console for your server will be opened please note that this server is supplied without any graphical user interface and you will need to perform all tasks from CLI and that is a practical practice for production-ready applications you always manage them by a CLI ok you could log in here or you could use SSH from your computer and log in remotely let me close this web-based console and let me connect remotely let's open up terminal and here in a terminal type as a sage SSH is built-in on Mac if you are using Windows you could open built-in power show terminal and use same SSH command as I will use now it is built in into Windows 10 but if you are using older versions of Windows and SSH is not available they are out of the box you could use program called putti it is really convenient and the lightweight SSH client for making connections to remote servers and the next type root add and paste corporate IP address of a remote server let's press enter here you could verify fingerprint of remote server let's type yes and now you need to enter root password please notice that after creation of droplet this password should be emailed to you so please check your email box I have checked mine and I have corporate root password there so let me paste it here and immediately after first McGinn you need to change this root password so let me paste root password once again and type new one repeat notice that you are always able to regenerate the root password by going to your droplet going to access and reset your password here okay let's go back to terminal and let's create it so basically now I have working a butyl private server with Ubuntu 1804 okay that's all for this lecture and the next I'll explain you how to install Apache Kafka on this beautiful private server see you next now let me show you how to install Apache Kafka on this Ubuntu server please notice that I have created this server remotely at the remote hosting service in my case this digital ocean you could use any other hosting provider and the main idea is that now we will do everything in CLI in terminal so let's verify thrust whereas our Java is installed here on this server let's type Java desperation as expected that Java is not included into this installation out-of-the-box that's why we need to install it manually and you should install either version 8 or version 11 and I'll install open JDK version 11 I could use command apt install for that and first let's update apt use command sudo apt-get update updating update completed let's clear terminal and let's now install open JDK version 11 sudo apt install open JDK - 11 - JDK like so fetching here is notice that this installation will take around 600 megabytes of disk space let's proceed press yes working this process may take one or two minutes let's wait a bit okay done let's verify now where's our Java is available let's create M&L and type Java there's - version and now Java is available and here you see it's version 11 0.5 great let's know proceed and now let's download archive with all files necessary to run Apache Kafka and unarchive it into Kafka for drop and that will basically complete installation of ok let's clear terminal and the let's create folder called downloads let's basically verify whether it is absent yes this absent here let's create folder downloads m'kay dir downloads and now let's download Apache Kafka from remote URL in order to find necessary URL please go to web browser and type here Apache Kafka download like so click on a thrust link and here you will find the links to source download and binary download this note is that you need to download one of the binary files it is already compiled file ready to be used and if version of Scala matters for you you could choose a different versions depending conversion of Scala but if it doesn't matter please choose version 2 dot well click on this link and here you'll find link on one of the mirror sites where you are able to download this tar jay-z file notice that in your case this link will be completely different because there are many many different mirrors all over the world and the days serve users as close as possible to them also you are able to use any of the backups eyes eyes or in you or in u.s. so let me copy this URL go back to timing out and in a terminal I'll use kuru utility in order to download remote archive and place it into the laws folder so let's paste corporate URL and I want to save this file into the laws folder and the file downloaded file should be named simply Kafka dot tar.gz without marshal in order to achieve that I could use - Oh option and type here downloads slash Kafka dot tar jay-z like so in such case this file will be downloaded from remote URL and placed into the walls folder as file Kafka dot tar.gz okay let's execute this notice that if you'll see a warning that cool command isn't available you should install it using apt so sudo apt in style cool in my case it is already installed ok file was downloaded let's list contents of the laws forum and yes here is this file now let's create new folder called Kafka here in a root of user folder let's clear terminal and the hill at create folder MK dir Kafka and let's CD to this new folder CD Kafka and now let extract contents from downloaded archive directly into this folder for that you should use command tar it is another utility it is built-in in any unix-like system and next type options X v CF here type path to downloaded file this absolute path and name of the file is crafted dr. jasion notice that I you step in order to autocomplete file names and next let's add another option there's - strip 1 this option is needed because this archive basically contains folder with name craft underscore and next comes version of Kafka and inside of that folder there are other folders and files and I need extract those second level let's say folders and files into Kafka folder and that's why I need to use their - strip one option ok let's execute this enter extracting and now let's list files in this folder LS and I'll see several folders like bean Quandt fix side dogs and the two text files and that's basically all what you need to do in order to install Apache Kafka on this computer if I list contents of being folder LS bin you'll see such files as Kafka context dot SH Kafka ACS dot SH craft server stop server start Sh and all those files are basically executable scripts and all what you need to do in order to start for example kafka server you need to execute this script and as argument supply specific configuration of file for each configuration files default files are located in config folder let's list contents of this folder less config and here you'll find the files with extension book properties and for example here is a file with default settings for Kafka Sara okay this completes installation of Apache Kafka on this beautiful private Sara and we have done it in several steps we have created butyl private Sara at digitalocean you could use any other hosting provider afterwards we have connected to this remote server via SSH and the next we installed Java on this server and downloaded Apache Kafka archive and extracted it that's all also please notice that this solution is production-ready solution we don't use any local settings we use remote butyl private server that could be running forever if you need it ok that's all for this lecture and the next time i'll explain you how to basically start Kafka sole keeper and Kafka server see you next bye bye if you run Windows this section is for you as I have noted before Abacha Kafka is almost never running on Windows in production that's why I don't recommend you to install Apache Kafka directly on your Windows computer but if you want to do so in this section I'll explain you how to do that the only difference between running Apache Kafka on Windows and on unix-like systems is that on Windows you need to execute scripts with extension dot path from unix-like systems correspondence Creed's have extension dot CH in this section I'll demonstrate you how to create also a butyl machine on your Windows computer it is basically preferable way of doing exercises throughout this course and inside of that built on machine you put install Ubuntu and run Apache Kafka inside of it let's get into it and let's get started with distillation of Abijah Kafka directly on your Windows computer [Music] now let me explain you how to install Apache Kafka on Windows but please know that I highly don't recommend to do that because even Apache tells that Windows is not fully supported platform you should run Apache Kafka on any unix-like operating system so please if you are able to do so create for example any of you told machine using your toolbox and run there for example Ubuntu or you can create any of your own machine on paid hosting that cost around five seven dollars per month and use it during this course in one of the previous lectures I have demonstrated you how to create a new remote server and install Apache Kafka dear but anyway if you decided to install Apache Kafka on Windows it is possible of course and now I will explain you how to do it please navigate to Kafka to apache dot org slash downloads and here you'll find links where you are able to download Apache Kafka please notice that this page is common for all operating systems independent of whether you are running Windows or Mac OS or unix-like system the reason for that is simple archive that you should download in order to run Kafka is the same for all operating systems so in my case latest release is 2.4 dot 0 and the here you should scroll to binary the law section and choose one of those archives as you see there are different binary down laws for different versions of Scour but if version of scale doesn't matter for you please choose by Russian 2 dot well let's click on this link and here you'll find mirror sites where you're able to download Apache Kafka archive it is compressed archive with extension tar.gz and you will need any an archiver in order to extract contents from this archive also please note that you may see here different link because there are many many different mirrors where you able to download the resources all over the world and the de-allocated as close to any user as possible also below you may find backup sites in you and in US where you are able to download Apache Kafka archive as well but the please use them only in case when mirror site doesn't work properly ok let's click on this link and download this file its size is around 60 megabytes so file was downloaded and if I'll open folder where it is located I'll see this archive but now there are no programs that could extract actually contents from this archive that's why I recommend you to install for example 7-zip program or any other for example WinRAR WinZip that are able to extract contents from archives so let me download the 7-zip and the I'll download 64-bit version this one let's download it and run it click yes here install and close okay 7-zip was installed let's go back to archive and here right mouse click on downloaded archive choose 7-zip and select open archive here you'll find one more archive let's simply drag it here like so and now let's extract contents from this archive again choose 7-zip open archive and now I see here folder Kafka under scroll and the next comes by Russian let's extract this folder and place it in the root of disc scene let's click extract here and here remove entire path leave just C colon and backslash like so ok extracting let's close this window and this as well and now let's navigate to disk C and here you should find folder that we have just extracted now please rename this folder to simply Kafka without any version like so let's go into this folder and here you'll find four sub folders and two text files and that's basically all files that are required in order to run Kafka on any computer either windows or mac o s-- or unix-like system let me just a size of those icons and the go for example into bin folder and here you'll find bunch of files with extension dot as age and also you'll find windows folder with files with extensions dot but and this exactly the files that you will need to use in order to start Kafka and zookeeper on Windows those files are basically executable files executable scripts for Windows ok let's go back to Kafka folder and in order to run Kafka you need to have Java installed on your computer and it is recommended to install eyes or version 800 version 11 so let's go and install version 11 and you need to type in search bar Java JDK 11 and here's the first link that leads us to Oracle website and here you are able to the lord corresponding Java development kit let's down and here please find file for windows except alliances agreement and click on download link here you will be prompted to enter or a code username and password if you don't have Oracle account please create one using this link I have already my account let me enter it and click sign in and the Lord of the JDK will be started let's wait a bit let's now run installation click yes here click Next Next okay and Java development kit version 11 was successfully installed let's close after successful installation of Java you need also modify path in order to make Java command available for all scripts let's open up File Explorer find here this PC right button click properties go here to advance the system settings and the here on Advanced tab click on environment variables then here in system variables section go to path and click Edit and here we need to add path to executable Java files you can click here browse and the navigate to disks in this PC scroll down disks in Program Files del Java then in JDK folder find bin folder and click on it click OK and you'll see here new path added and in my case it is C program files Java JDK - 11 / 5 / bin please notice that you need to add path to bin folder let's click OK and ok here and close this great now we should have Java available for use let's try to open terminal and let's open up our shell let's type here PowerShell it is built in terminal for Windows let's run this application and here let's type Java - Russian and I'll see that Java was correctly installed and configured and now it is available for use and that's basically all what you need to do in order to be able to run Kafka on Windows you need to install Java and you need to download the Kafka archive and unzip it and we have unzipped it into new folder called Kafka and it is located on disk see let me see D to this folder C DM C backslash Kafka and here is this folder with several folders and two files this all for this lecture and in the next one I'll explain you how to adjust configuration of Apache Kafka server and zookeeper and how to start Kafka server and zookeeper afterwards this process will be a bit different from a unix-like systems because for Windows we need to adjust the location of log files boss for Kafka server and Zoe keeper I'll see you in the next lecture bye bye I still here enjoying this tutorial then smash the like button and leave a comment and don't forget to share this video with your friends and co-workers let's go on in the previous lecture I have demonstrated you how to download Apache Kafka and unzip it on windows and now you should have Kafka folder located on big scene and also we have installed Oracle Java and verified that it works now in this lecture I will explain you how to adjust configuration of Apache Kafka's server and zookeeper server and start them we will edit the couple of configuration files and for editing of the files I suggest you to install visual studio code it is free editor available for Windows Linux and Mac OS if you have not done that before please install it I'll install it as well download was finished let's open up excel file except agreement click next next install and launch Visual Studio code finish let's close or release notes make it fullscreen and let's go to file open folder go to disk C and select folder called Kafka and click select folder and now you should see contents of the Kafka folder here on the Left pane there are several folders and two text files if I'll click a license file here in the right pane you'll see its contents now we need to adjust configuration files for Apache server and zookeeper server they allocated in config folder let's go there and the first let's find the server properties file this one here let's scroll down to section where log path is defined it is somewhere here yes here it is and we need to modify this path and let's first create Kafka - logs folder here in Kafka folder let me minify this and click somewhere here and click on new folder icon and let's create folder called Kafka - looks notice that there is already folder called logs that is present here in this folder Kafka will store system logs in this folder Kafka - logs crafter will store messages that will be sent by producers and consumed by consumers so let's adjust this path and type here C colon forward slash load backslash then Kafka and remove TMP so location of folder for logs is C colon slash Kafka slash Kafka - logs like so let me make it a bit larger in order to make it fully visible and that's all for adjustment of several properties file let's save this and let's close this also lets go back to config and defined the zookeeper properties file this one and here we need to adjust this line data directory and instead of TMP slash zookeeper we'll be see : / and remove the MP and type Kafka like so and of course we need to create a zookeeper folder here in Kafka folder let's save this file let's minify config like so and create a new folder called zookeeper here in the root of Kafka folder and now there are two new folders Kafka - locks and zookeeper now we are good to go and we're able to start both zookeeper and Kafka server let's first start so keeper let me minify visual studio code and let me close this window as well and minified this and here in PowerShell let's create a me now CLS and now let's first start though a keeper server using updated zookeeper the properties configuration file zookeeper will be started like soul dot backslash beam backslash Windows backslash zookeeper press tab zookeeper server start dot but afterwards type config backslash zookeeper and press tab as well and now let's press ENTER and zookeeper server should be started here you'll see this information pop-up check this checkbox here and allow access and zookeeper was started now it is running on port two one eight one great now let's minify this window and open up new powershell window and the start apache Kafka's server there let's open up new powershell window and here navigate to see backslash Kafka let me adjust size of this window and here let's start another script called Kafka server start and the disk whipped will be started using another configuration file that we have edited called cerebral properties let's do that dot backslash beam backslash windows backslash Kafka - server - start we can press tab here dot but and here specify path to server dot properties configuration file config backslash server dot properties I can use tab here as well and let's execute this command and Kafka server with ID zero was started successfully let's go to visual studio code and observe whether something has appeared in newly created Kafka desh logs folder and zookeeper folder let's examine Kafka desh logs folder and you'll find bunch of new files here that were created during startup of Kafka Sara also if I expand a sole keeper folder you'll find several files here as well for example here you'll find snapshot files ok great now you have zookeeper and the Kafka server up and running on Windows computer that's all for this lecture and in the next lectures I will explain how to start the Kafka consumer and the Kafka producer but I'll do that only for unix-like systems but I will not repeat those steps for Windows users but if you'll proceed using Apache Kafka on Windows please keep in mind that instead of as a screeds located in bin folder you need to use bad scripts located in subfolder called windows in bin folder that's all for this lecture and I will see you next bye bye on Windows you could run virtual machines under for example VMware Workstation Pro or VMware Player but in this course I suggest you to drive your toolbox if you have not done it before butyl box is supplied by Oracle and it is open-source software and that's why it is free butyl box is available for Windows for Mac OS for Linux and Solaris let's click on this link windows host and you will immediately download view top box it is around 100 megabytes while download is in process let's navigate to open to.com slash download and the world or future Ubuntu desktop is an image from official Ubuntu website you are also able to download Ubuntu server but for testing and educational purposes let's install Ubuntu desktop let's click on it and here I see that latest long-term support release is 18.4 photo stream in your case you may see more recent version and below you may read the recommended system requirements for computer where you will run a boon to desktop again notice that we will install your top box and run Ubuntu desktop as virtual machine ok let's download the installation file for Ubuntu desktop download has started and the size of this is a image is around 2 gigabytes ok download of butyl box has finished and the let's meanwhile install it while the image is being downloaded you'll see butyl box setup wizard let's click Next click Next I'll leave all those checkboxes checked let's click Next next I will see a warning that network interfaces will be temporarily disconnected from the network during installation of butyl box I am happy with dead let's press yes and finally let's click install click yes here installation process will be completed pretty fast let's click install here as well and let's start butyl box after installation and click finish okay now you see butyl box manager where you are able to manage all butyl machines let's go back to Chrome and it seems that easy image download was interrupted because network interfaces we're disconnected for a moment during installation of butyl box let's resume download and let's wait a bit until it will be finished ok Ubuntu easter image was successfully downloaded and now we are able to create a new virtual machine based on this image let me minify this window and here in butyl box manager simply click on this icon create new virtual machine here let me specify name of this virtual machine and I'll name it Ubuntu kafka like so and you'll notice that type of this virtual machine will be automatically switched to Linux and version will be a boon to 64 bits let's click Next and here we are able to allocate RAM to this virtual machine default Ram size is 1 gigabyte and let me adjust it to 2 gigabytes like so let's click Next and here you can choose the size of disk for this virtual machine and 10 gigabytes is pretty enough because we will use this little machine just for testing and educational purposes ok let's create disk on the next step you are able to adjust a hard disk file type and default one is VDI butyl box disk image I am happy with that let's click Next on one of the previous steps we have allocated 10 gigabytes of storage to this virtual machine and with this option dynamically allocated this butyl machine will not claim all space immediately instead it will feel physical disk space up to this fixed size that was physical before and it means that this futile machine may use less than 10 gigabytes of storage and in such case remain in free space may be used by hours or of your old machines I'm capable dynamically allocated and let me click Next here you able to adjust the name of file where built on machine will be stored and you see that by default name of this video file matches with the name of your tool machine given before and if you're not happy with that you are able to adjust this name and again you see that the 10 gigabytes of virtual hard disk will be allocated to this virtual machine you are able to change it here as well ok let's finally create virtual machine built on machine was created but now it is powered off and the we need to attach download that is the image to CD drive of this little machine in order to do that you need to click on settings and go to storage and here you'll see that there are actually two disks SATA disk and that's actually hard disk of this virtual machine and cd/dvd drive that is currently empty and we need to attach to this cd/dvd drive is the image that we have downloaded from official Ubuntu website in order to do that let's click on this blue disk and the choose a disk file then navigate to downloads folder in my case where is the image with Ubuntu desktop is located and select this file afterwards click open ok now this is the image was successfully attached to this butyl machine and we are able to boot up from this easy image let's click OK and let's start this virtual machine click on the start icon here we need to choose startup disk and we need to boot up from easy image this one selected from drop down menu and click start and gluto machine will be started but it starts in a very small window in order to make it larger select this monitor icon at the bottom right button click and the choose view tall screen 1 and adjust size to scale to 200% this one and now size of the window is larger let me close those notifications and finally after system has booted up from ISA image you will see a welcome screen where you are asked to install Ubuntu system here you can choose installation language I'll leave English and let's click install Ubuntu on next step select keyboard language I'll leave English as well and click continue on the next step you'll see that beer tour disk will be erased and the new Ubuntu will be installed there you are also able to encrypt this new one to installation if you want to but I am happy with those default settings and the let's click install now and finally you'll see that hard drive will be formatted and attached to guest operating system as xed for drive ok let's click continue on the next step you are asked about your physical location and I am happy with this default one New York let's click continue and finally you will be prompted to enter your user name for this butyl machine I'll enter my name Bogdan you can enter yours after entering my name computers name and the user name will be filled in automatically and I need to enter password as well let me do that repeat password and select option login automatically in such case you will not be prompted to enter password while logging in to the system please note that this user will have sudo privileges that we will require for installation tasks in the next lectures okay let's click continue and finally installation process will be started it will take up to five minutes depending on resources or for your machine let's wait finally installation was completed and in my case it took around ten minutes and the reason for that is that I'm running this Windows machine actually also as Vito machine on my Mac OS computer that's why probably it was so slow ok let's click restart Now button for starting let's press ENTER and finally one two was booted up and noticed that I didn't enter my password that's because I have selected option automatically log in here you can choose next in order to familiarize yourself with Ubuntu OS let's click Next next once again next and done okay there was one more step left notice that now picture looks a bit pixelized and if I'll try to make this window full screen notice what will happen let's click on this button and you'll see that the size of this desktop will remain unchanged and in order to fix that we need to install so-called guest additions to do that let's go to devices and the click insert guest additions CD image and after this butyl box will automatically insert new CD into CD drive of this virtual machine and he'll see notice that this CD contains software intended to be automatically started let's click run here enter my password click authenticate and installation process of your toolbox guest additions will be started this package is similar to VMware tools that I installed if you are using VMware Workstation or VMware fusion it enables a lot of different features that makes protests of interaction with virtual machine much more smoother ok finally it was completed let's press return also you may be offered to install some updates for your operating system and this is not related to installation of guest additions let me install them now again enter my password it seems that the software update was a start in background ok now after installation of guest additions we need to reboot butyl machine let's do that you can reboot butyl machine either from here click on this icon then power switch and click restart or you can always do that from your toolbox menu choose machine and choose here ACPI shut down here let's enter password ops indicate and butyl machine was powered off let's start it again and let's now try to minimize this window and make it fullscreen once again and now after installation of guest additions this desktop of your dull machine is stretched to full size of the window we are basically able to make it fullscreen go to view and the choose full screen mode heals instructions on how to exit from full screen mode is the key combination control right ctrl + F let's switch to full screen and now I see full sized screen for this built on machine let's quickly eject this CD we don't need it anymore eject now we are all set and we have a boon to virtual machine running on Windows computer notice that here at the bottom you see this panel that allows you to exit from full screen mode or perform other actions with this virtual machine let's for example exit from full screen mode and go to windowed mode ok that's the end of this long lecture dedicated to installation of Ubuntu on windows computer and I hope that you were able to successfully completed I'll see you next bye bye now I will demonstrate you how to install Apache Kafka on Ubuntu system basically the process is the same for any Linux like OS so you shouldn't be gate to calf cut Apache dot org slash downloads or simply type in search kafka download and you'll land on this page as well so here is this link that leads us to the page where you able to download Apache Kafka latest stable release in my case is 2 dot 4 dot 0 and I'll download it please note that here you'll find links to source download and binary downloads you need to download binary download and in my case I see basically three different options and if version of Scala matters for you you can choose appropriate version if it doesn't matter by Russian - da 12 is recommended let's open up this link in a new tab and here on this page you'll find download mirror size where you are able to download a patch a Kafka notice that in your case you'll see completely different link that's because there are many many different mirrors available throughout the world and they allocated as close to every user as possible also below you may find backup sites allocated in you and us where you're able to download those files as well but the Lord of a mirror is preferable so let me copy this link then go to terminal basically let me minify this window like so and plane is following using corporate URL we will download Apache Kafka archive from remote server we will place this file into the laws folder afterwards we will create new folder called Kafka and unzip contents of the archive into that folder and that's all what you need to do in order to install Kafka let's do that let's verify thrust whether the walls folder is present here LS and the yes I have such folder if such folder is absent in your case please create it using command mkdir downloads like so if I'll try to do that I'll get error cannot create directory the log file exists okay now let's use cool command in order to download remote archive and place it into the loads folder let's paste copied URL like so next I will use option - o lowercase o and afterwards I'll type downloads slash Kafka dot tera jay-z in such case this remote archive will be downloaded to downloads folder and saved to the file Kafka dot tarjay zip this will simplify process of an archiving because the file name will be pretty simple so let's press ENTER now and you may see the same error as I see now I'm using basically clean ubuntu installation this white out-of-the-box guru utility is absent here and here you see command that you should use in order to install cool on this computer let's do that sudo apt in style guru let's enter password for user press yes here fetched unpacking and cool was installed successfully let's clear terminal and let's repeat this command you can always navigate to previous commands using up arrow press enter and now the load has started size of this archive is around 60 megabytes so let's wait a bit okay file was downloaded and now it is located in the lowest folder now let's create a new folder called Kafka MK dir Kafka and let's move into this folder CD Kafka and now let's an archive downloaded file located here into this folder Kafka in order to perform such tasks you need to use our command with options X vzf like so then here will be absolute path to downloaded file downloads and here file name is Kafka dot tar gzip and the next let's use strip option and this option will mean that the contents of this archive will be extracted directly to this folder it means that in this folder after extraction you should see files and folders not just single folder ok let's use this command extracting and now if I use LS I will see several folders and the two files here in this folder and this basically completes installation of Kafka on your computer and the next all what you need to do is to run executable shell scripts located in bin folder I can list files in bin folder and here you see bunch of files with extension dot s H and every file here is simply executable script but we will talk about running services in the next lectures for now you should have Kafka installed on your computer but there is one more step left in order to be able to run Kafka on this computer we need to have Java installed and running let's verify Java version here in terminal Java - commercial and you see that in my case Java is not found I'll install default Java Runtime engine using command sudo apt install default shariah like so we and our password for user Java will require around 200 megabytes of space let's proceed let's wait until it will be completed okay installation was completed let's verify whether or not Java is now installed let's clear terminal and type Java desperation and now I see that the open Java development kit version 10 was successfully installed okay wonderful now we are all set and we are ready to move on and the start basically Kafka server and zookeeper server but we will do that in the next lectures bye bye [Music] in the previous sections I Fatima strated you how to install Apache Kafka on your operating system also we can installed Java because Java Runtime environment is required in order to start any Apache Kafka service in this section we will explore contents of a Pasha Kafka installation folder I'll tell you what a script or what is configuration file and how to start any Apache cough and service also we will try to start throws Apache Kafka broker and you'll understand that you will be able to start any Apache Kafka broker without connection to the zookeeper Lawkeeper is basically required for any broker that's why we will start first Apache Kafka's all keeper and afterwards try to start broker once again afterwards we will explore what actually happens under the hood when you are starting goalkeeper and broker and examine some Apache Kafka locks let's get right into it [Music] now you should health rose to Java up and running on your computer and the second you should have downloaded Kafka archive and unzip it into a specific folder in my case I have created a folder called Kafka and if a list contents of this folder I see five folders called bin leaps I'd dogs config and logs and two text files license and notice let's have a look at contents of bin folder LS bin and in this folder you'll find bunch of the files with extensions dot as age also you'll find here subfolder windows and the e-file list contents of windows folder I'll see bunch of bad files and the basically every file here in Windows folder or directly in a root of fur bin folder is executable script executable file and files with extension dot sh are executable files for unix-like systems like Michaels Ubuntu and so on and the files in Windows folder files executable on Windows and the basically in order to run specific service of calf cramp you should allow a specific script nothing else pretty simple and also you are able to run every script multiple times as well and that means that you are able to start for example multiple Kafka Brokers on a single computer we will try that of course later on but for now please understand that installation of Kafka is simply set of executable files math and girls ok let's create a me now and the let's again have a look at the contents of this folder there is also a folder called config let's examine files there and here you'll find the files with extension dot properties and those files contain actually default configurations for specific services that can be started using executable files for example this file zookeeper but properties is used in order to launch zookeeper service this file producer dot properties is used in order to start you saw this one is used in order to start Kafka server or Kafka broker and so on inside of the side dog's folder let's have a look at it you'll find archive with extension tar.gz called side dogs and you are able to extract this archive and it will basically contain documentation for specific version of Kafka that is installed in your computer also there is a folder called logs let's have a look at it sorry logs and this folder contains log files that are related to services that are started using different executable files for example here you see log file that relates to start of the Kafka server Kafka broker you can also read any file in any of those folders let clear terminal and here in terminal you are able to use for example card command and let's for example read file config server dot properties it is configuration file for Kafka broker and if I press enter you'll see file that basically contains set of default configuration parameters like lock segment bytes work retention check interval dot mas and so on simply set of default parameters for server start let's also have a look at one of the files in bin folder and at any moment of time you are able to press tab in order to see all options available here at this line lets press tab and for example let's have a look at contents of zookeeper shell dot CH you can start typing and next also press tab in order to autocomplete and the here let's type shell dot Sh and here you'll find executable file for unix-like systems and the every file with extension dot as age starts with this line it is actually commented outline of the shell script and this line indicates which program on unix-like computer should be used in order to execute this file and I have told you that this script is actually shell script and that's why CH program located in bin folder should be used in order to execute this file below you'll see a bunch of comments and after comments you'll see basically body of this shell script and here those lines do some job but let me clear terminal and show you location of CH application itself in order to find it let me CD to user folder and here CD to slash bin folder it will be system folder and the let's list files here and you'll find here for example file called CH and that's exactly the application that is used for execution of executable files also you'll find here bunch of other files like MV mkdir and so on for example this one is used to create a new folder of the computer this one LS was just used ok this where CH application is actually located on unix-like systems ok let's go back to Kafka see deem Kafka clear and also let me show you how to read files in a more convenient way because basically written text files in a terminal is not the best way to do that but if we will talk about editing using terminal it is also a nightmare so in the next lecture I'll quickly show you how to install open source text editor code Visual Studio code and I will show you how to open contents of this folder there and read different files there and also edit them I'll see you in the next one bye bye are you already subscribers if nostril I publish a lot of tutorials like this one in a previous lecture we have explored contents of Kafka folder and there are basically several subfolders and two text files also we have tried to read contents of several text documents and of course you are able to do that using terminal and command line but it is not the best way to do that that's why in this lecture I will show you how to install visual graphical user interface text editor called visual studio code it is open source free and available for Mac Windows and unix-like systems and I will demonstrate you how to read the contents of every file using vs code and how to add those files as well ok so please download vs code on your computer if it is not yet installed I'll download it a for Mac it sizes around 100 megabytes and basically in this course we will use vs code not just for editing of kafka files but also using vs code we will do some coding for creation of consumers and producers for Kafka ok archive was downloaded let's open it up and here I'll find dota prepare file and all what I need to do is drag this file into Applications folder but if I'll do that I'll get an error because I have other DVS code installed in this computer if you cannot please just move dr.pepper file into Applications folder and afterwards simply run Visual Studio code I'll stop operation then go to spotlight and the type here Visual Studio code and vs code will be opened here I see some release notes let me close them and now let's open up kafka folder here in vs code in order to perform this task go to file click here open and now we need to find location of kafka folder on this computer please go to this drop-down and select here your username my case it's Bogdan and here in a home folder for this particular user along with folders like applications desktop documents downloads I'll find folder called Kafka because we have created this folder in a home directory for specific user local user ok let's select this folder and click open and contents of this folder will be opened in visual studio code and you should agree that the written contents of this folder is much easier and nicer here than in terminal for example I can easily expand the bin folder and read any of the files listed here for example let's click on this file Kafka - convicts dot s H and you'll see here in the bottom that the V s code will automatically detect the type of this file it is a shell script and highlight corresponding sections for example comments will be in blue and executable lines will be like so with different colors also let's open up for example this file connect mirror maker dot SH and the same structure is here let me minify bin folder and the go to config folder and here I'll find the game bunch of files with extension to properties and for example let's open up file several properties this file will be used in order to start Kafka broker again I see some comments here and the below I see configure parameters like broker ID next I'll see Nam Network threads Nam are your threads and so on bunch of different configuration parameters for example on this line I see location of the folder where Kafka will store its logs and there is difference between logs that will be stored here and logs that are stored here in this folder up in this folder Kafka stores actually logs related to start scope and operation of services and in this folder Kafka will store messages that will be sent by producers and consumed by a consumers ok this how you're able pretty easily and the nicely read contents of every file here using visual studio code but more than that you are able pretty fast and easy edit those files for example I can rename this folder and the name it instead of Kafka - logs simply logs and now I see that this file was edited you see here this dot and there is also one unsaved filed here and I'm able to save file using key combination comment s or go to file and click Save and now this file was saved I can prove that by going for example to terminal and here let's again use card command and read file that we have just edited it is located in config and file name is server properties and scroll up a bit to the location of log file notice how inconvenient is to do that using terminal basically let me use search TMP slash logs and yes here is this line and it proves that contents of this file where edited ok let me go back to visual studio code and edit this file back to Kafka - logs and save the file again by pressing comment s ok that's how you're able pretty easily and fast read and edit files using visuals to their code and that's what we will do later on in this course also here you can click on license and the read license terms for Apache Kafka and also you can click on notice file and read another text document notice that not all lines are visible here on a screen and I need to use the horizontal scroll in order to see them and if you want to see all lines you can press key combination option or alt Z like so and now all lines will be wrapped and visible on a screen also you are able to hide this left pane by clicking key on this icon by clicking on it again it will be shown again and so on the S code is very powerful editor and if you cannot use it before I highly recommend you to do that I will not explain all of his features all what you need to know about it for the moment is how to read files and how to edit files that's all for this lecture and in the next one let's finally start Kafka service I'll see you in the next lecture bye bye for now all of you should have Kafka installed in your computer and all what you need to do is first install Java and the second download Kafka archive and unzip it for Windows users I highly recommended to don't install Kafka on windows and the use for example virtual machine on windows the terrazzo punto or create a remote butyl private server and install Kafka del from this point on I'll use all commands for unix-like systems basically Mac OS also unix-like system and the commands here doesn't differ from commands that you should enter on for example Ubuntu system again in this video let's finally start Apache Kafka server and you know that the CAF installation is basically set of executable scripts located in bin folder let's first CD to Kafka forum it is located in root of the user directory and here is list files in bin folder and you'll basically see here bunch of those executable files with extension dot s H you could start Kafka server by running this script Kafka there server - start dot as H this script also requires configuration file and there is default configuration file located in config folder let's list files in config folder and here is this file basically a server the properties okay let's know try to start Kafka server that is basically called Kafka broker and on every single computer you are able to run several Kafka brokers and we will tried it later on in the course for now let's try to start just single Kafka broker ok let's create em now and here let's type following we need to type path to executable file again this is located in bin folder then come slash and here will be Kafka - server - start dot Sh and after this path - executable script we need to specify file name of the configuration file that will be used for this server and in unix-like systems all arguments for scripts are specified simply after space so let's add space here and the next type / path - configuration file it is located in config folder and the name of the file is server dot properties basically I could use step in order to out a complete file name like so and now let's press ENTER in order to start Kafka server we see here bunch of messages and finally we see infamous edges in for shutting down info shutdown completed and that means that start of the server was unsuccessful but let's try to find reason for that let me scroll to the very beginning of this output here it is and you'll see that after this command that we have entered Kafka tries to connect to doorkeeper and here you see this message connecting to zookeeper on localhost colon 2 1 8 1 and if I'll scroll down to the end of this output and defined the message here like this timed out waiting for connection while in state connecting it means that Kafka server tries to connect to zookeeper that should be running on localhost at port 2 1 8 1 and finally this connection was timed out because we have not yet started so keeper and here we came to conclusion that the Kafka server is not able to start without connection to zookeeper so zookeeper is mandatory part of Kafka ecosystem and to without doorkeeper no brokers could be running ok to summarize we have just tried to start Kafka server and this attempt failed and next let's try to start zookeeper and afterwards try to start Kafka server once again see you in the next lecture bye we have just tried to start Kafka server and this attempt was unsuccessful and the reason for that is simple there was no zookeeper available and Kafka server isn't able to be running without connection to zookeeper now let me quickly show you how to observe logs of Kafka Sara let's go to visual studio code and open up Kafka folder here go to file open on Windows choose file open folder and the select Kafka folder that is located in my case in the root of my user directory in this folder and simply select folder and click open and next navigate to folder called locks this one and here you'll see bunch of log files that were created automatically by Kafka after start of Kafka Sara and let's observe log file called Sara log let me hide this left panel and here in this file you'll basically see same messages as we have seen here in a terminal but those messages are colored and that they are more readable of course and you'll see the same information here and Kafka server after it start tries to connect to zookeeper and localhost at port 2 1 8 1 and finally this connection attempt times out here we see this information as well and afterwards server shuts down also in this folder you'll see other log files but now we're interested only in this file server block ok let's know proceed and next let's start zookeeper see you next do you wanna mo tutorials like this if so please subscribe and doing Gabala to get them now you know how to read the log files located in the logs folder here in a Kafka folder and now let's start zookeeper let's clear terminal and start a zookeeper again by executing corresponding script and let's first list files in bin folder and the find file that we need to execute here is this file zookeeper - server - start and same as Kafka's server it needs configuration file and default one is located in config folder let's list contents of the config folder and here is this file zookeeper dot properties let's start zookeeper let's clear terminal and type bin slash zookeeper I can use step here server start and next comes path to configuration file config slash zookeeper you step zookeeper bought properties and let's press ENTER and zookeeper was successfully started and you'll see here this infamous edge info binding to port all zeros and port 2 1 8 1 and this message means that zookeeper server now listens to port 2 1 8 1 at all IP addresses at this localhost at this computer and now we are able to try to start Kafka server but before doing that let's try to observe and find logs of zookeeper server and let's first find location where log files of zookeeper are stored let's go to visual studio code and open up configuration file of zookeeper go to config and open up zookeeper dot properties file scroll down and here you'll find this path TMP slash zookeeper and that's basically the fault path where a zookeeper stores is logs in your case you may find another path here if this default configuration file was modified ok let's go back to terminal and navigate to this folder and please note that we need to keep the zookeeper running and that means that you need to open new tab in order to do other things so let's open up a new tab I can use short got the command team and the here let's see D to TMP slash and next comes zookeeper zoo keeper I can use step as well press ENTER and list files here and I will see just single folder by Russian - - let's see D - over Russian - - and list files here and there is single file called snapshot please note that while zookeeper will be running and will accept connections from Kafka server it will add log files here to this folder TMP slash zookeeper for now there is just single file snapshot dot 0 don't try to read contents of this snapshot file because it will be unreadable in time now ok let's close this step and let's proceed and next let's try to start Kafka server once again while zookeeper server will be still running see you next bye bye now we have zookeeper server up and running and it listens at pro two one eight one this one and it accepts connection to any IP address of this localhost that's what this record means ok now let's try to start the Kafka server once again and please note that we need to keep zookeeper server up and running let me make this window smaller half screen size like Seoul move it here and open up new tab and let's drag it here below this one and here in this step let's start Kafka server and we will start it the same as in one of the previous lectures go to folder Kafka and here let's type bin slash Kafka - server - start dot as H disappear - executable file which starts Kafka server and next after the space specify path to configuration file for Kafka server and now we will use default configuration file located in config and final name is Sara dot properties you can use tab in order to see all options let's press tab once again and this command will start Kafka server with raishin file located here in this file server the properties okay let's press enter and no kafka server was successfully started and now it is up and running let's now go to visual studio code and try to observe several logs now go here and go to logs and find file called server log and now let's scroll to the very end of this file you see bunch of logs here let me hide this step and at the end of this log file you will see message info Kafka server with ID zero started and that means that now Kafka server is up and running and we are able to use it for messages but let me now scroll up a bit and to show you other meaningful messages first one is here broker ID zero every broker or Kafka server has a unique ID and this ad must be unique across Kafka cluster and the default ID is zero and this ad is basically specified in configuration file for Kafka syrup let me quickly go back to configuration file of Kafka server this one server dot properties and if I'll scroll up here in this file you'll find broker ID parameter here and the fault one is zero and that's why here in this server log file we see here broker ID zero ok let's scroll up and find another line this one awaiting socket connections on this IP address and this port and that means that Kafka server Kafka broker is waiting now for connections at port nine zero nine two and again here you see four zeros and that means that it will accept connection to any IP address of your computer ok next message I wanted to show you log directory TMP slash Kafka logs not found creating it and this folder will be basically used for storage of Kafka messages recap that I have explained you difference between log folder this one located in Kafka folder and this folder where Kafka will store if messages so in this folder logs folder Kafka stores let's say system like Sarah log file and in this folder team P slash Kafka logs that is a default one Kafka will store messages that will be sent by producers and consumed by consumers and no worries please we will examine contents of this logs folder in the next lecture now let me scroll up a bit again and here you'll see bunch of parameters that were actually used for start of this Kafka broker and those parameters are default parameters let's scroll up to the beginning of those configuration parameters and notice that here in the right section you're also able to scroll and you'll see basically small text and you are able to navigate more easier so we need to find the logs related to connection to zookeeper and those logs start basically here connecting to zookeeper on localhost two one eight one and now zookeeper is up and running and that's why Kafka server initializes new session to this address localhost column two one eight one this port is used by zookeeper and below you'll see that zookeeper connection session will be basically successful so here is this message zookeeper client Kafka server connected and it is an indication that Kafka server or Kafka broker has successfully connected to zookeeper server and this message above creating new log file basically comes from zookeeper and same message we have seen here in terminal as last message in zookeeper wind of this one creating new log file log dot one and if you'll navigate to TMP slash zookeeper slash version two you'll find this file log that one here and that's basically all what I wanted to show you here in this log file and please notice that this file will contain also all previous logs for example here are logs that were related to the start and successful start of Kafka server where zookeeper wasn't available so if you want to clean this log file up you should simply delete it and Kafka will rake from scratch okay no soul keepers server and the Kafka's server are up and running and we are able to start using them and send some messages but before doing that let's observe contents of the log directory that will be used for storage of the messages by kafka server see you next bye bye now we have the Kafka server and Kafka zookeeper up and running and here they are running in my case in those two windows here is doorkeeper and the here is Kafka Sara and while we are doing basically nothing you see some logs here in Kafka cellar window like remove the zero expired offsets in zero milliseconds and that means that basically in background the Kafka server is performing some background jobs ok in this lecture let's examine the contents of the log directory that was created by Kafka server and notice that here we have seen this log message this one log directory team P slash Kafka logs not found creating it and let's now observe contents of this folder let's open up new terminal window and I'll drag it to new window like so let me make it fullscreen and let's see D 2 T MP / Kafka - logs like so and the here has list files and now in this folder you'll see that Kafka has created automatically several files related to processing of Kafka messages but for now recap that there are still no messages at all because we have not yet created any topics and the Hemet sent any messages for Kafka server for now there are some files like method of properties or cleaner of said checkpoint and we can quickly observe contents of a meta dot properties file and you'll see here such contents broker ID 0 version 0 and cluster ID this one also let's quickly have a look at contents of this file for example card cleaner - offset the checkpoint and now it is empty basically let's quickly have a look at log - start of said checkpoint and it is empty as well and the reason for that is again simple we have not yet sent any messages to Kafka server those log files were created here in advance okay let's clear terminal here and let's quickly go to zookeeper logs directory let's CD one level up like so and here let's CD to zookeeper and list files here again there is still one folder operation - tool lets CD to this forum and list files here and now you'll see he a new file that has appeared after successful connection from Kafka server to zookeeper this one let's try to read it cut log dot one and here you basically see partially unreadable text but we are able to distinguish this part and this part and those parts are basically information about first broker that has successfully connected to this zookeeper and recap that our first Kafka server is running at this prod at localhost and it's ID is zero but you may ask me where basically this port comes from and how Kafka server has decided on which port it should be run let me show you that quickly you can close this window because we don't need it anymore let's close it and go back to visual studio code and here open up server dot properties file recap that it is located in config folder and here is this file this file is default configuration file for craft a server Kafka broker and let's first scroll up and again I will show you this line broker ID is 0 it is default idea of Kafka broker and below here you'll find this commented out line whispered 1992 and that means that the fault port for Kafka server is 1992 but if you want to change it you can uncomment this line and change this port for example to 1993 and we will do that later on in the course when we will try to start several Kafka brokers on the same computer for now let's go back to commented out line like so and for now you know that default port for Kafka Sarah is 1992 and the default broker ID is 0 also there is the for lock path TMP slash Kafka logs where Kafka will store messages and files related to processing of the messages okay that's all for this short overview of locks folder and the current picture of our setup and now we all set and we could start sending messages to Kafka server but in order to send message to Kafka server it should be sent to specific topic and we can create multiple topics and that's why let's create our very first topic in this course in the next lecture see you in a bit bye bye now there is Apache Kafka class 2 running on your computer and it contains one so keeper and the one Kafka broker in this section I'll explain you how to create thrust Apache Kafka topic and how to explore contents of that topic also you'll understand what happens under the hood when you create any Apache Kafka topic let's get started [Music] at the moment we have a zookeeper Sarah and Kafka server or broker up and running and please note somewhere that zookeeper is running at the localhost to port two one eight one and the Kafka server or broker is running at localhost port 90 92 and again note that those ports are default and of course you are able to change them if you need to if you want to run multiple brokers on a single computer of course you need to run every server on separate unique port and certainly such ports as two one eight one and 1992 shouldn't be occupied by any other services on your computer if you want to run those services there why I'm showing you this information and why it is important basically at the moment we have single Kafka cluster with basically single broker but it doesn't matter it is single Kafka cluster with zookeeper and Kafka server and if you want to send any messages to Kafka cluster by producer for example or consume messages by consumer you need to specify connection details for Kafka cluster and usually connections are made to one of Kafka servers inside of the cluster and that means that if you want to connect to this cluster with this doorkeeper and disk Africa server you need to connect to localhost 1992 if there were other Kafka servers or brokers inside of the cluster you will be able to use any of those servers to make initial connection to Kafka cluster keeping this information in mind next we will create the first Kafka topic called cities and we will create it by using another Kafka script called Kafka - topics and this script is located here again in bin folder let me go there here it is bin and here if you'll scroll down you'll find this script called Kafka - topics dot s H and again using this script in separate window we will connect to Kafka cluster and create their new topic let's make it real next bye-bye hey I want to buy YouTube likes how much it costs five dollars per 20 likes no I don't have enough money for that sorry now we will create a first Kafka topic and for that we will use another script called Kafka topics dot s H ensure that zookeeper and the Kafka server are up and running and please open up new time--not window I'll go to shell click new window and drag this window to another desktop in order to be able to switch back to Zoe keeper and Kafka server and observe what will happen here here in new terminal window let city to Kafka forum clear terminal and here let's run script located in bin folder and the name of the script is Kafka - topics dot Sh and let's try to run the script without any arguments let's press ENTER and I see this output with long list of different options and that means that we are not able to run this script without any arguments like this we need to use some options for that and you see here at the top that this script is used for creation deletion and the changing of topics and now our goal is to create throws topic in our Kafka cluster notice that this terminal window is completely separated from those two windows so here we have a zookeeper running and here we are running Kafka Sara and basically so keeper and server have formed Kafka cluster and now somehow from this script we need to connect to that Kafka cluster and create a new topic there this way now we will use several named options and first one is option there's - create like so let's press ENTER now and see what will happen and now we'll see exception in thread the main Javed lung and this exception says us that we need to specify eyes or bootstraps server or zookeeper and that's what I have told you in the previous lecture there is Kafka cluster and zookeeper is running and this port by default and Kafka server single broker is running at this port and in order to connect to this Kafka cluster with zookeeper and Kafka server we should specify either so called bootstrap server and it could be any of the servers in the cluster or you could specify zookeeper and the connect via zookeeper IP address and port let's use the bootstrap server option for now and after create option we need to add another named option for that again let's use two dashes and next type boot strap their server after space comes IP address and port of bootstrap server and in our case Sara is running at the local host and port 1992 this one and you could type either localhost call 1992 or specify IP address of local host like 127 0 0 1 but let me type here local host for simplicity and next comes the port of a bootstrap server this one 1992 let's enter it like so and now let's try to create new topic enter and now I see again list of health options let me scroll up to the very beginning of this list and here I see another error missing required argument total and that means that in order to create a new topic we need to specify its name by using another named argument called topic let's do that let me basically clear it up go to history and here add another named argument - - topic and after space you should specify name of the new topic and let's create a topic called cities like so and this topic will be used for exchange of messages with names of different cities let's press ENTER now and now there is no error it's a good sign for sure and now we can try to go back to Kafka server window this one and observe what has happened here and if you'll scroll a bit you'll see such information info partition city 0 broker 0 and that means that new partition for new topic cities was created basically let's go to visual studio code and observe logs here go to server dot log minified this section and here you'll see such message as creating topic cities with configuration and empty object that means that new topic cities was created with default configuration and less basically observe what has happened in Kafka cluster after creation of new topic after this command let's do the next bye we have just created a very first topic in our Kafka cluster and name of this topic is CDs for creation of this topic we have used this command we have used the Kafka topic script with several named arguments notice that in shell you could specify named arguments with - - prefixes like so and for example here we have specified the connection string for bootstrap server and it is a one of the server's inside of the class term for now we have only single server inside of the cluster and of course each cluster must have at least one zookeeper server that's what we basically have and also with named argument - - topic we have specified name of the topic that we want to create but now in this lecture let me show you what exactly has happened in the graphical cluster when we have entered this command for that let me go to logs folder where Kafka stores logs related to Kafka messages let me go one level up and the CD now to TMP slash Kafka logs and list files here recap that - we have observed contents of this folder before and there were only five files but now there is a new folder called CDs - zero let's go to this folder CD Cities zero and list files now here and as you see here this folder contains several files for example this file with extension index and this file with extension log let's try to read for example this file let me copy its name and cut and paste name and now this file is empty let's try to read this file let me copy its name cut and paste this name this file is empty as well and the reason for that is that we have not yet sent any messages to this cities topic and as soon as we will do so Kafka will store new messages inside of those files in cities - 0 folder but why this folder has such weird name cities - 0 let me explain you that lets go to visual studio code and open up several properties file and he ll scroll down a bit to this line nam dot partitions equal sign one this line means how many partitions will be created by default for every topic but what is partition inside of every topic messages can be spread among several partitions and every partition is basically just separate folder like we see here and in our case Kafka has created just single partition because by default as you see here number of partitions is set to 1 if you try to change this default number for example to 2 then you'll see here 2 folders with names CDs - 0 and cities - 1 but again by default Kafka will create just single partition per topic that's why here in this folder name we see site X - 0 it is partitions number and there are no hours or folders that Kafka has created in Kafka logs folder there was just single folder cities there zero and Kafka did this do it to this configuration parameter num the partitions equal sign 1 ok now let's proceed and next I'll show you how to read information about existing topics in Kafka cluster see you next bye bye hi I'm looking for people who could share my new youtube tutorial but where [Music] [Music] now you know what happened in Kafka caster under the hood when we have entered this command that actually created a new topic let's now try to read details about this city story for that let's see D again - Kafka for Durham City - cough come clear terminal and now let's run again same script Kafka tropics but with another option Dez - list bin slash graphic a topics but as age and here will be optioned as - list and afterwards we need to specify same as before ayahs or bootstrap server or zookeeper server let's try to specify connection string of zookeeper for that let's use zookeeper option and here will be address of connection - zookeeper Sara recap that the zookeeper is running at port - 1 8 1 so here type - 1 8 1 and let's press ENTER and you'll see list of the topics in Kafka cluster and now there is just single topic called cities that's what this command does you could always use there - list option in order to list all available topics that exist in Kafka cluster also notice that we have just used zookeeper option instead of bootstrap server and that means that you are able to connect to Kafka cluster eyes or via zookeeper or any of the CAF casseras ok let's no read details about this topic called cities for that let's use another option - - describe type same command beam slash Kafka topics dot s H next comes describe next comes same as before Isis or keeper or bootstrap server let's again specify zookeeper localhost two one eight one and the next comes another named argument name of the topic you want to read the tales about so let's add there's - topic and here comes name of the topic cities and let's press ENTER and here we see now configuration details for City story let me make it a bit smaller to fit it to the screen like this and do we see such information topic name cities next comes partition count 1 and you already know what is dead that is actually quantity of different folders where messages of this topic will be stored and by default this quantity is 1 next comes another new for you parameter replication factor 1 what does this parameter mean at the moment our cupcake lustre has only single Kafka broker and if the server will become unavailable for some reason then nobody will be able to store new messages that will be arrived from producers and consumers will be able to read data from the server and that's why replication factor is used if there are multiple brokers in the cluster you are able to replicate every message that arrives to topic in our case cities topic and replication factor tells how many times each message will be replicated and the normal case this parameter is set to default value 1 because we have only one server and only one server will save every single message in the topic but if they are aware for example 3 servers in this cluster you can set replication factor to 3 and every message in this topic cities will be straat on every of 3 servers for backup purposes of course okaythat's replication factor below after this line you see this line with details for every partition here comes number of partition and you know that partitions numbers start from 0 0 1 2 and so on next comes information about leader replicas and in replicas and everywhere you see here number zero and it is basically number of the broker and in our case we have only single broker with AD zero recap this configuration parameter broker ID zero this one and that is exactly debt number that you see here in those fields and that means that every partition must have at least one leader leader broker in our case again we have only single broker and that's why his leader for this partition partition number zero we will get back to this later on for sure but now you should remember that with described argument you are able to read details about each topic in our case we have only single topic and you could list all topics in the Kafka cluster by using list option okay guys that's all for this one and the next we could try to send a very thrust message to distorted see distorted and for that we need to have producer luckily Kafka ships with built-in consumers and producers that allocated in bin folder and they are craft consul producer and Kafka consul consumer and we are able easily use them in order to produce and consume Kafka messages and let's produce some messages for cities topic in the next lecture bye bye [Music] now you have a batch Akaka cluster on your computer and there is no store to create and now is a good time to start first producer and produce some messages to death in your talk in this section I'll explain you what is built in the Kafka console producer and what is Kafka console consumer and how to start them they are basically shipped along with half installation and you are able to get them up and running out of the box we will also start multiple for users in parallel and multiple consumers and see what will happen let's get into it [Music] we have just successfully created the first topic in a Kafka class huh and name of this topic is Sidious and you could use Kafka topic stood as age script with named option there's - list in order to list all topics available in a class term and now it's a good time to start sending some messages to this topic CDs and for that we could use any producer there is a built-in producer that ships with Kafka installation and its name is Kafka called sole producer it is located in bin folder and here it is Kafka console producer let's use the script let's clear terminal and type bin slash Kafka - console - producer dot SH and let's try to execute it like so of course as usually we will get list or for help options it means that in order to start this console producer we need to specify some required arguments and first required argument is topic here you see that it is required and second one is located at the beginning of this list and it is broker list this one and basically in this list we need to specify list of the brokers and if there are multiple brokers available then you should split them using comma like so okay let's add the DOS arguments broker list and topic let's clear terminal and type here bin slash Kafka - console - producer dot as age next comes broker list argument and here will be connection string for our single broker it is localhost colon 1992 and next comes the second required argument topic there's - topic and the here will be a name of the topic we want to produce messages to and name of our first topic is CDs like so let's press ENTER and see what will happen and we see this prompt and that means that this Kafka console producer has successfully connected to Kafka broker and now we are able to start sending some messages our topic that we have created is called cities and this way let's send the names of some cities for example let's start with New York Press Antrim next one will be Berlin next one let's say will be Paris one more and one more scene name okay we have just successfully sent several messages using Kafka consul producer but now there are several questions first one when those messages successfully received by Kafka cluster or not second one where those messages were stored in Kafka cluster and last one how we are able to consume those messages from Kafka cluster and let me answer last question Frost and the next let's lounge built in Kafka console consumer and try to consume those messages that we have just sent by producer let's tried it next bye-bye no please stop and to three simple actions just one smash the like button second ones subscribe to my channel and throw the phone ringing a bell to get notifications about all rosy videos we have just launched Kafka console producer connected to one of the cellars in Kafka cluster and the sent some messages to topic CDs and here are those messages and now this console producer is still up and running and we are able to send another messages but now let's try to consume messages from Kafka cluster and for that we need to lounge consumer that will consume messages from CDs topic and luckily there is Kafka console consumer available in Kafka out of the box let's use it let's open up new window let me make it smaller like so and here go to Kafka folder and Lounge craft consumer with help of built-in script called Kafka - console - consumer this one let's press enters usually now without any arguments if you'll scroll up to the beginning of this options list you'll find out that there is required argument called bootstraps Sarah this one and this argument is already familiar to you it supplies list of Kafka servers we want to connect to we have just single server that's why we need to specify it like so localhost call 1992 ok let's copy this argument clear terminal go back to Kafka console consumer pays this argument and type here localhost call on 1992 and press ENTER and again I see list of help options and that means that there is an azure argument that is required let's go back to the beginning of this output and here you see this message exactly one of white list or topic is required and that means that in order to successfully launch Kafka console consumer we need to specify ayahs or why at least argument or topic argument using white list argument you are able to connect to set of topics and the read messages from multiple topics using wildcards in tax but our goal for now is to read messages just from single topic called CDs that's why let's specify argument topic with name of the topic CDs okay when it's clear terminal again and the go to this command and add another argument topic and here will be name of the topic cities and now let's press ENTER and it seems that now the script is up and running but we don't see any messages in a console let's do following let's go to Kafka console producer and produce one more message and let's send another city name for example Delhi like so let's press ENTER and now I'll get Delhi here in a Kafka console consumer and that is a good sign it means that Kafka console consumer is running correctly and now it receives all messages that are being sent by producer let's send another message for example to buy and this message was successfully received by consumer as well but now you may ask me a question but what about those messages that were sent by producer here from New York to Sydney let me answer this question in the next lecture bye bye now there are Kafka producer and consumer up and running and after launch of console consumer we have seen that it will start receiving only new messages that will arrive from producers to Kafka Kloster and now let me answer the question how you are able to read all messages from beginning from this one in our case let me stop this Kafka console consumer control seemed less clear terminal and now let's launch it again but using another option called from beginning like so let's press Enter and now I have got all messages from the very first one from New York okay this how you're able to read the messages from specific topic from the very beginning for that you need to use option from beginning I can terminate this console consumer and launch it again using the same option from beginning and I'll get the same result all messages were consumed and now we are able to make very important conclusion Apache Kafka Craster stores messages even if they were already consumed by one of the consumers and this means that we are able to read Kafka messages from the Kafka Craster several times by different consumers and at different moments of time and next let's try to launch another Kafka consumer and see how those two consumers will receive messages sent by this producer I'll see you next bye bye now please keep this producer and this consumer up and running and let's start one more consumer but that let's copy this command without option from beginning like so and let's open up a new window but let's make this window smaller like soap let's open up a new one adjust its size to make it visible just in this corner like so go to folder Kafka clear terminal paste corporate command and now press Enter if you see that cursor has jumped to the next line it's a good sign it means that this console consumer is up and running and now there are two consumers this one and this one let's now try to produce one more message here for example let's write here amsterdam like so enter and now you'll see amsterdam in both consumers here and here and that means that you are able to read messages from Kafka server by multiple consumers in parallel and that's another important conclusion having just single Kafka cluster between producers and consumers you are able to provide single point of storage of all messages and all producers and consumers are able to work by just the point of storage and next let's try to lauch one more producer and produce messages from different producers let's try that next bye what did you say I should delete my YouTube channel and don't post any videos anymore please leave a like click don't think so now there are two consumers up and running here and we have just verified that both of them received new messages but we have just single producer here let's know try to run one more producer by copying this command and running it in a new time no window let's make this window smaller like so it will be producer number one open new window make it smaller and place it somewhere here like so and let's CD to Kafka folder clear terminal and start new producer with same command as we have used for starting of this producer who has press ENTER here and now we are able to produce some messages from this producer let's write another city name for example Barcelona press Enter message was sent and it was successfully received by both Kafka consumers and here is another important outcome neither consumers nor producers know about each other it means that this producer doesn't know anything about existence of those consumers and about other producers as well it is just performing its own job nothing girls same I can say about this producer it produces messages without knowing about other producers or consumers and consumers don't care about producers that have produced those particular messages they just consume messages from one centralized storage from Kafka cluster and that is the beauty of Kafka architecture producers and consumers may appear and may go but as long as messages are stored in centralized storage in Kafka cluster may be consumed by consumers and new producers may send new messages to disk after cluster for example let's try to stop this Kafka producer it was stopped but this action didn't affect in any way those consumers or this producer they are still up and running and I'm able to produce another message for example here by typing let's say Madrid and this message will be received again by both consumers I can start this producer once again and send another city name for example Toronto it will be consumed as before by both consumers and again those consumers don't care about presents or producers they do just simple job get new messages from Kafka cluster that's it ok at this moment you should have really nice understanding of what Kafka is and what it actually does also you understand what is producer and what is consumer producers sent messages to Kafka cluster and consumers consume those messages now let's proceed and next let's answer some other questions like where Kafka basically stores all those messages because as we have seen before you are able to read the messages from the very beginning and that means that somewhere Kafka's arose all messages that were sent to particular topic let's explore it next bye bye in this lecture I'll answer the question where Kafka basically stores messages that are sent by producers and consumed by consumers we have found out that you are able to start multiple consumers and multiple producers and consumers are able to read messages from the very first one in each topic in our case we have read the messages from beginning in topic cities recap that I have told you before that the Kafka stores all messages in TMP / Kafka - locks folder by default and that is the parameter that is set in server configuration file let me show you it quickly go to vs code open up config folder and open up several properties file and here on this line you'll find this path where Kafka's stores all messages let's open up new terminal window and go to this folder so I'll do that here new window let me drag it to the new desktop like so and here let's see d2 /tmp don't forget slash here and next will be Kafka - locks and now let's list files here and recap that after creation of the new topic cities we have explored this folder and they aware those files and this new folder but now there are a bunch of new folders with practices consumer underscore offsets and there are 50 such kind of folders here now let me just say that when we have started sending messages to cities topic Kafka has automatically created new topic quote - underscores consumer underscore offset and this topic has 50 partitions by default and you may notice that those folder names start with consumer offsets 0 and last one is consumer of cells 49 and this is simply 50 different folders for 50 partitions of new system topic called consumer of cells now we are interested in contents of CDs - 0 folder recap that cities topic has just single partition and that's why here we see only one folder with this name cities - 0 let's go into this folder CD cities 0 and list files here and same as before we see here only for files recap that after creation of the new topic we have examined contents of those two files and they were empty let's have a look at those files now let's explore contents of index file first copy its name and use card command paste name and we see that this file is empty but what about this file with extension dot lock let's copy its name cut and paste name and this file is not empty anymore this file is partially readable and you can distinguish messages that have sent to see distorted first message was New York and if I'll go back to producer here you'll find this first message last one is Toronto and if I'll go back here you'll find Toronto here at the end of this file and that means that Kafka has stored all messages in this log file located in TMP slash Kafka logs folder that's that simple of course every message has additional system information and we will get back to that later on in the course but now you need to understand that Kafka simply stores all messages inside of the files not some girls it works like a file system and that's why thanks to those files you are able anytime read the messages from Kafka class term but certainly every message has its lifetime and after specific amount of time Kafka will delete all messages in order to make room for new messages and consumers that Hemet read messages in time will not be able to read them again but this of course done for optimization now you should remember that all what Kafka does after it receives message from producer it adds it to log file in specific partition folder in our case folder name is cities - zero ok now let's go to vs code and try to read log files of Kafka broker itself and see what was happening there so let's go to be s code and go to logs folder this one and recap that we have examined there Sara dot log file but you may notice that this file will be automatically cleaned up and the Kafka will store previous logs in such files as for example this one server dot log dot and date and if there were enough logs it archives them every single hour so please try to find here log file with the file name with date closest to your current date it should be at the end of this list in my case this file is here let me open it up close this side pane and because this file doesn't have extension dot look vs code will not highlight every line with different colors to make it happen go - this plaintext and here in select language mode type lock and select a lot from the drop-down and no type of this file was set to look and we are able to read it more easy and now in this file you'll see bunch of different messages some of them are related to creation of system topic called consumer offsets I have told you about it before and you are able to find message related to creation of this new system topic for that let's press command F or edit and find and the here in a search bar let's type 50 and space and you should find single blog message where 50 appears and this message is this one our two creation of topic two underscores consumer underscore offsets with 50 partitions and replication factor 1 and this creation was successful and that's what I have told you Kafka cluster has automatically created new system topic called consumer of cells with 50 partitions ok this is the first meaningful load here also if you'll scroll down you'll find out that all partitions for this system topic we're created at broker with ID 0 and the reason for that is that we have just single broker in our cluster and that's why all partitions for this topic were created there if there were more brokers available then partitions will be spread among all available brokers we will try it later on in the course and will create our custom topics and you'll see that the partitions will be spread among all available brokers ok also if you'll search for producers let's search for producer you'll see some logs related to produce us but they basically don't tell us much let's try to find messages related to consumers and for that let's type here - and consumer and you'll find messages like this one rebalance group console consumer 18 9 8 0 or stabilizing group console consumer is the same number also you'll find similar messages related to another console consumer with announcer number below you may find another console consumer group name with suffix in my case for 166 in your case those numbers will be completely different recap that for consuming of messages we have launched two different consumers here they are still running and Kafka has automatically created new consumer groups with random numbers and that's because every consumer must belong to specific consumer group later on will try to start consumers with custom consumer group names now you'll see here a random names like this one also let's try to find name of the topic who kept that name of the topic in our case is cds and you wouldn't find here any results and this because logs related to writing or reading from specific topic in our case cds very low level and that they are not shown here also please search for word of cells like this and you'll find bunch of messages related to offset and offset is basically number of every message that is thought in Kafka topic every message has unique offset number and the consumers are basically starting reading from specific offset and when we have started this command with argument from beginning this consumer has told Kafka that it was to read messages in CDs topic from the very beginning from the first one that has offset number zero so first message in our case message New York has offset zero and now it's a good time to end this introduction practice section where you were able to launch producers consumers generate some messages consume them by consumers and less at the beginning of next section shortly summarize what you have learned so far what is Kafka cluster what is broker what a zookeeper and afterwards we will talk in details about such concepts as offset consumer group leader and so on so let's go on in the next section bye bye if you have done all practice activities in previous sections on with me you understand a lot about a by Chekhov you know what is broker you know what is so keeper you know what is topic what is producer and what is consumer and how all that actually works and in this section I'll take some time in order to explain all of those concepts using diagrams we will have do any practice activities I'll use just diagrams and explain different concepts I'll explain you in details what is aphasia cough a broker what is all keeper what is zookeeper and sample what is cluster of brokers you'll understand what is topic what our petitions and how partitions are spread across different brokers inside of the cluster also you'll understand structure of the message and how actually messages are sent from producers and consumed by consumers also I've explained you how producers are able to produce messages to different brokers to different partitions and how consumers may read messages from multiple brokers you'll also understand waters controller and what is its role in Kafka cluster and most importantly you'll understand how to create fault tolerant or a silent reliable solution with Abacha Kafka and I'll explain you what is replication factor and how to store multiple copies of every message on different brokers and if any of those brokers fails other brokers will take over and continuous Cerveny producers and consumers let's get started [Music] hi guys in this section I gonna explain most important concepts of Apache Kafka and we will do any practice activities in this section I'll just try to give you a short overview of what Apache Kafka is and which components are included in a by Chekov car and basically in a previous section we have talked about most of those concepts and we have explored all of them in practice and I did it on purpose in order to make you weekly dive into the Apache Kafka and make your hands dirty and do some practice activities and now it's time to explain every of those concepts in greater details and afterwards in the next section we will continue with practice activities so I gonna start with definition of what Apache Kafka basically is and I'll define it as following Apache Kafka is distributed publish/subscribe messaging system and let me talk about every part of this definition in details and I'll gonna start with publish/subscribe publish/subscribe system means that there are publishers and there are subscribers so publishers published some information and subscribers subscribe to that information and let me first give you an example of publish/subscribe system and you may be surprised but youtube is the such a system at YouTube there are a lot of publishers or they are called the creators that may create videos at any moment of time and publish them to YouTube platform also on other hand there are a lot of subscribers consumers who are able to watch any video at any moment of time and YouTube as published subscribe system stores every single video and any video that was published to YouTube platform may be watched at any moment of time by any quantity of consumers all that say subscribers and that's why YouTube is a great example of publish/subscribe system and key points here is that publishers may publish videos at any moment of time and they don't know anything about subscribers or consumers and they don't know when consumers or subscribers will watch those videos YouTube as centralized system is responsible for storage of YouTube videos and making them available for watching by subscribers and subscribers independent of other subscribers independent of producers or publishers are watching those videos at any moment of time that's publish/subscribe system next word I want to explain in details here in this definition is messaging and messaging means that in Apache Kafka publishers and the subscribers exchanged messages not videos like in YouTube and every message is simply sequence of bytes nothing else and responsibility of Apache Kafka is to store messages that were sent by publishers and supply those messages to subscribers whenever they ask for them and the last point here in this definition is distributed distributed means that Apache Kafka is fault tolerant rezaian system with ability to create large clusters with many many different servers that allow you to create fault tolerance system and whenever any of the server's fails or even multiple servers file other servers will continue operation and will continue serving publishers and subscribers and if everything is set up correctly even single message we don't be lost okay that is the definition of Apache Kafka and it is again distributed publish/subscribe messaging system let's now proceed and next let me explain you what is Apache Kafka broker see you next we have just defined what is Abacha Kafka it is distributed publish/subscribe messaging system and in every publish/subscribe system messages should be stored somewhere and also publishers should be able to send to somewhere those messages and the subscribers should be able to read from somewhere those messages and in Apache Kafka brokers are responsible for all those operations they store messages and they serve publishers and subscribers but publishers in Apache Kafka are called producers they produce messages to Kafka brokers and subscribers in Apache Kafka are called consumers they consume messages from Kafka brokers and most interesting that at every single physical server of Utah server you may run multiple Kafka brokers and they will be completely independent and you are able to create even Kafka cluster on a single computer desk Africa broker again its main responsibilities are following first receive messages from producers second store those messages and third give ability for consumers to read those messages that's all and the Kafka broker simply stores messages in files on a hard drive and producers are able to append messages to those files and consumers are able to read from those files that's that simple also it is possible to have multiple producers and multiple consumers like on this diagram and multiple producers are able to simultaneously produce messages to Kafka broker and multiple consumers are able to simultaneously read messages from Kafka broker and of course messages may be produced and consumed asynchronously on different moments of time and notice that there is single weak point on this diagram and this broker if this broker fails nobody will be able to share producers and consumers that's why usually nobody runs appacha Kafka like that with just single broker and instead broker cluster is created let's discuss what is broker cluster in the next lecture bye hey I still here do like this tutorial if so please smash the like button and share this video with your friends and co-workers let's go on now you know that in appacha Kafka broker is responsible for storage of messages sent by producers and consumed by consumers in a centralized point of storage and of course if there is just single broker and if this broker fails nobody will be able to serve our producers and consumers and all processes will be simply stopped that's why usually cluster of the brokers is created and you are able to create very small clusters with few servers but you are also able to scale them and create cluster that will contain thousands of servers and large companies like LinkedIn or Netflix who utilize Apache Kafka have such a large kafka clusters desk after cluster and if there is Kafka cluster and if there are multiple producers and multiple consumers they are able to interact with different brokers inside of the cluster what I mean for example every single producer may send messages to different Kafka brokers and every Kafka broker will store part of the messages so it means that all messages from producers will be spread among different servers on other hand the Kafka consumers may read messages from different Kafka brokers also if one of the brokers files in a Kafka cluster other brokers will take over it and the continued operation of entire cluster and now you may ask me a question what if there are hundreds of brokers in a Kafka cluster how those brokers synchronize between each other how they talk to each other how they agree how to distribute workload and so on that's where zookeeper comes in let's discuss what is all keeper and what is responsible for in the next life chart see you next bye-bye now you understand why appacha Kafka is called a distributed system this because you are able to create a cluster or brokers with multiple brokers but if there are multiple brokers inside of the cluster it becomes unclear how they maintain state of entire cluster and how they communicate between each other and that's where zookeeper comes in zookeeper was also developed by Apache it's called basically a patches or keeper and it is used not just in conjunction with Apache Kafka it is also used to with Apache Hadoop for example or aperture solar and upon shikaka broker will not even start without active connection to doorkeeper that's why zookeeper is a mandatory part of Apache Kafka ecosystem and main responsibilities of zookeeper in case when it is used to with Apache Kafka our following first of all it maintains list of active brokers it means that at any moment of time it knows which brokers are active in the live and which have failed also zookeeper elects controller we will talk about what is control a bit later but controller is elected among brokers in a cluster and there is just single controller in Erik Kafka cluster also zookeeper manages configuration of the topics and partitions it means that when you create any topic in a Kafka classroom it is created basically at a zoo keeper and the keeper distributes this configuration to all brokers in the cluster that's why zookeeper is needed and that's what it does in Apache Kafka ecosystem but now you may ask me a question but what if this single zookeeper server fails how brokers producers and consumers will continue their operations you are able to create a cluster of zoo keepers and let me talk about it in the next lecture bye if cluster of the brokers in your setup is very very large and contains hundreds of brokers of course it's not safe to have just single zookeeper that's why usually cluster of zookeeper is created and usually the Skjold and sample zookeeper ensemble and it is recommended to have odd number of the servers in zookeeper and sample why is that in every zookeeper classroom you should set up so-called quorum quorum is the minimum quantity of the server's that should be up and running in order to form operational cluster otherwise if there are less servers then quorum zookeeper cluster is considered down and all corresponding brokers that were connected to this cluster will be also down let me give you an example in such case in this diagram there are three zookeeper servers and we can set up a quorum of two servers and in case when any one of those three servers will fail zookeeper cluster will still be up and operational but if another server will fail and only one server will be left then zookeeper example will be considered down and entire patchy kafka cluster will be down as well and now let me quickly explain you why you should have all the number of the servers in zookeeper ensemble and how you could set quorum how you should choose number of a quorum let's suppose that you have set for servers in zookeeper and sample and the quorum is set to two and let's suppose that two of those zookeeper servers allocated for example in USA and two others are in Europe and there is the internet voltage between two continents and if quorum is set to two two servers in USA will still think that zookeeper ensemble is up and running and operational because quorum is to end the Dera to operational servers and same relates to - zookeeper servers in Europe they both will think that zookeeper cluster is up and running and now you'll end up by in situation with two different clusters - Sara's in each and every cluster will serve Kafka Brokers Kafka producers and Kafka consumers and that's really really bad in such case you will get discrepancy in messages and you'll get a mess that's why it is recommended to set quorum to half of quantity of zookeeper service plus one what I mean for example if there were nine Sarah's in zookeeper ensemble then quorum should be set to five if for example there are fifteen zookeeper servers then quorum should be set to fifteen plus 1/2 so quorum will be equal to eight in such case even if 7 Sarah's of fifteen will fail quorum will be reached we will have still eight servers in a cluster and cluster will be still up and operational okay that's all about zookeeper cluster or example and the most important you should keep in mind is that quantity of the servers in zookeeper cluster should be odd so one three five seven and so on okay let's proceed and next to let me explain you what you should do if your company is international and it is spread among different continents and it is really hard to maintain just single kafka cluster across all continents see you next bye let's proceed and now let's talk about different clusters of course if your company is really large and tough spread all over the world it is a really nice idea to create multiple clusters in different countries or on different continents or something like that and each cluster will be actually separate entity with the own brokers with own producers and consumers also zookeeper is required in every classroom and you may create the as many clusters as you need but now you may ask me a question but what about data synchronization between different classrooms of course is possible if you need that and you are able to set up mirroring between different clusters you are even able to accomplish following scenario when clusters will be fully in sync with each other but what is an idea behind creation of different clusters let me explain you for example there are two clusters one in USA and the one in Europe and producers that allocated closed or to the cluster in u.s. will produce messages to that cluster afterwards those messages will be synced to with cluster or located in Europe and if there was some consumer located in Europe it will connect to the closest to each cluster located in you app in such case message that was produced in one cluster will be read in another cluster thanks to me reading between two different clusters and of course such separation and creation of two different clusters will reduce total load and will increase efficiency if you of course don't require any mirroring you are able to create completely separate clusters in different regions that's all what I wanted to tell about clusters and please keep in mind that every cluster is a separate entity and it requires a separate brokers and the separate Azure keepers and if you need to producers may produce messages even to different clusters and consumers may consume messages also from different clusters okay let's proceed and next let me explain you which ports are used by default when you start a soil keeper and Kafka broker on computer so see you next bye when you install a Kafka on any computer it comes basically as a set of the files and those files are mostly executable scripts with extension dot sh or bad script so for Windows also there are default configuration files for Zoe keeper and Kafka broker and here on this diagram you see default ports that I used those configuration files so if you start a zookeeper it will use the port two one eight one and the Kafka broker or server will use the port 1992 and here is important notice if you are going to launch multiple for example zookeepers on a same computer you need to use different ports because every single port can be occupied only by single process on a single computer and that's why if you want to run for example three zookeeper Sara's on the same computer you need to adjust configuration files and create the separate configuration files for every zookeeper instance with different ports and it is also a good idea to create separate log folders for every instance for example you could use ports to 1/8 1/4 zookeeper 2 1 8 2 4 second one and two one eight three four third one same relates to Kafka brokers it is possible to run multiple brokers on the same computer and also you need to create separate configuration files for every instance of brokers and you need to adjust at least the pores and make them for example 1992 1993 1994 and so on depending on the quantity of the brokers you are going to run on your computer and also you should adjust folders where each broker will store messages or logs if you are going to run for example zookeeper instances on different physical or virtual servers then there is no need to adjust pores and that you are able to run all of them on the same port but on different computers same relates to Kafka brokers you could use the default port 90 92 on allcom or if you wish you could adjust it to any custom port you want and one more important notice regarding clowns of Kafka brokers if you create a cluster with different Kafka brokers and those brokers are located on different computers even in different data centers and those brokers need to communicate via public IP addresses or the my names you need to adjust advertise the names in configuration files of each broker because advertised name is that host name plus port that are communicated by kafka broker to zookeeper and one producer a consumer will connect to Kafka cluster zookeeper will supply this hostname plus port that it received from Kafka broker and for example if Kafka broker sent the advertised hostname localhost and the port 1992 of course producer that is located the sound where in other place in the world we won't be able to connect to Sajha broker because local host is local address of computer itself that's why please be careful and adjust advertised name and port on every broker if you are going to make them public okay that's all what I wanted to talk about hey in this lecture and just make a note of those ports that I used by default by zookeeper and Kafka broker also please note that I don't want to dive deeper into installation and the launch of soul keeper and the Kafka broker because all of that we have performed in a previous section now let's move on and next let's talk about tonics so see you next bye bye so far we have talked about architecture over kafka clusters and you know that every cluster consists of Kafka brokers and zookeeper servers also there are producers and consumers but we're basically messages are stored at the every broker for that there is special entity called the topic and every topic has its own unique name and name must be unique across Kafka cluster and every message inside of the topic has specific number called offset and this number is assigned to every message when it arrives to specific broker and when times go by every producer may append messages only to the end of the log for example here you see that new messages may be added after message with offset number 10 and you're not able to insert message for example here between messages 2 & 3 that's not possible and that means that every number is immutable and every log record in broker is immutable and you're not able to change it but on other side broker may delete all the records all log records that were expired and by default this period called log retention period is set to 168 hours or 7 days and that means that by default again by default every broker will delete all messages that are older than 7 days of course this retention policy is subject to change if you need so so short summary is following every topic has its own unique name and every message has its own number called offset and new messages that arrived could be appended only after existing messages you are not able to insert any messages before previous messages and old messages could be deleted automatically if retention period has expired and one more important notice every Kafka broker does not care about consumers it means that it stores kafka messages independent of consumers that may read or may not read messages look it doesn't matter for Kafka broker at all its job is just to receive new messages and append them at the end of previous messages so that's all for this lecture and next to let me quickly explain your structure of each message so see you next my mind do you wanna more tutorials like this if so please subscribe and join capella to get them from the previous lecture you know that every message that is stored at Kafka broker is immutable and that means that you're not able to change contents of the message that was already stored at Kafka broker also on other hand every message has its own structure so let me explain you it now every message has timestamp and this timestamp can be assigned to the message either by Kafka broker or by producer so it is configurable also every message has offset number and recap that in the previous lecture I didn't told you that this number must be unique across topic that's because it must be unique only across partition we will get back to partitions pretty soon in one of the next lectures but for now just to recap that offset number is unique only across partition in specific topic so that's where time step and offset debt are assigned to the message and message itself may contain key the dis optional and failure value of every message is simply sequence of bytes nothing girls and the Kafka broker does not care what is actually stored inside of the message it stores simply sequence of bytes that means that using Apache Kafka is centralized transportation messaging system you are able to exchange different kinds of data for example you could send objects or you could send strings or numbers or modern that you are able to send files that must be encoded simply into Bias format that's all Kafka stores just sequences of bytes but also please keep in mind that the idea of Kafka messaging system is to keep messages as small as possible so please don't try to send entire movies in a single message that's not purpose of Abijah Kafka his purpose is to quickly send just small messages between producers and consumers that's it and that is actually optional you are able to use as additional grouping mechanism for messages inside of the topic for example if there are multiple stores that sent messages to centralized appacha Kafka service for example about sales of every product and using it you are able to set name of each store that sends data to Apache Kafka topic and using this key you are able to distinguish which shop has sent this update message also another mode is about keys they are created on producers and the send to Kafka brokers and if several messages they have the same key they will be sent to the same partition we will get back to that later on but for now just keep in mind that using key you are able to direct messages to specific partition so that's all about message structure and to summarize every message has unique offset number that is unique across partition second every message has timestamp and every message has basically message your body it is value and optional kill so that's all about message structure and also please keep in mind that you should to keep every message as small as possible to achieve maximum efficiency of Abijah Kafka cluster so that's all for this lecture and next let me explain you how topics are spread among different brokers and what is actually partition and how partitions are spread among different brokers so see you next bye-bye okay now you know structure of the message that is taught in Kafka brokers and now it's a good time to go on and talk about how topics are actually stored on brokers and what is partition so every topic may exist on different brokers that are included in the Kafka class for example here on this diagram you see that topic a is present at broker 0 and at broker 1 and topic B is present on broker one broker 2 and broker 3 another topic topic C is present only at broker to notice that on this diagram I have removed a sole keeper for simplicity but it is always there without zookeeper Kafka broker is not able to operate so you may ask me a question why do we need the same topic on different brokers why don't just create single topic per single broker answer is pretty simple for fault tolerance if in such case broker 2 will fail all messages in topic C will be lost and no one will be able to produce new messages to topic C and consume messages from distorted because broker that was serving all messages inside of this topic topic C is gone in this case when we talk about topic a and if for example broker 0 fails broker 1 will be able still save new messages that will arrive to topic 8 and it will able to serve reading requests from consumers that will ask for data from this topic a but how actually messages are spread among different brokers in such case when same topic is present on different brokers for that Kafka uses partitions and here in this example you see that topic a has two partitions partition 0 and partition 1 one partition is located at broker 0 and second one is located at broker 1 if we will talk about topic B it has in this example three partitions starting from partition number 0 so this partition is located at broker number 1 this partition number 1 is located at another broker broker 2 and partition 2 is located at the last broker number 3 topic c has in this example only single partition partition zero and it is located only on single broker also in this diagram you see one more topic - a big deal that has two partitions partition 0 and partition one that allocated both on one broker broker 0 it is also possible also you are able to create topics with hundreds of different partitions that will be spread among multiple brokers but what is an idea behind creation of those partitions why don't just create topic with just single partition as shown here in this example because as you know every message is written to the file and file is stored on a hard drive of every computer and if there will be a lot of messages that are produced simultaneously by many producers single computer will be able simply to write such amount of data quickly to the hard drive that's why if there will be multiple partitions spread among different computers they will perform riot operations of messages much much more quicker same relays to read operations if there are many consumers and if for example 1000 consumers will try to read data from topic C partition 0 this broker may simply go down due to lack of resources but if this topic will have for example hundreds of partitions spread among different brokers this job of supplying data to consumers will be much much easier there is a reason for creation of different partitions that are created basically on different brokers also it makes topic fault tolerant and in this example if for example broker 0 will fail to pick a will still be present on broker 1 and it will still accept new messages and accept new read requests from consumers that's why partitions are needed also in this diagram I have shown you this topic D and purpose in order to make you understand that multiple partitions in the same topic may exist on the same broker it is possible ok let's proceed and Baxter let me explain you how suggest our spread among different partitions when they are produced by producers so see you next mine I have just explained you concept of partition in the topic and the partitions actually increase performance of entire Kafka cluster and they optimize the read and write operations because now they are spread among different servers and also they increase for tolerance because in case when some servers fail other will continue processing of messages inside of a specific topic and now let me explain you how messages are stored inside the partitions recap that in the previous practice section we have created topic called cities and we didn't specify quantity of the partitions we want to create for that talk and by default in such case only single partition will be created and you have seen that the Kafka broker has created a folder with named cities - 0 that stores all messages for this particular partition that means that every partition is simply separate folder with files nothing else and if there are multiple partitions spread among different brokers then every broker may have one or multiple folders for every partition like cities 0 cities one cities two and so on if you create topic with multiple partitions and if there are multiple partitions on different computers producers may write messages to different partitions for example in this case there is topic 8 with three partitions partition 0 1 & 2 notice that the number of the first partition is always 0 and you're not able to adjust it it is behavior it is default behavior of Kafka cluster it creates partitions starting from number 0 for example if you create a topic with 10 partitions then last partition number will be 9 and number of the first partition will be 0 so in this case there are three partitions 0 1 & 2 & here you see how those partitions are distributed among those brokers and notice that offset numbers inside of every partition a unique and are starting from 0 so for example here in this partition there are only two messages and offset of those messages are 0 & 1 so when first message arrives to empty partition it gets offset 0 next one will get offset 1 and so on 2 3 4 and so on offset numbers must be unique across partition no topic because here in this first partition you see that it also has offset numbers for messages 0 & 1 same as in this partition but those messages are completely different from those messages they are simply different messages and same you see here in partition 2 it has currently 4 messages with abscessed from 0 to 3 and main idea i wanted to communicate to you using this diagram is that every partition must have unique numbers across all messages inside of it it's very very important but of set numbers of the messages across entire topic should not be unique in this case you see that this message this one and this one are different messages but they have same of set number but they are located in different partitions so producers may write messages to different partitions for example producer number one may write messages to partition 0 producer number 2 may write messages to partition 0 and partition 1 and producer number 3 may write messages only to partition 2 again every producer decides which partition to choose to write it is very very important producer decides net and every message that arrives to specific partition for example let's say we send a new message to partition 1 it will be appended to the last message that already exists in this partition and new message in this partition one will get offset number 3 so now from this diagram you see that using different partitions we achieve parallel performance of writing and reading operations so we may write different partitions in parallel on different brokers same relates to read operations but now you may ask me a question but what if broker one will fail and partition one with all messages inside of it will simply disappear in such case those messages will be simply lost I am talking about message 0 1 & 2 inside of partition 1 and you will be able to consume those messages anymore they were simply lost because they were stored only in a single place on a single broker and there is of course solution for this problem and you are able to replicate messages inside of every partition and if you want to you are able to replicate every single message inside of the topic multiple times for example five times or ten times and in such case copies of the same message will be stored on different brokers and in such case if one of those brokers fails author will be able to serve same messages to consumers that's idea behind replication and now let's discuss replication in greater details and talk about leader and the rest of the text by so now you understand that the every topic may have multiple partitions those partitions are spread among different brokers and messages inside of every partition must have unique offset number and offset numbers are started from zero so first message inside of every partition has offset number zero new messages are appended at the end of existing messages inside of every partition but there is a problem if any of the brokers fails each partition simply disappears and messages that were stored in that partition also disappear that's why it is possible to create replicas of partitions and in this lecture let me talk about that so here you see diagram with three brokers broker 0 broker 1 and broker 2 and there is topic a and notice that in this topic there is partition 0 let's suppose that there is just single partition so far and you see same partition partition 0 that was created on broker 0 and broker two but those partitions are created as follower partitions so this broker and this broker are followers and broker one is leader for this partition and job of followers is simply to get new messages from the leader and write them into specific partition that's all they don't accept any writing requests for this partition from producers also they don't serve consumers producers and consumers communicate only with broker 1 when they want to write or read from partition 0 and that's why partition 0 on this broker one is called leader partition or this broker is called leader for this particular partition and again when for example new message arrives to partition 0 at broker 1 it accepts write request from producer creates new message here in its partition 0 and replicates this same message message number 2 for example to partition 0 at brokers here and to broker to partition zero here and same message message number two is created here here and here and now there are two replicas of the same message and if broker one fails now one of those brokers is a broker zero or broker two becomes leader for the same partition partition zero and now for example broker zero will still continue operation and will accept a new right in requests to this partition and accept read requests from consumers that's how you're able to achieve for tolerance using replication again mainly in di behind this replication is that leader performs most operations it communicates with producers and consumers and also it sends copies of every message to all followers followers simply sit and relax and awaiting for new messages and as soon as new message arrives from the leader their job is to write this message to partition and that's all nothing else that's why you need to plan resources of the brokers accordingly because if there will be multiple replicas for every partition for every topic then load on brokers who will be responsible for leader role will increase hugely increase so please keep all that in mind also it is recommended to create at least two replicas I mean if you want to create such fault-tolerant architecture you need to have at least multiple brokers and if you want to create two copies of every message you need to have at least three brokers and you need to configure a so called replication factor on the topic level basics it means that you are not able to set up a replication factor for every specific partition inside of the topic instead you must configure it on a topic level basics and by default replication factor is set to one and that means that every message is stored only once only on one broker and if you want to achieve such architecture where every message from a leader will be replicated twice to two different brokers then replication factor must be set to three and that means that every message will be stored three times on three different brokers it is actually a recommended number for production environments and you should not go further and set a replication factor to four or five and so on so three is completely enough and this number will tolerate shutdown of two brokers of three that we're storing replicas of every message in this case broker one and broker two may fail and broker zero will still continue operation and it will still keep all messages in partition zero and if I'll get back to this diagram where we had three partitions partition zero one and two and if you set up replication factor three then basically we will have nine partitions instead of three but six of them will be lets say passive and they will be created as following partitions and leaders will write messages replicate messages to those following partitions but as you see with replication factor in place quantity of partitions basically multiplies by replication factor so here now we have three partitions but with replication factor three we need to have nine partitions okay that's all about leader leader of partition and that's all about replication I hope it is clear and if you use multiple brokers multiple partitions in each topic and replication factor you may achieve really nice results and you may build really nice fall tolerant or is Island to Kafka class okay that's all about this topic and next let's gonna talk about controller I'll explain you what is controller and the what is is job so see you next just checking did you smash like button yeah okay you are good let's suppose that there is topic a with multiple partitions and all partitions are spread among different brokers and every partition has all of these some messages when a specific producer connects to Kafka cluster and the ones to produce messages it can write the messages to different partitions and this diagram you see that producer Rhys message with offset number two to partition 0 then it writes message with offset number 1 to partition 1 and it writes one more message with offset number 2 to partition 2 and that's pretty acceptable every producer may produce messages to different partitions of specific topic also you can setup it in such a way that it will write only to specific partition in case if you will use key with all messages and usually all messages with the same key will be written to the same partition but if you don't supply this key ten messages will be spread in a round robin fashion across all partitions in specific topic but main idea here is that new messages will be added only after existing messages and for example if producer will produce one more message to partition 1 then this message will get offset number 2 and so on existing messages with numbers 0 and 1 are immutable and producer any producer is not able somehow to change those messages they I gain immutable that's the main idea behind creation of messages inside of every partition and again notice that every partition must have a unique offset numbers of all messages and thirst offset number is 0 so in partition 0 first message has offset number 0 and so on let's idea behind the write in messages to different partitions and of course you may have multiple producers that will write messages to the same torvik to the same partitions in parallel ok let's now proceed and next let's review how consumers make on Zuma from a specific topic so see you next by now let's discuss how consumers may consume messages from specific topic usually every consumer is connected just to one topic but it is possible to connect to multiple topics as well and to consume messages from multiple topics on this diagram you see consumer that is connected to topic a and it consumed messages from dead topic and you are able to consume messages either from all partitions or just one partition if you want from several partitions and you are able either to consume messages from beginning in such case you can tell Kafka cluster that you want to consume messages from beginning and in such case messages starting from offset number zero will be consumed by that consumer but of course if some messages were already deleted to do a to retention period for example messages zero and the one where deleted by broker then in such case this consumer will start consumption only from offset number two also you are able to start consuming from the current state and that means that consumer will receive only new messages that will arrive to specific partitions in such case it will just wait for new messages and receive them as soon as they will be produced by producers also it is possible to create multiple consumers that will consume messages from the single topic and in such case those multiple consumers will belong to consumer group and in such case every single message may be consumed only by one of the consumers in the consumer group also when consumer connects to Kafka cluster you are able to specify specific offset number you want to start reading messages from for example you could say that you want to read messages from offset number two and in such case you will consume this message this message this one and that's all and after that you will wait for new messages also it is important to understand that messages inside of the brokers will be a remained they're independent of the read status of the message by its specific consumer what I mean for example let's say that all those messages where already consumed by this consumer but it doesn't impact other consumers if someone else will want to read same messages then Kafka broker will allow to do that no problems at all because Kafka is published subscribed system and any quantity of subscribers are able to read messages stored in Kafka broker that's that simple okay that's all about to read messages by consumers and of course you are able to have multiple consumers that will read data in parallel no problem at all okay that's all also for this section and here in this section I tried to make just short and quick overview of main concepts of Kafka infrastructure and now I hope that you understand all of them and we are able to move on and in the next section we will get back to practice activities and Lounge Kafka class tour with multiple brokers I'll demonstrate you how leader is elected how different partitions are assigned to different brokers and so on so in the next section we will try multiple things that we have discussed here in this section and we will do it on practice so see you in the next section and by mine sad to say but this tutorial is over if you liked it please smash like button leave a comment below and subscribe to my channel also don't forget to share this video with anyone you know and I hope to see you in the next tutorials bye bye [Music]
Info
Channel: Bogdan Stashchuk
Views: 24,164
Rating: undefined out of 5
Keywords: apache kafka, apache kafka tutorial, kafka apache, kafka tutorial java, kafka tutorial python, kafka tutorial for beginners in java, kafka tutorial with example
Id: CU44hKLMg7k
Channel Id: undefined
Length: 194min 21sec (11661 seconds)
Published: Thu May 07 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.