Golang Microservices: Events Streaming using Apache Kafka

Video Statistics and Information

Video

Captions Word Cloud

Captions

hello my name is mario welcome to another video in today's episode i will share with you another tip for building micro services in go and specifically i will be discussing with you how to implement and use apache kafka so what is apache kafka kafka is an event streaming platform and let's focus on events right now event means that is something that happened and when we're talking about kafka there are three important things to do to indicate there will be a key which in our case in the context that we're using right now will be in our to-do microservice that happens to be using tasks will be a uuid there is a value that indicates hey something happened for example a task was created with a description and with a priority and it's going to supposed to and the due date is xyz and there is a time a timestamp rather all of this is important because this is how it's going to be we are going to be using all these values and these values are going to be used for ordering that data that is going to be coming into kafka so what is kafka used for so it's used for publishing and subscribing to events storing events and processing events and it does all of these thanks to the event event streaming platform that is happening behind the scenes with kafka kafka is not only for sending data but it's also a way to like i said produce uh write process store and publish and subscribe and there are different apis for doing that in this video we're going to be focusing on only publishing and subscribing in future videos i will be covering kafka in much more detail so how does it work well there is this concept called topic that you can define multiple partitions when there is a producer and it sends an event to the topic the way the topic is configured it will send and organize those events depending on the partitions they will still keep their order and what happened is that when the consumers come into place and they are grouped by a thing called group id and and if you think about it will be some sort of a way to indicate hey you a group id will have a different way to read the data and those consumers ids or rather those group ids will be obviously part of a multiple instances that are working together to consume those events so a consumer id or rather a consumer doesn't mean one instance but a combination of n instance instances or multiple instances and the way they read will always be in the order that they were configured and this is important to notice because depending on how the consumers are configured they can read things in a different way rather the instances that are part of that group will be reading the events in a different way however all of them will always keep the order so although for example the consumer one with the group id abc already finished reading this those three events if there is a new consumer call consumer two with a group by the xyz it will come in and still read those events that are still available in the stream in kafka so this is the important thing about kafka or the cool thing about kafka is that when you're sending data to kafka everything is stored and there is obviously a duration right there when the events will expire but you can still add new consumers and those can still consume those events that were coming or were already produced i don't know a few days ago a few hours ago what not so let's enough that's enough let's go and let's discuss the code then let's see how this works in real life as usual the link to the code will be in the description there's a repository that you can click that you can clone or hit up and you can feel free to check it out so i have these two applications already running one will be the elastic search indexer and the idea what we're going to be doing is uh is something like this so we are going to be creating a task that happens to be let's say new task and as usual this most of these things are reusing the things that we already built previously in previous videos so feel free to check those out so when i create a task what is going to happen i want you to pay attention to this window it will be receiving an updated event or writer a created event that indicates hey an event was created an event was submitted to kafka and then the consumer will have to do something about it in the context of this implementation i'm doing something similar to what i did in the previous video with rabbidmq and i'm just receiving the event and i'm literally indexing that event so because i'm doing an indexing in elasticsearch what i want to be doing to demonstrate uh what i what just happen is i'm going to use search what was just included all right what was just created and this one is actually returning a bunch of different values but if you if you scroll down to what was posted right here you will notice that this is the one that was created just now which is 7f and then some some sort of a uuid if i look at the results you will notice that there is a 7f and so on so forth and the same happens with all the other events i will show you how this works in the code uh so if i do a task and i say a change again and then i will i will do the 2 day 12 and is done is true again i will be receiving oh what happened oh there's a mistake right here if i do again an update it will execute a invalid request which goes oh obviously this is incorrect if i do it execute it will be okay and you've noticed that there is an updated event right here again if i go back to to the elasticsearch endpoint that we implemented previously you will notice that if i search by change you will see the one that i use change so how does this look in real life if we look at the implementation there is a new repository that does what i was just describing which is hey we're going to be doing the sending or publishing the events to kafka by after receiving those events in the service so if you remember all of this is sort of similar to what we did before with rabbitmq but rather now instead of calling driving queue we're calling kafka and the biggest difference i will show you what what's the biggest difference between the two implementations is that um in kafka what i'm trying to do here is instead of using gob or encoding gov like we've used before i'm actually using json and this is a way to sort of like defining up an encoding a format between your different services that you're using so there are a bunch of services there are a bunch of different ways to encode obviously messages there is avro there is protocol buffer json binary on you literally you can come up with your own so what i'm doing here is i'm actually using json because i want to show a different way to do this i in in future videos i will i will show you how i like doing this for versioning purposes but for now uh we're using json okay so one defining a new event type which happens to be right here oops what the heck is going on oh there you go so this event is literally a stroke that happens to be including a type and a value and i'm calling the value and i'm just literally just sending that to kafka the consumer is slightly similar to what we did before with a um elastic search in rabbit all right rabbitmq is receiving again this one is again one of those things that is using graceful shutdown and all of that fun stuff so please again check out the code and and play with it if we look at the actual code you will notice that there is again sort of like an infinite loop that is receiving an event and then pulling the values from the kafka consumer and what is this doing is waiting for uh 150 milliseconds to pull the values from the a consumer of writer from kafka and then from there depending if the value is is valid according to our rules it does what is supposed to be doing which will be indexing or deleting the records again it goes back to the same process that we defined previously when we use drive mq now like i was telling you what is the biggest difference between using for example if you are i've been talking about mq or kafka is that if we look at the way we have right here and again we already processed three events right according to this the cool thing is that because we can define or replay the events after there were push to the cons to kafka we can add new clients that they can replay those events or reconsume those events after those happens while that data is still in storage and that is the biggest difference between rabbit mq and kafka for example so if i do a consumer using the default you know if you go to the quick start in apache kafka there is this fancy way to download a few different tools that are basically useful for interacting with the kafka service so if i do that and i execute the consumer you will notice that i'm going to be receiving all the events that were produced previously although i'm literally a different client so it was creating a kafka rules hello world new das can i change it okay if you notice i have the updated right here created i created and one that i created before so how cool is this right because depending on what problem you're trying to solve perhaps using kafka makes more sense or maybe perhaps it makes sense to use private mq or perhaps it makes sense to not use any of this because it's overly complicated again it's one of those things that you need to consider when building micro services and whatnot so let's jump into the conclusions and and let's discuss what should be doing next so when should we use kafka in the first place so if you're planning to build something that is highly distributed that happens to be communicating with different microservices of rather you're building a system consider that consists of multiple microservices that happen to be communicating with each other via messages asynchronously and those happen to have the need to replay messages from time to time perhaps it makes sense to use kafka one thing i can tell you right now is that kafka is a highly difficult thing to manage and maintaining it manually it's a really difficult difficult thing to do so you're planning to use kafka i highly encourage you to use something that is managed for example amazon has they manage a kafka service if i recall correctly msk and that one will literally amazon will handle all of that if you're planning to maintain that on your own you're going to have not problems but it's going to be a little bit difficult than you expected other than that the the support in go is is nice confluent which is the company behind the commercially maintaining and supporting kafka they already have a package there are a few other alternatives i personally like using the commercially supported packages because obviously the company is supporting those packages and therefore they they want to keep your customers happy and their customers happy so those will be my recommendation is supported by go it again is one of those things because it's using c go perhaps the performance is not as good as it would be if you were using c or c plus plus but again it's not that bad and the last thing to consider is that is a sort of like a big learning curve but and and it it could be problematic when trying to version different things because again a few different things that work in different languages are not supported in go for example the registry uh for events uh that happens to be an avro at the moment is not supported in go so those are the few different things to consider when trying to use kafka in go in the first place so should you use kafka i guess it depends on your on your problem and the thing that you're trying to solve but in the end you know think about it and if you have any questions just let me know i will talk to you next time okay take care and be safe see you

Info

Channel: Mario Carrion

Views: 3,599

Rating: 4.9649124 out of 5

Keywords: golang, golang kafka, golang kafka microservices, golang kafka tutorial, golang kafka microservice tutorial, golang kafka microservice, golang tutorial produce kafka, golang kafka events, golang kafka confluent, golang kafka sarama, golang kakfa shopify, golang event streaming, golang event sourcing, golang protobuf, golang avro

Id: jr7OULxYm0A

Channel Id: undefined

Length: 12min 23sec (743 seconds)

Published: Sat May 15 2021