Streaming with Spring Cloud Stream and Apache Kafka

Video Statistics and Information

Video

Captions Word Cloud

Captions

my name is Oleg Tchaikovsky and here so V will introduce yourself in a second to be talking to you about spring clothes stream specifically in the within the scope of Kafka and Kafka streaming so I'll introduce myself again I'm Alexa horochowski I'm a project lead for spring cloud stream and spring cloud function you may have been at my presentations yesterday and the day before and let's OB introduce myself i/o is a himself so be chaco i'm in the spring cloud stream data flow team primarily working in splitting extreme sort of things and committing to the Kafka Kafka streams binder and a few other things within the data flow umbrella projects he's actually project lead for spring cloud stream binder Kafka so again for those of you who have been at my presentations yesterday in the day before the next 10 minutes maybe may seem to be a bit repetitive so forgive me for that and for those who just came in so we're gonna basically cover a little bit about what spring cloud stream is and the goal of the spring cloud stream and and then I'll turn it over to so be and he'll start talking to you about Kafka because I don't know much about Kafka myself so why Spring cloud stream well why not right so spring clouds stream is a general-purpose framework that allows you to build microservices but a simpler way of explaining spring cloud stream is that it's basically a binding framework that allows you to bind your piece of code to remote destinations exposed by your broker so as you may have seen at my previous presentations you write a code for example as function and by simply bringing sprinklers stream as a dependency you that piece of code is going to be activated based on messages arriving to those remote destinations so and that's done through the binders that we provide within the within spring la stream framework so we can support Oh as a spring team we provide you with a Kafka Rabbit Kafka streams binder and then there are a lot of community binders for solace for Asia for anybody need stuff yeah Kinesis so and there's a Genesis guy out there sitting somewhere back there okay Oh back here frontier so anyway again continuing an overview of spring loss stream so leverages a lot of the native features provided by the underlying framework on some sort underlying broker so for example I mentioned before that we have features such as such as partitioning in consumer groups such a broker such as Kafka provides those features natively broker such as RabbitMQ does not but this is the additional value that sprinkler stream provides because we either rely on pull back on the native features to expose these features or implement on top of the broker to make sure that these features are kind of compatible and exposed regardless of which binary are using so and of course a flexible program your model which we hope to demonstrate in a few minutes but then again functional or annotation based like with the stream listener but we don't like that anymore so here's kind of one view of what sprinkler stream implication model is you have application core which relies on spring boot and there are additional frameworks that are involved spring integration Kafka stream spring Kafka spring cloud stream these are all kind of if you look at sprinkle stream right now within let's say within the scope of Kafka that's pretty much what you're gonna see underneath even though sometimes at least we try to limit the exposure of those things to you and to say listen you only worry about your functional code and we will take care of the rest regardless of what we're using so we're just going to bind the code to your destinations so the features are as we discussed before the PAP conversion so in other words you don't have to even though messages are travelling in a some kind of a raw format from most of the time in just a byte array right but you don't have to write you you don't have to write your code to accept binary or two output but to send byte array we will do that for you so in other words there is a concept of a type conversion that's bundled within a framework and we will support a lot of different types but most importantly that part of the framework is extension point which means that if there is certain things we don't support meaning we don't provide a message converter for that you can implement your own and just throw it as a beam into your configuration and we'll pick it up support for error handling basically vlq or a provisioning of destinations so again as you've seen from the demos and you may have seemed you may see the dammit well one of one of our demos will kind of take will take care of the order provisioning meaning that as a developer like in production this feature is kind of meaningless because the production you already have destinations but when you're developing you want to be able to have something dramatically usually you just want to write code and you're going to be able to have something that I magically provision your destination so we'll take care of that for you as well health indicators it's a whole of the topics so we're going to skip that and a lot of other things so Java based Java based functional programming model that's my favorite topic that's probably the most important part of this year spring streams pretty loud streams big loss function story because we we are not duplicating stream listener unfortunately we're going to keep it because a lot of people are still using it but if you're coming new this spring cloud stream don't use it and if you have a chance to switch from spring clouds from some stream listener to a functional programming model you're basically going to remove some code that's all it is reactive programming model support using project reactors so again functions right now functional programming model using much more features because it's natively supports reactive programming models so you can now support different styles of event processing like simple events streaming events and complex events right so and like I said the frame it provides annotation based programming model but that's that implies stream listener which again I'm kind of have to mention it but would prefer not to so the tip within sprinkler extreme and actually it's not even with the sprinkler stream it's like I asked this question before when you writing code you typically do one of three things you either producing data or consuming data or consuming and producing data can you think of anything else so effectively the Java eight functional programming model defined you provided you with three interfaces that you ever need supplier function and consumer and the concept of source sinks and processors this is just the sort of trying to find the right word it's the concepts that describe essentially the same thing so source is a function forces a supplier function processor as a consumer and sync is a sorry source is a supplier sync is a consumer and the processor is a function right so but again conceptually you can sort of mix them in your in your conversation because they kind of mean the same thing so and that's a that's exactly the type of applications we support within spring cloud stream and yeah so this is a fully functional wall spring closed string application as you can see there's absolutely it's just a spring boot app right the fact that it takes the nature of a spring cloth stream app is by putting all the configuration on the class path right so you add string cloth stream binder to the classpath which brings spring clouds stream core and whatever else that needs to be brought up and at this point of time as I said we're kind of embracing boot mentality at this point we're saying listen if you use it if you have a configuration with the function and you're both spring cloud streaming to the classpath you probably want to use this function in the scope of the spring as a message Handler within the context of a sprinkle of stream and again just with as with any boot message if that's if that was not your intention we do provide a way to back out okay well just an example of supplier and probably consumer right so again just a plain raw Java supplier function or consumer and we will treat it and the interesting thing about this is that if you think about the importance level not important so if you think about like like look at this piece of code and think about how much information we could extract from it and that's important within the context of a sprinkler stream because we try to limit your configuration so let's try to kind of decipher it so we have a function right which falls into the concept of a processor because it's going to have an input isn't going to have an output right that's piece of information I know the types of the input and output right because if I reflectively introspect the signature of the method that's what I'm going to get string strings so I know when it comes to the tag conversion when you're sending me the battery I know which converter to apply to convert it to string and I because I know that that's what you want the string and I also have the name of the function which allows me to create for the provisioning purposes the binding which is gonna in this particular case is going to be called to uppercase - in or out and so on and so forth right and then again you can do the mapping and there's additional properties that can do if you want to override the default behavior but the point is that you code this you click run as long as you rabbit or Kafka whatever band reusing is running everything is going to be functioning so you don't need to do any additional configurations and again just in contrast to the stream listener you would have to create a method that you think it's going to be a handler you have to taylean whispering listen we have to say is going to receive the input from this channel sent output to this channel all of a sudden you're exposed to all these internals of the framework for absolutely no reason it's the information we could have known ahead of time okay and that this is a signal for me that this is my last slide and I'm going to hear me on stage and help out so--but with he's part of the presentation but this is kind of the sort of my favorite slide because this is really how I see what sprinkler extreme it's because you have destination binders right those take care of binding to remote destinations and you just slide your application with the function into the binder and it will know what to do so again we kept having these debates throughout less three days but and I'm gonna put it to change our front page and update our documentation spring cloud stream is the binding framework that binds the piece of code is your ID two destinations that's all right it's not some kind of a wrapper over spring integration which some people presume it to be because yeah we're using spring integration underneath but that's not in some frame in some binder some other binders don't so that's already kind of a broken paradigm right but again spring closed chain is the binding framework to bind the piece of code to remote destination that's it so I'm gonna turn it over to sabe he's gonna be talking now getting deeper into the Kafka aspect because what I was doing was on rabbit so somebody's gonna do think Africa and oh yeah so let's legislate quick demo of a simple function like an uppercase but so you can see exactly what the fully functional spring closure application look like so this is what all I'd was talking about this is a full-blown spring boot application which is using sprint out stream behind the scenes as you can see this is a Java little function defined as a beam here all we are doing is returning a lambda expression from the function so we're gonna look at the palm of this project the way it becomes a spring cloud stream application is the fact that we have a spring cloud stream kafka binder on the class path so the moment you add that into the project palm you're essentially making that as a spring cloud stream application so when you when you run it now just came infrastructure will come in and then do what finding the provisioning and all those things so that is an important piece so you need to import the binder dependency in your class path in your palm it will be automatically detected by the framework and then everything will be provisioned for you so at this point it's pretty self-explanatory by looking at this code what it's doing so I'm just gonna run it and then we're gonna actually by default there are certain things going on here if you actually look at the the configuration for this stuff for this project it's pretty much empty right there's nothing going on there so you might ask ok this is a kafka binder based application there are my topics where is my input topic where is my output topic so there are certain opinions that print out stream makes here so by default if you have a functional bin here like this the functions name is beams name is to uppercase so we expect that you have an input topic defined in the name of to uppercase - in - 0 so that will be your default input W similarly on the outbound side the default output topic is to uppercase - out 0 but what if he what if I have existing topic that is called input and output and I want to bind it to that topic what would I do so there are like properties that we provide in the binder as far as print out string core using which you can actually override that so if have a predefined topic existing out there you can actually tell the application that okay use that topic instead of using the default so that's what I was telling about this so let's look by default we'll bind it for you and we'll create or bind if it's already exist but and that's great for early development stages or trying or demos or anything like that but in the real life you most likely are most definitely will have existing destinations and at that point of time you will provide additional configuration to say listen I want this function to be bound to this destination or that destination in quickly about naming convention naming convention is we were trying to come up with a something very intuitive so function name - Ian or out that's the only two things you can have and second - signifies which is not relevant for this particular topic but whoa I should when you get to extremes it becomes relevant but the the index the last part of the name signifies the index of the input or output in other words for single input out which makes no sense it makes no difference I'm sorry but for multiple input outs because that's what becomes relevant because you can have to uppercase in 0 1 2 3 4 and so I'm just gonna run this application kind of start as a normal boot up yes as you can see tuffnut binder picked up all the information it prints out all the relevant information there so now the application is started so now let's sum some data and throw the input topic and then let's see we can get data or the outbound side so so this is the using utility here kafka cat so this is the producer so if you're actually sending data into the producer topic here to uppercase - in so so just gonna copy the command I put there so this is the output topic is that big enough gonna produce another I'm just gonna send some data here something or not Deanna you didn't did you get better you did sorry there we go there you go sorry alright so let's make it real time so all right so all right so there is the normal Kafka binder in action this is the pub/sub based binders we published into a topic and you consume data on the other end any questions so far sit again please no it started the broker-dealer actually runs out there so the question was his broker started with AB no the broker is not like embedded Kafka mean sure protesting we use better Kafka but the broker is really running so again pretend you're in production you have Kafka broker or rabbit or whatever broker running and you're basically using that too yeah all right any other questions yeah can you pick up like how is it going to represent what title in fact if I hear question correctly you asking how is it going to convert to string or from string oh I just explained the name of the function - in or out - index so the topic is by default if you don't define anything it's going to use the name of the function and there's a two benefits to that first of all for first one is obvious it does it automatically and it has some kind of a default thing but second one is because it is using function name you can always when you have multiple functions multiple topics and so on and so forth you can always correlate which function corresponds to special in the early development stages which function corresponds to which destination all right all right so now question oh can you use this with ready streams well I'll answer this question differently yes providing we have a binder yeah we used to have a rightist binder we don't have it anymore but mark Pollock actually approached me yesterday there is another push to have a rightist binder so that's the answer to the question so basically it's the binder that provides this functionality so does that answer so we need we need a binder right yeah exactly whoa not just Kafka we said spring cloud stream binder Kafka so if we had spring cloud stream binder edits or binder foo that's really what so in other words we need to implement yeah the binder I'm sorry can you repeat that in u.s. Kappa specific features sure yeah like Sarah days he's gonna be talking about this like yeah yeah you can definitely use specific features I mean that is what all it was mentioning before I mean all those things that Kafka brings to the table are abstracted behind the framework like for example Kafka gives you like native support for consumer groups partitioning but all those things are sort of abstracted into the built into the framework so if you're using something cloud scream you get all those Kafka benefits by default and those are not implemented like I said in the rabbit since it doesn't provide consumer groups and partitioning we actually do some Clarence behind the scene to make it look like it does so from your perspective you still define the same properties is still using consumer groups and partitions but internally it goes through our code now on the Kafka binder side since Kafka natively provides this we just fall back on their features right and then I think you may be showing it later like the the native things like service like type conversion Kafka allows you to define native type conversion versus using our type conversion you can actually say don't basically there's a property you can say don't use sprinkler stream type conversion fall back on the type of contact conversion so in other words you can provide a full-blown Kafka configuration as part of the application configuration for the feature that you wanted to use I think that's an important point so by default these mesenchymal based in a box of binders spring cloud stream just all the message conversion behind the scenes so it does the that's the content negotiation and then basically that's the conversion so in the case of Kafka if you want to use like natively provided like serialization for example you can actually there's a flag that the framework exposes you can actually set that and then bypasses message conversion by the framework and let Kafka carry on with the native serialization okay all right let's move on sorry we'll make another pose later for the questions all right so at this point we're gonna segue into a different topic altogether so up until now we were talking about spring cloud scream from my pops up Meza channel based Kafka miners which means that you basically get data through a Kafka topic in a consumer basically consume that data and do some more oppression may be using some imparative models or you know project reactor or things like that and then basically do the processing so for that the regular Kafka binder is fine so Kafka Patrick Kafka also provides actually before we get there let's talk about a Patrick Kafka action I don't think I need a an introduction of Kafka here I'm sure everybody's familiar with the cough kites benefit of Kafka it's based on the log data structure up and only log data structure provides a lot of features that many enterprises find it as as attractive so instead of writing your data into like different multiple systems they basically use Apache Kafka as the single source of truth basically your databases your your search engines your other other Enterprise Systems legacy systems everything basically write the Kafka and then you can seem from Kafka so Kafka is very good for you know consumption consumption science and it provides a lot of other things like here it comes with the producer consumer API and Kafka streams API also it provides transactional capabilities recently they also I don't like support for exactly one semantics that's a big thing right so it's a major accomplishment so we basically briefly touched on the producer and consumer API in the previous section when we did the demo and all X section so Kafka also provides another library called Kafka strings it if you download Apache Kafka it comes with the distribution so this is a stream processing library used for writing screen processing applications for example if you are familiar with using flink or patchy flink or spark streaming beam magic Google Beam things like that it's kind of in the same ballpark you essentially can get data as a stream of events and then you can perform screen processing on it and then write the data out into a destination top PDR or through the you know some downstream mechanism and align it to sort of the three paradigms of event processing we have simple event processing we have complex event processing and somewhere in between there's also called stream event processing and that's really when if you outside living outside of the world of Calca streams we have project reactor with the flux moana type structures but it what Kafka streams also be looking about they provide their own native API directly into their broker which effectively gives you the same type of constructs to in the form of a case stream and all you'll talk about details so I think one big benefit of using Kafka streams but scream screams stream processing is the fact that you don't really need a separate dedicated processing cluster for example if you use Apache flink or spark streaming as you know it involves its own you know operational you know issues and you need to have it separate separately dedicated processing cluster the benefit of using Kafka streams is that if your enterprise is already tied into the Patrick Kafka you know you don't need a separate cluster for stream processing you can essentially write a java application as a car normal can't cuff a consumer and that is your screen processing application Superman operational standpoint and that takes away a lot of such constraints so I see that as a big benefit of using Kafka streams if you're already into the Kafka world you don't need to bring other libraries for stream processing so you're saying that if I just if I have the same topics I can choose whether I want a stream processing or email processing Mexico pressing so you can use the regular Kafka binder for normal pops up kind of processing even processing and then you can use Kafka streams for Kyle's screen processing type of workloads another thing is that using Kafka streams all the piece of Kafka that it provides is applied like fault tolerance the replayability of such management and all those things are already built in and all those guarantees are applicable here as well another thing is that Kafka screams is a Perl code based processing you get a rep code you get a Kafka record you process on it done with it and get the next turn so that is the the programming model that they support but when it comes to the stateful management state management that is the place where this library shines the most using regular Kafka binder you can probably do everything that straight after streams that's what is a lot of things involved like the moment you have to maintain your state for example if you have to maintain things in a time windows and then you want to basically stop your application and then restart from very left of that immediately brings up the you know necessity of you know having to maintain state right you need to pull the state from somewhere sure you can actually build your own state management in your system and that makes things very complex right normal have application general development that involves a lot of things so this library is already providing that feature like you can eat the stateful state management is actually built into the into the library so that is a you know big benefit of using it all right so like Oleg said spring cloud strain provides like two binders in the world of Kafka so only is the one that we just talked about Kafka buy into regular Kafka binder so there's another separately dedicated binder implementation for Kafka strings so in that implementation we only deal with with this library so you can actually bind as I mentioned spring cloud stream is is basically a framework for binding it does binding and then a few additional things so a binding is at the top level so when you use the Kafka streams binder it is essentially allowing it to bind to the various types that Kafka streams exposes so let's say that Kafka streams exposes a type XYZ so spring cloud stream binder will basically bind to the type and you have that type available in your function beam so that you can actually immediately start doing processing with that so this is the stack it's a normal boot application and inside it you have Kafka's streams at the top and then you have cuffed are streams binder comes comes into play and then CAFTA streams binder in spring cloud streams essentially built on top of spring Kafka spring Kafka is at the lower Larry there I provides all the all the necessary bells and whistles for building the binder and of course it communicates with the Kafka property so this is the basic stack of writing a Kafka strings so these are the three major types that Kafka streams exposes case stream is actually you get your data as a stream of events I don't know if you patent that mark Fisher's keynote yesterday he was showing that the the new paradigm is that you get like events like a product update or somebody purchase something so you get that as an event and then you basically immediately do some operations on that so that is represented through this case stream type so that is the major one of the major type that Kafka screens exposes so what would this be like an equivalent of flux in the exactly it's pretty much like flex but these are two different things specific together flux is basically based on reactive streams and kafka streams is a completely different programming model so that is a case stream and then another major type is K cable k table is where you get the same same data but behind the scenes you can think of it as a database table you only get the latest update of that particular key so in that regard you can actually I do like joins and lookups and things like that and in the case of Kafka you can actually run multiple instances of a consumer right let's say you have you know one topic with 50 partitions and you want to basically run ten instances of a consumer so then the way Kafka streams is designed is that if you if you define your type as Kate Abell you only get data from a certain number of partitions but they also expose another type called global key table in which you can actually get all data for example if you if you want to if you don't have all the instances to have access to for example I brought a catalog you may want to use consider something like that one other important thing that I want to point here is that like all the other binders that like mentioned like after binder RabbitMQ binder tuff gasps screams binder is not portable so that it's probably you know connecting to that very strange question that he asked so he cannot say you know easily take this Kafka streams binder application and then pour destroy different streaming model so this is strictly you know specific to Kafka streams library technology it's basically a type specific and types are defined by Kafka exactly so it's not like in the vent processing model specific like for example in the pure event-based binder with function and with reactive support you can choose which model do you want to use either a streaming model by just defining a function as a function of flux or simply been processing model by defining a function is just simple types but in this particular case we it's a specialized binder that is specific to types exposed by kafka so obviously k stream outside of Kafka means absolutely nothing so in stream processing that are like various concepts you can actually get your data as a stream you can immediately convert that to a table you can do vice-versa Marj entrapments designing data intensive application is talking about these concepts in detail like you get events or stream you convert that a table and then you get table and then convert that to stream reply back to your stream and so on and so forth some other concepts common use cases are time-based windowing you want to basically find out aggregates count and so such information you can do different kind of joins with between streams and stream the table so on and so forth and then you can do aggregation you can do stateless map and filter those kind of operations as well using the library but like I mentioned before Kafka streams provides like built-in capabilities for stateful string processing where as soon as you do like some some window based operations like count or aggregation you need to basically keep that data in to some state store right so by default cough-cough streams users and in memory database called rocks DB a disk based tables actually on memory rocks TV and then you can actually if you want if your application is worried with the with memory issues and things like things like that you can bring your own state store for example if you want to use some other external technologies of state store it is quite possible but majority of the time you might be okay with using in built-in rocks DB as the that's a state store one other thing to note here is that when it involves state the way Kafka guarantees fault tolerance is that when you actually publish put something into the rocks DB or your state store it also publishes that same information into a change log topic in Kafka so that way you know even if you don't have access to the access to the state store it it actually retrieves the data from from your change log Kafka topic where it actually backs up the data so there are different ways to get around the concern of oh what happens if I if I restart my application and if I have like thousands and thousands of keys in my in my state store how do I restore if my application is going to stop it's gonna stop until it's it basically restores the current state about our check out Kafka streams dogs I mean they provide a lot of options for getting around to such performance issues and things like that from a spring cloud stream angle we provide a lot of touch points for you know interactively querying the state store so once you have data in your state store you can actually instead of looking up your database in the in the in the other models we basically have some repository and then you look through that repository to pull the data from the database so here if you have your you know real time information in your state store you can actually get that by you know using an interactive curry against this database so that is another another pattern that we are observing in the enterprise these days like you're on a build like dashboards like real time inventory information real time information about some some some use case out there it's a very convenient way to build such - for applications and things like that instead of you know you don't have to go to the database to get the data but rather kafka let me use Kafka stream state store to retrieve the data alright so the programming model is pretty much the same as what I like talked about before the the we support both stream listener actually that is the that is the current model that is available in the currently g'd versions of spring cloud stream but in 3 3 . Oh which is going to be released pretty soon the Kafka streams binder is fully now aligned with with the functional model so you can essentially write your Kafka streams application essentially as a as a function beam and then you can provide your business logic as a as a lambda expression for example and then there's your Kafka screams application another thing I want to point it is that this is not a trivial library I mean it it's it's there's a lot of things going on here in the stream processing writing stream processing applications is not easy so I don't want to for you to think that by using spring cloud stream your writing stream processing applications are like becoming like very easy or anything no there's still a lot of things involved from a from an application development perspective you really need to know there's a library to write remember how I said yesterday that from the spring cloud stream and layers above we only care about the signature of your function what you put inside of your function it's outside of our control so that's the application code that's you know that's where the mistakes can happen and we can't do much about it other than just you know rely on and error handling mechanism exposed by the framework right so so it's not a framework to help you with the Kafka streams to simplify Kafka streams like some other spring frameworks that lay that layer on top of some other you know like like spring JDBC for example simplifies Java JDBC API right well this one again use Kafka stream we just help you with connecting activating error handling that conversion things of that nature but what you put inside your function is you yes absolutely yep so in the toughest streams binder we only support either function Java it will function or Java it will function consumer so this the supplier model doesn't really fit in in the Kafka streams binder so you can receive data from topics multiple topics and then produce data so that is your function or you can just receive your data and do some take some actions so on it so that is your consumer so those two models fit very well and we support multiple input and output in this binder so if you like two inputs and one output you can use Java Chavez by functional support if you only have two inputs just use the bike consumer anything beyond that this is where it probably diverges from the other binder they use the tuples support for multi input and output here the doctor screams binder of using a function Carine support available in java c it's a partial functions yeah and then partially applying those functions in a net crate and i want to kind of stop right here so this is a long debate that so being i had and many other people were involved in terms of how to represent functions with multiple inputs and multiple outputs and this is again where the specifics of not only kay stream but also the community behind the case stream we had to take that into account because for example when i was looking to service sob what do you think you you're the you're the month who's interacting with the community how do the developers today deal with like the decay stream developers the functional developers that live in the world of Kafka how they do today what's more natural for them to do a very linear 2.2 2.3 or to do occurring like function that returns that takes something in returns a function that takes something that returns the function because to me personally that may not be the most readable way of especially once you go beyond two three outputs or two or three inputs but again so we kind of had a lot of debates with a community and everything so we chose for the source for the Kafka streams which shows the the more national support is the function occurring however that said we're still in the process of providing flexible programming models so that doesn't mean that this is the only way you can do it we will expand various different ways of how you can declare it and then essentially you'll pick which are away because as long as we can bring it internally to a canonical model where we can say listen for the purposes of applying the same code we can actually allow you much more flexible signature and say as long as we can understand that this is a three input to output function we can turn it into exactly how we want it to look with the post-processing during the initialization which we've done in spring since inception so when it comes to multiple outputs Kafka streams provides a specific way to do that it exposes a an API method called branching so when you branch you essentially send it to case stream array so binder behind-the-scenes detects that and find out how many case streams you are actually producing in that array and then the binder provides binding for each of those output so that is the only way multiple outputs a supporter in a in the binder alright so this is a essentially a Kafka streams function so you here you write it as a bean and the type signature is the the the method return is a function Java util function so the first argument is your input topic so it's a case stream coming in and then you're producing a case stream and then in the return return you provide your lambda expression so that is a that is a basic Kafka stream function in a nutshell you know but I'm gonna talk about studies when we do the demo so I'm just gonna skip it right now here I mentioned the interactive queries it's an abstraction that we provide on top of the state stores in that is built-in kafka streams if you have time you can also demo as some few things about interactive curry at the end so Casca streams library provides two different api's one is this higher level DSL in which you can actually chain functions in like a normal functional wave you get a case stream and you applies a map and then the filter then you account aggregate reduce on and so forth so that is the normal pretty much looks like reactor based programming that kind of style if you don't want to do that you can also against the lower level processor API which is much more powerful you can do all kinds of things in the low level processor it's an imperative model that you are doing they're saying the binder it doesn't support the low level processor out of the box but you can actually mix high level DSL and low level processor so we can use the case stream process or case stream transform and that takes the supplier of processor or transformer and then there you can actually use the low-level processors so using the binder the only way you can is the lower lower level process raise by mixing a higher level DSL and the lower level processor so kafka streams provides a lot of metrics as part of the basic library so the binder provides support for some info level metrics through micrometer so if you have micrometer in your class path we export all the available metrics through micrometer then you can then take that and then use that easier visualization tools to export that into - prod and things like that so if you include boot as pin boot actuator so think then that that will become as part of your application so Kafka streams binder also support error handling by default Kafka streams allows you to log the error and skip it or log and fail you can by default I think it loads and then fails so your your application crashes if there are any D serialization exceptions majority of the time where errors happen in the Kafka screams application is when you deserialize if it cannot deserialize but spring cloud stream adds another level of TC lizard handler it's a dlq handler it is strictly based on the support that spring Kafka arts spring Kafka provides some support for D serialization exception handler that is delegating to a dlq dead letter publishing both power I believe that's the name of that class so we use that class in the binder to basically send that information by dlq if you choose this is not an out-of-the-box feature you have to opt into this you have to say that okay I want to give this D serialization error handler if you if you say that then the binder will send any records that it failed to CDC relies to that Dale queue destination the way you can send to DL kill it can be highly configured and tuned and they can use all kinds of producer properties and things like that there right so at this point I think I'm gonna show a tough cut streams basic demo and then we'll take it from there any questions so far yes by default the dlq will use the actual inbound topic name dot error and the dot application ID of the of the application is documented it is documented and then you can change that if you want if you want to change that is something else it is a perfectly possible to do so yeah yeah absolutely and so remember we're binding so this is the this is the important part this is what so he was trying to say and this is actually extremely important to understand this binder kind of lives in a song world because we are binding not necessarily to a destination we're binding the destination in this case is the actual type exposed by the Kafka so it's a highly specialized binder no he cannot yeah that's an interesting use case though if you can somehow combine sure but I mean the real question is why like I mean and at the moment it's a rhetorical whether there was actually use case because it's like wool wool case stream guys embrace it or not you know like yes having a uniform programming model obviously has values but then again flux only gives you kind of an equivalent of a to the K stream but so being mentioned there is other two types take a table and global K table and there is no equivalent to that in in the reactor API as far as I understand I may be wrong so yeah yeah so this is the dependence that you have to build bring spring cloud stream binder kafka streams so that essentially gives the signal that this is a cough constrains application so and yeah this is that being that we saw before so here this is a function being input is a case stream of all strings the output is a custom domain so the binder will do all the D serialization and deserialization on the edges so by that I mean that when you receive data when you can seaming data the input topic the binder does binder will do a lot of inference and then it will infer the rights or duties already is actually an abstraction on top of deserialized strand sterilizer that Kafka provides so if you look at the solidity interface you can see that it all it provides is a serialize and deserialize or so binder will detect that type a string and then it will automatically do you know delegate your strings 30 similarly on the outbound side we are producing a case stream of word counts in this case this is the by the way this is the canonical word count example that confluent provides so in this case it actually looks at that type and it realizes that it's not one of the out-of-the-box 30s at Kafka streams provides so when it defaults to adjacent 30 that is provided by spring Kafka so spring Kafka provide its own jayson's 30 so it defaults that so in that case it has to be a JSON friendly object which is in this case so let's look at this code I mean as I said before all it's doing is it's returning a lambda expression from the method it's a full-blown Kafka streams application so normally when you read a Kafka streams application you have to write a lot of you know set up step like you have to you have to actually you know build your streams builder you have to you know you have to set the configuration properties in the code and then you have to start the Kafka streams object and then maintain its lifecycle and all those so all the stream handles all this mystery Kafka streams by and that's all there for you yep living only with worrying about the actual functional okay exactly so the the value out here is that the application developer can simply focus on the logic what what really matters here so in this case this is that high-level DSL that I mentioned before gets the input and it does a flat map on that and then it does a map and then group by the key and then we want to find out the the count of each word that occur within a 30 seconds window and we are sending the data back to the back to the output topic and vol survey described what this code does is absolutely irrelevant what it does because again that's the code you write we from the binder from the framework perspective only want to make sure that you provide us with enough information to properly invoke this code and to properly deal with the output that you produce so this is the configuration for the for this application so this is this is what we were talking about before by default the input binding is named as processed - in their server but I want to use a different topic instead of that default topic so I want to consume from the topic words and then produce to your topic name counts so here this is how we can actually override that so the documentation provides all the details of all the other configuration that he can set in this application so let me run this and then we'll take it from there while he's setting up you guys have any questions say prints out all the topology as you start yeah okay right so the application is started now let me look at my cheat sheet here well I'm gonna make the same mistake last time all right all right yes as you can see as I type this it actually prints out the word count of that particular word so as you can see on the right hand side you're consuming from the output topic in this using that utility we can see that the count is increasing so it's a quick demo of this applications they essentially what's happening is that you're sending the data to the inbound topic do the stream processing and then write to the output topic some quick notes about state management so this looks like a very simple application but it demonstrates a lot of concepts from a stateful angles the moment you introduce like windows and counts on line 49 and 50 so that involves state so kafka streams behind the scenes store that information into into a state store so this is not a stateless application it's actually a completely stateful application that we are looking at here so I just want to show how we can get the metrics that Kefka streams provides so I I have springboard actuator on the class path so I enabled metrics endpoint so I want to first see what are all the metrics available through micrometer so I can use this endpoint so these are all the available metrics that Kafka streams provides so if I want to if I want to just get some particular matrix let's say that I want to see my current commit total by the stream thread so they can actually do that just get that so it will give me like now so far 18 commits applied in this in this that consumer application so on and so forth so that is one demo I am one more demo so five minutes yeah try this we'll try to make it real quick so the use case is this this is also again taken from confluence example so basically you have like this in the in the era of social media and or you know people want to be you know let's say that you have users in like different continents and all and if some companies want to track where the clicks occurring from so click impression so that is the use case you have so here is another thing that I forgot to mention so in those kind of instances so you probably we have a let me first show you the code and then they left dock so this is an example of by function so essentially let me stop it the previous one essentially you have two inputs and one output so the first input is actually a stream so the stream is basically good information so you have users users are clicking some websites or some links and things like that and then you have a cake table a chaebol is where you get the users information like which which continent is that user is from normally this is where things like CDC comes into play so Christians somewhere here so you have like information coming from a database you want to bring that information to a cosmic topic you most likely you will you lose something like a CDC mechanism to bring that data and then once you have that data coming out from it all from a database then you get into this cuff gas streams application as a key table so here you have two kinds of bindings happening area of case stream binding and K table binding and once you have that data you are doing some joins here so have stream coming you have a clicks information coming I'm sorry region information coming and then you doing a left join on that on that stream with the table and then you're doing some normal map operation and finally what you're doing is that you are essentially finding out how many clicks are occurring from a particular continent so that is the normal use case so here we are providing some convenient consumers so this application has like multiple function beings here what is the buy function and the other one is the consumer so consumer all it's doing is it's basically logging from the outbound topic like how many clicks are there then I also provided an interactive query so you can actually interactively query that state stuff so this this reduce is actually putting that data into a state store so if you don't have time today we won't probably demo that but you can we'll we'll provide the links to pull together the sample and things like that you can actually run it on your owns but we're gonna run run the very basic care use case here so here again I want to look at my readme so this sample has all the readme information so first I want to send information for a user called Alice so actually I have to run the application first so the application at the end I have a invocation one hours away I need to bail but if you have any more questions somebody will the better candidate for couple of questions well thank you very much well it is just one minute one minute okay so I'm just gonna run a producer here thank you or like thank you you know what I'm gonna post the links to this sample so that you can actually run it or you know your own so in the in the in the slide that I think we are running out of time so here it's the all the resources that I used for this demo says in this in this sample through poetry so feel free to catch that there and then we provide like detail readme there on how to run this these applications thank you very much if you have any questions I'm more than happy to answer [Applause] you

Info

Channel: SpringDeveloper

Views: 35,190

Rating: undefined out of 5

Keywords: Web Development (Interest), spring, pivotal, Web Application (Industry) Web Application Framework (Software Genre), Java (Programming Language), Spring Framework, Software Developer (Project Role), Java (Software), Weblogic, IBM WebSphere Application Server (Software), IBM WebSphere (Software), WildFly (Software), JBoss (Venture Funded Company), cloud foundry, spring boot, spring cloud

Id: 5Mgni6AYnWg

Channel Id: undefined

Length: 59min 25sec (3565 seconds)

Published: Wed Oct 16 2019