Spring Batch in Spring Boot | CSV to Database | Tech Primers

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] press the bell icon on the YouTube app and never miss many a bit from tech primers as usual we are in the spring initializer website star dot spring dot IO I'm going to create a spring boot application with spring batch we are going to read a CSV file and write that CSV file back to the database so it is similar to how we extract the data from a CSV file transform it into a different object and then load it into the database so it is something like an ETL process we are going to use spring batch to leverage the three different components inside the steps inside the spring batch which we saw previously we are going to use item reader to read the data from the CSV file we are going to use the item processor to enrich this data which we passed from the CSV file and finally we will be using the item writer in order to write the data into the database so I'm going to use the group ideas comm tech primers as usual the artifact name I'm gonna give it a spring batch example I'll just say example one and the dependencies which I'm going to use is majorly batch which is nothing but spring batch and I also need the spring MVC because I want to trigger this batch with a rest endpoint I don't want to trigger it with a scheduler right now I'll show it initially with the endpoint that is why I'm going to use spring MVC next I need a database so i'm going to use h2 for this purpose I have my sequin in my machine but I'm not going to use it I'm going to show how you can use dev tools to query h2 so there is a dependency called dev tools I'll show you what dev tool does in this example I'm going to show you how to query the h2 database which is in memory inside this particular spring good application so I have never to show this particular tool recently spring has introduced this dev tools which can be used to query the h2 database so I'll use that that's why I'm using his - I'll just show you what data is present inside the h2 because spring batch uses h2 as well because it stores the batch information the batch run what - what time it ran how many batches were on all those kinds of information will be stored inside the h2 database so I'm going to leverage h2 for loading our final data back into the h2 database itself so in order to load data I'm going to use JP the item writer is going to use JP I forgot to mention we need to use the spring boot 1.5 . 14 we are not there yet for spring boot 2.0 right I have not started the tutorials for 2 Springwood 2.0 I'll use 1.5 not 14 for that let's create this project and I'll open it in IntelliJ so the project is open in IntelliJ and if you see there will be only one main class which is going to be the start and we have the application run properties that's it we don't have anything else I'm going to start this particular project in the port server the port equal to 8 0 8 1 because by default 8 0 8 0 is occupied in my machine also I'm going to use Java 8 I'm not going to use Java 9 let me change that setting as well apart from that I am going to create the input file the input file is nothing but the CSV file which we need we are going to load user data so user is mostly the example which we see in all the spring boot videos right so I'm going to use the same example let's consider there are different fields like ID name Department and salary and ID is one Peter and Department I am going to do what I'm going to do is I'm going to use a department code so I will just say 0 0 1 and the salary is 12,000 IDs - Sam 0 0 - thirteen hundred and thirteen thousand it's a thing that Ryan 0:03 let's say and this is always ten thousand okay let's see there are only three rows for now and there are some Department code and then there is a salary right now our data is ready so this is going to be the input file and the final output is going to be inside the database so in order to create those database we need our repository site so I will just create a repository also we need a model isn't it so the final model with which we are going to store is going to be called as user model so I'm just going to call it as user and I'll mark this as entity because we are going to use JPA in order to store the data into the database back so I'll just use that and the primary Keele for this particular table is going to be ID so I'll just mark the integer as ID we also have the fields name Department and also the salary those are the getters and setters so our entity is set now we need to create a repository you will create a user repository this is going to be the JPA repository it's an interface which needs to extend from the JPA repository and it will use the model user an integer is the data type for the primary key that's it so we have our input file ready which is nothing but the CSV file and the while doing the writer we are going to use the repository JPA repository so we have the model ready as well and also the repository ready as well so this is how we are going to insert the data now we need to create our batch specific properties right so here in spring boot we are going to use the Java configurations so I'm going to create a package called config and let's create a spring batch configuration I'll just say spring batch config file right and this is going to have all the configurations which we require for our spring batch application so I'll mark it as add configuration so that spring boot can by default trigger this scan this particular class file so that it can load all the configurations also what I need to do is I need to disable the spring batch job trigger by default so what happens is by default if you have a spring batch application spring boot will automatically trigger the batch job so what I'm going to do is I'm going to disable it so in order to disable the default job launcher of the spring batch we need to do the spring dot batch dot job dot enabled equal to false if we don't do this spring wit by default to use the simple job launcher and automatically start it based on the configurations which we are going to add here so that is what we are doing here we are just disabling that particular default configuration so if you remember the diagram which we saw yesterday we had a spring job launcher job launcher is going to be the start of the class so that is going to come under the rest controller from the risk control we are going to call this job and so in order to create call the job we need to create a job right that's what we are going to do here we need to create a job and this job is going to be of type bad short corridor job right so we need to use the same job now in order to create a job we need a job builder opposed factory job built a factory is automatically created by the spring boot so we can leverage the same so I'm just going to use this job builder factory so using the job builder Factory you just say dot get and we need to provide a name for this particular ETL job so what kind of jobs is this is going to be ETL job right so I'm just going to call it as ETL load all right you can give any name I'm just giving a name as ETL - load that's it our Factory is going to create this particular job now now we can provide something called an incremental so the incremental is nothing but a sequence of IDs which we assign to run for every run so for example I can assign an integer value saying these are my run names or run IDs so basically these are nothing but run IDs I'm going to use a default one provided inside Springwood root cells which is called run ID incremental this will just increment my batch runs every time in the sequence so my run ID will be like a run dot ID will be assigned see this is how it will be assigned so it will be assigned as a run dot ID and every time I run a batch it will be incremented one by one so I'm going to use a default one I don't want to have any customized one if you need any customized job ID names then you can create your own custom incremental and you have to just implement the job parameter incrementally in order to do that so by default I'm leveraging this because it is just doing a increment based on run dot ID so run dot ID is my key and values will be one two three or whatever based on the batch run that is the incrementer once the incremental is done now I need to register my step so you would have seen right under a job you can have multiple steps if you have multiple steps then you can use either flow or start if you have only one step you can just do a start and then as a in a step and then you can directly end it saying build and just create this local variable right for now Jason does not so what I will do is I will do a build and this will return me a job basically I can just Joe in fact we can merge these two in line if you have multiple steps instead of the start we can do a flow and we can do dot next and then assign another step so you can add multiple steps that way but since I have only one step my job is going to have start alone that's it here I need to create a step similar to how we created the job we need to create this step right if you had remembered the diagram again a step had a reader a processor and a writer so same way we need a step builder Factory so I am going to inject that as well so we use the same step builder Factory again and we need to provide a name for it again we are going to call it as ETL file load write the step this particular step is going to call to be called as ETL file load and we can process them in chunks so chunk is nothing but batching so I want to do this particular load in terms of batches of 100 so I'm just going to say 100 then we need to assign a reader when we need to assign a processor they mean it was in a writer finally you just build this particular step that's it but we did not create any reader we did not create any processor we did not create any writer right so now we need to do that what I can do is I can before creating the reader I can just inject this item reader because the reader whatever we are going to create is going to be of type item reader and we know what model we are going to use right the model we created here is called user I am going to use the same thing because we are going to load the data from the CSV file into this particular model and use this particular model type so I am going to call it as user and I'll use item reader interface so I can use this particular item reader here done same way for the item processor my input is of type user and output is also of type user I'm going to use the same so I will use it same way for I in input writer I just say user item writer that's it now let's use these objects here item processor an item writer that's it okay this is because in the chunk we never gave the object as user c'mon user so now this is all salt so the item processor is now able to detect that my input is also user output is also user that's it now we have the step ready and the job ready but in order to use this item reader and processor and writer we not we need to write these implementation right we have just Auto IDs but where is the implementation so now we need to create these reader firstly let's start with the reader since we're going to read the CSV file there is an inbuilt function or inbuilt class called the flat file item reader we are going to leverage that in order to read this file so we're going to use flat file item reader and say I want to return type user I'm going to say file item reader then my input is going to be my file right it's going to be a resource right it's nothing but other so so I'm going to say resource and this I'm going to auto wire using the add value sanitation and I will pass the field called input file I'll just say input and I'm going to add this particular input here the input is nothing but users dot CSV so that's the file right yeah so we have added this file called users dot CSV I don't know if I have to give slash or something let's see it should use that okay so we have given that as an input so resource we'll get that particular file now we need to use that particular file here we need to create a flat file item reader any class instance you see new flood fine I mean design these right so I'll assign the resource first allison in the resource directly now we need to set some default values which we require for this particular item reader so i'm going to give the name of this particular reader as a csv reader also i'm going to set the number of lines it can skip if there is any issue right I'll just say the number of lines it needs to skip is one because the first line would be always the header so I'll just keep that particular line also finally I need to map these data right the second and the third lines which we added these data need to be mapped to a POJO basically we need to map this to the user class right how do we map it so using this line mapper we can map it so line mapper is another class which we need to create let what we do is we'll just create a function for that and we can leverage that meanwhile we can just return this flat file reader so reader is done however we can create this guy here I'll just create it doesn't mean we need to create a default line mapper this leaf on line mapper can accept values of type user just say default line number now our line mapper is created but what type of file this particular CSV file is right it's a comma separated file so we need to add a tokenizer to this particular line mapper so there is something called delimited line tokenizer we are going to use that and say what type of delimiter it is we added comma right it's a CSV file we added comma so I'm going to say a delimiter is equal to comma also I'll just say some other configuration for example set strict I'll just say fall so I don't want it to be strict also I will just provide the name of the columns which we added so the name of the columns are I D then name then Department and finally salary so these were the different names which I added so I just added that to the tokenizer now this tokenizer needs to be added back to the line mapper so i've added that back now we need to set each field to this particular user pojo right we need to set the value of each field mean Dero into this particular poem so there is something called bean wrapper field set mapper which can help us in that case so I'll just create this bean wrapper and I will just set the target type and it will automatically map everything from this to that only thing is I need to set this into the default field map an offset field mapper again we just need to set it that's it and I'll just return the default field map but that's it so what we have done here is we have configured saying the flat file is going to be a delimited file with chooses comma separated values and these are the different columns and map these to this particular POJO using the bean wrapper fields it also skip the first line because that is a header that's what we have done in this particular set up so we have created our item reader so this item reader will now be set with whatever we have set here so automatically we are going to leverage springs framework or spring boots framework to parse the file directly and map it to a poem here now the ferbos is mapped if we need to do some processor right so let's write a processor I'll create a new package called a batch so that I can add the item processor and I'll just name it as processor item processor it should a little processor use a common user right under let's implement the methods also I'm going to auto by this yes component this is an interface so I will just say implements okay so we got the user object directly and we need to do some processing right what type of processing do we need here I'm going to get the department if you notice in the user dot CSV we added one two three as the department code however I don't want the code to be persisted into the database so I'm going to create some private study I'll just create a map here imagine that it's a configuration basically it's a static information where I have my codes mapped to a department name so I'll just map 0 0 1 as the technology to operations and what was the last three rate three years accounts so that way when I get this particular department what I lose I'll go to this map and then get the value of this map and map it back that way we will so what I'll do is this is department code I get the department code from this particular map and map it under Department and I will just set it back in the user say Department Department that way I just transforming the code into the department name that's what we are doing here so we need to do some processing right now so I just took this particular example where we are just going to leverage the same field however we are just going to transform it from the department code to the department name that's what we are doing here so this is the processor so the process is going to do that job so that's it we don't need anything else in the processor the processor is done finally we need to write the data back into the database right so we need a item writer so I'll just say DB writer it's going to write it into the database and this is going to be of component it will implement item writer so item writer is the other component of the spring batch so we need to write the data into the database right so we need to inject this user repository which we already created so that's what I'm doing here I'll use the user repository and just say say that's it I'm just going to save this into thee is the repository I'll just add some log statement here so that we can know that it does say the for users I'll just put in the user so that we know what all user information is getting saved and I just add some to string statement here so that we can see whenever we are printing it there so I'm just printing a statement saying data saved for users and in the processor I can add some log has been here I'll just add some logs okay so this is another log statement which I am added now so this is just going to log seeing converted from Department code to Department that's it and it is setting the user here right okay so I think we are set almost so we have created the reader the reader is here the file item reader is done which will be used here basically I just call it as item reader yeah so this will be called here item processor is nothing but the process of which we created item writer is nothing but the DB writer which we created as well so apart from that now we need to add the trigger point so trigger point is nothing but our rest control so I'm going to add a risk controller I'm just going to call it as a load controller because it's going to load the data and this is good to use spring MVC so I'll just mark it as rest controller and say load and I'll just add a get mapping for simplicity I'm not going to do a post or something let's do void but I will just change it so what are we going to do we are going to load the job right in order to load the job if you remember the diagram again we need to trigger the job launcher so job launcher we need to auto wire so that we can use it and we need the job as well because when you trigger the job launcher you need to provide the job so job is what we already created it as a part of the configuration the spring batch config file if you see here we already created the job job launcher is created by the spring good framework itself so we are going to use the same I'm going to say job launcher basically I'm just going to say job launcher dot job sorry run and I'm going to pass the job and then there is some parameter what are these parameters let's try so these parameters are nothing but default parameters which we can pass I can just pass some parameter let me pass some value right I'll just pass what I lose I'll just pass the timestamp I'll just pass the saying time come on and I need to do do job parameter then you say system current milliseconds I just pass the current time in milliseconds this is throwing some exception I just add methods okay done so we are just trigger the job here how will this will return a type job execution so job execution is something where you can see whether job is executed you can control the job so I'm just going to say job execution dot get status in fact we can return the gate you returned that itself job execution dot get status it's of time batch status I'm just going to say badge status so let's return the status of the batch also we can add a while loop here to see if the badge is completed or not if it is running I'll just say the cheese running I'll just add some log statement once the batch is completed it should be false and it will just come out of the loop and it just is going to return whether it is pass or fail that's it so that's it so we are going to trigger the job from here using this particular resistant point and it is going to trigger the job launcher using this command called job launch or dot run and it is going to get a handle of the job execution what what happened to that particular job and it will just return the status back to us I think that's it we don't need anything else right so we have added all the rest area configuration let's start the spring go to application and see what's happening also if you remember I added something newly called the dev tools so this is something which I'm going to show now how to access the h2 database because right now we are using h2 in this particular in this particular project so I'm going to show you how to access h2 which is running in memory in this particular spring boot app we can connect to this particular h2 database from the webpage I will just show you how to do that and we are going to see what all tables are created by default now we are not loading anything we have not loaded anything we will see what God what God created right by default see here if you notice there is a new restraint point called slash h2 - console so this is the endpoint which we are going to hit in order to access our h2 console okay what does it say it says field a job job launcher in the record of being off type no that could not be found it is it's not able to find let's see what happened there okay so the job launcher did not get loaded because we did not add the annotation called at enable batch processing we did not even enable batch processing that was a problem so the since we did not enable it Springwood was not able to identify what type of job launcher that is so what kind of class it is so that was the problem let's check now after adding this annotation so the application is up now let's go to the URL eight zero eight one yeah we have it ready now I will just show you how to connect to the h2 console so I'll just say h2 - console since we added the dev tools this will now be able to connect to the h2 console and if you see here it connects to the in memory JDBC URL so I will just say connect so this will show what all tables are present currently inside our Springwood application which is just running so see these these are different tables created by this spring batch frame book and the user is the table which we created and if you go here see that these are the columns which we created right ID department name salary etc meanwhile I'll just open our localhost colon 8080 1 and this is the endpoint which you are going to hit to load the data meanwhile let me write some queries so that we can see what data is present right now so job execution will have the status of the complete job and there is something called step execution step execution should have the status of each step right what are the things for there there is something called execution context as well let's see what data comes in right same way with step context you and finally we need our table called user because that is the table where the data is going to get loaded so I'll just use that table in the front right now everything should be empty yeah right now all the tables are empty I think the Java con 626 okay yeah all the tables empty see that no rows are returned for all these table now I'm gonna hit the load option this is going to now trigger the job launcher and job launcher is going to read the CSV file and load that integer into the database the CSV file is nothing but the one which is present here so let's trigger the load actually it failed I guess let's see why it failed fail to initialize the reader I think it's not able to read the file yeah file so I'll just give the absolute path for now I'll restart the server meanwhile I'll just copy these queries because when I connect again this will be gone right because this h2 is like down now because this whole application is down so when I reconnect it it will just ask me to reconnect again from the beginning yeah the server is up let's reconnect so I'll run the queries and keep yeah all are empty I'm going to hit the load option so this load option is now going to load data see that it is showing is completed right now if I query these tables see that I have three different rows Peter technology Sam operations Ryan accounts and there is a job execution saying job ignition one version is to job instance IDs one what time in code completed what is the status and all those things or is he context and what is the step name so this is where the step name will be useful right when you have multiple steps it will be very useful here right what happens if I reload right if I trigger the reload again and let's query it see that you are able to see multiple job executions here so these are the tables which are used by spring batch and that's why you are able to see multiple batches or multiple rows there however the data is all intact because we have marked these as primary key so hibernate is taking care of that so when we are saving it's just doing an update but what we can do is we can add a timestamp here timestamp field here and while doing the processing we can update it we can try that as well so let's try that right I don't know why what will happen when I tries to map we can try I've never tried it but what I'll do is I'll just say okay so now we need to go to the processor where it ads so I'm going to add the timestamp here so set time I'll just say new date that citrus is just adding at current timestamp into that particular table so previously it is empty but now we are going to add it let's reset this so once I restart again this data is gone because the it's all running in the in memory right so it's all gone so let me copy these queries okay once the application is up we can try this again yeah the application is up it's Y reconnecting so all the data is empty again let's load it again completed now when I run it shows the timestamp yep it's show if you see here it shows 145 and if I load it again it shows it should show 146 I think yeah it should load it should show 146 so whenever I update it's updating the timestamp all of the fields are intact because they never changed another thing which you noticed the Department is now coming up with Department name however the CSV had only the code so our processor is also working and you can see the logs here converted one to technology two to operations accounts and then also it's saying data saved for users so this is how spring batch works I'll just summarize once again what we did so we added the input file which is basically this users dot CSV this is going to be our input file and we need to load this into the database and how do we need to do it we need to do it using spring batch framework so we are going to add the configuration for the spring batch we need we need to create a job in order to create a job we need to have a step you can have multiple steps or you can have a single step by default I have I'm going to add only one step because I'm going to load only one operation so I'm just going to get the file and then load it into the database that's it if you want to do multiple operations like read a file and then load it into the database or three multiple files then you can had multiple steps as a part of the same job if you need right you can do that so I in this case we are using only one step and inside a step you will have to do three operations reader processor and writer if you don't want the processor you can skip it as well you can just read and write directly but I want to show you the processor as well that's why we added the processor where we are transforming the department code from one code to the department name that's what we did here also we add with the timestamp so that we can see that in the database as a part of the reader what we did is we use the flat file item reader from spring and we leverage the different utility utilities there and then we just converted automatically into the user's POJO so the read is completed using your write what we are doing is we are using JPA we are using spring JPA to write the data back into the database so we created a model with the entity and we are provided ID as the ID unique identifier and then we just created a repository for that and use that repository inside this DB writer which is nothing but the item writer so we have created that as a job now we wanted to create a controller basically this is the spring MVC res controller by which we can load the data and we need to auto wire the job launcher and the job so that we can trigger the job launcher whenever we hit the load option so that's what we are doing here and it triggers a job launcher with the job the T parameter switching requests and it returns an object of type job execution and we can check whether the execution is completed or not and we are returning the status back to the user saying whether it is failed or completed and spring batch if you see in the backend it uses history database to store its execution detail so it how does it know which code completed or which god failed it has its own database these are the different tables which it has and we are using this dev tools spring boot has something called spring boot dev tools we are leveraging that to connect to h2 via this h2 console I can make a separate video on this h2 console and other tools which are there in the dev tools if you want that do let me know in the comment section below I can just do a separate video on dev tools right but I hope you understood what is a spring batch and how we are leveraging spring boot to use spring batch additionally if you remember we added properties here to disable the default batch load this is done to disable the default load oral spring boot will by default trigger the load when you bring this over up that is why we have disabled it if I enable it by default it is enabled so I am disabling it for our sake right where we wanted to load it manually so that's it from this particular video I hope you guys understood what a spring batch and how you can use spring batch inside a spring good application if you want me to make further videos on spring batch do let me know any specific scenario I can just pick that and do if you liked the video go ahead and like it if you haven't subscribed to the channel go ahead and subscribe to it meet you again in the next video thank you very much [Music]
Info
Channel: Tech Primers
Views: 164,512
Rating: 4.8977537 out of 5
Keywords: techprimers, tech primers, spring batch with spring boot, spring batch tutorial, spring batch hands on, spring batch example, spring batch using step, spring batch step by step, spring batch examples with spring boot, spring batch, spring boot with spring batch, spring dev tools, spring dev tools example, spring batch explained, spring batch for begineers
Id: 1XEX-u12i0A
Channel Id: undefined
Length: 41min 31sec (2491 seconds)
Published: Fri Jun 15 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.