Apache NiFi : Extracting Rest API Data and persisting into file system

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone welcome back in today's video we will see a simple data pipeline that will get the data from the rest api and save it to a file system in order to show the use case i have drawn a diagram for you which is here so this is the web server which exposes the data through the rest api and the data is gonna access through rest api and push to the file system this whole process is orchestrated by the nifi here the necessary configuration details would be the rest api url and the file system path if the rest api url is an https that is as an ssl secure api then we need ssl certificate for this one if it's not ssl secure which is simple http then we can just use the http as it is in order to show you the example i have found out a dummy link which has the https which is the json placeholder so this has an ssl certificate i will post this url in the description and the file system i have chosen is my own folder so this is the destination folder here we will purchase the file in order to build the pipeline we can use the processors in the nifi so let's go to the processors which we can see which processor we might need the first part is we are extracting the data from a rest api so it's an https or http api so we can get get http invoke http let's use the invoke http one because we can customize it as we want and to put the data into a file system let's see which processors we can use so put file so this processor will write the content of the flow file so if you search it here you can see the descriptions of the corresponding processor what is going to do and we can create the pipeline by connecting these processors like this these are the relationship which shows which kind of state of the pipeline is gonna flow through this connection so for a simple simple use case i am gonna use all the relationship in one connection for example if the failure happens it will go into this one for no retry for original or for response it will go to this one we can use it as a different pipelines for different kind of states that will be useful in the production now let's see what to configure in the invoke http processor here are the properties which we gonna change or the selected ones are the http method remote url and ssl context service in the sd matter we are going to give the type of method we are going to use as in our case we are getting the data from the rest api we use the get and the remote url is the api which we have chosen or dummy one here we're gonna use this one and the ssl context service before configuring this ssl context service let's go to a diagram which i have created for you so as the in the previous diagram what we have seen that the knife will control the pipeline from the rest api to the file system and this diagram is a deeper design diagram here so the rest api in our case is an https so it has an ssl certificate and the nifi actually works on top of java in order to trust this certificate we need to import the certificate into the java so that the nifi rest processor can access the data so for that right now what we need to do we need to import this certificate into java where the nifi is running generally in order to import the certificate we need the certificate itself so how can we get the certificate from the api we can simply go to the api in the browser click on the this log symbol and go to the certificate and go to the details and select copy to file select the certificate export wizard and start with next we can have all these formats i am gonna choose the dr encoded binary x.509 which is a default one in our case so i'm gonna next and i have already saved that in my folder here it's gonna rewrite it and the export was successful now the export is successful let's go to a diagram now we're gonna import the certificate to our key store so here is an example so the key store is in my machine is in this folder which is in windows in the java folder and the default key store i am using with cs search so you can either use this command to import the the certificate file or you can use keystore explorer tool i'm gonna show you both this is the keystore explorer which provides an ui to manage your certificates it comes in windows linux and mac os as well and before going into the existing key store i would like to say that open the keystore explorer as an administrator so that you can update the keystore so let's start let's browse our own uh personal keystore which comes in the java and so this java version is i am using for my new feed to start and in the security folder i will get the our key store of the thruster so generally the default password is change it and if you have changed already the new to a new password then you can use that one let's import the new certificate which i have stored here and you can give an earliest name i am giving a random alias name right now successful if you see there is a demo and we can save it so right now the import is successful so here is the command that will help us to import the certificate which is at this this path to our key store so let's copy this command and put it to administrative command front and give your password certificate was added to the keystone so this is it now since we have successfully imported the certificate in the keystore we're gonna restart the nifi instance now let's start configuring the ssl context service here so let's go to the standard ssl context service here we need to configure this trace thruster in our case this is the csr let's go to the csr folder and get the file path name the password the type is it's a java key store that's why we are using jks apply it will validate for a time now after validating it comes to a disabled state to enable it we need to click on this link click on this link and click the enable button now it's enabled now let's configure the times schedule so for each minute we need a data so this is the 60 second scheduling let's start this invoke http so right now we should get one data in the queue now that we got one json response let's see the content by click into the list queue so this is a dashboard where you can see how many flow files are there in the list view and to see the content of the flow file we can click on this i button this shows us the json response of the particular rest call as i mentioned earlier that the content will be wrapped into a flow file and in order to see the flow files attribute we can click this information button here it will show us the json response attributes which comes with the http call which are for example connection content type which was application json in our case the date which the in the stamp in which the rest api call was invoked and all other server server related details in the next video i will show you how to customize update or add these attributes let's go to the next part where we will persist the file to the file system now let's configure the put file processor in our case the directory is the destination folder in this destination folder there is no data as of now so let's get this path and play paste it to the directory and there are other configuration options as well we will change the configuration strategy which says it will replace if there are two files having same name and apply let's start this boot file right now so as we see this put file has started the processor and the data gonna be in the folder so this data is the json response which we got from the http call in this video so far we have seen to get the data from a rest api whether it's http or https and process the data into a file system in the subsequent videos we gonna see how to transform the data on the fly so don't forget to subscribe the channel for the new videos and have a nice day
Info
Channel: Talk Tech With Santosh
Views: 5,240
Rating: undefined out of 5
Keywords: apache nifi, extract data from https, Data pipeline, Apache nifi https, Apache nifi put file, InvokeHttp
Id: Jk7H8w3evN0
Channel Id: undefined
Length: 12min 13sec (733 seconds)
Published: Sun Dec 20 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.