Demo: How to Enrich Data Using a Key-Value Lookup in Apache NiFi

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome back to a new session of where I explained knife I basics or big data and stream processing basics in general last time we built a very simple data flow in knife I by using the query database table record processor and put file processor to get data from my sequel database and write them into a local file because my my for instance runs locally now we want to add a new processing step today by taking the data that we have which looks like this two columns ID and name three rows at the moment from this table basically this ID name key value pairs and and rigid with the name of a Pokemon based on the ID so that a record looks something like this and we want to use the public poker API for that as you can see it's a URL a restful service that we can use to look up Pokemon based on their ID for example number six and we get all the abilities names versions games they appeared in and much more information for us we want to use only the name for this demo and enrich the ID and the name of a person with the Pokemon name so again then a record a sample record looks like this for this we only need to add one more processor which is called lookup record processor and before we put anything to a file we put it through the look of record process sir and once this is successful we want to write it into a file or anywhere else this is just an example where we put it into a file so first we need to configure the look at record processor we know that data from the previous processes Avro so we read it as a bro with all the defaults and not using the schema registry we want to write it again as an Avro this is totally fine I've already created an average writer with defaults and enabled it nothing fancy and the main thing that we are configuring here is the lookup service I've already created a Pokemon lookup service but the Pokemon the lookup service is just create a new service one of those many services based on the rest lookup service so as you can see we could look up data from any key value store each page elastic CSV files IP lookup service simple key value lookup service we're going to use the rest of the service and I've already created as I said I've already created one so I'm not going to create a new rest lookup service but use the one that I've created and configure it that's not yet configured it's not yet enabled we're going to look into that in a few seconds so we only want to use the Pokemon name and I want to insert the name into a new field that's called Pokemon so that it once again looks like this Pokemon name so that's why we configuring the result record path as as a strategy we route to success that's fine and we're going to insert insert the entire record that's totally like now so now we're going to configure the Pokemon lookup service it's currently enabled let's disable it so that we can change configuration this is the URL I showed you and this is the parameter we are using as the IDE it's not yet in a lookup record but we won't we will add it later on it's called pokey ID and we use it as a variable in here since the result of this rest call the response of distressed call is a JSON we're going to use a default jason-3 reader as the record reader here and we want to only use the name of that Jason nothing else just the name of that Pokemon as a result since it's HTTP and not HTTP we also need to use SSL context service super simple default is already up and running and configured I've created a trust or here right edit the certificate of this website set the trust or password the trust so type check yes and enabled it so let's have a final look at the configuration there's nothing more that we need to do here this should work as it is configured and therefore we enable it once it's enabled we go back into the lookup record processor and as I said we need to tell the lookup service somehow what the pokey ID is that it should look up so what we do here is to set the program key to the value of the ID of the incoming flow file the value of the lift incoming flow file is basically this ID and we then convert it to the variable called pocket ID and use the POC idea to do the rest card we still have the failure relationship that we need to terminate we don't care about failures error handling is something we don't use for this demo so we are set to go now I'm starting only the query database record so that we can have a look at what we got list queue and we should have all the records in one flow file which is the case so starting to look up record processor and it should work and give us an output one flow file you as an Avro where each of the records has an a Pokemon assigned based on their ID that's great so exactly what I wanted and as a test I'll just insert a new statement into our database one with pokey IDE maybe 21 a new name to Thomas and we should when we refresh the UI having a second flow file in the queue or we're ready nine seconds old number two as an opera file number 21 Thomas and the Pokemon Sparrow is assigned and you can use any kind of lookup services not only the rests lookup services like this you can use this to do any kind of enrichment to any kind of flow files record based flow files in our case and yeah enrich join data that you have based on certain keys and parameters one of the advanced features just before I stop this video one of the advanced features is that we can of course add as many parameters you want here so if you have 110 meters or 100 fields in your base record you can define them here and they will then be used or can be used in a query in this URL maybe you don't only have Pokemon that maybe you also want to have an enrichment of the abilities so you would then parameterize the ability and maybe the game ID and whatever you want to look up based on multiple parameters that come with your records I hope that was useful
Info
Channel: Stefan Dunkler
Views: 3,397
Rating: undefined out of 5
Keywords: demo, Apache NiFi, NiFi, LookupRecord, Enrichment, Data Enrichment, KeyValue, Data Flow, Data Integration
Id: bSJ5reO8AA4
Channel Id: undefined
Length: 10min 8sec (608 seconds)
Published: Sat Nov 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.