Real-time vehicle telemetry analysis with Kafka Streams

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi my name is Alex Watford I'm a systems engineer and in this short video I want to show you how you can take telemetry data so in this case I've got data from RTD that's the the bus company in Denver and they publish data as appro Tabar file that you can download you're allowed to download it every 30 seconds and it's kept up to date and so inside this protoboys file I'm passing out the positions of all the buses so I'm doing that in Java spring that's what this icon is here and this is an example of the message that I'm pulling out it has an ID which is the bus an identifier for the bus you've got a timestamp along and a lat in there and so I'm writing these into Kafka now once I've got that in Kafka I can use Kafka streams to analyze that and so so what I need to do I want to calculate the speed so the idea I had in mind was I want to monitor the speed of all the vehicles traveling around Denver and you know see see how that changes over time and that sort of thing so look I've written a Kafka streams app now in order to calculate speed you need two positions and a difference in time speed is different so ver time time and so in order to keep the previous position and time I'm using rocks DB now rocks DB came out of Facebook and it's basically like a library that is sort of almost a plug-in replacement for something like a Redis even it's there's no server required to run this you just drop a dependency in your in your jar and you have a stream screaming fast key value store capable of doing millions of transactions per second it's crazy fast and so from rocks DB / kefka streams job that's talking to rocks DB I now have an enriched feed look you can see I've added miles per hour to this and I'm writing that back to Kafka now finally I want sis this into elasticsearch and this is going to allow me so you know slice and dice it over time and do visualizations and that sort of thing and elastic needed the the the geo locations formatted in a slightly different way so I've got another little snippet in my Kafka streams app that is managing this this is longitude latitude into geo JSON format so that's you know for that let's take a quick look at the feed itself so if you go to RTD Denver comm you can see that you know you can sign up for this this feed and you'll get a username and password when you log in you'll be able to pull down this vehicle position protobuf file I've put all the code for this in the comments so it's in this project called RTD Denver and to me this is kind of the money shot right this is this shows all the speeds of the the average speeds of the buses and traveling around I've been running this for about a day or so so there's not so much data in there and I also added a geo hash I wanted to do some kind of anomaly detection if a vehicle is going particularly fast through a point that it doesn't normally can't normally do that and that might be kind of an indication of sort of reckless driving or something like that I don't have enough data to do that yet but I added the geo hash to my data set so it so that's kind of there for later and and that this geo hash has about you know 150 meters resolution I think it's a hexagon 150 meters wide and it'll track all the speeds of the vehicles over time going through there each 150 meter square anyway you can see if you go down I 25 there's pretty good average speed 67 62 miles an hour like that this is e 470 it's a toll road very fast I love going down efore 70 especially to the airport you can really haul ass down there and you can see you know as you might expect Denver's a bit of a disaster loads of stop sign and traffic lights and all that kind of stuff but you know and his highway 36 pretty fast down there so this is this is kind of what I would expect really so this is obviously using Cubana and now we'll just have a quick look at the code there's a couple of things that I want to point out that I think are interesting first of all this this spring app so there's two parts to this one is the RTD feed which is getting the data from the RTD website and it's basically logging in with a username and password taking that protobuf file and iterating through it and writing each one of these two a calf kotappa called RTD bus position so that's how that works and that's really it the the thing that I think is a little bit cool about this is the scheduler it's using the at scheduled annotation here which is part of spring with enable scheduling annotation and then at scheduled yeah okay and that's really easy to use and kind of a no-brainer really I use that all the time so that's the the spring part really not much to it let's pop over to the the Kefka streams app this is a little bit more tricky I'm not going to beat it to death there's just a couple of things that I want to point out in here so we have this our TV streamer app you'll notice here it says get rocks DB so I have a little function in here that is going to create a local rocks DB instance see set create if missing it's gonna it's gonna put this in the temp folder and this is where it's going to read it read and write their data to and from there are rocks DB data so that's all it is to create a local rocks DB instance you know we'll pop over to the pom and you'll see that you know this this is what you need to add to your pom file in order to do it but there's really nothing to it rocks DB is a piece of cake to stick in a Java app okay so now we've got our rocks DB in there so there's there's a couple of streams here so the first one here and these are both working in the same way right here here's a stream transformation and there's one just underneath it so let's just take a quick look at this one calculate speed based on the distance time for the last measurement so look we've got we're consuming this topic I'm treating it as a string it's really Jason that's fine so you can see this map values the reason I didn't use just the map command is because if you use the map command it might rekey the it might rekey your feed and that means writing stuff but to and from kefka which is kind of an expensive thing to do if you can use map values either key for this is the bus ID so the reason I chose the bus ID it means all the transactions for the same bus go into the same Kafka partition so that's why I did that so look we have this little function here in rich bus position if I just drop down to this if don't want to get into too much detail but I want to point out the rocks DB bits so you know there's there's a little bit of Jackson here where I'm mapping the bus you know longitude latitude and all that to a bus position object right here and I'm grabbing the previous one so look I have this bus ID here I'm turning that into a byte array and then I am getting the previous record from rocks DB so DB get that's all there is to it to pull data out of rocks DB it comes back as a byte array and then I need to deserialize that and that's what I'm doing here with this new string so this is basically taking data out of rocks DB getting the previous record and then once I've done that I'm going to put the current record back into rocks DB so I've got the latest lat/long and time for next time so this is how I've got the at this point I've got the current and the previous records and I can use this thing called have assigned distance this is a very common way to calculate the distance between two points on a sphere it's a sort of a very well known formula I just you know cut and paste this off the interwebs basically but this is going to return the distance in kilometers between two long lap points and so once I've done that speed is distance over time I need to know the difference between the you know the current time stamp and the previous time stamp and from those two things I can get the miles per hour I've got my geo hash so geohash is that tile thing that I was I was talking about earlier don't really have a use for it right now but I think it's gonna be handy in the future and this number of characters here is sort of the precision of the geocache seven is about 150 meters something like that anyway so then I'm writing this back to a topic and so if we go back to this diagram here you can see this is what we just did so we took data like this we use rocks DB to store the previous values we manipulated it with the have a sign formula and then we write it back to Kafka and then the last bit is just basically reformatting this a tiny a tiny little bit so it displays nicely in elasticsearch so as far as elasticsearch goes I'm using Kafka connects to to write that or and this these are the the properties that I had to say so basically all I need to do is give it a topic and an elasticsearch Ural and a couple of other properties which are here and that's it that's all you need to do to set up this thing that tracks the speed of buses around Denver so that was an end-to-end vehicle telemetry example I built this you know in a few hours and I thought it's pretty fun pretty interesting and I hope you did too thanks so much for watching
Info
Channel: Alex Woolford
Views: 5,278
Rating: 5 out of 5
Keywords: vehicle telemetry, elasticsearch, kafka streams, spring, kafka, confluent, elastic, rtd, json, rocksdb
Id: yIFOCYy7Wmc
Channel Id: undefined
Length: 11min 3sec (663 seconds)
Published: Mon Jul 15 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.