IoT Demo: Azure IoT Hub Data Egress - Routing

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
>> Thanks for tuning in. My name is Robert Eichenseer. I work as a Senior Services Engineer in Microsoft. Today's session is all about IoT Hub and how to get data out of IoT Hub, so how to data egress data from an IoT Hub instance. When it comes to IoT Hub, quite often, you're already familiar with how to get data into IoT Hub, how to connect the devices, how to send data from the devices to IoT Hub. But quite often there is a question, okay, hey Robert, if I have another data in IoT Hub, how do I get the data out of IoT Hub? This is really the topic of today's session. How can we get data out of IoT Hub to process the value data further on. This will be the topic of today. When we look at it from 10,000 feet perspective, like always, there are multiple ways to retrieve and process the data from IoT Hub. Let me start here from the very first beginning. If we have all the devices and the devices are sending now their telemetric data to IoT Hub, there are two, let me say fundamental different ways to get this data out of IoT Hub. There is one way which is called in our documentation routing or events. There is another way which is called in our documentation service endpoint. In other question is, what is really the difference between those two ways? Well, the primary difference is that routing or events are pushing information from IoT Hub into other services. In this case, it's not the chore of an application to ask IoT Hub please give me the latest and greatest information, Iot Hub is really routing this information into this Azure service. As you can see on the screen, the service endpoint is just the other way around. The service endpoint provides the data inside of IoT Hub for other applications to pull it from IoT Hub. In this session, we will definitely focus here on the push of the information of the telemetric data from an IoT Hub to another service. This brings me already to the next slide. When we say we push information from IoT Hub to another Azure service, what services are available there? As you can see on the screen, we have right at the moment four services where IoT Hub can really push or route information telemetric messages which had been sent by devices too. When we look at them, we have storage, so we have not just block storage. We also have data lake, we have Event Hubs, we have Service Bus queues and topics, and we also have Cosmos DB. I'll go a little bit deeper and give you somewhat best practices when to use which service when it comes to processing the messages. But the real question is, how is IoT Hub not doing this? Think about the multiple data sources that you have in IoT Hub. It's not just the telemetric information that devices are sending to IoT Hub, you also have in IoT Hub data sources like device events. Something like a device twin has been changed or a device connection state events, something like hey device has connected or device has disconnected, or if you are connected using the MQTT Protocol, you also get MQTT broker events. This is what I have here in the slide as data sources. This information now can be routed to the Azure services using the so-called endpoints. When it comes to the end points, you can have up to 10 endpoints pre-paid IoT Hub installation or IoT Hub instance, and per endpoint, you can have a dedicated service where IoT Hub routes or pushes the information to. In this example, I say, hey, I have three routes, one is pushing information to storage, another one to Cosmos, and the third is pushing the information to a Service Bus Queue or to a Service Bus Topic. But that's not all. IoT Hub provides additional information, and this additional information is quite often very handy when you have to implement specific scenarios. One of the functionalities is really query. IoT Hub can look into the message which you sent to IoT Hub and then can decide based on a query to which endpoint this information, this message should be sent to. What is the typical way of using this functionality? Think, for example, about a multi-tenant solution where you are providing an IoT Hub installation of single one inside of a multi-tenant environment. Now the trick is when messages come in, maybe you want to process the incoming messages with a different SLA for Customer A and Customer B, because Customer A, maybe it's your gold customer who pays a higher fee and therefore expects also different SLA. That means when you have this multi-tenant IoT Hub or the IoT Hub used in a multi-tenant application, you might want to route the incoming messages to a different end point with higher performance maybe with more resources associated to it, and IoT Hub can hear your friend. IoT Hub can really help you to look into the message. We will see this later, and then decide based on the content of the message where this message should be routed to. You have here a query functionality, and on top of the querying functionality, you can also enrich incoming messages. Well, what does enrich mean and how could you use it? Again, think about a scenario, maybe a multi-tenant scenario, where message comes in and you just have the device ID, for example, off this message, more or less the idea from where the message was sent. But to process this message further, you might need, for example, a department ID or if you want to do cross charging of the cost, you might need a cost center, and IoT Hub can help you here with this as well. IoT Hub can take the incoming message, then look for some specific criteria and say, hey, I want to add additional information to this message and enrich this message and then forward it to the endpoint. This is really from a high-level perspective, the routing functionality. I want to show you also some examples. Before we go into the example, before we do the demo, let's talk a little bit about the different endpoints and when to use the different endpoints. Let me start here with the Event Hubs. Event Hubs is really the service of choice when you have to do high volume and high throughput scenarios. When you have to implement really a scenario where latency matters, where you say, hey, I'm getting a lot of information and I want to process them at want to store them, I want to forward them for a reasonable price, then Event Hub really is your friend. Event Hub, as you know, is a messaging solution. On top of Event Hub, we as Microsoft also provide you a messaging solution which is called Service Bus, and inside of Service Bus, we have two flavors, Queues and Topics. When we now look at the difference between those two routing endpoints, an Event Hub or a Service Bus Queue or Topic endpoint. What is the difference and when to use which endpoint. Again, from an high level perspective, Event Hubs gives you this high volume and high throughput functionality. Service Bus on the other side, gives you enhanced functionality for your messages when you want to process those messages. To give you just an example, Service Bus has a time to live on a message. How can you use that functionality? Again, think about a scenario where you, for example, receive an alert and you have an SLA with your customers where you see this alert will be processed within a given time period, and if not processed within this given time period, I will do another alerting or another alerting needs to be done. If you now forward this message to a Service Bus Queue or Topic and you have time to lift defined, Service Bus helps you after the time to live for this message has expired to route it maybe to a different queue where you can have a different listener and then implement the alerting functionality order escalation alerting functionality based on that Service Bus Queue Topic. >> Storage for sure is a prime example to archive raw data. There are multiple reasons where you want to archive raw data to think about compliance requirements. Where you have to prove maybe in a year that you have processed the message correctly or that a message has arrived and so on, or think about a scenario where you just want a train artificial intelligent models, where you really need the big data to train your models. This is definitely use case where you want to go into the storage area. As already mentioned, we have two ways to route data to storage. What is really into a standard Blob storage. The other way is really to route the data into a data lake, and data lake, for example has a lot of advantages when it comes to processing, when it comes to big data compared to a storage account. But if your intention is to just store the raw data with a lowest possible costs, then a Blob store might be the right thing for your scenario. The fourth option that we have, this is currently in preview, is really our Cosmos database where we say all the messages are directly routed into a database for immediate processing. For example if you have a dashboard and you want to really have, I wouldn't call it near real time, but really something where you want to access data as it flows in, then you can route it to a Cosmos Database for immediate processing. This brings me now to really a demo. What do I want to do within this demo? What I want to do is I want to really start with a basic setup. By the way, you can do the whole demo on your own. There is a link aka MS IoTHubEgressPush. With this link, you can follow the whole demo. There is a script, there is a step-by-step guidance if you want to follow them. But what I want to show here is really a basic setup where we will emulate a device. This device will then send data to IoT Hub, and in the first example, we will implement a simple endpoint with a simpler route to really send all the messages which are flying into this IoT Hub to a storage account. That is a super simple setup. The second example is I want to take the same environment and then I want to enrich the incoming message. I want to enrich the incoming message up from my perspective very cool way. Because when the device now really sends a message, let's assume that this device just sense, let's say it's device ID. But to process it further, you might need, as already mentioned, maybe a cost center or a department ID, and what I will do here is, I will update the device twin, which is unique for every device with information about a cost center and with information about a department. Then when a message from this device flies into IoT Hub, IoT Hub will take this message, will look into the device twin and then we'll put the information about the cost center and about the department into the message so that at the end, when you store it somewhere in the storage account, then you really have this additional information inside of the message. You might wonder, what resources can I use to really enrich the incoming messages? We have here three possibilities. From my perspective and what I see most often used is really the device twin. Because the device twin is unique to a device and you can store all the necessary information for that specific device there. You can also have fixed literal where you just have, let me see your string which will be added to every message, or you can also have the name of the IoT Hub where this message was initially sent to. This is something what I want to show you in code. Then the last thing what I want to do is I want to show you how you can filter incoming messages and then route the incoming messages based on the filter criteria to customer endpoint. Again, the scenario here is think about your IoT Hub in a multi-tenant environment where you see hey, messages coming in from, let's say this device or from this payload inside of the device, should be routed to a different endpoint. Because I have to do here the high-frequency processing because it's a gold account with a high SLA instead of a free of charge may be just for the test purposes account. This is really what I wanted to show you today. Now let me switch to my development machine. What you see here is really PowerShell script and you can find this PowerShell script under the link that was provided in the beginning of the session. As mentioned, the first thing what we want to do together is really the basic setup. You can see here in the script. I'm using here the Azure CLI and what I'm doing here, the login. This might become handy to you. This small line of code here, if you are in a situation where you have to deal with multiple subscriptions and you login to your account, quite often you have to select which subscription you want to work against. Here this handy script just picks your subscription which is marked as your default subscription and then really sets your account to this subscript. The stuff what I'm doing here is pretty straight forward. I'm creating a resource group, then I'm creating an IoT Hub. Here you can see that I'm creating an IoT Hub. Also create a device inside of IoT Hub. Remember, I want to really simulate the message flow from a device, and therefore, I think the connection string from IoT Hub for the newly created device and I just simply store the connection string into the variable hop device connection string. Then again, to have the simple example and remember I want to forward all those messages to a storage account. If I want to forward it to a storage account, I really have to create the storage account first. This is what I'm doing here. I create the storage account with the resource group and so on. Instead of the storage account, I create the container, and inside of this container, all the incoming messages should be stored. That's pretty straightforward. The next thing, what we have to do is, we have to create the IoT Hub endpoint. IoT Hub endpoint means, if you look back to the sketch, is really this endpoint where we can route information from IoT Hub 2 and the endpoint forwards that information in that specific case to a storage account. As you can imagine, every endpoint, depending on where it points to, has different parameters that we have to provide. Because I'm here routing information to a storage account endpoint I have to provide an endpoint file format. How should the IoT Hub name the file where all these information, all the incoming messages will be stored. I've you also to interesting paramaters, an endpoint batch frequency and I've set this to 60. That means when IoT Hub is routing information to a storage endpoint more or less to a file. It does not write every incoming message to that specific file, it really patches the incoming messages. Then here I have two parameters where I can control when is this batch written to the file. Here I can say, hey, the batch frequency is every 60 seconds or and on top, the endpoint chunk size is 20 messages max. If there is a minute and then these IoT Hub dump all the information to the file, or if you have reached the chunk size of 20 messages in a file, then these also dump it. Here you have to play really according to your scenario with the numbers, those figures are good for a demo scenario, but in the real world, please make those figures a little bit higher so that you can really cope with incoming message frequency and with incoming message sizes. That's all what I'm doing. I'm creating the storage account, then I'm also saving the storage connection string in a variable. Now comes the interesting stuff. Here, I really create the endpoint inside of IoT Hub. I'm using again the Azure CLI. I say IoT Hub routing endpoint create and then I provide all the necessary parameters. Let me just call out here a few of the parameters. Here's the connection string to the storage account. >> Here's the encoding, how it should be encoded, and the parameters with the file format from hub, the batch frequency, in the chunk size. This is all that I have to do. If I execute this, and the this end-point will be created. I've executed this already. Stay with me a little bit, but I will show you how it really works after all this stuff is created. After I have created the endpoint, I also have to create a route. Because remember the endpoint links directly to another Azure service for all the messages are forwarded to. The route gives IoT Hub all the information, what information should be routed to this endpoint? Here, I just say the source is the HubRouteSource, and that means the HubRouteSource is really what information do I want to send to this endpoint? Let me say here, that was not what I wanted to do, if I say here $hubRouteSource, you see, I have here device messages. Really telemetry messages coming from the devices, other opportunities would here be lifecycle events from the device and so on. The next thing, as I mentioned, is to create this route, and then you have everything you need to really have this straightforward simple example. Now, let's test this out. What I have here is again, and you will find this also in the pickup account, I have here a very simple chasten format of the data I want to send to IoT Hub. More or less I'm simulating here the device, and I have here the device ID, the device category, telemetry ID, pressure, energy consumption and the telemetry timestamp. That's the very simplified chasten that I want to sent to IoT Hub. Here I'm just replacing some of the information with the real device ID that I have created. Let me stress deadline here as well, and say here telemetric property. I'm sending really the payload of the message and then with telemetric property, I'm adding to the payload, and when I say payload I mean this chasten format, I'm adding to this payload an additional property, where I say the error equals to no. This will become interesting in one of the other demos where we really know want to route information from IoT Hub to another Azure service. But where we also want to filter where we say, all the messages where error equals to no should go to this endpoint and maybe messages where the property or the header is set to error equals to yes, they should be processed differently, mainly with the Service Bus, maybe with another service. Let me just execute that. Say, I want to send to here 21 messages and let me execute here. Now this PowerShell script is executed and 21 messages will be sent to IoT Hub. While it is sent, let me go back to PowerPoint back here just to say, what have we done though? What are we really doing? What we have done is we have created here an IoT Hub. Here we have created an IoT Hub with the script, we have created a storage account with the script, we have created a device inside of IoT Hub, and then we are using this device to send a telemetry, really the chasten that you have seen in the PowerShell script to IoT Hub. Because we have created a route and we have also created an endpoint which pointed to an Azure storage account. Now, all the information that we're sending to IoT Hub should automatically be pushed to the storage account. Let's see if that really worked out. That means we are going back to the development screen. Here we are. We see that it is still sending the 21 messages. It will take a little bit until all the 21 messages are sent. Then remember, we have this threshold of a minute or 20 messages before IoT Hub dumps all these messages into the storage account. What I have done before is, I have executed that script already and we can now look into this storage account. I'm here in the Azure portal and let me go in the Azure portal into the storage account, and let me go here into the containers. Here we see the container that we have created with the PowerShell script. If I click on the "Container", I already see now a folder. This folder was created by IoT Hub and really the name of the IoT Hub instance I have created. Now when I look inside of this folder, I already see some messages which have been dumped into this storage account. Let us look into this messages. Here we see the messages. We see here, for example, this additional property that I have created in the PowerShell script, we see a lot of system properties and we also see here at the end the payload. This is a super straight forward example of sending or pushing information from IoT Hub directly to another service. Here we really use the magic from IoT Hub to create endpoints, to create routes and then IoT Hub will forward the messages to the end point and then to the Azure services or to the Azure service, which is linked to this endpoint. Now let me go back to the deck. We have seen a simple route. Now, let's switch gears and let's say, what do we have to do to enrich an incoming message. The scenario here is really the device is not sending specific information lets say about a cost center, a tenant ID, a company ID, it's just sending its device ID. But to process this information and we are dumping all that information into storage account. Let's take the example of an artificial intelligence training data. In this training data, we're also want to have a cost center and maybe a department ID. What do we have to do now to let IoT Hub do this functionality for us? Here in the example that I have prepared, I really want to use the functionality of a device twin to store some information about the device in its device twin, which is stored inside of IoT Hub, and then let IoT Hub do the necessary functionality to retrieve this information and edit to the messages which are then routed to the endpoint and finally routed to the storage account. Let me again go back to my developer screen. By the way, here I have, if you look to the GitHub account, I have here simple CLI command to check if the files have been created and when it is executed, and if you were patient enough to wait for the batch time, then you will see that the files created. But let us go back to the message enrichment example. As I have stated, I want to enrich the incoming messages with a department ID, and here the department ID, I made it super simple, it's again a simplified example, I called it dept01, and the CostCenter is cost01. Meaning information messages which are sent by a specific device, and in this specific case, really device that I've created and where I'm simulating the message ingestion, the sending of the telemetric data. Whenever messages from this device arrive, IoT Hub should automatically add the department ID dept01, and the CostCenter, cost01. The first thing I have to do is I have to update the device twin for that specific device. This is what I'm doing right at the moment so let me execute this line. That will take a couple of seconds. Let's wait until it was updated. Here it is. Now, we have for that specific device, really the information about the department ID and a cost center stored in the device twin for the specific device. That's the first step. The next step is we really have to create the message enrichment. Because we want to update two values, we want to enrich the incoming message with two values, we have to create two enrichments. The first is really the enrichment where it say, I want to have in the messages which are coming in, the value from something which is stored, $twin, this is a reference to the device twin. Then in the properties, in the desired ownership department ID and that if we go back, that is exactly the chasten format that I have created here in the device twin. That's all what I have to do. Let me execute this at well, and let me execute it also for the cost center, it follows the same principle. Now, I'm executing this message and this takes again a couple of seconds. What do we are doing here is we are updating in IoT Hub, the device twin for that specific device with two additional messages. We are creating here this enrichment rule, and this enrichment rule will then be used by IoT Hub when a message from this device flies in to enrich the incoming message. As mentioned, that takes a couple of seconds. >> Now we have created those two rules for IoT Hub so this message enrichment rules and that's all. Again, all we have to do and now we can send again a batch of 21 messages, and IoT hub will now receive those messages. IoT Hub will look into the enrichment rules which have been created, and will then enrich incoming messages with the two information about the department and the cost center and we will have this in all the messages which are dumped to the storage account. Pretty cool, pretty straightforward while we are ingesting here this data, let me add also something on top. If you want to do with this functionality by your own, you can do this, not a problem at all. You just spin up your own logic, you take the message coming from IoT Hub and then do lookup to say, hey, this message is coming from device, ID, whatever, and then you enrich it with this messages. From my perspective, the good news here is that when you do it using the IoT Hub functionality that you will introduce for sure some latency, but this latency will be under 500 milliseconds. If you do this with your own functionality, you might add a little bit more latency or what I see in customer engagement that it's most often not just a little bit more latency, that it is way more latency than the under 500 milliseconds that an IoT Hub will use here. That's one thing. Now let's go back to the deck because we have not seen the enrichment, how enrichment can be done. Now let's look into how can we filter and how can we query information coming from a device and then route it accordingly and this is really the next thing I want to show to you. Let me go back again here to the development screen and I'm back here in my Visual Studio with the PowerShell script and we see where it's still ingesting data. That is cool, that's not a problem. But let's focus now really on the filtering, on the querying of incoming messages, and then route them accordingly. The most important thing here is the line that I've highlighted and it's the route condition and here I say, hey, my route condition error equals no. That means IoT Hub will look into the incoming message and remember, this error was not part of the payload. Error was a property or a header to the message because I'm writing here just error equals to no. IoT Hub will look into the properties of the incoming message. After I have created this route condition, I just have to update a route and in this case, I'm updating the route which is already forwarding messages to the storage account with the endpoint name and with all the information that we have already created, I'm just adding this route condition. Let me stop here the ingestion which is still working in the background and let me execute here the route condition. This again will take a couple of seconds. I think we do not have to wait for this until it was executed because I want to show you also a different route condition. When you look now to this route condition, it says dollar body device category equals two, and here in this specific case, multi-center. Let's compare this to the previous route condition. Let me scroll up. Here, we have the initial root condition and the initial root condition is just saying error equals to no. There is no dollar in front of it but if we go here to the second route condition, we'd say here, Dollar Body device category equals two multicenter. I think you guess what it means. Here we are instructing IoT Hub. Hey, IoT Hub, don't look into the properties or the headers of an incoming message. Here we instruct IoT Hub to really look into the body of the incoming messages and here's a caveat. That functionality works very well and as you can see here, I'm doing just the telemetric interest that you have seen in the previous example. But I've marked it here with an attention, this data will not be routed to the storage account. Data will be routed to the IoT Hub default endpoint. Why will it not be routed to the storage account? Because we have done everything correct and the reason why it will not be routed to the storage account is because when we instruct IoT Hub to look into the payload of the incoming message, the payload needs to have a certain format so that IoT Hub really can execute the criteria, the condition that we have provided. In that specific case, the message need to be UTF-8 encoded and at the time of the recording of this session, we do not have a property here in the CLI command. We do not have a parameter here for AC IoT device sent device-to-Cloud message where we say, hey, CLI command, please send it UTF-8 encoded. Therefore, I have created here another small project. In this case it's a C Sharp project. Let me show you the project. It's really a super-simple, straightforward, simplified source. Where I say hey, I'm using the device client to connect to IoT Hub to more or less, I'm again simulating here at device, and the connection string is provided. This is the same connection string that I'm using for the CLI tooling. Then I say, hey, here's the telemetry data. But I'm adding here, the content type is application JSON and the content encoding is UTF-8 and all the SDKs using this information and is sending the message really UTF-8 encoded. Whenever you have routing enabled in your applications, and you have the feeling that the messages might not end at the route that you have configured, please check the content encoding to really allow or enable IoT Hub to look into the payload. Because for IoT Hub it's just bytes which are flying in and we really have to tell him, hey, this is UTF-8 messages, this is UTF-8 encoded messages so that IoT Hub really can execute on the criteria that you have provided. By saying that, let me go back to the PowerPoint deck and let me summarize what we have seen so far. Really what we talked about is all about data egress and pushing data from IoT Hub to other Azure services. We have seen how endpoints can be created to link really to a specific Azure service. In our example, it was a storage account. But remember, it could be an Event Hubs, it could be a Cosmos Database, or it could be a service bus topic or subscription and we also have seen how to enrich invents. Where we really can use building IoT Hub functionality to add additional information from a device twin, which is unique to the device into the incoming messages. We have seen how we can filter or how you can query incoming messages based on payload inside of the message, or maybe based on properties or headers, which you add on top to the payload to a message sent to IoT Hub. I already mentioned, you can see the whole sample under this link. It links you to a GitHub account. You have a step-by-step guidance there to setup the environment and then do the testing on your own. Stay tuned, this is just a first session about how to get data out of IoT hub. There will be two additional ones. Another one will focus more on pulling information from IoT Hub. From my perspective, this is a very interesting topic, and when I work with ISVs or partners, quite often the puling of information is still not go always accurate so quite often you need an exactly once processing pattern and then you have to really invest in how to pull information from IoT Hub, and another session will focus on Event Grid and how you can use Event Grid to again egress data from IoT Hub. By saying that, thank you for your attention. If there are any questions, just shoot me an email and I'm looking forward to see you in the next sessions.
Info
Channel: Microsoft IoT Developers
Views: 4,125
Rating: undefined out of 5
Keywords: IoT, Microsoft IoT, Azure IoT Hub, Routing, Industrial IoT, Telemetry
Id: EBpzLrEx5gg
Channel Id: undefined
Length: 40min 4sec (2404 seconds)
Published: Wed Jan 04 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.