>> Thanks for tuning in. My
name is Robert Eichenseer. I work as a Senior Services
Engineer in Microsoft. Today's session is all about IoT Hub and how to get
data out of IoT Hub, so how to data egress data
from an IoT Hub instance. When it comes to IoT
Hub, quite often, you're already familiar with
how to get data into IoT Hub, how to connect the devices, how to send data from
the devices to IoT Hub. But quite often there is a
question, okay, hey Robert, if I have another data in IoT Hub, how do I get the
data out of IoT Hub? This is really the topic
of today's session. How can we get data out of IoT Hub to process the
value data further on. This will be the topic of today. When we look at it from
10,000 feet perspective, like always, there
are multiple ways to retrieve and process
the data from IoT Hub. Let me start here from
the very first beginning. If we have all the devices
and the devices are sending now their telemetric
data to IoT Hub, there are two, let me say fundamental different ways to
get this data out of IoT Hub. There is one way which is called in our documentation routing or events. There is another way
which is called in our documentation service endpoint. In other question is, what is really the difference
between those two ways? Well, the primary difference
is that routing or events are pushing information from
IoT Hub into other services. In this case, it's not the
chore of an application to ask IoT Hub please give me the latest and
greatest information, Iot Hub is really routing this information into
this Azure service. As you can see on the screen, the service endpoint is
just the other way around. The service endpoint
provides the data inside of IoT Hub for other applications
to pull it from IoT Hub. In this session, we will definitely
focus here on the push of the information of
the telemetric data from an IoT Hub to another service. This brings me already
to the next slide. When we say we push information from IoT Hub to another Azure service, what services are available there? As you can see on the screen, we have right at the moment four services where IoT
Hub can really push or route information
telemetric messages which had been sent by devices too. When we look at them,
we have storage, so we have not just block storage. We also have data lake,
we have Event Hubs, we have Service Bus
queues and topics, and we also have Cosmos DB. I'll go a little bit
deeper and give you somewhat best practices when to use which service when it comes
to processing the messages. But the real question is, how is IoT Hub not doing this? Think about the multiple data
sources that you have in IoT Hub. It's not just the
telemetric information that devices are sending to IoT Hub, you also have in IoT Hub data
sources like device events. Something like a
device twin has been changed or a device
connection state events, something like hey device has connected or device
has disconnected, or if you are connected
using the MQTT Protocol, you also get MQTT broker events. This is what I have here in
the slide as data sources. This information now
can be routed to the Azure services using
the so-called endpoints. When it comes to the end points, you can have up to 10 endpoints pre-paid IoT Hub installation
or IoT Hub instance, and per endpoint, you can have a dedicated service where IoT Hub routes or pushes
the information to. In this example, I say, hey, I have three routes, one is pushing
information to storage, another one to Cosmos, and the third is pushing
the information to a Service Bus Queue or
to a Service Bus Topic. But that's not all. IoT Hub provides
additional information, and this additional information
is quite often very handy when you have to
implement specific scenarios. One of the functionalities
is really query. IoT Hub can look into the
message which you sent to IoT Hub and then can decide based on a query to which endpoint
this information, this message should be sent to. What is the typical way of
using this functionality? Think, for example, about a multi-tenant solution
where you are providing an IoT Hub installation
of single one inside of a multi-tenant
environment. Now the trick is when
messages come in, maybe you want to process
the incoming messages with a different SLA for
Customer A and Customer B, because Customer A, maybe it's your gold
customer who pays a higher fee and therefore
expects also different SLA. That means when you have this multi-tenant IoT Hub or the IoT Hub used in a
multi-tenant application, you might want to route
the incoming messages to a different end point
with higher performance maybe with more resources
associated to it, and IoT Hub can hear your friend. IoT Hub can really help you
to look into the message. We will see this later, and then decide based on
the content of the message where this message
should be routed to. You have here a query functionality, and on top of the
querying functionality, you can also enrich
incoming messages. Well, what does enrich mean
and how could you use it? Again, think about a scenario, maybe a multi-tenant scenario, where message comes in and
you just have the device ID, for example, off this message, more or less the idea from
where the message was sent. But to process this message further, you might need, for example, a department ID or if you want to
do cross charging of the cost, you might need a cost center, and IoT Hub can help you
here with this as well. IoT Hub can take the
incoming message, then look for some specific
criteria and say, hey, I want to add additional
information to this message and enrich this message and then
forward it to the endpoint. This is really from a
high-level perspective, the routing functionality. I want to show you
also some examples. Before we go into the example, before we do the demo, let's talk a little bit about the different endpoints and when
to use the different endpoints. Let me start here
with the Event Hubs. Event Hubs is really the service of choice when you have to do high volume and high
throughput scenarios. When you have to implement
really a scenario where latency matters,
where you say, hey, I'm getting a lot of information and I want to process them
at want to store them, I want to forward them
for a reasonable price, then Event Hub really
is your friend. Event Hub, as you know, is a messaging solution. On top of Event Hub, we as Microsoft also provide you a messaging solution which
is called Service Bus, and inside of Service Bus, we have two flavors,
Queues and Topics. When we now look at the difference between those two routing endpoints, an Event Hub or a Service
Bus Queue or Topic endpoint. What is the difference and
when to use which endpoint. Again, from an high
level perspective, Event Hubs gives you
this high volume and high throughput functionality. Service Bus on the other side, gives you enhanced functionality for your messages when you want
to process those messages. To give you just an example, Service Bus has a time
to live on a message. How can you use that functionality? Again, think about a scenario
where you, for example, receive an alert and you have an
SLA with your customers where you see this alert will be processed
within a given time period, and if not processed within
this given time period, I will do another alerting or
another alerting needs to be done. If you now forward this message
to a Service Bus Queue or Topic and you have
time to lift defined, Service Bus helps you
after the time to live for this message has expired
to route it maybe to a different queue
where you can have a different listener
and then implement the alerting functionality
order escalation alerting functionality based on
that Service Bus Queue Topic. >> Storage for sure is a prime
example to archive raw data. There are multiple reasons
where you want to archive raw data to think about
compliance requirements. Where you have to prove maybe in a year that you have
processed the message correctly or that a message
has arrived and so on, or think about a
scenario where you just want a train artificial
intelligent models, where you really need the big
data to train your models. This is definitely use case where you want to go
into the storage area. As already mentioned, we have two
ways to route data to storage. What is really into a
standard Blob storage. The other way is really to route
the data into a data lake, and data lake, for example has a lot of advantages
when it comes to processing, when it comes to big data
compared to a storage account. But if your intention
is to just store the raw data with a
lowest possible costs, then a Blob store might be the
right thing for your scenario. The fourth option that we have, this is currently in preview, is really our Cosmos
database where we say all the messages are directly routed into a database for
immediate processing. For example if you have a dashboard
and you want to really have, I wouldn't call it near real time, but really something where you want
to access data as it flows in, then you can route it to a Cosmos Database for
immediate processing. This brings me now to really a demo. What do I want to do
within this demo? What I want to do is I want to
really start with a basic setup. By the way, you can do the
whole demo on your own. There is a link aka
MS IoTHubEgressPush. With this link, you can
follow the whole demo. There is a script, there is a step-by-step guidance if
you want to follow them. But what I want to
show here is really a basic setup where we
will emulate a device. This device will then
send data to IoT Hub, and in the first example, we will implement a simple endpoint with a simpler route to really send all the messages which are flying into this IoT Hub to
a storage account. That is a super simple setup. The second example is I want to take the same environment and then I want to enrich the incoming message. I want to enrich the
incoming message up from my perspective very cool way. Because when the device now
really sends a message, let's assume that this
device just sense, let's say it's device ID. But to process it further, you might need, as
already mentioned, maybe a cost center
or a department ID, and what I will do here is, I will update the device twin, which is unique for
every device with information about a cost center and with information
about a department. Then when a message from this
device flies into IoT Hub, IoT Hub will take this message, will look into the device twin and then we'll put
the information about the cost center and about the department into the
message so that at the end, when you store it somewhere
in the storage account, then you really have this
additional information inside of the message. You might wonder,
what resources can I use to really enrich
the incoming messages? We have here three possibilities. From my perspective
and what I see most often used is really
the device twin. Because the device twin is
unique to a device and you can store all the
necessary information for that specific device there. You can also have fixed
literal where you just have, let me see your string which
will be added to every message, or you can also have the name of the IoT Hub where this message
was initially sent to. This is something what I
want to show you in code. Then the last thing what I
want to do is I want to show you how you can filter
incoming messages and then route the
incoming messages based on the filter criteria
to customer endpoint. Again, the scenario
here is think about your IoT Hub in a multi-tenant
environment where you see hey, messages coming in from, let's say this device or from this
payload inside of the device, should be routed to a
different endpoint. Because I have to do here the high-frequency processing
because it's a gold account with a high SLA instead of a free of charge may be just for the
test purposes account. This is really what I
wanted to show you today. Now let me switch to my
development machine. What you see here is
really PowerShell script and you can find this
PowerShell script under the link that was provided in
the beginning of the session. As mentioned, the first
thing what we want to do together is really
the basic setup. You can see here in the script. I'm using here the Azure CLI and
what I'm doing here, the login. This might become handy to you. This small line of code here, if you are in a situation
where you have to deal with multiple subscriptions and
you login to your account, quite often you have to select which subscription you
want to work against. Here this handy script just picks your subscription
which is marked as your default subscription and then really sets your account
to this subscript. The stuff what I'm doing here
is pretty straight forward. I'm creating a resource group, then I'm creating an IoT Hub. Here you can see that
I'm creating an IoT Hub. Also create a device
inside of IoT Hub. Remember, I want to really simulate the message flow from a
device, and therefore, I think the connection string from IoT Hub for the newly created device and I just simply store the connection string into the variable hop device
connection string. Then again, to have the
simple example and remember I want to forward all those
messages to a storage account. If I want to forward it
to a storage account, I really have to create
the storage account first. This is what I'm doing here. I create the storage account with
the resource group and so on. Instead of the storage account, I create the container, and inside of this container, all the incoming messages
should be stored. That's pretty straightforward.
The next thing, what we have to do is, we have to create the
IoT Hub endpoint. IoT Hub endpoint means, if you look back to the sketch, is really this endpoint where
we can route information from IoT Hub 2 and the endpoint forwards that information in that specific
case to a storage account. As you can imagine, every endpoint, depending on where it points to, has different parameters
that we have to provide. Because I'm here
routing information to a storage account endpoint I have to provide an endpoint file format. How should the IoT Hub name the
file where all these information, all the incoming
messages will be stored. I've you also to
interesting paramaters, an endpoint batch frequency
and I've set this to 60. That means when IoT Hub is routing information to a storage
endpoint more or less to a file. It does not write every incoming
message to that specific file, it really patches the
incoming messages. Then here I have two
parameters where I can control when is this batch
written to the file. Here I can say, hey, the batch frequency is every
60 seconds or and on top, the endpoint chunk size
is 20 messages max. If there is a minute and then these IoT Hub dump all the
information to the file, or if you have reached the chunk
size of 20 messages in a file, then these also dump it. Here you have to play really according to your scenario
with the numbers, those figures are good
for a demo scenario, but in the real world, please make those
figures a little bit higher so that you
can really cope with incoming message frequency and
with incoming message sizes. That's all what I'm doing. I'm
creating the storage account, then I'm also saving the storage
connection string in a variable. Now comes the interesting stuff. Here, I really create the
endpoint inside of IoT Hub. I'm using again the Azure CLI. I say IoT Hub routing endpoint create and then I provide
all the necessary parameters. Let me just call out here
a few of the parameters. Here's the connection string
to the storage account. >> Here's the encoding, how it should be encoded, and the parameters with
the file format from hub, the batch frequency,
in the chunk size. This is all that I have to do. If I execute this, and the this end-point
will be created. I've executed this already. Stay with me a little bit, but I will show you how it really works after all this
stuff is created. After I have created the endpoint, I also have to create a route. Because remember the endpoint links directly to another Azure service for all the messages
are forwarded to. The route gives IoT Hub
all the information, what information should be
routed to this endpoint? Here, I just say the source
is the HubRouteSource, and that means the
HubRouteSource is really what information do I want
to send to this endpoint? Let me say here, that was not what I wanted to do, if I say here $hubRouteSource, you see, I have here
device messages. Really telemetry messages
coming from the devices, other opportunities would here be lifecycle events from
the device and so on. The next thing, as I mentioned, is to create this route, and then you have everything
you need to really have this straightforward
simple example. Now, let's test this out. What I have here is again, and you will find this also
in the pickup account, I have here a very
simple chasten format of the data I want
to send to IoT Hub. More or less I'm simulating
here the device, and I have here the device ID, the device category, telemetry ID, pressure, energy consumption
and the telemetry timestamp. That's the very simplified chasten
that I want to sent to IoT Hub. Here I'm just replacing some of the information with the real
device ID that I have created. Let me stress deadline here as well, and say here telemetric property. I'm sending really the payload of the message and then with
telemetric property, I'm adding to the payload, and when I say payload I
mean this chasten format, I'm adding to this payload
an additional property, where I say the error equals to no. This will become
interesting in one of the other demos where
we really know want to route information from IoT
Hub to another Azure service. But where we also want
to filter where we say, all the messages
where error equals to no should go to this endpoint and maybe messages where the property or the header is set to
error equals to yes, they should be
processed differently, mainly with the Service Bus, maybe with another service. Let me just execute that. Say, I want to send to here 21
messages and let me execute here. Now this PowerShell
script is executed and 21 messages will
be sent to IoT Hub. While it is sent, let me go back to PowerPoint
back here just to say, what have we done though? What are we really doing? What we have done is we have
created here an IoT Hub. Here we have created an
IoT Hub with the script, we have created a storage
account with the script, we have created a device
inside of IoT Hub, and then we are using this
device to send a telemetry, really the chasten
that you have seen in the PowerShell script to IoT Hub. Because we have created
a route and we have also created an endpoint which pointed to an
Azure storage account. Now, all the information
that we're sending to IoT Hub should automatically be
pushed to the storage account. Let's see if that really worked out. That means we are going back to the development screen. Here we are. We see that it is still
sending the 21 messages. It will take a little bit until
all the 21 messages are sent. Then remember, we have this
threshold of a minute or 20 messages before IoT Hub dumps all these messages
into the storage account. What I have done before is, I have executed that script
already and we can now look into this storage account. I'm here in the Azure
portal and let me go in the Azure portal into
the storage account, and let me go here
into the containers. Here we see the container that we have created with
the PowerShell script. If I click on the "Container", I already see now a folder. This folder was created by IoT Hub and really the name of the IoT Hub instance I have created. Now when I look inside
of this folder, I already see some
messages which have been dumped into this
storage account. Let us look into this messages. Here we see the messages. We see here, for example, this additional property that I have created in the
PowerShell script, we see a lot of system properties and we also
see here at the end the payload. This is a super straight
forward example of sending or pushing information from IoT Hub directly
to another service. Here we really use the magic from
IoT Hub to create endpoints, to create routes and then IoT
Hub will forward the messages to the end point and then to the Azure services or
to the Azure service, which is linked to this endpoint. Now let me go back to the deck. We have seen a simple route. Now, let's switch
gears and let's say, what do we have to do to
enrich an incoming message. The scenario here is really
the device is not sending specific information lets
say about a cost center, a tenant ID, a company ID, it's just sending its device ID. But to process this information and we are dumping all that
information into storage account. Let's take the example of an artificial intelligence
training data. In this training data, we're also want to have a cost
center and maybe a department ID. What do we have to do now to let IoT Hub do this
functionality for us? Here in the example
that I have prepared, I really want to use the
functionality of a device twin to store some information about
the device in its device twin, which is stored inside of IoT Hub, and then let IoT Hub do the necessary functionality
to retrieve this information and edit to the messages which are then routed to the endpoint and finally routed to
the storage account. Let me again go back to
my developer screen. By the way, here I have, if you look to the GitHub account, I have here simple CLI command to check if the files have been
created and when it is executed, and if you were patient enough
to wait for the batch time, then you will see that
the files created. But let us go back to the
message enrichment example. As I have stated, I want to enrich the incoming
messages with a department ID, and here the department ID, I made it super simple, it's again a simplified example, I called it dept01, and the CostCenter is cost01. Meaning information messages which
are sent by a specific device, and in this specific case, really device that I've created and where I'm simulating
the message ingestion, the sending of the telemetric data. Whenever messages from
this device arrive, IoT Hub should automatically
add the department ID dept01, and the CostCenter, cost01. The first thing I have
to do is I have to update the device twin
for that specific device. This is what I'm doing
right at the moment so let me execute this line. That will take a couple of seconds. Let's wait until it was
updated. Here it is. Now, we have for that
specific device, really the information
about the department ID and a cost center stored in the device
twin for the specific device. That's the first step.
The next step is we really have to create
the message enrichment. Because we want to
update two values, we want to enrich the incoming
message with two values, we have to create two enrichments. The first is really the
enrichment where it say, I want to have in the
messages which are coming in, the value from something
which is stored, $twin, this is a reference
to the device twin. Then in the properties, in the desired ownership department
ID and that if we go back, that is exactly the chasten format that I have created here
in the device twin. That's all what I have to do. Let me execute this at well, and let me execute it
also for the cost center, it follows the same principle. Now, I'm executing this message and this takes
again a couple of seconds. What do we are doing here is
we are updating in IoT Hub, the device twin for that specific device with
two additional messages. We are creating here
this enrichment rule, and this enrichment rule will
then be used by IoT Hub when a message from this device flies in to enrich the incoming message. As mentioned, that takes
a couple of seconds. >> Now we have created
those two rules for IoT Hub so this message
enrichment rules and that's all. Again, all we have
to do and now we can send again a batch of 21 messages, and IoT hub will now
receive those messages. IoT Hub will look into the enrichment rules
which have been created, and will then enrich incoming messages with the two
information about the department and the cost center and
we will have this in all the messages which are
dumped to the storage account. Pretty cool, pretty straightforward while we are ingesting
here this data, let me add also something on top. If you want to do with this
functionality by your own, you can do this, not
a problem at all. You just spin up your own logic, you take the message coming from IoT Hub and then do
lookup to say, hey, this message is coming
from device, ID, whatever, and then you enrich
it with this messages. From my perspective,
the good news here is that when you do it using the IoT Hub functionality that you will introduce for
sure some latency, but this latency will be
under 500 milliseconds. If you do this with
your own functionality, you might add a little bit
more latency or what I see in customer engagement that it's most often not just a little
bit more latency, that it is way more latency than the under 500 milliseconds
that an IoT Hub will use here. That's one thing. Now let's go back to the deck because we have
not seen the enrichment, how enrichment can be done. Now let's look into
how can we filter and how can we query information coming from a device
and then route it accordingly and this is really the next thing
I want to show to you. Let me go back again here to the
development screen and I'm back here in my Visual Studio
with the PowerShell script and we see where it's
still ingesting data. That is cool, that's not a problem. But let's focus now
really on the filtering, on the querying of
incoming messages, and then route them accordingly. The most important thing
here is the line that I've highlighted and it's the route
condition and here I say, hey, my route condition
error equals no. That means IoT Hub will look into the incoming
message and remember, this error was not
part of the payload. Error was a property or a header to the message because I'm writing
here just error equals to no. IoT Hub will look into the
properties of the incoming message. After I have created
this route condition, I just have to update a
route and in this case, I'm updating the route which is
already forwarding messages to the storage account with the endpoint name and with all the information that
we have already created, I'm just adding this
route condition. Let me stop here the ingestion
which is still working in the background and let me execute
here the route condition. This again will take
a couple of seconds. I think we do not have to
wait for this until it was executed because I
want to show you also a different route condition. When you look now to
this route condition, it says dollar body device
category equals two, and here in this specific
case, multi-center. Let's compare this to the
previous route condition. Let me scroll up. Here, we have the initial root condition and the initial root condition is
just saying error equals to no. There is no dollar in front of it but if we go here to the
second route condition, we'd say here, Dollar Body device
category equals two multicenter. I think you guess what it means. Here we are instructing IoT Hub. Hey, IoT Hub, don't look into the properties or the
headers of an incoming message. Here we instruct IoT
Hub to really look into the body of the incoming
messages and here's a caveat. That functionality works very
well and as you can see here, I'm doing just the
telemetric interest that you have seen in
the previous example. But I've marked it here
with an attention, this data will not be routed
to the storage account. Data will be routed to the
IoT Hub default endpoint. Why will it not be routed
to the storage account? Because we have done everything correct and the reason why
it will not be routed to the storage account is
because when we instruct IoT Hub to look into the payload
of the incoming message, the payload needs to have
a certain format so that IoT Hub really can
execute the criteria, the condition that we have provided. In that specific case, the message need to be UTF-8 encoded and at the time of the
recording of this session, we do not have a property
here in the CLI command. We do not have a parameter here for AC IoT device sent device-to-Cloud
message where we say, hey, CLI command, please send it UTF-8 encoded. Therefore, I have created
here another small project. In this case it's a C Sharp project. Let me show you the project. It's really a super-simple,
straightforward, simplified source. Where I say hey, I'm using the device client to connect
to IoT Hub to more or less, I'm again simulating here at device, and the connection
string is provided. This is the same
connection string that I'm using for the CLI tooling. Then I say, hey, here's the telemetry data. But I'm adding here, the content type is
application JSON and the content encoding is UTF-8 and all the SDKs using this
information and is sending the message really UTF-8 encoded. Whenever you have routing
enabled in your applications, and you have the feeling that the messages might not end at the
route that you have configured, please check the content
encoding to really allow or enable IoT Hub to look
into the payload. Because for IoT Hub it's just bytes which are flying in and we
really have to tell him, hey, this is UTF-8 messages, this is UTF-8 encoded
messages so that IoT Hub really can execute on the
criteria that you have provided. By saying that, let me go back to the PowerPoint deck and let me
summarize what we have seen so far. Really what we talked about is all about data egress and pushing data from IoT Hub
to other Azure services. We have seen how endpoints can be created to link really to
a specific Azure service. In our example, it was
a storage account. But remember, it could
be an Event Hubs, it could be a Cosmos Database, or it could be a
service bus topic or subscription and we also have
seen how to enrich invents. Where we really can use building IoT Hub functionality to add additional information
from a device twin, which is unique to the device
into the incoming messages. We have seen how we can
filter or how you can query incoming messages based on
payload inside of the message, or maybe based on
properties or headers, which you add on top to the payload
to a message sent to IoT Hub. I already mentioned, you can see
the whole sample under this link. It links you to a GitHub account. You have a step-by-step
guidance there to setup the environment and then do
the testing on your own. Stay tuned, this is just a first session about how
to get data out of IoT hub. There will be two additional ones. Another one will focus more on
pulling information from IoT Hub. From my perspective, this is
a very interesting topic, and when I work with
ISVs or partners, quite often the puling of information is still not go
always accurate so quite often you need an exactly once
processing pattern and then you have to really invest in how to
pull information from IoT Hub, and another session will
focus on Event Grid and how you can use Event Grid to again
egress data from IoT Hub. By saying that, thank
you for your attention. If there are any questions, just shoot me an
email and I'm looking forward to see you in
the next sessions.