Home automation systems have to run all the
time. Today we will make sure you get an alarm on
your Smartphone if something goes wrong with your system. But how can we create an alarm if the server
is down? Let s have a closer look. Grüezi YouTubers. Here is the guy with the Swiss accent. With a new episode and fresh ideas around
sensors and microcontrollers. Remember: If you subscribe, you will always
sit in the first row. Critical systems often use heartbeats or watchdogs
for monitoring. We will use both to supervise our private
servers like the Raspberry Pi, connections, sensors, and actuators. The principle is simple: If something does
not work as expected, we create an alarm via the Telegram app on our Smartphone. But, as said before, we cannot create an alarm
on a server that does no more work. So we have to use a watchdog as a supervisor. This watchdog alarms us if something with
the server is wrong. So, in this video, we will:
- Understand how we can build end-to-end monitoring - Learn what a single point of failure is
and how we have to deal with it - Create Node-Red workflows to create alarms
if a sensor does no more deliver values or an actuator died or lost connection
- Create an expensive watchdog using a Raspberry Pi
- Create a cheapo watchdog - And as usual, you will learn some tricks
from my lab Let s start with the overview of many of our
systems at home: We have sensors that connect, often via Wi-Fi, to a server. Then the messages go to either Node-Red, Home
Assistant, or another software to display results and act upon the sensor values. Messages are sent to actuators, and they create
physical actions like switching a light off or on. Let s use my awning controller as an example. It has to extend the awning if there is a
lot of sun, the temperature is above 22 C, and there is no wind. In all other cases, the awning retracts. Frequent viewers know that I made this controller
for my wife. So you can imagine: This is the most critical
system in our home. If my Harley does not work, or if my lab lighting
does not go on when I enter, who cares? But if the awning does not extend and the
house gets hot. Well
For this controller, end-to-end is from physical parameters like sunlight, temperature, or
windspeed to the physical extension of the awning. If we want to supervise this system from end
to end, we have to compare these three parameters with the awning position, which would mean
that we need a second system in parallel as a supervisor. Or we build the same system in parallel, which
takes over if the first fails like in airplanes. This is called redundancy. NAS systems, for example, also often use RAID
controllers for that purpose. These systems use additional disks, which
only are needed in case of a failure of the primary disk. But what happens the two systems report different
signals? Which one is right? To decide, we have to include a third system. Now they can decide two-to-one. And still, bad accidents can happen, as the
example of the 767MAX showed In my case, Wi-Fi, for example, clearly is
a single point of failure. If it fails, nothing goes. Or my Raspberry Pi. If it crashes, my controllers do no more work. Or, of course, mains power is a single point
of failure, too. As a first step, we have to decide what happens
if a system fails and how much effort we want to invest in preventing it. For airplanes, fortunately, they invest a
lot. For other systems, we trust that they do not
fail, or if so, the effect is minimal. This is with mains in Switzerland. The chance for a power outage is minimal. If it happens, we meet with the neighbors
and drink a beer. Because the fridge is still cold, and power
usually will be restored before we finished our beers. In the worst case, we have to open a second
or third bottle. Wi-Fi is already less stable in my case- Especially
with sensors. They sometimes lose connection. All my Home Automation systems are not life-threatening. So I decided to have no redundancy, just an
alarm. In most cases, the system can be manually
fixed in minutes. Therefore, I want to be alarmed if Ethernet
and the internet still work, but either a sensor, a gateway, the server, or an actuator,
stops. How can we build such a system? If devices deliver sensor readings in a defined
interval, we have no problem. As soon as a sensor stops delivering, we know
something is wrong, and we have to create an alarm. This can quickly be done in node-red, for
example. We use the timeout node, connect it to the
sensor reading we want to supervise, and set its countdown to 2.1 times the expected interval
of the sensor. Then, after two lost sensor readings, this
node creates an alarm message and sends it via Telegram to my Smartphone. Each sensor in my home has such a timeout
node and an alarm. If you want to know how to enable Telegram
with node-red, please watch video #270. This system not only supervises the sensor. For example, this sensor transmits its values
to a satellite, and I get them via the internet. So you can imagine how much can go wrong. All is covered with this simple timeout node. Here we would set the timeout to 2.1 days,
for example, because we expect at least one value per day. But what if our device has no sensor or does
not regularly deliver readings like my awning, which works on 433MHz? Frequent viewers remember that I hacked it
in video #209 and video #242. Now it is controlled by an ESP, which does
not create regular sensor readings. So I had to implement a heartbeat that every
minute sends I am ok. But what happens if this Raspberry crashes? If crashed, it cannot send an alarm. So we need a second Raspberry that supervises
the main Pi. We call it watchdog. It listens to regular events created by the
main Pi. For example, the MQTT broker on this Raspberry
creates a lot of messages. Suppose no message is sent for a minute, we
know that the system is dead. Without such MQTT messages, we could create
a HeartbeatHUB that is sent every minute. On this watchdog Pi, I install only Node-Red. Of course, I use IOTstack to save time. You can also install it barefoot if you prefer. This flow supervises the events of the first
Pi using our standard procedure. After one minute without events, a telegram
message is created. Cool. Now I am alarmed if my main Raspberry stops
working. But what happens if the watchdog Pi crashes? I would not discover the fact until too late. So we need a third Raspberry? And to supervise the third, a fourth? Maybe this is why they invented the Raspberry
clusters! Fortunately, this is not needed. We also can create a heartbeat on the second
Pi and use the first for supervision. Now I only do not get an alarm if both go
down simultaneously or if the network or mains is down enough safety for me. Of course, you find a link to the node-red
flows in the video description. But, as usual on this channel, we want more. I do not want to use a Raspberry Pi running
the whole year just producing a heartbeat and, very rarely, an alarm. What is the cheapest possibility to get the
same effect? An ESP8266-01. The supervisor does not need any display or
outside connection. So I wrote a small sketch consisting of merged
example files from two libraries: PubSub to create a heartbeat and listen to the Raspberry's
incoming MQTT messages and the Universal Telegram Bot Library to create the alarm messages. The logic is elementary. The loop creates a heartbeat MQTT message
and listens to the heartbeat from the main Pi. As soon as no heartbeat is heard, an alarm
is created. All we need. For a few cents. Let s put it into production. I could use a 5V USB power brick, but I use
a simpler method: A Tenstar 3.3 volt power supply and a 1000uF solid polymer capacitor
to ensure we have no problem with the current peaks produced by Wi-Fi. Solid Polymer Capacitors are used in quality
power supplies because of their low equivalent series resistance or ESR. Just a tiny tip: When comparing different
capacitors in my lab, I discovered that these Sanyo caps are fake. These tantalum caps, as well as the solid
polymer caps, have the rated capacity, while these fake ones only have 50% or less. So pay attention and check your caps with
a simple transistor tester when you get them. I soldered the power supply as well as a header
for the ESP-01 on a standard experimental board. Because the ESP-01 has no pull-up resistor
for the enable pin, I added one as well as a small capacitor. Like that, it boots when Vcc is already applied. If you use an ESP-12 board, these components
are already there. But how do I program my ESP-01s? I use this small board where I added a button
switch between GPIO0 and GND. If I want to program the chip, I press the
button while I insert it into the USB connector. If I want to run it, I just insert it without
pressing the button. Simple and cheap. Today, you even get boards with a built-in
button for no additional charge. As the last thing, I adapt the dimensions
of the configurable box presented in video #258, create a hole for the cable, and print
it. Ready is my watchdog. You can place it wherever you want. It will work as long as it has Wi-Fi and power. What do we have to remember? - Ideally, we create end-to-end monitoring
for all our essential systems - After assessing the effects of a failure,
we decide which single point of failures we can accept. If we cannot accept them, we have to invest
in redundancy. If we can accept a failure, we only have to
invest in a supervision system. Or we just accept it as I did with the mains
power - We make sure that our sensors regularly
transmit values, or we create regular heartbeats for all other systems
- Using Node-Red, we can create simple workflows to create alarms for missing values using
a timeout and a Telegram node. If a sensor does no more deliver values, or
an actuator died or lost connection, an alarm is sent to our Smartphone
- Using a second Raspberry, we can create an expensive watchdog to supervise our productive
server - To save space, energy, and money, we can
create a similar watchdog using an ESP-01 - The ESP-01 can be programmed using a simple
and cheap board - Test your electrolytic capacitors when you
get them. They may be faked
One last thing: You could use an internet service like AWS
or Google instead of the ESP-01. This would allow us to monitor our mains power
and our internet connection in addition to the rest. Maybe somebody shows us how this can be done? Or even creates a service for that? As always, you find all the relevant links
in the description. I hope this video was useful or at least interesting
for you. If true, please consider supporting the channel
to secure its future existence. Thank you! Bye