How to Design a Network Messaging Protocol!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

here at hofor we suffer from a debilitating condition known as not invented here syndrome which is why today I'm going to be enabling that for all you guys by teaching you how to design your own custom Network protocol so that you can write Damons that can do this and this and this the big picture for the protocol we're going to be designing today is it is essentially just a structured way for the applications you write to talk to each other over some kind of Transport if you're familiar with the OSI 7 Layers model this would be somewhere between layer 6 and 7 what I mean when I say transport is something like a network socket or Unix socket on the desktop Computing realm or something like usart SPI or Bluetooth low energy on the embedded realm so essentially what we're going to be doing is taking that transport that's method of transferring data and we're going to be building an application layer protocol on top of that and as I said before we all suffer from not invented here syndrome so what's going to happen is you guys are going to design your protocol based on what's happening in this video and then you're going to develop applications and libraries for it and then you're going to slowly expand the scope of your protocol because you've realized that you need to do that to add some functionality and then the cycle will repeat until you've essentially recreated HTTP let's dive in hold your horses there tiger before we Define the packet structure that our applications are going to use to talk to each other over the network you have to make some high level architecture decisions up front and that is namely what is the order of messages that your applications going to send to each other and what kind of response Does the sender need from the receiver I'll show you what I mean with uh an example over here uh on the left I've got John D and on the right I've got Jenny d right and those are going to be my two programs that I have in the chat I've also got the message number labeled here and the first thing you can see that John does is John sends a request to Jenny okay pretty standard stuff right John wants Jenny to do something now the first decision you need to make what kind of response is John going to get from Jenny is he even going to get a response at all there are some scenarios where you you could get away with not sending a response if your application requires low latency and is Fault tolerant meaning you can drop messages really without much of a problem I think some games do this in their net code as well where it doesn't matter if you miss like one packet because you're sending a whole bunch really really fast you may not even bother sending back an acknowledgement at all and just leave John on Red like Jenny could just fully leave John on Red here I know like rip Bozo that's unlucky for him but yeah you could just leave John on red not send anything back and just have the request go and John would have no idea whether the request was successful or not and that might be fine for your application so that's the first decision you're going to make right does he even get a response from Jenny now if you decide that he does need a response from Jenny you need to decide on the level of response okay so the probably one of the most basic ones Jenny could send back is like a delivery acknowledgement okay so John sends a request to Jenny and Jenny says hey I got your message and it's valid or you can also send a not acknowledgement here and say no your request was invalid now you might be asking well Matt I know about TCP and I know that Axe and KNX are in the TCP protocol why do we have to do some here and the answer for that is that in the real world sometimes TCP sessions can be kept alive for longer than the application on the other end is responsive in which case you might actually want acknowledgements and not acknowledgements in your application protocol just so you can make sure that the person on the other side is alive right and that the TCP session is not kind of like a zombie session and besides you might be using a transport where acknowledgements and non- acknowledgements are not part of the protocol okay like um SPI or U or or something like that or UDP like you could implement it over UDP if you like so that's the that's one very basic thing is like a delivery acknowledgement now consider that John's request to Jenny is actually going to take a really long time to run okay so in this scenario uh you would have this acknowledgement be sent way later like if you're having this acknowledgement be set at the completion of the request it would be sent way into the future and that might be too long so what you could do is have this be set back immediately so John sends a request Jenny sends back the acknowledgement immediately and then right after that as soon as Jenny's operation is complete she sends John a success or failure of the operation okay so essentially what that means is that John gets confirmation immediately that the message was correct and delivered and then when Jenny's done doing her stuff she sends him a Successful Failure now the fourth message I have done here is data right so maybe John's request needed Jenny to send him some data what a lot of people will do sometimes is combine these ones right here so like these ones they'll combine them uh which is totally fine to do right like John already knows that his message got delivered and then you're just sending back the data along with the the success or failure code so essentially what we've got here is a situation where in order John sends something to Jenny Jenny sends back an acknowledgement and then Jenny sends back the data that John requested okay and sometimes your application might not send back data these are all decisions that you're going to have to make there's a more complex situation that we're going to delve into now which is the situation in which John is a Naughty Boy and he's cheating on Jenny with Jessica oh tragic right tragic he's such a bad guy so what's going to happen here you'll notice is that Jon is actually talking to two programs at once he sends Jenny a request and then he sends Jessica a request Jenny sends back her acknowledgement and Jessica sends back her acknowledgement then the problem here is that Jessica is actually faster to send back her data than Jenny so even though Jon's request to Jenny was sent first he actually receives Jessica's data first and JN is going to have to be able to handle that or he might not have to handle that if you're not going to be talking to multiple people at once so this is what I'm talking about like with these high level architecture decisions right that are going to influence the design of your actual protocol the the decisions that you're going to have to make up front are who is initiating the request and why what kind of response what level of response in detail do they need do they need multiple responses and are they going to be talking to multiple people at once out of sync so this kind of communication here is commonly referred to as asynchronous communication which means that the requests being sent are not necessarily in the same order as the responses being received at not necessarily the same time like you can see up here everything is in order whereas down here things are kind of out of sync so it's commonly referred to as asynchronous communication um so once you've made these high level decisions we can actually get started into the meat of building our protocol all the transports that I mentioned earlier by the way so network sockets Unix usart SPI Bluetooth low energy they're all serial and what that means basically is that every bit that is sent over that transport is received in the exact same order as it is said okay that's what serial means uh you might have heard of parallel before that's the kind of complement to serial it's where you can send the entire thing of data at once and the receiver kind of like reconstructs that but the vast majority of transports these days use seral I think the only the last one to use parallel was like PCI not even PCI Express but PCI like way back in the day but anyway so on a higher level what this means is that the concept of serialization is going to be very important when you design your protocol essentially what it means is how easy is it for your application like the sender to take a chunk of data that you want to send sort of serialize it that means put all of the bytes in order in a sequence right construct a sequence of bytes that is then sent to the receiver and the receiver is going to do the complement of serialization called deseral serialization tongue twister right um where the receiver is going to take the packet that was sent from the the sender and deconstruct it and back into the original data format so if you have like a class or a struct or something in your application that has information that you want to send serialization is taking that object and assembling its data into a line essentially like like this right and then deserialization is taking that line and putting it back into the original format so like a struct or a class or an object or something like that okay so the key thing to take here is that whenever you're designing your protocol like the structure of it make sure that it is such a way that it is easy to do the serialization and deserialization because if you come up with some really complex really like hard to put in order structure that can like change all the time and stuff your handling Logic on the sender and receiver going to be a lot more complex and that's going to have a bad time for you right if you actually dive in and start writing a networked application you'll realize pretty quickly that it is kind of tricky to to make sure that it always reads the right number of bytes so when you've got your serialized packet right of all your data that you want to send how does the uh receiver actually know when that packet ends and another one begins like if your sender sends two packets incredibly quickly your receiver might actually take that as one read call and assemble it those two packets thinking that it's one packet so how do you actually make sure that your protocol is done in such a way that you read the right number of bytes when you do the read call there's two major methods that I'll talk about I'm sure there's like other ones that you could do you could probably think of some um but the two major methods that I've come across in my time are including the length of the packet in the message itself so that is if your packet is 50 bytes at the start of your packet in what's known as a header and we'll touch on headers in just a second you would include 50 the length in the actual packet itself so that once the receiver reads that it knows how many more it has to to read to get the length of the packet and uh like IP packets it it does something like this another common way that I see is a stop sequence so that is essentially like you've got your packet you've got your header you've got your data and then you have a stop sequence at the end that is explicitly defined as the end of the packet so for example a very simple one you have a bite that is denoted as the stop character so for example 0x FF that is 255 in decimal you could say whenever the receiver reads 255 that's the end of a packet and from there you can use that to say okay I'm going to start reading the next packet now there are a couple of benefits and drawbacks to both so you might be thinking why would you ever do the stop sequence method because one of the innate drawbacks of that is that you lose a bite that you're able to represent okay so if you have to denote one BTE as the stop sequence you can never actually include that bite explicitly in your data right because your receiver is going to think that that's the end of the payload uh so what you can do like instead the like the length of the message method you can technically send arbitrary data as long as the the length is correct so why would you ever use the stop sequence method it's because it's very easy to understand the stop sequence method and it's very easy to implement as well um and in Simple Communications protocols remember all of this stuff is seriously situational dependent okay there's all of these different implementations and and sort of things that I'm telling you about here I'm telling you about all of the options so that you can look at your specific scenario and your specific application and decide okay I'm going to do this method or I'm going to do this method for example you might have a situation where you don't need to send arbitrary payloads okay so for example if you're familiar with the asky character range um like that's representing the English alphabet and some extra characters using you know a one bite you can and if your data is only going to be in the asy range you can just have your stop bite be outside of the asy range and it doesn't even matter so it's really all situational dependent that's what I'm trying to get across here all of these options you look at your situation and you apply them so this method of including the length in the message I'll talk I'll talk about that a little bit more almost every protocol has a header at the start of its packet okay this is an IP protocol header and as I just dubbed over because I misspoke uh you'll notice right here the IP header actually includes the total length as I mentioned before but what you'll notice is that the IP header is 20 bytes long this this section here the options can vary but the the actual main part of this header is 20 bytes long okay and the reason for that is so that even though the data can vary in its length which is specified here there's a structure that the receiver can always use to draw upon to reconstruct the rest of the message and understand it so essentially when you're designing your own protocol you're going to have to think about what you want to put in the header and technically listen technically you could have the header be dynamic size as well if like uh the first bite is the size of the header so there's there's so much leeway with your design that you could do but um typically your header will be a static size and every section every B range in your header will be defined as something so for example a very simple protocol that you could design would have a message type in the header it could have some options in the header um and it could have like the offset of the data in the header or something like that or the length of the message right that would be a very simple header like this this is a pretty complex header because IP is a pretty you know involved thing but if you're building a simple application protocol your header can be very short and it can just contain metadata about the packet that you're going to be uh reading essentially the next section of our packet is incredibly important and this is the payload or payloads plural depending on what you want to do right the payload of the message is literally just information in a structured format again this is all about structure guys I can't stress that enough it's literally just all about structure as long as your receiver knows what format to expect you can have it in whatever structure you want like you can do Json you can just do an asky text you can do yaml you can do tokenized strings you can do whatever you can do whatever you want as long as you implement the code to deserialize that on the receiver end your data can be in whatever format you want okay so the payload comes after the header typically and before the stop sequence if you're going to use one or if you're doing the Len the message length method uh you'd have the header with the length of the message and then just the data and then your receiver would read to the end of the data and and go from there so essentially what you're kind of doing on the receiver end is is deserializing twice so you're you're kind of taking the the packet the very high level packet that you're sending which contains the header the payload and stop sequence if you've got one and you deconstruct the header you deserialize the header uh figure out what all the options are and then you move on to deserialize the payload which might be in a completely different format like Json or something like that and then from there you can decide what you want to be doing now one thing I'm going to briefly mention is check sums and CRC okay so essentially this is for error correction so if you send a message to the receiver and somewhere along the line somewhere on the transport that message gets corrupted how does your rece know that the message is corrupted and how does it ask for a retry okay and this is where check sums and CRC come in so essentially what they are I'm not going to get into all of the mathematical details here but they are a mathematical representation of all of the data that is in your packet and it's usually one way so you apply a one-way mathematical operation to this packet and you get like two bytes or something in the case of like cc 16 four byes in cc32 like it's usually very short that you can just append on the end so what happens is the sender is going to calculate the CRC or check sum of the packet it's about to send it's going to append that to the end the receiver is going to receive that calculate the check sum or CC for the data it actually received and then compare that to the CC or check sum that the sender calculated and put at the end of the message and if the CC or check sum differs it's going to ask for a retry and do that as many times until it gets it right now the reason why I say this is like something I'll mention briefly is because some transports do this already uh and if you're on this channel you're most likely interested in network programming with Unix right and the thing is that some transports do this already so TCP and IP they already do and ethernet they already Implement a whole bunch of check sums and CC checks along the way to make sure that no errors are there but if you're transmitting over something like uart or SPI on the embedded realm you would want to consider adding something like this or if you're doing something over like IR like radio frequency um you'd want to consider doing something like this just to make sure that there's no corruption going on in your actual transport this is something that's usually appended at the end so if you were going to include one of these your structure would be header payload CC okay you guys have no idea how long it took me to come up with this okay this is my Magnum this is my monol Lisa all right this is the culmination of my life's work and it took me months I'm talking months in a cave no food no water just thinking all right this is the toe protocol all right the toe protocol it's incredible it's revolutionary it has a three byte header and variable length data this is the tow protocol okay I just I just came up with it in like 5 minutes it's really not that hard to come up with protocols once you once you know how to set things all up all right so this is the protocol that I came up with in 5 minutes you literally just have a message type it's a three by header okay you have a message type you have the options for the message and you have an IND end index that's your three by header you have a variable length data and the end of the data is specified by this BTE here in the header so I'll give you an example this is the true power of the tow protocol I don't think you guys quite understand the gravity of what we've got here all right you take this format okay message type options end index and data and what you do is you apply that in your application to mean something now consider a home automation engine and I've actually built one as I shown you it doesn't use the tow protocol okay it uses a uses a much more advanced one that I took a little while to design and maybe I'll link the uh the specification for that down in the description if you're interested but the tow protocol right here let's say that our receiver and sender has decided that message type 0x01 means live okay so whenever this option right here is set to 0x01 that's the decimal character one we working in HEX usually uh when we when we do these protocol things that means I want to modify the lights okay our option here o we can make that mean apply immediately you could also have an option that says like you know apply in 5 minutes or whatever um but for this purpose for this example we've said that we want 0x A1 to mean apply immediately and our receiver understands that now this section here obviously we have our end index or our you know length of the length of the data which is set to three so we have a three by payload and this text right here is actually just asky text so if you remember way back when I was talking about protocols I told you that the payload is simply a structured set of data and I gave asky text as an example well that's what doing right here and this this uh hex value right here is literally just if you decode this from asy it literally just means off so if you think about what our receiver can decode this entire message to be the type is 0x01 which is lights the option is 0x A1 which means apply immediately 3 by payload and the payload is off so what the receiver can decode this to be is lights modify the lights immediately and turn them off and so as long as this is what I've been talking about the whole time as long as your receiver and sender agree on the structure you can code them to do whatever they want so the sender will send this message and once the receiver receives it decodes it it will understand that okay now that I've received this I need to run the routine that changes the lights off and that's what I've been talking about this whole time you could apply this to so many other things like I also showed off how I got it to turn my computer off and turn it back on again like these protocol things are so exciting to me because there is so much you can do with them it's literally just the limits of your imagination once you come up with it and uh yeah there are people that have make the argument of oh well you could always use something you know you could you don't have to make it yourself and use something that someone's already made and that's true but I think this is a fantastic learning exercise and I think it is so much fun to do because you can literally say that you designed your own network protocol you wrote applications to do things and having them do stuff is so satisfying it's so satisfying so yeah this is the the power of the tow protocol um and of course you can add as many message types as you want you can make as many option types as you want as long as your receiver knows what they are and knows what they mean and can decode it and then apply the correct operation anyway like do the correct thing you're all good you can just go ahead and keep adding features and stuff so that's pretty much all I wanted to really talk about today um in future what I'm actually planning is that we're going to write some networked applications using a custom protocol we might even use the tow protocol who knows I'll see what I feel like on the day but I kind of wanted to give an overview of designing your own protocol because it's so cool it's so much fun um and yeah I I think it's a severely underrated learning exercise CU for example um when I was doing the uh computer systems course at my University they gave us the protocol that they wanted us to implement in our assignments they gave us the protocol they came up with the protocol but I think it's so much fun to be able to come up with your own protocol and then Implement that protocol see what works see what doesn't swap it around have fun with it that's what that's what we're all about we're just goofing around and having fun and the toe protocol yeah goofing around and having fun so that's really it for me hope you guys enjoyed I have some coding stuff in the pipeline like I said networked applications coming very soon so yeah have a good one

Info

Channel: hoff._world

Views: 12,021

Rating: undefined out of 5

Keywords: linux, networking, protocols, internet, tcp, communication, embedded systems, network sockets, sockets, software engineering, programming, servers, computers, spi, usart, uart, udp, unix, unix sockets, crc, checksum, error correction, ip

Id: kH7P1ZX44DQ

Channel Id: undefined

Length: 24min 14sec (1454 seconds)

Published: Sat Feb 03 2024