Real-time Transport Protocol (RTP)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

welcome to EPG party shala in computer science this is a series of lectures on computer networks so we've been looking at some protocols which are useful for multimedia applications so we looked at network layer protocols we looked at quality of service then we started looking at multimedia applications and their characteristics and what so that they typically demand from the network so what is what kind of support would be useful for these multimedia applications from the network to aid that we had looked at a couple of applications typically what is required for streaming media content either audio video and for interactive applications so based on that input that we had we will now be looking at some of the protocols that have been designed to support these kind of applications so today we will be looking at the RTP protocol which is the real-time transport protocol so you look at the design philosophy we will look at packet formats and other details of this particular protocol so we said that there are a number of real-time protocols that have been developed in order to support real-time applications multimedia kind of applications so we have applications are used for stream description you have protocols like STP SMI l and so on and then we have stream control protocols which are used to kind of remote control a session a streaming session RTSP protocol then we have session handling protocols like sip sap SDP and so on then we have the media transport protocols RTP and rtcp this is a pair of protocols that are normally used together RTP provides a data transport and rtcp is the Associated control protocol so this is used to send data and metadata so we will be looking at some of these protocols we will start with the transport related protocols does RTP and rtcp today so if you look at the protocol stack so far we've looked at the protocol stack with respect to general class of applications so where we normally talk about TCP and UDP running over IP running over some layer two protocols so now in order to support the multimedia kind of applications there are few other protocols that have been developed so some of these come under the category of signaling protocols that is you have H three two three sip these session initiation protocol and so on she used to signal some information between the client and the server then you have streaming protocol RTSP protocol and then we have RSVP we've already looked at RSVP the reservation protocol which is used to provide quality of service no RSVP may run either on top of UDP or it may run directly on top of the IP protocol itself in addition to that for media transport we have the media transport protocols RTP now RTP typically is to be used along with other media encapsulation protocols such as h.261 or MPEG 3 or whatever standards so these application level codecs or whatever will work using RTP which will run on top of UDP and the will be carried as IP packets so in addition to RTP we have again our tcp which is which aids in providing a quality of service it is not a quality of service related protocol but it kind of carries information which may be used by the applications to improve or to know at just the quality of cells so it is this pair RTP and rtcp that we'll be looking at first you look at RTP now you may look at our tcp now a subsequent session so moving on the normal question that comes up is why these additional protocols after all we had TCP and UDP why could we not just do with TCP and UDP itself why not TCP so we've already seen this TCP whenever there is a loss there's a retransmission mechanism and because of retransmissions there will be a delay and delays are not as acceptable for us when we talk of real-time applications and there's a continuous rate fluctuation which also is not very comfortable for multimedia applications and why not UDP after all it doesn't have these problems it's too simple and typically when you have a firewall when you have an ad for instance UDP packets may be dropped for security reasons so UDP is also not really suitable so what do we do so we try to come up with something in additional but which make use of this UDP and TCP s because these are two mature protocols which are there in reach a time tested which have been which have been running very successfully in the Internet so we want to make use of these protocols but do something over and above this or make use of these for the for certain purposes so what is that we make use of them for UDP is normally used to carry pure audio streaming so like for instance in audio and voice over IP kind of applications and this along with this UDP we can make use of RTP and other protocols which will tec take care of the requirements for the multimedia application as such TCP can be used for streaming whenever you have for instance a large buffer which can accommodate delays which will which can kind of compensate for delay so even though you have large delays in TCP by having large buffers you can compensate for it so for applications like one-way applications for example watching clips video clips on on YouTube you may not you may not mind waiting for a few seconds before the video starts so in those kinds of applications we can do with TCP but it's definitely not okay for interactive applications like video conference so we definitely need something more yes which is where we are moving to watch these protocols like RTP and rtcp so what is this RTP protocol it's a real-time transport protocol meant for transporting real-time data so you can say it's a framing protocol for real-time applications and definitely it's not defining any QoS mechanism for real-time delivery so please do not confuse RTP as secure s protocols basically a framing protocol for real-time applications and rtcp it's a companion protocol to RTP and you can say it's a companion control protocol used for carrying control information related to the RTP protocol so again this does not guarantee anything it just provides feedback provides reports so using this report the application will have to take necessary steps so RTP and rtcp are protocols which aid the application to know something about the network and accordingly adapt its behavior yes that's the purpose of RTP and rtcp now we will look at RTP in detail ok so if you take a look at this this pair of protocols RTP and rtcp what has been the goal or the philosophy behind the design of these protocols one of the first things that was that was definitely kept in mind was that it has to be flexible that is you want to provide mechanisms so that you can you can kind of cater to many different types of applications but do not want to dictate any algorithms I think that is so it should you should be able to instantiate h.261 mpeg-1 to whatever comes you must be able to kind of handle this without any change to the basic protocol as such and it should be protocol neutral in the sense that you should be able to work either with UDP IP protocol D lower layer protocols you must be able to work with UDP and IP or it may also be some private ATM networks and so on that you want to work with of course now predominately it's view dpip but whatever other protocol comes in future that also we should be able to handle third is it should be scalable and it should be able to handle unicast as well as multicast because remember when we talked about many multimedia applications we typically go into conferencing kind of applications audio conferencing video conferencing and so on so we're definitely there is a need for multicast link support so you have many users or many receivers who will receive a single piece of multimedia data and normally what we also try to do is you try to separate the control and the data okay so that some functions the control functions can be taken care of by different protocols if necessary okay so kind of separate control and data that always gives us a little more control over how the entire process actually can take place okay so now coming to the requirements as such of the RTP protocol okay what is that we wanted wanted to do so if we identify the requirements then we can see if all these things all these requirements are met by means of whatever information is carried as part of the protocol format and so on so one of the first things of course is that we need to take care of both types of applications interactive multimedia applications as well as streaming applications now interactive multimedia applications we have seen have strict real time constraints streaming applications are not so strict so whether it is strict real time constraint or not so strict both these must be handled by the RTP protocol okay so we need to have a mechanism which is kind of flexible to handle both and of course one of the major goals is that it should allow similar applications to talk to each other that is they should be able to negotiate on coding schemes if necessary there are similar applications but you may have one application which which is connected to a low bandwidth network and other to a high bandwidth network so in which case they should be able to negotiate certain parameters may go coding schemes and so on so then they will still be able to run the application and communicate with one another so the protocol must have the flexibility to support this kind of negotiation third point is that it should help receive recipients determine the timing relations among data now when you talk of real-time obviously we are very much worried about the timing relationship between successive data that is received so which means we will need some kind of time stamping some mechanism for synchronization of data especially when you have multiple media that's being transferred you have and you have audio and video being transferred you need to do things like synchronization lip synchronization for instance is very important if you do not have lip synchronization and your view for instance a program where lip synchronization is missing you will see that you know you hear something but you see something else it's not a very nice experience as viewing experience for the user so definitely some kind of synchronization will be required so we need to have mechanisms to support synchronization further we know that there will be packet losses and since it is running a visa we said that RTP would run on top of UDP and UDP will not take care of any packet losses as such so the application will need to handle the packet loss so the application has to hand your packet loss we need some kind of indications to be given to the application that a packet has been lost you must have some mechanism by which the application is able to identify that a loss has happened happened that's one thing second thing is you must provide some indication of frame boundaries the reason we talk about frame boundaries is that depending on the application different applications have different kinds of boundaries okay for instance when you talk of audio we talk about talk spurts that is there is time when there's a continuous speech that is happening and then there may be some silent periods so these so you know that when you are transmitting data you have to indicate this is the beginning of a talk spurt or it's an end of a stock spurt and so on similarly I'm looking at video data for instance you may be sending different types of frames we talked about I frame P frame B frame and so on so there are different requirements of frame boundaries and we talk about different applications so we must be able to support these different frame boundaries as well and you must be able to provide a generic identity for senders independent of the IP at because you're talking about something is happening about the at IP layer right so we want to have a gender identity for senders in depend of IP address using which you'll be able to identify who the sources who the recipient is and so on and all of this we want to do without too much header overhead so you want to keep the entire header very simple so that the processing can be done really fast we don't want to have too much of an overhead and remember many of the packets that you may be sending for instance when you're sending audio interactive audio for instance the packets may be very very small the data in that's carried in the packets may be very small in which case if you have too much of header the header overhead will become too high so you will not be using your bandwidth effectively so all these are the requirements that we would like to meet using our RTP protocol and you design the RTP protocol so let's look at how these requirements are met there are two key ideas that are used in the design of RTP protocol so first one is called as application level framing so what this means is that you're leaving everything to thee you kind of let the application figure out how it wants to frame the data how is it that it wants to format the data and so on so the idea we're using is that the application knows best as to what it needs so let it decide for itself so which means different applications have can have different profiles different formats so your protocol must be such that it can accommodate different profiles and formats depending on the application to the application on the basis of the application you'll be able to choose different profiles and formats so the advantage basically is that you get application specific flexibility second idea that is used is what is called as the end-to-end principle so what this means again is that the end systems will take responsibility for providing service irrespective of the network capabilities ok this particular end-to-end principle you can see is in tune with the internet principle where we say that your network can be simple or dumb and all intelligence is a cently at the end system so similarly here we are saying intelligence will be at applications and RTP the protocol that you have will just provide our transport service ok and enough of information will be carried in the transport service so that the end machines can figure out what needs to be done so with these two key ideas the entire protocol has been designed okay so if you look at the features of RTP protocol you will see that basically it's specified initially not RFC three five five zero so as I said RTP runs an end system if you have the end end-to-end principle so it runs on the end systems so no RTP running anywhere in the network meaning at the routers or whatever okay an RTP packets are encapsulated in UDP segments that's another decision that is taken and the an RTP will manage delivery of real-time data that is it specifies the package structure which is required for carrying audio/video data and so on so that there is some mechanism to specify that and it will provide for interoperability which means if two VoIP applications run RTP they can work together they can talk to each other alright now in order to do all these things what is what are the other features that are inbuilt are built into RTP so RTP packet if you take a look at it you will see that it provides some kind of a payload type identification now why do I need a payload type identification to identify what kind of data is actually being carry well it's audio/video what type of audio/video I am carrying and so on okay so there's a payload type identification that is specified there's a source identifier remember we said we wanted to identify who the sources so there's a source identifier and we said there could be loss so in order to detect losses we have a sequence numbering mechanism sequence numbers are used so that if something is lost by looking at the sequence number you will be able to identify that something has been lost so then we said timing recovery is important so a time stamping mechanism is you so you all RTP packets will have our timestamp okay and there are markers for significant events in the data stream so we said that we want to have different frame boundaries and so on so if I want to mark something there are markers that are provided and again these markers can be defined based on the application so you get application specific marking or application specific frame boundaries can be specified so the functions if you look at as such you can see that you have data functions which is basically talk about labeling the content so you have source identification you have last detection and you have resequencing and then you have timing related information so the timing related information that is available you can do intra media synchronization that is within a media stream you could have for instance problems like jitter that occur it you remember is variation in delay so because of variation delay you'll get a very unstable kind of a picture or an C or you will not hear your the audio properly so we know we know that with play out buffers by delaying the play out point you can take care of these things but to do this I need to get some idea about the jitters you must be able to calculate jitter you must be able to identify what is happening in order to do those kind of functions RTP will provide some supports by means of the timing data that it carries then we can also do intermedia synchronization so what we mean by intermedia synchronization is its for instance when you have two different media satis audio and video that's being carried together for coming from a single source you may want to provide some kind of a limp lip synchronization between audio and video so those kind of things also RTP will allow you to do and as we said RTP runs on top of UDP so you have your application layer here IP layer here UDP will be on top of IP and RTP sits above UDP and below the application so this so you can say that with RTP right the transport layer is kind of extended you have UDP plus RTP providing the transport layer functions for the real-time applications okay so in that sense so this is a combination of RTP plus UDP together ok it will provide you port numbers and IP addresses to using which you can uniquely identify the the application it gives you payload type identification it gives you packet sequence numbering it gives you time stamping ok all these become possible with the help of this combination now there's no fixed UDP port that is used for RTP it's normally negotiated out of man ok and since we have a companion protocol that's used with RTP the rtcp protocol the UDP port used for our TCP is normally whichever port is used for RTP that number plus 1 so if you using 5000 for RTP 5001 will be used for our TCP and RTP has the constrained that usually only one media is sent put RTP session there is per port pair that is one RTP are TCP combination one media will be sent so this is the idea that is basically used in RTP so now if you look at the header format now if you look at the header format a lot of things become a lot more clearer so if you look at the header format you'll see there's first a version field and then you have a P fields called as a padding bit okay and then you have many other fields here you have an X field which indicates some extension headers then there's a CC which indicates contributing sources account of contributing sources okay then you have an M field which is a marker bit then you have a payload type field then we have sequence numbers you can see there's a timestamp and you can see that there's a synchronization source IDs SRC which is basically the source ID that we talked about you can see we have a sequence number we have timestamp we have source ID okay and you have what is called as contributing source ID so if you have more than one source and whose data is all being combined together that information is also embedded over here and then there's a provision for extension headers that can be added to it so this extension header will normally be added based on applications requirements and so on then you have the actual RTP payload header and the payload so depending on which payload what kind of payload type you have you will have a header here and the actual payload data following that okay so of this these first three okay are the ones which are so you can look at each of this is four bytes so four and three 12 bytes forms the standard part of RTP header and the rest of the things are added depending on the situation depending on the application so just looking at each of these bits now so version it's a version identifier this P is a padding bit now padding bit is used whenever the RTP payload is padded because we want everything to be it the entire packet to be in multiples of four bytes so what happens here is that when padding is used the last byte of the padding that gives the number of bytes that are padded so using that you can count back and look at which are the except extra bytes Ravine added and you can remove them and as an expert this basically said whenever there's an extension header added as we said CC refers to the count of contributing sources M is a marker bit which is used for identification pts payload bit sequence number is is a number which is used to detect missing and miss ordered packets now this M and PT are both defined by the application profile so we have what squad are called as different application profiles which are used and they specify what the format of M and PT should be and what should be there in that particular payload header and so on so the payload type as we said okay it indicates the type of encoding that's currently being used as seven bits so if the sender changes the encoding during the call then the sender informs the receiver through this payload type field so which means even during the call if there is a drop in the bandwidth you can accommodate for those changes so you have different payload types right zero indicates PCM you law three is GSM thirteen kilobits per second can see the different data rates are supported using the different payload types that you have and a sequence number field that we looked at that's a 16-bit sequence number field so it's incremented by one for each RTP packet that is sent so you can use this to detect packet loss and restore any packet sequences okay so that was the first line of the header now the second line that's where we have the timestamp so if you look at how the timestamp is handled this is used to enable as we said different multimedia streams to be synchronized so what this actually consists of is actually it's a it's a counter okay of ticks now this what exactly set pic would depend on the application so the granularity of the ticks as such can be defined by the applicants an application specific time stamp that you're actually determining okay so and the time stamp in a particular packet will refer to the time at which the first sample in that packet was generated and difference in time stamps can be used for synchronization purposes okay that's the no pop is the time stamp field then you have a synchronization source ID this uniquely identifies the single source of the RTP stream and contributing source ID is when there are a number of RTP streams for instance which pass through a mixer the mixer will become the SS as the synchronization source and the other contributing sources their IDs will be mentioned over here so this is basically what you will have so coming back to the time stamp field okay that's a very important field because we need to clearly understand how that is used okay so we look at the timestamp fields 32 bits long and as I said it's a it gives you the sampling instant of the first byte in that RTP data packet first right or first sample in that RTP our data packet so if you take audio for example now audio let's say I'm sampling at using an 8 kilohertz sampling clock which means you have about one every 125 microseconds the sample is being taken so the time stamp clock will increment by one for each sampling period that is once every 125 microseconds the timestamp clock will be incremented okay so if the application is generating chunks of 160 encoded samples for instance for 20 milliseconds of samples are being sent as one chunk okay so in 20 milliseconds how what would I get I would get 160 encoded samples so which means the timestamp in subsequent packets will be incremented by 160 because in the previous thing I would have had if I had a time stamp of X the next RTP packet is going to have the higher density first sample right from the starting from the first one so it will have the value of x plus 1/16 so that will be the value you will find in the time stamp field so it kind of tells you exactly what is the timing relationship between these two packets okay the sequence number field however will be incremented only by one so by looking at a sequence number field which tells you this packet follows this packet and by looking at the timestamp information you will know exactly how many samples are being carried and then you will be able to find out what's the timing relation that you have between these two packets and then you will be able to play it back at the appropriate rate so time staing clock will continue to be increase at the constant rate even when the source is inactive for instance we talked about having silent periods right so if you have talk spurts and you have a silent period even during the silent will you may not be transmitting the data but it's your sampling the data as such right so but the date there is the value is zero so you're not transmitting it but the timestamp will continue to be incremented so that is how you will be able to also replay the silent periods also on the other end that's the beauty of how this timestamp field is used and the source SSRC field that we talked about it's a 32 bit long field it is used to identify this so of the RTP stream and each stream in the RTP session will have a distinct SSRC so this is the just a small example to understand this so let's say we have a we are sending the same example of 64 kilobit per second PCM encoded voiceover RTP so the application will collect the encoded data in chunks for example every 120 milliseconds so you will have 160 bytes in a chunk okay so now the audio chunk so we are assuming that you are sending one byte per sample ok Hanon 60 samples are there 1 byte per sample so you'll have hundred 60 bytes of each chunk so this harness state by chunk plus the RTP header that forms the RTP packet ok which will now be encapsulated in a UDP segment and it will be sent now the RTP header will indicate the type of audio coding that is used in the packet ok so obviously sender can also change the encoding during the conference if I have a conference that's going on and the RTP header will also contain sequence numbers and time stamps so using this the receiver will be able to replay the data this is overall flow of how things happen now in addition to this there are a few other interesting things that you one can do with RTP ok so for instance you could use multiplexing of different sessions and you could use SSRC as the multiplexing ID to separate them out ok firstly let's say we have a situation like this I have an office at this end of a telephone network under this end of a telephone network right or two offices at two different locations let's say this is connected which are connected by means of voice over IP network ok so you have a IP network through which using a voice over IP gateway you want to transfer data from an office which is here to an office which is there ok so now what happens this may be connected to a public telephone network here also its connect to a public telephone network there's a voice over IP gateway on on either side ok and then there's a IP cloud here so your IP network now let's assume that these are branch offices so at the branch offices you may have many phone calls between these two branch offices so what we do is each of them right could be will be carried using different SSRC IDs within the same RTP session for instance so but by using the different SSRC IDs the source synchronization source IDs the add the Gateway here you will be able to separate them out and send them on to the different receivers are different you can identify that they are different streams so you can you can send it to the corresponding users on either side okay so this multiplexing using SSRC becomes a very efficient way of combining many voice over IP packets and/or many sessions of RTP and sending it as a single session this is one thing that is done okay so you only have one RTP session between the voice over IP gateways it's not so though you have multiple calls that need to be handled with a single session of RTP are able to send the data this is one example of how RTP is used similarly you have other device which are called as an RTP mixers so these are normally used when you want to combine the streams from different sources okay you have different source so I have different SSR sees one two three and so on so for instance this may be on a high bandwidth or sigh and network connected on a high bandwidth network the receiver may be connect to a low bandwidth network so in which case this device called a mixer which comes in between okay can now come combined the data coming from these and then play it out at a rate which is which can have which can be handled by the receiver so in which case this mixer will become the SSRC and your SSRC one two and three will now become the contributing sources so we talked about CSR C's right so in this kind of situation when data is sent from the mixer to the receiver an RTP session will be established here and then for this session you will find that the these SS are C's become the CSR C's and the mixer will become the SSRC which is actually handling this session okay so like this there are many uses of the RTP protocol and many uses of how you can use this information that is carried in the RTP protocol to distinguish between sessions to carry data appropriately and so on of course along with the RTP protocol we also have the rtcp protocol so it is this combination that gives a lot of strength to thee to this the multimedia session management or the transport management that we are talking about okay we will look at our tcp earlier but with this idea of RTP you should be able to figure out how the protocol is designed right in such a manner that whatever features are required are a kind of provided okay now when you talk of RTP packets if you see some there's some interesting facts about this packet format as such you can see that there is no protocol field so how do I identify what what is the protocol so what you will actually have to do is you'll have to check for the version field or you have to check for some fields in the stream of packets so you'll have maybe you will have to something like a per flow checking that is look for a constant SSRC and increasing sequence numbers to identify a flow and use that to validate whether it's a proper RTP packet and so on so to summarize what we have looked at in this session we've started with the design goals of RTB okay and they looked at the fundamental principles of design of RTP let me look at the protocol the message format and we saw how the different fields help in carrying the RTP sessions and we also saw some reviews of this RTP protocol for carrying various sessions so you'll be looking at our tcp later so this is a quick introduction to RTP thank you

Info

Channel: Vidya-mitra

Views: 20,558

Rating: 4.9538903 out of 5

Keywords:

Id: l71kGy69S8g

Channel Id: undefined

Length: 30min 30sec (1830 seconds)

Published: Wed Nov 30 2016