Understanding Session Description Protocol (SDP)

It’s impossible to truly understand SIP without understanding its cousin, Session Description Protocol (SDP).  While SIP deals with establishing, modifying, and tearing down sessions, SDP is solely concerned with the media within those sessions.  That SIP would relegate media to another protocol is not accidental.  The creators of SIP set out to make it media agnostic and this separation of church and state reinforces that.   SIP does what it does best and leaves media to SDP.

So, what is SDP?  Well, it’s exactly what its name says it is.  It’s a protocol that describes the media of a session.  It is important to realize that it doesn’t negotiate the media.  It isn’t used by SIP clients to go back and forth asking “can you do this?” before finally settling on a common media protocol like G.711.  Instead, one party tells the other party, “here are all the media types I can support — pick one and use it.”

SDP is comprised of a series of <character>=<value> lines, where <character> is a single case-sensitive alphabetic character and <value> is structured text.

SDP consists of three main sections – session, timing, and media descriptions.  Each message may contain multiple timing and media descriptions, but only one session description.

The definition of those sections and their possible contents are as follows.  It’s important to know that not every character/value may be present in an SDP message.

Session description

v=  (protocol version number, currently only 0)

o=  (originator and session identifier : username, id, version number, network address)

s=  (session name : mandatory with at least one UTF-8-encoded character)

i=* (session title or short information)

u=* (URI of description)

e=* (zero or more email address with optional name of contacts)

p=* (zero or more phone number with optional name of contacts)

c=* (connection information—not required if included in all media)

b=* (zero or more bandwidth information lines)

One or more Time descriptions (“t=” and “r=” lines; see below)

z=* (time zone adjustments)

k=* (encryption key)

a=* (zero or more session attribute lines)

Zero or more Media descriptions (each one starting by an “m=” line; see below)

Time description (mandatory)

t=  (time the session is active)

r=* (zero or more repeat times)

Media description (if present)

m=  (media name and transport address)

i=* (media title or information field)

c=* (connection information — optional if included at session level)

b=* (zero or more bandwidth information lines)

k=* (encryption key)

a=* (zero or more media attribute lines — overriding the Session attribute lines)

For Example

The following is an example of an actual SDP message.

v=0

o=Andrew 2890844526 2890844526 IN IP4 10.120.42.3

s= SDP Blog

c=IN IP4 10.120.42.3

t=0 0

m=audio 49170 RTP/AVP 0 8 97

a=rtpmap:0 PCMU/8000

a=rtpmap:8 PCMA/8000

a=rtpmap:97 iLBC/8000

m=video 51372 RTP/AVP 31 32

a=rtpmap:31 H261/90000

Unless you’ve been working with SIP and SDP for a while, this probably looks pretty undecipherable.  However, it’s really not that bad if you know what to look for and what you can safely ignore.  This is what I pay attention to in an SDP message.

c=  This will tell me the IP address where the media will come from and where it should be sent to.

m= There will be a media line for each media type.  For example, if your client can support real-time audio there will be an m= audio line.  If your client can support real-time video there will be a separate m=video line.  Each media line indicates the number the codecs that will be defined in attribute lines.

a=  There will be an attribute line for each codec advertised in the media line.

Looking at the example above I immediately see this.

The client will use IP version 4 with an address of 10.120.42.3. It can support three audio codecs and one video codec.   The audio codecs are G.711 uLaw (PCMU), G.711 aLaw (PCMA), and iLBC.  The audio codecs will use port 49170 and all have a sample rate of 8000 Hz.  The video codec is H.261 on port 51327.  99.9% of the time I can safely ignore any of the other SDP values that might be present.

After receiving  a SIP message with the above SDP in the message body, the recipient will respond with SDP of its own identifying its IP address, ports, and codec values.  The recipient will also pick from the list of the sender’s codecs which ones it will use and potentially start real-time media flows.  The unwritten rule of SDP is that if possible you use the first codec of a type listed, but you don’t have to.  If the sender says he can do something, he had better be prepared to handle media of that type no matter in what order it was listed.

I hope this helps makes sense of what might be seen as a difficult subject.  If possible, take some Wireshark traces of a few SIP calls and see if you can figure out how media is being described and used.

By the way, this is the 50th article I’ve written for this blog.  Congratulations to me.

110 comments

  1. Fabian Monzon · · Reply

    This article was really helpful. Thanks

  2. Ravi C G · · Reply

    Thank you This article on SDP clarifies many doubts.

  3. Thank you so much. It is really helpful

  4. well explained. Thanks

  5. I really appreciate it your articles, they are so easy to understand and fluent, also it really helped in my job

  6. FInally it makes sense

  7. Shamik Datta Choudhury · · Reply

    Andrew I have been following your posts for a long time and I must say, the way you explain the topics, it makes them very easy to understand and also to retain complex logics in our mind.

    I have a confusion about SDP and media negociation. For example, if A has sent description of all the media types, codecs values ports that it supports to B, and then B also replies with it’s description, then where does the media negotiation happen? I mean, how A comes to know that B is going to use this codec for example.

    Say for example A sends G.711 a-law and G.722 to B. Now what I assumed till date was B would pick one (Most probably the first one i.e G.711 a-law) from the options shared by A, and then send a response SIP message (200 OK) to A with a SDP message body which would include only G.711.

    But you are saying that B would also advertise all its capabilities to A. I mean, if B supports G.711, iLBC, and if B send all these details to A, then how A would understand that B is going to use G.711.

    I hope I was able to express myself. Please pardon me if I am asking some stupid things.

    1. SDP is not a negotiation protocol. As the name implies, it is a description protocol. Unless there is some specific negotiation that the clients employ, the recipient won’t know the chosen protocol until the media arrives and the client looks at the contents of the RTP.

    2. Each party should be able to handle whatever they advertise or describe in the SDP. It is usually offerer will pick from the top of the list in the answer and the answerer will choose with the top of the list which it had listed according to priority. If either of the party starts receiving an undesired codec and wish to change to another one from the mutually shared list, they should be sending a re-invite /update to change the SDP. Not sure update or re-invite which one is best suited.

  8. how should be the session id and version number to be used in subsequent re-invites and their responses, with or without changes in SDP.

  9. Great and very helpful article, THANK YOU Andrew.

  10. Thanks for explaining SDP in simple but explicit manner..

  11. Billy Zheng · · Reply

    Thank you for explaining this in a simple way.

  12. Usman · · Reply

    is it possible to have different IP’s of owner info and connection info. if yes under which scenario?

  13. Thank you for your posts, clear and useful. I appreciate them

  14. thank you – 100 times now)

  15. Thanks for the good work!

  16. Can a system respond to an SDP offer with codecs and/or PayloadType information that were NOT part of the SDP offer? I have at least seen some cases of changing the PayLoad type of DTMF RFC2833 codec. That doesn’t seem to be in line with acceptable codec “negotiations”.

    1. It’s not a negotiation protocol. It’s a description protocol. It’s fine to tell the other side what you can do even if you don’t use particular protocols in that session.

  17. Top work.

  18. Taroen Sitaldin · · Reply

    Thannk you for this article! It really cleared things up for me. And gratz on the 50th article!

  19. Sean Herbert · · Reply

    I sent a SIP invite to a mobile phone. The SDP data I received included the c=* value, i.e. c = IN 123.4.5.6. However, 123.4.5.6 was not associated with the physical location of the mobile phone, but rather it was associated with (I believe) a SIP trunk server. My question is, does there exist some way to get the IP address of the mobile phone tying to its physical location using SIP/SDP?

    1. Not that I know of.

  20. Sumit saini · · Reply

    Hi , i want to know .. during a conference call is any participant wants to initiate Video call then what parameter will go in re invite?

  21. Hiya Mr. Prokop…So.. if I have a client device, and an SBC.. if the UE sends an INVITE to the SBC, it’ll have the sdp in the INVITE.. but is the SDP offer from the SBC done through a non SIP response? I don’t see any SIP traffic being sent back to the UE except an 100 Trying and later a 200 Ringing. But from what I can telll those don’t include any of the SDP offers… is it sent on the Media side or something?

    1. I am not sure what you are asking, SDP from the client can be in the INVITE or ACK.

Leave a reply to Abhijit Cancel reply