DTMF and RFC 2833 / 4733

Over the past couple of weeks I’ve written two installments on voice codecs (A Cornucopia of Codecs and Codecs Continued).   I mentioned some of the major characteristics of six different codecs and why you might choose one over another.

However, I failed to point out something that is of great importance when making your codec choice.  What do you do about Dual Tone Multi-Frequency (DTMF) touch tones?   This is not something you can ignore because compressed codecs such as G.729 and G.726 will not successfully pass those tones along from sender to receiver.  Does that mean that those codecs aren’t  compatible with your voice mail system and SIP phones?   Absolutely not.  Read on to learn why.

How many of you are old enough to remember rotary telephones?  For better or worse, I certainly am.  In fact, it was all I knew for the first ten or so years of my life.  Heck, we even had a party line for most of my childhood.  Rotary phones used something called pulse dialing.  You put a finger in the numbered hole in a “finger wheel,” pulled that wheel back to the “finger stop” and let go.  During the return rotation, the electrical current of the telephone line would be interrupted in accordance to the number you dialed.  The number one would interrupt 1 time and zero would interrupt the circuit ten times.  The central office would then translate those current interruptions into the dialed telephone number.

DTMF was introduced by AT&T back in 1963 as a way to replace pulse dialing and rotary telephones.  Now, instead of interrupting the electrical current to dial a number, the telephone produced a tone to represent the dialed number.  Actually, it was two tones blended together — thus the Dual Tone part of DTMF.

DTMF has clearly been extended to purposes beyond simply dialing a telephone number.  Interactive Voice Systems (IVR) prompt us for all sorts of things that we answer with button presses.  We log into our voicemail systems and retrieve our messages with DTMF.  If so inclined, you can even play Mary had a Little Lamb using DTMF.


DTMF wasn’t a problem with digital and analog telephone systems because they both use a toll quality (64 kilobit, 8000 Hz) audio connection.  The tones and speech easily mixed with one another and tone detection hardware was able to separate the DTMF out for applications that required it.

However, with VoIP and bandwidth concerns came voice compression and different techniques to send a legible voice stream using as little bytes as possible.  This compression and voice encoding techniques wreck havoc on DTMF and render the tones undecipherable by the components that need to detect and act upon them.

Enter RFC 2833.  With RFC 2833 you don’t send those DTMF signals on the same connection that you send your audio conversation.  Instead, you send them out-of-band on their own stream.  This allows you to compress the heck out of the voice without altering the DTMF signals.

Note: Technically, RFC 2833 has been replaced by RFC 4733, but for the most part people still want to call it 2833, so I do, too.

Depending upon the origin of the DTMF signals, they can start out in a separate stream, or that separate stream might be created by stripping the tones out of an audio conversation.  An example of the latter would be a gateway that converts analog to SIP.  Problems can arise from this stripping that need to be considered.  The converter must “hear” the tone before stripping it out and sometimes there is  leakage where the very beginning of the tone makes its way through.  This might cause a voicemail system to “hear” two tones for a single tone.  One would come from the RFC 2833 stream and the other in the voice stream.  Fortunately, conversion hardware is getting better and better and these problems have become less common (albeit a bear to debug when they occur).

So, in terms of SIP, how is this RFC 2833 stream created and managed?  Through Session Description Protocol (SDP), of course.  SDP is used to describe the voice stream (e.g. G.729) and it’s also used to inform the recipient that RFC 2833 is available.  Specifically, it uses something called telephone-event.  Here is an example of an SDP media description that you might see in the body of an Invite message.  Note the format of “0 – 16.”  This represents the ten digits plus *, #, A, B, D, E, and Flash.

m=audio 12346 RTP/AVP 101

a=rtpmap:101 telephone-event/8000

a=fmtp:101 0-16

That’s probably about as much as you really need to understand about RFC 2833 and how it works.  Its purpose is to create a separate stream for DTMF to allow voice codecs to strictly deal with creating the best possible voice stream using the fewest number of bytes.  If you remember that you will be ahead of the game.


  1. Short and informative article, thanks Andrew

    1. Thanks! I do these in my spare time and I am happy to hear that they are making a difference.

  2. Is SIP INFO message used for DTMF ?

    1. That approach is frowned upon. There may be a few holdouts, but just about everyone has moved to RFC 2833/4733.

  3. The mention “a=fmtp:101 0-15” does not include the Flash event. I just run into this by tracking some interoperability problems between asterisk (0-16 with Flash) and cisco (0-15)

    1. You are correct and I fixed the text. Thanks for noticing it and commenting.

  4. Sayam · · Reply

    Hi Andrew,

    When we are dialing the verizon conferencing bridge number and when dialing the passcode it is detecting two tones for a single tone. Please comment.

    1. It’s hard for me to know what is happening here. Are you using SIP trunks? It could be a problem on your end, inside the carrier, or inside the conference bridge itself.

  5. David · · Reply

    Great article Andrew, and praise to your effort in your blog – it has much to admire. I was wondering if youre recolection of a rotary phone is accurate? Doesn’t dialing zero invokes ten pulses? 🙂 Anyways keep up the great work!

    1. You are correct, David, and I fixed the article. I clearly wasn’t thinking when I typed that!

  6. Thank you for the explination. Very well written.

  7. hi andrew ..good basic outline ..it would be grea you cover DTMF interworking .You deal with these a lot when working with carriers .the capabilities of the SBCS also pose a challenge when we do interworking

    1. I’ll put it on the list. 🙂

  8. RiverIntoSea · · Reply

    Thanks for the great work Andrew !

  9. Thomas Brahe · · Reply

    Hi Andrew
    I want to get clarified if this DTMF issue is related to the carrier or the PBX.
    When calling from cellphones on LTG/4G networks and reaches the IVR on the PBX, I can see from the wireshark traces that it uses HD Voice 16000khz, and what I learned from google is that it simply kills the dtmf tones, but if you call from 2G networks and landlines the dtmf works perfect and comes in 8000khz.
    So from my point of view, it looks like the SIP carrier just choose to use HD Voice on all calls that comes from their own network.
    The PBX uses RFC2833 and the carrier also uses RFC2833. But in my opionion it is the carrier that need to change the format ? Agree ?

    1. So, the tones are stripped, but it doesn’t send a 2833 stream containing the digits? If so, that’s not right.

      1. Could be a simple routing change to fix. I would love to solve. PCAPs could be relative if the event is not seen but heard is important.
        Does a routing change(s) make any difference, Likely it will.
        Can the caller hear the sound of the tone when they press the button?

  10. I would like to know the difference between rtpmap: 127 telephone-event and rtpmap: 101 telephone-event and which of the two is in band.


    1. Both are dynamic payload types and both are out-of-band. Are you seeing those types on the same system or different systems?

      1. I’m seeing in different systems, from a call from a SIP trunk to the PSTN.

        what happens is that I have a problem with calls that have rtpmap: 127 telephone-event not being completed and if you have rtpmp: 101-event telephone calls if completed.

        I could help me what would be the difference?

  11. Vineeth · · Reply

    If DtmfRelay is Enabled, Does it mean that RFC2833 is enforced ?

    1. As far as I know, they are not related.

  12. Mark N · · Reply

    Hello Andrew I’m always confused when People say 2833 is OOB. Correct me if I’m wrong but the DTMF is technically in-band, but the DTMF payload is out of band? So the DTMF is audible but it is just using a different payload type that is outside of the dynamically negotiated UDP/RTP ports the rest of the audio uses.

    To me SIP info would be a true OOB protocol. I believe if you listen to the audio with a Wireshark you will just hear pauses because the DTMF is signaled between the two endpoints and not passed through the RTP stream like 2833.

    Am I correct in my thinking?

    1. With 2833, the DTMF will be stripped out of the audio stream. It will only be represented by the separate RTP stream.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: