SIP vs. H.323

On a number of occasions I’ve mentioned how I prefer SIP over H.323.  Some of you, rightfully so, are probably wondering why?  Just because SIP is newer doesn’t make it better.  Well, here are a number of reasons why I’ll take SIP over H.323 any day of the week.

First, let’s look at who controls each protocol.  H.323 was developed by the International Telecommunications Union (ITU), an organization that dates back to before Alexander Graham Bell invented the telephone.  The ITU has been instrumental in building the Public Switched Telephone Network (PSTN) as it existed for the first 100 years or so.   In fact, H.323 is basically an IP wrapper around Q.931 which is the call setup protocol of ISDN.  In other words, H.323 is clearly rooted in traditional circuit switched telephony.

SIP is controlled by the Internet Engineering Taskforce (IETF) which is responsible for the protocols that make the Internet work.  The same folks who make enhancements to Hypertext Transfer Protocol (HTTP) and the Domain Name System (DNS) also make enhancements to SIP.  While H.323 takes its cues from the TDM world, SIP looks towards the Internet and the web for its guiding light.

In H.323, endpoints are dumb.  They are what I like to refer to as stimulus devices.  The phone tells the call server that a button has been pushed.  The call server tells the phone to write something to its display.  Without an intelligent network element telling an H.323 telephone exactly what to do, it’s little more than an expensive paperweight.

SIP endpoints are smart.   A lot of processing occurs in a SIP phone and it’s possible to create and manage all sorts of communication flows without the assistance of any network components.

H.323 is a byte-based, hexadecimal protocol and as such is hard to troubleshoot and debug.  SIP is text based and anyone familiar with the protocol can easily look at a SIP trace to understand what is happening.

Despite being a standard, H.323 has evolved into being a very proprietary protocol.  You cannot take an Avaya H.323 phone and expect it to work on a non Avaya system.  However, despite some of the liberties that vendors have taken with SIP, there is a high degree of probability that you can run anyone’s SIP phone on any SIP compliant system.  You may not get a particular vendor’s proprietary call control features, but you can expect to make calls, hold calls, transfer calls, create conferences, and most of the other common telephony functions.

SIP allows you to embed information about the call within the SIP messages.  For instance, you can add a customer’s account number to a SIP message and have the number stay with the call no matter how many times it is transferred around a contact center.  H.323 does not have the mechanism to easily attach data to a call which leads to a fully parallel Computer Telephone Integration (CTI) network.

H.323 works well for voice and video, but it hasn’t been extended to support other media types.  On the other hand, SIP is media agnostic and can be used for instant message, presence, file transfer, etc.  In fact, when I was working for Nortel developing the MCS 5200 soft phone, we used SIP to play chess across the Internet.

In summary, H.323 is not a bad protocol.  It does what it does and it does it well.  However, as communications have evolved, H.323 has pretty much stayed where it was 10 years ago.  SIP not only addresses the media types that H.323 ignores, it has absorbed the properties that have made the Internet as pervasive as it is today.  SIP is clearly the future and its relevance will proportionally rise as H.323’s relevance falls.


  1. Andrew, I think you’re a bit misguided here. I’ll first preface what I’m going to say with the fact that SIP and H.323 are both old protocols by any measure today, both dating back to 1996! It’s hardly even worth debating the merits of one versus the other these days, but I can’t help but correct some of the misstatements here.

    H.323 wasn’t “developed by the ITU” and more than SIP was “developed by the IETF”. You’re correct that the work was done within those respective standards organizations, but the people doing the work were employees of companies interested in making voice/video work over IP. Many were young engineers in their 20s, not at the way you make it sound with a bunch of old cronies carrying forward long history of old-world technology. More importantly, H.323’s main focus was for making videoconferencing calls, not old-world voice phone calls.

    That said, H.323 did make use of Q.931 (ISDN signaling). Why? It was immediately available, well-understood, and made interworking with circuit-switched systems easier. While you turn your nose up at that, just keep in mind that the IETF is still actively working to reach feature parity with H.323, even in basic things like Q.931. As a relatively recent example, look at RFC 6432. There are numerous examples where SIP was extended to support various capabilities from the circuit-switched world. It has been extended so much, in fact, that it is now starting to be used as the replacement for ISDN in many carrier networks. SIP is used by 3GPP as a means of making phone calls over mobile phones (you won’t see that signaling, as it’s behind the scenes running on LTE) and for interconnecting voice calls between traditional phone companies. They’ve been working on that since 1999, but it’s starting to be utilized.

    So what we have today is, in fact, two systems with two points of origin that are both dated and converging to have a similar set of capabilities. The underlying protocols are different, but much of the same information is conveyed and they try to perform the same function.

    The binary vs. text argument never ends. The truth is that H.323 is binary and SIP is text. Do people debug binary H.323. Absolutely not. The binary messages are always presented in textual form for debugging purposes. Thus, the argument really carries no weight whatsoever. H.323 is not more difficult to debug than SIP.

    H.323 endpoints are not dumb and they’re not stimulus-based. There is an Annex to H.323 that defines how one might add stimulus signaling support to H.323, but what you described about H.323 is totally wrong. H.323 and SIP are both “smart” endpoints that can operate autonomously to make calls point-to-point.

    H.323 also has not “evolved into being a proprietary protocol”. Avaya does have some models of phones that implement a subset of H.323 and will generally only work with Avaya PBXes. Cisco also has SIP phones that only work with Cisco PBXes. So your argument is what? There are vanilla H.323 and SIP devices out there that will “generally” work with any system.

    H.323 is actually an interoperable protocol used widely for videoconferencing systems made by Cisco, Avaya, Polycom, Lifesize, Spranto, etc. and with cloud services like BlueJeans, Pexip, etc. Interoperability events are still held each year and vendors do try to work to ensure that system can interwork.

    SIP also strives for interoperability, but it has been observed by many in the industry that, since SIP does not define a system, but merely the base protocol, it leaves a lot open to interpretation. As such, SIP has been called the “Subject to Interpretation” protocol. Others have said that “there are as many variants of SIP as there are implementations”. Those claims might be harsh, but the point is that SIP is not without its interoperability issues.

    SIP and H.323 both have a means of “attaching” information, as you described it. Within H.323, there are actually fields dedicated for the purpose of inserting whatever kind of non-standard data you might want to insert. Not only that, but the type of data can be identified using standard identifiers (or manufacturer-defined values). Basically, both systems are equally extensible and support addition of non-standard data.

    H.323 works well for voice and video, but you apparently didn’t know that H.323 supported data applications from the outset. Those include fax, modem, whiteboarding, application sharing, file transfer, and text. SIP does the same, of course. H.323 never defined instant messaging, as experts felt it was more appropriate to use a protocol like XMPP than to re-invent the wheel.

    You might have heard of the forthcoming work called “WebRTC”. H.323 is being updated right now to support the WebRTC data channel so as to allow H.323 devices to interwork with WebRTC clients coming in the future. This speaks to your last point. H.323 has not stayed in the same place for 10 years. It was most recently updated in 2009 and it’s scheduled for another update in 2015 to include expanded Telepresence capabilities and WebRTC support, among other things.

    However, it can’t be ignored that both of these protocols are now old by every definition. When the work first got started on SIP, for example, it was commented in an IETF meeting or on one of the mailing lists that the old SS7 network was “15 years old!” I laugh now, because the IP protocols of today are older than that.

    It’s actually time to start looking toward the future and try to figure out what comes next. H.323 and SIP are both “monolithic” systems. Basically, if you want some kind of capability, it has to be built into the endpoint. If you buy an IP phone and want file transfer capability, you’re out of luck unless you can get the vendor to implement it and roll out a new version of the firmware. And what if you make a phone call from your mobile phone, but want to transfer a file to a person sitting on your PC or tablet? Can’t do that today.

    However, the “old” ITU is working to enable just that kind of capability with a next-generation multimedia system. It’s been progressing for a number of years, but it’s inching closer to reality. It will be a protocol that uses JSON as the transfer syntax, making it much more uniform and easier to develop with than either H.323 or SIP. It will be designed from the outset to allow distributed applications (like file transfer on the PC and voice on the mobile phone) to work harmoniously. Also, it can serve as the signaling syntax used in WebRTC, since WebRTC doesn’t define a signaling protocol. It’s been slow to progress, because the desire was to create something revolutionary to lead the industry forward, not to merely be a replacement for the same monolithic systems of today.

  2. Thank you for the thoughtful reply. I agree with you on some of your points (e.g. both protocols becoming overburdened — not fond fond of the term monolithic when it comes to SIP), but I stand by what I wrote.

    I am also well aware of WebRTC. Look through my blog articles and you will find four distinct articles and a number of separate references. You will even see why I don’t think that WebRTC should define signaling:

    I do appreciate your taking the time to express your thoughts so thoroughly. I would be interested in what others have to say on the subject.

    In any case, H.323 has clearly lost the protocol war. The world is moving to SIP and away from H.323 and I don’t see that changing.

    1. Andrew, I’m not sure one can say that, after 18 years of parallel development and continued use, that either “won” a protocol war. If you’re referring to telecom operator adoption, I certainly agree that SIP owns that space now. Basically, the protocol is serving as a basic replacement of legacy TDM voice service. Meanwhile, H.323 still dominates in it’s core area of videoconferencing.

      That’s the issue I take with the current state of affairs. Whereas the industry could have more fully exploited IP in order to enable really cool, revolutionary modes of communication, what do most people still use? It’s a basic voice phone. It might be IP on the backend, but that’s it. In the enterprise, there might be video and, if you’re really lucky, it might even work when calling outside the enterprise, but not likely if interconecting via SIP to a carrier.

      In the consumer space, there is some innovation with the likes of WeChat, Google Hangouts, etc. However, those are all proprietary solutions, so the world is left fragmented. The only “common” capability is the PSTN. That “PSTN” might be a basic voice SIP “trunk”, but nothing more than that.

      It’s discouraging and frustrating at the same time. I’d really like to see that change and see the world move on to other standards that do allow for richer communications, support for multiple devices, etc. I think it’s time for the world to move on to new architectures and protocols. Neither H.323 or SIP are really the right tools to carry us forward. They’re both really old now.

  3. Again, thank you for your thoughtful comments. I appreciate your insight and honesty.

    My “winning the protocol war” remark is based on personal observations. Every major telephony applications vendor is moving their products to SIP. There are lots of cheap (or even free) SIP soft-phones for mobile devices. The hard-phone manufacturers are all moving to SIP. Avaya, the largest supplier of enterprise communications, is gung-ho on SIP. Microsoft has entered the telephony world powered by SIP. Even the video folks (LifeSize, Polycom, Tandberg, Radvision) are working with SIP. I cannot think of anything new and innovative that is based on H.323. From what I’ve seen, that ship has sailed.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: