Thursday, April 23, 2009

The Nightmare Before TR-069

The open source project that I'm currently working on is named LACS. It's an open source implementation of the TR-069 standard. The standard is a way of configuring, upgrading, and diagnosing customer premise equipment (CPE) in a telecommunications network.

At this point, most of you are probably scratching your heads. What is a CPE, and why do I care? You probably don't, but management of network equipment is what I do for my day job. I work on the OAMP team at SOMA Networks, where we create the interfaces that are used to manage our equipment. These interfaces are used to setup and ensure the equipment is operating correctly.

How does this relate to you? Well, think about the DSL or cable modem you have at home sitting behind your wireless 802.11b/g/n router. That modem is a CPE in network management speak. The network operator (e.g., Rogers, Bell Canada, Verizon) has to manage these CPEs. They need to ensure they are configured correctly, and that they get upgraded when new firmware versions become available to fix bugs or add new features.

Think about how many of these they have to keep track of. In a large network, you are potentially talking millions of CPEs. OK, well thats not so bad since Rogers always hands out the same type of cable modem, right? Wrong. They probably have several types of modems out there; I'm sure the modem I bought six years ago, still pushing bits around in 2009, is not what they are currently using.

So what can you do to manage all these different CPEs? One option is to create different applications that know how to talk to a specific brand of modem from a specific vendor (think Motorola, Cisco, etc.). This application is built by the CPE vendor, and may be provided for free (or not, as the case may be). We'll assume that this application gets upgraded on a regular basis to handle new firmware versions as they become available. If we are really lucky, the application knows how to mange multiple types of CPEs by the same vendor.

At first glance this doesn't seem so bad, but there are some big issues with this approach. Think of it this way. I need to update all of the CPEs in my network to introduce a new feature. I have five types of CPEs out there, and two of them have to be upgraded with new firmware before they can have that new feature enabled. I need to go to two separate applications and tell them to upgrade all of the affected CPEs. Assuming all goes well and all of the CPEs are upgraded over the course of a week (yes, a week or potentially longer), we can start configuring the new feature. I have to go to five different applications and figure out how to switch the feature on. In one application you need to interact with a Windows GUI, and you have to interact with 3 different dialogs before you can start the process. In the next application, its a basic web app that works only with Internet Explorer 6, and the server is a Sun box. In another, its a full blow Web 2.0 Ajaxy thingamajig that works only with IE 7 and FireFox 3, and the server side runs on IBM hardware only. I have to dig through two menus and five screens to get to the right spot.

You can see the obvious problem; I have to figure out how to do the same thing at least five different ways. And that's just the start. I have to be sure that I can track which CPEs have successfully updated and which haven't. I have to keep from overloading my network by trying to configure too many CPEs at the same time; this is a bigger deal when the last mile is a radio link as in WiMAX, rather than cable or fiber. And so on. I have to coordinate everything between five different applications to make sure everything runs smoothly. To boot, a single application can support only so many CPEs. Lets say our Windows GUI application has a back-end that can manage 100,000 CPEs from one vendor. You have 250,000 of these CPEs in your network. So I need 3 instances of that application to manage them. Now do the same thing for the other 4 applications. To put it mildly, life is getting complicated.

Here's another approach. You get the CPE vendors to provide you specifications on their provisioning, upgrade, and configuration interfaces. You create a single server application that provides a single interface for provisioning, configuring, and upgrading the CPEs in your network. This server has a "mediation layer" that knows how to translate operations to the "language" that the CPE understands. This sounds a lot better. I have one place to go to manage the firmware images for my CPEs; one place to go to configure, upgrade, and diagnose them. I still have the problem that I can support only so many CPEs per instance of the application, but the pain factor is cut down significantly.

The biggest drawback is that I have to do a lot of custom integration development to tie everything all together. It becomes a really big deal deciding whether or not you really want to use that new vendor's CPE. Do we really want to sink the time and money into supporting it? It could take months to do design, development, testing, and then deployment into the live system. There are lots of factors in that kind of decision, and determining the real cost of integrating a new CPE into your server application is just one of them.

So what's a poor network operator to do? In comes TR-069 from the Broadband Forum, which defines the CPE WAN Management Protocol (CWMP). In short, its an application level protocol for remote management of CPEs. It provides a standard way to provision, configure, upgrade, and diagnose your cable modem. It describes an auto-configuration server (ACS) that is used to manage the CPEs using SOAP over HTTP. There is a well defined set of messages that the ACS and CPE exchange to get and set configuration properties, indicate that the CPE needs to get a new firmware image and uprade, and so on. Keeping it simple, the configuration model is a set of key value pairs. The keys are a dot separated tree, for example InternetGatewayDevice.LANDevice.1.Hosts.Host.1.IPAddress. TR-069 defines a set of standard parameters, and there are other standards that provide additional parameters for various types of CPE devices.

Why is this a good thing? It means that if a CPE vendor supports TR-069, your ACS stands a good chance of being able to integrate with that CPE with a minimum of fuss. And you don't even have to write the ACS, since the standard describes how the ACS is going to work; you can just buy one. The ACS still has to be integrated into your larger network management system through its northbound interface, which varies from vendor to vendor, but you were going to have to do that anyway with your custom server. So I get a standardized CPE management server, and I get a vastly simplified integration of new CPEs. Sounds pretty good overall.

What's the downside to CWMP? Well, the ACS servers are expensive. We are talking from $500,000USD to $1,000,000USD for the server, and then $2-5USD per CPE. That rapidly adds up to several million dollars for a medium to large scale deployment. In a small scale deployment, it's probably cheaper to build a stripped down ACS server yourself that handles just upgrades and basic configuration.

The LACS project intends to create the whole TR-069 ecosystem: the ACS with a RESTful northbound interface and SOAP/HTTP southbound interface, and a CPE "stack" with the SOAP/HTTP northbound interface and a RESTful southbound interface. The project was originally going to be just the ACS, but the simple fact is that I'll need a CWMP client (CPE) to test with. Might as well do the whole thing; in for a pennny... Also the RESTful interfaces to the north of the ACS and south of the CPE stack is my decision; it's not part of the standard.

Why am I doing this project? Well, for a couple of reasons. First, I understand the domain pretty well. I've built two different configuration management systems for network elements, and this will be the third. I don't have a conceptual hurdle to get over, meaning that in theory I can be productive sooner rather than later. Second, it gives me the opportunity to explore some technology I'm not as familiar with. I haven't done much, if any, network programming. By that I mean pushing and pulling bytes over a network using sockets. Also, I want to use heirarchical statecharts and active objects for the implementation. (I'll probably discuss what these are in a future blog post. If you are curious, check out the Quantum Leaps website.) While I've been a fan of the event driven approach I haven't really had an opportunity to put it into practice. This is my big chance. RESTful interfaces are the new "big thing" in web services, and I think they have a lot of promise. They certainly can't be any more complicated than SOAP and the host of WS-* standards bolted around it. Third, ... well, I guess I just like the challenge of building something for free that others are charging an arm, leg, stomach, chest, and head for.

So, TR-069 is the answer for CPE management in a broadband network. It's a stable, mature standard with wide acceptance; you can find ACS and CPE stack vendors by the dozens now. It doesn't answer the question of the ACS northbound interface, nor how the CPE is supposed to use the configuration, but I'd argue that is actually a good thing. LACS will eventually be an open source implementation of TR-069, available to anyone who wants to use it.

11 comments:

  1. Any engage on your Open Source project?

    ReplyDelete
  2. Hi,

    Do you see any substantive difference between managing fixed line CPE and wireless CPE (eg mobile phones). I know that there is an OMA-DM standard for mobile. Do you have a view as to how this compares with TR-069/LACS?

    thanks

    Tim Joyce
    tim.joyce@wdsglobal.com

    ReplyDelete
  3. Hi Glenn,
    I'm currently here because I - as a home broadband user - am trying to see if there's a way of monitoring the current state of my broadband connection (As you may have already concluded, mine is poor, and I want to gather as much information I can before contacting my telephone company).
    By looking into the management section of my modem, I came to this tr-069 protocol and understoood that it can be the way to that monitoring I want.
    My question is, from a home user standpoint, is there a way of implementing a acs and a client in my network? Are there already any tools - preferrably for free?
    Thanks,
    Gustavo - Brazil

    PS.: Nice kid!

    ReplyDelete
  4. Glenn,
    Forgott to tell you my email, which is glapido@gmail.com
    Please feel free to send me your answer to my question above.
    Thanks,
    Best regards

    ReplyDelete
  5. Hi Glenn,

    We are setting a same project here in europe. If you want to discuss/work together on topics please let us know.

    Sandra @ info@comsysco.nl

    ReplyDelete
  6. Hello Glenn,

    We are looking to collaborate on a wireless management project, and are based in California. If you wish to discuss the opportunity, please do let me know.

    Srinivas
    advaita1@gmail.com

    ReplyDelete
  7. So for those of you who occasionally ask, unfortunately I never actually got anywhere with the open source project. Just too many demands on my time, professional and personal, to let me work on it. No real surprise there. It's gotten to the point where I've asked Google Code to remove the project since I don't think it's appropriate to have a project up that just has some wiki pages and is effectively squatting on a name.

    Thank you for the interest, but I'm afraid I'm not going to be able to work on this type of project in the future. I'm currently working at Proofpoint doing email archiving, and it's a totally different ballgame. :)

    ReplyDelete
  8. Plus there's more players on the market now like http://www.acslite.com who are bringing the cost of the ACS down just a few thousand dollars if you only have a small number of devices.

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete
  11. EasyCwmp, it's an open source TR-069 cwmp client which contain all required RPC and fully conform with broadband forum standard

    for more detail informations visit the website : http://www.easycwmp.org/

    ReplyDelete