Thursday, May 6, 2010

Mercurial and Managing Third Party Code in Your Project

Overview

Managing third-party code in your source repository can be a tricky business. Many projects simply chose to not address the problem at all; you are required to download all the dependencies and build them yourself, or perform the equivalent installation of binaries using your distribution's package manager. It's typically up to you to ensure that you are using the correct versions.

What if you want to ensure you know exactly what's going into your application so you don't get broken by something that changes in the package you rely on? What if you know that you are going to have local customizations to those packages that are needed for your application, but may not be accepted back upstream?

The approach that I've seen used successfully is to track most if not all of the dependencies you have in your source tree. At this point several people are saying "What the hell? That will bloat my source tree significantly!" Yes it can and almost certainly will, but there are definite advantages:
  • When users download your application source, it's just all there; they don't have to get each and every dependency individually,
  • You can include local changes to those packages and not have to invent complicated workarounds for bugs,

The simplest way to manage this is with subrepositories, which isn't particularly well supported by most SCM tools at the moment. The usual workaround is to use a wrapper script that knows how to work with multiple repositories at the same time. The drawback to that approach is that people using your package now have to get the wrapper before they can correctly download everything. Also they have to remember to use your wrapper for certain operations, for example cloning, pushing, and pulling. Mercurial does have some support for subrepos, but it isn't a "fully baked" feature at this time.

There is a compromise approach that will work with standard Mercurial and not require any fancy scripts. It's not perfect, but it does cover most of the bases. In essence, you will keep pure copies of the third party source in separate clones of your repository and merge them back into your main repository. Local changes will be kept in the main repository. When you want to upgrade the third party code, you update the pure repository and merge it back into your main repository again. Your local changes are in their own changesets, and you will hopefully be able to merge them cleanly with the updated code. Without having those local changesets, and changesets for the updates to the pure code, it can be very difficult to track which change is which.

Again at this point, some people are likely freaking out and saying "Mercurial doesn't require a master repository, this violates one of the main DVCS principles!" Yes and no. Typically most projects do have a main repository that everyone gets their initial clones from and is considered the "golden master." Even for those that do not, this technique will still work reasonably well.

Note that the MQ extension to Mercurial will often do everything most people want when managing third-party package and local patches. This is an alternative approach if you feel MQ doesn't quite fit your needs or your way of doing things.

Before you begin

One of the bigger drawbacks to this approach is that you need to work from a "clean" repository.  If you already have a bunch of code, this isn't going to work very well.

At this point, you are probably saying, "But if I'm just starting out, how do I know what third party packages I'm going to need?"  Fair enough, but I'm betting that you are going to have at least one already in mind, and maybe more. To get us started, let's assume we are starting a new project and that we are going to use Lua for our scripting and configuration file needs.  This is a non-trivial package that is quite common in a large number of projects.

Basic How To

The basic series of tasks will look like:
  1. Create your main repository.
  2. Setup your pure "starter" repository called 'pure-stem' from the main repo.
  3. Create a pure package repository from the pure-stem repo.
  4. Import the third party code into the pure repo and tag it.
  5. Pull and merge the pure repo into your main repo.

Create your main repository

This will depend on your host provider, but lets assume you are managing everything locally. You need to add a simple file into the root of the repository and do a single commit. If you don't, the pure-stem repo you will create later could have a different initial changeset identifier, which is what Mercurial uses to identify whether repositories are related. If the repos aren't related, this technique doesn't work.

$ hg init myproject
$ cd myproject/
$ echo "Let's take over the world." > README
$ hg commit --addremove --message "Adding initial project README."
adding README

Setup your pure-stem repository

We need a repository that all of our third-party pure repos will derive from. This really needs to be done before pretty much anything else, as it needs to be completely empty of non-third-party source.

$ cd ..
$ ls
myproject
$ hg clone myproject pure-stem
updating to branch default
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ ls
myproject  pure-stem
$ cd pure-stem
$ mkdir -p src/packages
$ echo "All third-party source packages are located here." > src/packages/README
$ hg commit --addremove --message "Setup of pure-stem repository."
adding src/packages/README

To make life a little easier later, lets pull the contents of the pure-stem repo into our main repo:

$ cd ../myproject/
$ hg incoming ../pure-stem 
comparing with ../pure-stem
searching for changes
changeset:   1:2b2ed8ef6cba
tag:         tip
user:        Glenn McAllister 
date:        Thu May 06 15:57:39 2010 -0400
summary:     Setup of pure-stem repository.
$ hg pull ../pure-stem 
pulling from ../pure-stem
searching for changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 1 changes to 1 files
(run 'hg update' to get a working copy)
$ hg update
1 files updated, 0 files merged, 0 files removed, 0 files unresolved

Create and Populate the Third Party Pure Repo

Our first third-party package to import will be Lua. For the sake of further examples later, we are going to use Lua 5.1.3 to start. Later on, we'll upgrade to Lua 5.1.4. Assuming we have lua-5.1.3.tar.gz downloaded already:

$ hg clone pure-stem pure-lua
updating to branch default
2 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ cd pure-lua/src/packages
$ tar zxf /tmp/lua-5.1.3.tar.gz --transform s/lua-5.1.3/lua/
$ ls
lua  README
$ hg commit --addremove --message "Initial import of Lua 5.1.3"
... lots of files were added ...
$ hg tag lua-5.1.3

Note that we tag the repo with the Lua release information so we know which version we have at a given point in time.

Pull the Third Party Repo into the Main Repo

At this point, we want to pull the new pure-lua repo into our main repo:

$ cd ../../../myproject/
$ hg incoming ../pure-lua 
comparing with ../pure-lua
searching for changes
changeset:   2:ec222e7c1372
tag:         lua-5.1.3
user:        Glenn McAllister 
date:        Thu May 06 16:09:04 2010 -0400
summary:     Initial import of Lua 5.1.3

changeset:   3:3dc42e785d44
tag:         tip
user:        Glenn McAllister 
date:        Thu May 06 16:09:39 2010 -0400
summary:     Added tag lua-5.1.3 for changeset ec222e7c1372

$ hg pull ../pure-lua 
pulling from ../pure-lua
searching for changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 104 changes to 104 files
(run 'hg update' to get a working copy)
$ hg update
104 files updated, 0 files merged, 0 files removed, 0 files unresolved

At this point, you now have the Lua source code in your main repository.

Making Local Changes


Now that we have some third party source, lets make some changes to it. This isn't going to be a 100% realistic example, it's just enough to show the principle. Let's change the Lua README file to add our own comment:

$ cd src/packages/lua/
$ echo "Local change in main repository." >> README
$ hg status
M src/packages/lua/README
$ hg commit --message "Local change to Lua README file."

And that's it. In theory, you can make as many changes as you want to the package. In practice, you won't. You will typically try to make the minimum changes necessary, if any, to get your project to work.

Upgrading the Third Party Package

Lua's current stable release is actually 5.1.4, not the version we are using. We've looked at the release notes, bug reports, etc. and decided that upgrading to the current version is a good idea. The basic steps we want to follow are:

  1. Blow away the existing files.
  2. Untar/unzip the update into the repo.
  3. Commit the changes and tag them.
  4. Merge the changes back into the main repo.

Rather than go into the same level of detail as in the above sections:

$ cd ../../../../pure-lua/
$ hg locate -0 --include src/packages/lua | xargs -0 rm
$ cd src/packages/
$ tar zxf /tmp/lua-5.1.4.tar.gz --transform s/lua-5.1.4/lua/
$ hg commit --addremove --message "Import Lua 5.1.4."
$ hg tag lua-5.1.4
$ hg tags
tip                                5:a71637075ea6
lua-5.1.4                          4:7ada377f7533
lua-5.1.3                          2:ec222e7c1372
$ cd ../../../myproject/
$ hg incoming ../pure-lua
comparing with ../pure-lua
searching for changes
changeset:   4:7ada377f7533
tag:         lua-5.1.4
user:        Glenn McAllister 
date:        Thu May 06 16:30:34 2010 -0400
summary:     Import Lua 5.1.4.

changeset:   5:a71637075ea6
tag:         tip
user:        Glenn McAllister 
date:        Thu May 06 16:30:53 2010 -0400
summary:     Added tag lua-5.1.4 for changeset 7ada377f7533

$ hg pull ../pure-lua
pulling from ../pure-lua
searching for changes
adding changesets
adding manifests
adding file changes
added 2 changesets with 15 changes to 15 files (+1 heads)
(run 'hg heads' to see heads, 'hg merge' to merge)
$ hg heads
changeset:   6:a71637075ea6
tag:         tip
user:        Glenn McAllister 
date:        Thu May 06 16:30:53 2010 -0400
summary:     Added tag lua-5.1.4 for changeset 7ada377f7533

changeset:   4:cf8dc98b5be3
user:        Glenn McAllister 
date:        Thu May 06 16:21:40 2010 -0400
summary:     Local change to Lua README file.

$ hg merge
15 files updated, 0 files merged, 0 files removed, 0 files unresolved
(branch merge, don't forget to commit)
$ hg status
M .hgtags
M src/packages/lua/Makefile
M src/packages/lua/doc/manual.html
M src/packages/lua/doc/readme.html
M src/packages/lua/etc/lua.pc
M src/packages/lua/src/lapi.c
M src/packages/lua/src/lbaselib.c
M src/packages/lua/src/ldebug.c
M src/packages/lua/src/loadlib.c
M src/packages/lua/src/lobject.h
M src/packages/lua/src/lstrlib.c
M src/packages/lua/src/ltablib.c
M src/packages/lua/src/lua.h
M src/packages/lua/src/luaconf.h
M src/packages/lua/src/lundump.c
$ hg commit --message "Pull in Lua 5.1.4"

The use of the hg locate command above is worth commenting on. Basically we use it to remove all the files that Mercurial knows about in that repo. However, we want to remove only the Lua related files, not the src/packages/README file. With that and the use of the --addremove option to the commit, and we can get Mercurial to do the work on figuring out what files have been added, removed, and changed.

Typically your merges are going to be more complicated. Also, when you get multiple third-party repositories merging the .hgtags file gets to be a pain. Just remember, when merging that file, take the lines in both files.

Summary

This technique allows you to manage third-party packages in a controlled manner in your main repository, and ensures your local changes will be managed correctly. It has drawbacks compared to other methods, but this is often a good solution from an end-user's perspective, as they don't have to do a bunch of work to get your project's dependencies.

Tuesday, May 12, 2009

In the room in front of me I can see...


The following is the result of a writing prompt from the tool I use to keep my novel writing together. I'm finding that I need to work myself back into writing my novel, so I decided to do a writing exercise to warm up. I was given the prompt: Write about the following, "In the room in front of me I can see...". So I did. This is basically a first-draft quality bit, written in about a half hour with some minor editing. It's probably more of an essay than a bit of creative writing. Nevertheless, I'm throwing it out into the wild in the hopes of getting some constructive criticism. Some of you are going to hate it. Some will like it. Most will likely be indifferent. All I ask is that whatever comments you leave, you tell me what you liked or didn't like about it. What pissed you off? What struck a chord? What fell hard on your ears? What left you cold and indifferent? Thanks for taking the time to read.

In the room in front of me I can see the cafeteria at Sick Kids hospital. A place where people can take a short break from the dreary sameness of their hospital room, their desk, their lab. A room full of tables and chairs, where people have lunch or dinner, drink coffee, study, read, and even sleep. Where parents can take their children to look out the windows and see the cars passing by, some impatient and damn close to getting into an accident. To see the old sandstone brick building across the street that is going to be demolished and replaced with a YMCA center. To see the thin trees lining the sidewalk, where people are walking hand in hand, without a care in the world. To see the buses lined up down the street, just around the corner from the bus station. To see the homeless person tagger by, drunk, maybe with fatigue or fear, clutching their mismatched clothes tight to themselves.

It's now time for visitors to go home. But not all parents will leave. They will stay by their children for as long as they can. They will be asked to leave, and most will simply go, promising to return as soon as they can the next day. The child is sad, but if they are old enough, they understand that Mom and Dad can't always be there. If they can, the parents will be back at their child's bedside first thing in the morning. Not everyone can. Some have jobs they have to go back to; they have to make money, just like everyone else. So they work their day, constantly worried about what is happening when they aren't there. Is she OK? What about his tests today? Will they call me with the results?

I can only just imagine what it would be like to have a child in Sick Kids. The fear, the uncertainty. I think its the not knowing that would get to me the most. Until we really know what is wrong, you feel so helpless. You are there to protect your child. It is your job to keep them safe, to light the way home. When they are sick, all you can do is stand by them, try to keep them happy and not afraid. To understand what the doctor is trying to say, and to make the best choices you can for your child.

What do you do when you hear that it's something you can do very little about? Cancer. Leukemia. Some blood disease that has no cure. I don't really want to know. I'm sure that we'd do our best, keep a bright face in front of our child. We'd be strong. But we'd be weak too. When we are alone. When it's OK to cry. When it's OK to rage at God for what has been done to your child. When it's OK to feel despair, because we all do. It's only human.

Thank God for Sick Kids. For any hospital for that matter. Because they try to make the world OK again. They try to help. And we need help, because we aren't all heros. Most parents are just people, trying to do right by their children and by themselves. They want their children to be happy. And with some help they will be; you'd be surprised at how much joy a child can find in the world, even when their parents can't.

In the room in front of me I can see boundless love. I can see struggle, and resignation, and despair. Most of all, I can see hope. The hope that gets people through the day, minute by minute, second by endless second as they wait. Wait for good news, and bad.

Thursday, April 23, 2009

The Nightmare Before TR-069

The open source project that I'm currently working on is named LACS. It's an open source implementation of the TR-069 standard. The standard is a way of configuring, upgrading, and diagnosing customer premise equipment (CPE) in a telecommunications network.

At this point, most of you are probably scratching your heads. What is a CPE, and why do I care? You probably don't, but management of network equipment is what I do for my day job. I work on the OAMP team at SOMA Networks, where we create the interfaces that are used to manage our equipment. These interfaces are used to setup and ensure the equipment is operating correctly.

How does this relate to you? Well, think about the DSL or cable modem you have at home sitting behind your wireless 802.11b/g/n router. That modem is a CPE in network management speak. The network operator (e.g., Rogers, Bell Canada, Verizon) has to manage these CPEs. They need to ensure they are configured correctly, and that they get upgraded when new firmware versions become available to fix bugs or add new features.

Think about how many of these they have to keep track of. In a large network, you are potentially talking millions of CPEs. OK, well thats not so bad since Rogers always hands out the same type of cable modem, right? Wrong. They probably have several types of modems out there; I'm sure the modem I bought six years ago, still pushing bits around in 2009, is not what they are currently using.

So what can you do to manage all these different CPEs? One option is to create different applications that know how to talk to a specific brand of modem from a specific vendor (think Motorola, Cisco, etc.). This application is built by the CPE vendor, and may be provided for free (or not, as the case may be). We'll assume that this application gets upgraded on a regular basis to handle new firmware versions as they become available. If we are really lucky, the application knows how to mange multiple types of CPEs by the same vendor.

At first glance this doesn't seem so bad, but there are some big issues with this approach. Think of it this way. I need to update all of the CPEs in my network to introduce a new feature. I have five types of CPEs out there, and two of them have to be upgraded with new firmware before they can have that new feature enabled. I need to go to two separate applications and tell them to upgrade all of the affected CPEs. Assuming all goes well and all of the CPEs are upgraded over the course of a week (yes, a week or potentially longer), we can start configuring the new feature. I have to go to five different applications and figure out how to switch the feature on. In one application you need to interact with a Windows GUI, and you have to interact with 3 different dialogs before you can start the process. In the next application, its a basic web app that works only with Internet Explorer 6, and the server is a Sun box. In another, its a full blow Web 2.0 Ajaxy thingamajig that works only with IE 7 and FireFox 3, and the server side runs on IBM hardware only. I have to dig through two menus and five screens to get to the right spot.

You can see the obvious problem; I have to figure out how to do the same thing at least five different ways. And that's just the start. I have to be sure that I can track which CPEs have successfully updated and which haven't. I have to keep from overloading my network by trying to configure too many CPEs at the same time; this is a bigger deal when the last mile is a radio link as in WiMAX, rather than cable or fiber. And so on. I have to coordinate everything between five different applications to make sure everything runs smoothly. To boot, a single application can support only so many CPEs. Lets say our Windows GUI application has a back-end that can manage 100,000 CPEs from one vendor. You have 250,000 of these CPEs in your network. So I need 3 instances of that application to manage them. Now do the same thing for the other 4 applications. To put it mildly, life is getting complicated.

Here's another approach. You get the CPE vendors to provide you specifications on their provisioning, upgrade, and configuration interfaces. You create a single server application that provides a single interface for provisioning, configuring, and upgrading the CPEs in your network. This server has a "mediation layer" that knows how to translate operations to the "language" that the CPE understands. This sounds a lot better. I have one place to go to manage the firmware images for my CPEs; one place to go to configure, upgrade, and diagnose them. I still have the problem that I can support only so many CPEs per instance of the application, but the pain factor is cut down significantly.

The biggest drawback is that I have to do a lot of custom integration development to tie everything all together. It becomes a really big deal deciding whether or not you really want to use that new vendor's CPE. Do we really want to sink the time and money into supporting it? It could take months to do design, development, testing, and then deployment into the live system. There are lots of factors in that kind of decision, and determining the real cost of integrating a new CPE into your server application is just one of them.

So what's a poor network operator to do? In comes TR-069 from the Broadband Forum, which defines the CPE WAN Management Protocol (CWMP). In short, its an application level protocol for remote management of CPEs. It provides a standard way to provision, configure, upgrade, and diagnose your cable modem. It describes an auto-configuration server (ACS) that is used to manage the CPEs using SOAP over HTTP. There is a well defined set of messages that the ACS and CPE exchange to get and set configuration properties, indicate that the CPE needs to get a new firmware image and uprade, and so on. Keeping it simple, the configuration model is a set of key value pairs. The keys are a dot separated tree, for example InternetGatewayDevice.LANDevice.1.Hosts.Host.1.IPAddress. TR-069 defines a set of standard parameters, and there are other standards that provide additional parameters for various types of CPE devices.

Why is this a good thing? It means that if a CPE vendor supports TR-069, your ACS stands a good chance of being able to integrate with that CPE with a minimum of fuss. And you don't even have to write the ACS, since the standard describes how the ACS is going to work; you can just buy one. The ACS still has to be integrated into your larger network management system through its northbound interface, which varies from vendor to vendor, but you were going to have to do that anyway with your custom server. So I get a standardized CPE management server, and I get a vastly simplified integration of new CPEs. Sounds pretty good overall.

What's the downside to CWMP? Well, the ACS servers are expensive. We are talking from $500,000USD to $1,000,000USD for the server, and then $2-5USD per CPE. That rapidly adds up to several million dollars for a medium to large scale deployment. In a small scale deployment, it's probably cheaper to build a stripped down ACS server yourself that handles just upgrades and basic configuration.

The LACS project intends to create the whole TR-069 ecosystem: the ACS with a RESTful northbound interface and SOAP/HTTP southbound interface, and a CPE "stack" with the SOAP/HTTP northbound interface and a RESTful southbound interface. The project was originally going to be just the ACS, but the simple fact is that I'll need a CWMP client (CPE) to test with. Might as well do the whole thing; in for a pennny... Also the RESTful interfaces to the north of the ACS and south of the CPE stack is my decision; it's not part of the standard.

Why am I doing this project? Well, for a couple of reasons. First, I understand the domain pretty well. I've built two different configuration management systems for network elements, and this will be the third. I don't have a conceptual hurdle to get over, meaning that in theory I can be productive sooner rather than later. Second, it gives me the opportunity to explore some technology I'm not as familiar with. I haven't done much, if any, network programming. By that I mean pushing and pulling bytes over a network using sockets. Also, I want to use heirarchical statecharts and active objects for the implementation. (I'll probably discuss what these are in a future blog post. If you are curious, check out the Quantum Leaps website.) While I've been a fan of the event driven approach I haven't really had an opportunity to put it into practice. This is my big chance. RESTful interfaces are the new "big thing" in web services, and I think they have a lot of promise. They certainly can't be any more complicated than SOAP and the host of WS-* standards bolted around it. Third, ... well, I guess I just like the challenge of building something for free that others are charging an arm, leg, stomach, chest, and head for.

So, TR-069 is the answer for CPE management in a broadband network. It's a stable, mature standard with wide acceptance; you can find ACS and CPE stack vendors by the dozens now. It doesn't answer the question of the ACS northbound interface, nor how the CPE is supposed to use the configuration, but I'd argue that is actually a good thing. LACS will eventually be an open source implementation of TR-069, available to anyone who wants to use it.

Monday, March 23, 2009

My First Experience with Open Source

I first got into open source through a project that I developed while at IBM on the Visual Age for Java technical writing team. Specifically, they needed a build system. Why do tech writers need a build system you ask? Because they had a hell of a lot of documentation. Specifically, they had over 10,000 HTML files to manage. In English. Then toss in the eleven translations of the product and you are talking about 120,000 files that have to get processed and packaged correctly for use in the VAJ help system. I was tasked to create this system because of my background in software development.

After a couple of false starts at building my own web-based mod-perl system on AIX from scratch, I stumbled across Ant. If you've ever done any Java development, you've almost certainly used Ant at least once. Its a Java based build system which uses XML files to describe how a build is executed. The original author, James Duncan Davidson, had written it while working on Java projects at Sun. He was tired of the cross-platform build headaches he was facing, so he set out to build a system using Java as the cross-platform source. And yes, he really did write the core of it on a plane. It was, as you probably know, a huge success.

Aside: While writing this I went to look up JDD's latest info, and it turns out he's now a full time professional photographer. He's the guy who took the picture of Bill Gates releasing a jar of mosquitoes on-stage at TED09. Cool.

I picked up Ant and started to use it for our project. It wasn't the fabulous web based system that I had originally envisioned, but it had one major thing going for it that my other attempts didn't: it worked. At the end of the day, that's what mattered the most. That is, it mostly worked.

I tripped across the usual mismatches between the documentation and what had actually been done. Not to mention bugs. Lets face it folks, every bit of software out there has bugs. We are used to it, and we either work around it or fix it. And that's one of the things I really like about open source: the code is open and you can fix it yourself. So I did.

But that was where I stopped. I fixed the problems in my local copy of Ant. Then the next version came out, and many of the bugs were fixed. Yay! Except they weren't fixed exactly the same way I did, so I had to spend a bunch of time manually merging my changes (I wasn't using the CVS tree at the time, just the release tarball). Boo.

By this point I had been monitoring the Ant mailing lists pretty avidly. It was great to see this collegial atmosphere, where people could ask questions and get well thought out answers by the people who actually wrote the product. Most of the people on the list were welcoming of newcomers, and were willing to steer people in the right direction. That is, most people were, and a couple were complete jackasses. At the time, these people really got my back up, but they were developers on the project, and I was just a lurker. What did I know? A fair bit, as it turns out, but I didn't necessarily think so at the time.

The list also pointed out that not everyone on the list was strictly equal. That is, there were committers, and everyone else. In Apache parlance, committers are the people who have write access to the source code repository of the project. There's more to the role than that, but its enough to get us going. In an Apache project, each committer has a veto vote. That is, if they don't like a source code change, they can vote to have it undone. The voting is also used to determine what new features are going to be added, their design, and details of the actual implementation.

I had noticed that people were sending in patches to the list to fix bugs or add enhancements. This was often described as "scratching their own itch," since it fixed something about the project that the submitter had a personal interest in. After having gone through a frustrating time of merging my changes with theirs, I decided to try posting my own patches in the hopes that they would be accepted, cutting down on my overall work. 'Cause at the end of the day, I'm a lazy, lazy man.

I dug into CVS, learned how to checkout a copy the repository, and started hacking. Once I fixed a problem, I figured out how to create a universal diff. From that, I sent in my first patch. As I recall, I checked and rechecked the list a couple of dozen times waiting breathlessly for someone to comment. At some point a committer said "Thanks, we'll look at it later." Crushed, I returned to my build, convinced I had failed.

After a while, however, my patch did get picked up. Yay! No longer crushed. So I submitted another, and it too was accepted. I started to participate more in the email discussions, offering my humble opinion. And it was humble, as I was learning the system and realized that many of the developers on the list had a much better grasp of how things had come to be the way they were than I did. I learned to search the archives of the list before asking a question, to make sure I wasn't covering old ground.

I'm not trying to say I was the perfect citizen on the mailing list. I'm more than sure I made a few gaffes and came off like an idiot at least once. Who doesn't? That being said, I like to think I learned from my mistakes reasonably quickly. Others, however, did not. They were the help vampires, and they started to take up more and more of everyone's time with useless emails. If I recall correctly, this is when someone posted a link to the indispensable essay on how to ask smart questions in an attempt to stop the tide of inanity. If you haven't read it before, do it now. Really, go read it, its worth the time.

At this point JDD had to leave the project for a while. He had other commitments that kept him from being able to dedicate enough time to the project, so he announced that he would be stepping for some time and would return when he could give Ant the attention it deserved. I didn't really know who JDD was at the time, so I didn't actually pay much attention to his departure.

Also around this time, I was invited to become a committer on Ant. I was thrilled! I was being asked to take a very active role in the project, with direct access to the project source code. I was still developing my VAJ help system build, so I was actively working with Ant on a daily basis.

Things were going along pretty well for a few months, and then JDD came back on the scene. He had a vision for Ant 2.0. He laid out that vision for us (sorry, I don't remember the details), and we said "no." JDD was metaphorically stunned. Who were we to reject his view of the project he had created? Technically, we committers were the stewards of the project. According to the Apache rules, we each had an equal say in the direction of Ant. Just because James was the one to start it didn't mean he had any more influence. He was not "first among equals." I think that point came as a shock to more people than just JDD. At this point, JDD left Ant and started a new Java build system named Amber.

After a few months, I moved onto another project at IBM, working on the Eclipse help system (this was before Eclipse was open-sourced). I was no longer working on Ant builds, and my participation in the project waned. I left IBM shortly after that to work at a startup, and my Ant involvement dropped even more. I made an effort to get back into it, but working at a startup tends to not leave you a lot of free time. In the end I quietly withdrew from participating on the project, and became an emeritus committer.

So if you want to participate in an open source project, the morals of the story are:
  • you need to read the email discussions regularly,
  • be respectful of peoples time by crafting short, readable emails,
  • ask smart questions,
  • don't be a help vampire,
  • scratch your own itch,
  • people working on an open-source project as an offshoot of their day jobs aren't necessarily the best long term participants,
  • be patient, as people are often working on the project on their spare time, so don't expect instant answers to your questions, and the really big one
  • abide by the rules of the community
I can't stress that last point enough. Each open source project has its own community, with its own set of established rules. You need to ensure you understand those rules and are willing to abide by them if you want to be taken seriously. If you can't abide by them, then its probably not the project for you.

Participating in an open-source project can be very rewarding and enriching experience. You can gain a lot of experience in the reality of distributed development even if that isn't part of your day-to-day job. It can also offer a way to get those extra skills that your current job just can't offer you. Its a committment, but a very worthwhile one.

Wednesday, March 18, 2009

First Post

God I hate those words. As if being the first person to add a comment to a blog or article is a great accomplishment. But they are appropriate, given that this is my first blog post. That and I couldn't come up with something witty.

I've wanted to be a writer for a long time, and at one point I was a technical writer at IBM on Visual Age for Java (the precursor to Eclipse) for a couple of years. Unfortunately, technical writing didn't live up to my dreams. Well, my vague musings anyway. I didn't really understand what technical writing was about at the time, so I didn't really have all that many expectations. It was a pretty good job, and I learned a lot about writing in general, not just technical writing. That being said, I missed doing software development and left my dreams of being a writer behind.

About a year or so later in 2001, I wrote an article about Apache Tomcat for Linux Magazine, and landed a short lived job as their new Java columnist. I wrote three or four articles for them (I think three actually made it to print) before differences in opinion between myself and the editor were a little more than either of us cared for. Since then, I haven't done any professional writing.

But what I really wanted to do was write fiction. Yep, I have artistic aspirations. Not particularly lofty ones though. I'd like to write genre fiction, specifically fantasy. I'm not looking to write the next great Canadian novel, I've just got a story or two inside me that I'd like to tell. I've read a lot about the craft of writing, and I attended a couple of creative writing classes at the University of Toronto (taught by Lee Gowan, from whom I learned a lot) back in 2002 or so. I did manage to get a first chapter written, but not much more. Its amazing how often you can end up re-writing the same chapter, tweaking something here and there.

And where did my career go from there? Well, not in writing. I ended up working lots of very long hours at a startup company and didn't have much in the way of free time to write. After a while, working long hours became a habit and an excuse not to write. An excuse not to fail at something I cared about, truth be told.

I did, and am doing, very well in my career. I actually enjoy software development and that's a lot of what this blog will be about. I've done all aspects of the job, from lowly code monkey to module design, project lead, team lead (think project managment), technical lead, and architect. Each part of the job has its appeal. Well, being a code monkey with a crappy assignment wasn't a lot of fun, but it had its moments. And I still try to code a bit every day; I'm a firm believer that staying close to the code ensures you don't forget the practical aspects of trying to implement a design. I've seen my share of elegant-on-paper but really friggin' hard to implement designs and architectures; I've tried hard to avoid making those same mistakes.

A few months ago I was feeling pretty restless. Something was bothering me, but I didn't know what. My job didn't have the same sparkle it used to have. Luckily for me I had a whack of vacation saved up that I was going to lose if I didn't use it, so I basically took off the entire month of December to work on writing. It was invigorating! I hadn't had so much fun in a very long time. I got a fairly detailed outline done in about a week, and then started in writing after that. It got easier each day, and I was able to get more pages written in a day as time went on. I topped out at about five to eight pages. Given some unavoidable interruptions, including Christmas in Ottawa with my parents, wife, and two wonderful kids, I managed to get about 30 pages of the manuscript finished.

No, I'm not going to tell you what the story is about. Its my baby. Mine! Seriously, untill I finish a complete first draft no one is likely to see it. Even my wife seeing it is only a 50/50 proposition.

After that first rush, my enthusiasm for writing admittedly waned. Not having several hours of solid writing time, which I spent mostly in coffee shops or at Sick Kids, can be hard. My daughter is almost five years old, and our son is one. The sheer noise level can be overwhelming at times, and getting a solid chunk of alone time with no distractions to write is challenging to say the least. Also, the longer I've been away from writing, the harder it seems to be to get back to it. There always seems to be a reason not to do it.

All that being said, to be a writer you have to write. So, I'm trying some writing that is less personal to help warm me up for those more personal and creative writing sessions. Obviously, that writing is this blog. I'll be working on writing about stuff in my working life of software development. The idea is that the more I write, the more of a habit it is. And hopefully some of that time I dedicate to writing the blog can be transformed into creative writing time.

I probably won't be talking about creative writing all that often here, so if the above hasn't been your cup of tea don't worry. I won't be doing it often. For the foreseeable future I'll be talking about all the stages of software development, particularly as how I'm applying what I know to an open source project I've started. Yep, I've got a blog, I'm working on an open-source project, I want to do fiction writing, I have a full-time job doing software development, and I've got a family to fit in there. I think I'm doomed as well, but I'm hoping it will be a fun ride.