Overview
Managing third-party code in your source repository can be a tricky business. Many projects simply chose to not address the problem at all; you are required to download all the dependencies and build them yourself, or perform the equivalent installation of binaries using your distribution's package manager. It's typically up to you to ensure that you are using the correct versions.What if you want to ensure you know exactly what's going into your application so you don't get broken by something that changes in the package you rely on? What if you know that you are going to have local customizations to those packages that are needed for your application, but may not be accepted back upstream?
The approach that I've seen used successfully is to track most if not all of the dependencies you have in your source tree. At this point several people are saying "What the hell? That will bloat my source tree significantly!" Yes it can and almost certainly will, but there are definite advantages:
- When users download your application source, it's just all there; they don't have to get each and every dependency individually,
- You can include local changes to those packages and not have to invent complicated workarounds for bugs,
The simplest way to manage this is with subrepositories, which isn't particularly well supported by most SCM tools at the moment. The usual workaround is to use a wrapper script that knows how to work with multiple repositories at the same time. The drawback to that approach is that people using your package now have to get the wrapper before they can correctly download everything. Also they have to remember to use your wrapper for certain operations, for example cloning, pushing, and pulling. Mercurial does have some support for subrepos, but it isn't a "fully baked" feature at this time.
There is a compromise approach that will work with standard Mercurial and not require any fancy scripts. It's not perfect, but it does cover most of the bases. In essence, you will keep pure copies of the third party source in separate clones of your repository and merge them back into your main repository. Local changes will be kept in the main repository. When you want to upgrade the third party code, you update the pure repository and merge it back into your main repository again. Your local changes are in their own changesets, and you will hopefully be able to merge them cleanly with the updated code. Without having those local changesets, and changesets for the updates to the pure code, it can be very difficult to track which change is which.
Again at this point, some people are likely freaking out and saying "Mercurial doesn't require a master repository, this violates one of the main DVCS principles!" Yes and no. Typically most projects do have a main repository that everyone gets their initial clones from and is considered the "golden master." Even for those that do not, this technique will still work reasonably well.
Note that the MQ extension to Mercurial will often do everything most people want when managing third-party package and local patches. This is an alternative approach if you feel MQ doesn't quite fit your needs or your way of doing things.
Before you begin
One of the bigger drawbacks to this approach is that you need to work from a "clean" repository. If you already have a bunch of code, this isn't going to work very well.At this point, you are probably saying, "But if I'm just starting out, how do I know what third party packages I'm going to need?" Fair enough, but I'm betting that you are going to have at least one already in mind, and maybe more. To get us started, let's assume we are starting a new project and that we are going to use Lua for our scripting and configuration file needs. This is a non-trivial package that is quite common in a large number of projects.
Basic How To
The basic series of tasks will look like:- Create your main repository.
- Setup your pure "starter" repository called 'pure-stem' from the main repo.
- Create a pure package repository from the pure-stem repo.
- Import the third party code into the pure repo and tag it.
- Pull and merge the pure repo into your main repo.
Create your main repository
This will depend on your host provider, but lets assume you are managing everything locally. You need to add a simple file into the root of the repository and do a single commit. If you don't, the pure-stem repo you will create later could have a different initial changeset identifier, which is what Mercurial uses to identify whether repositories are related. If the repos aren't related, this technique doesn't work.$ hg init myproject $ cd myproject/ $ echo "Let's take over the world." > README $ hg commit --addremove --message "Adding initial project README." adding README
Setup your pure-stem repository
We need a repository that all of our third-party pure repos will derive from. This really needs to be done before pretty much anything else, as it needs to be completely empty of non-third-party source.$ cd .. $ ls myproject $ hg clone myproject pure-stem updating to branch default 1 files updated, 0 files merged, 0 files removed, 0 files unresolved $ ls myproject pure-stem $ cd pure-stem $ mkdir -p src/packages $ echo "All third-party source packages are located here." > src/packages/README $ hg commit --addremove --message "Setup of pure-stem repository." adding src/packages/README
To make life a little easier later, lets pull the contents of the pure-stem repo into our main repo:
$ cd ../myproject/ $ hg incoming ../pure-stem comparing with ../pure-stem searching for changes changeset: 1:2b2ed8ef6cba tag: tip user: Glenn McAllisterdate: Thu May 06 15:57:39 2010 -0400 summary: Setup of pure-stem repository. $ hg pull ../pure-stem pulling from ../pure-stem searching for changes adding changesets adding manifests adding file changes added 1 changesets with 1 changes to 1 files (run 'hg update' to get a working copy) $ hg update 1 files updated, 0 files merged, 0 files removed, 0 files unresolved
Create and Populate the Third Party Pure Repo
Our first third-party package to import will be Lua. For the sake of further examples later, we are going to use Lua 5.1.3 to start. Later on, we'll upgrade to Lua 5.1.4. Assuming we have lua-5.1.3.tar.gz downloaded already:$ hg clone pure-stem pure-lua updating to branch default 2 files updated, 0 files merged, 0 files removed, 0 files unresolved $ cd pure-lua/src/packages $ tar zxf /tmp/lua-5.1.3.tar.gz --transform s/lua-5.1.3/lua/ $ ls lua README $ hg commit --addremove --message "Initial import of Lua 5.1.3" ... lots of files were added ... $ hg tag lua-5.1.3
Note that we tag the repo with the Lua release information so we know which version we have at a given point in time.
Pull the Third Party Repo into the Main Repo
At this point, we want to pull the new pure-lua repo into our main repo:$ cd ../../../myproject/ $ hg incoming ../pure-lua comparing with ../pure-lua searching for changes changeset: 2:ec222e7c1372 tag: lua-5.1.3 user: Glenn McAllisterdate: Thu May 06 16:09:04 2010 -0400 summary: Initial import of Lua 5.1.3 changeset: 3:3dc42e785d44 tag: tip user: Glenn McAllister date: Thu May 06 16:09:39 2010 -0400 summary: Added tag lua-5.1.3 for changeset ec222e7c1372 $ hg pull ../pure-lua pulling from ../pure-lua searching for changes adding changesets adding manifests adding file changes added 2 changesets with 104 changes to 104 files (run 'hg update' to get a working copy) $ hg update 104 files updated, 0 files merged, 0 files removed, 0 files unresolved
At this point, you now have the Lua source code in your main repository.
Making Local Changes
Now that we have some third party source, lets make some changes to it. This isn't going to be a 100% realistic example, it's just enough to show the principle. Let's change the Lua README file to add our own comment:
$ cd src/packages/lua/ $ echo "Local change in main repository." >> README $ hg status M src/packages/lua/README $ hg commit --message "Local change to Lua README file."
And that's it. In theory, you can make as many changes as you want to the package. In practice, you won't. You will typically try to make the minimum changes necessary, if any, to get your project to work.
Upgrading the Third Party Package
Lua's current stable release is actually 5.1.4, not the version we are using. We've looked at the release notes, bug reports, etc. and decided that upgrading to the current version is a good idea. The basic steps we want to follow are:- Blow away the existing files.
- Untar/unzip the update into the repo.
- Commit the changes and tag them.
- Merge the changes back into the main repo.
Rather than go into the same level of detail as in the above sections:
$ cd ../../../../pure-lua/ $ hg locate -0 --include src/packages/lua | xargs -0 rm $ cd src/packages/ $ tar zxf /tmp/lua-5.1.4.tar.gz --transform s/lua-5.1.4/lua/ $ hg commit --addremove --message "Import Lua 5.1.4." $ hg tag lua-5.1.4 $ hg tags tip 5:a71637075ea6 lua-5.1.4 4:7ada377f7533 lua-5.1.3 2:ec222e7c1372 $ cd ../../../myproject/ $ hg incoming ../pure-lua comparing with ../pure-lua searching for changes changeset: 4:7ada377f7533 tag: lua-5.1.4 user: Glenn McAllisterdate: Thu May 06 16:30:34 2010 -0400 summary: Import Lua 5.1.4. changeset: 5:a71637075ea6 tag: tip user: Glenn McAllister date: Thu May 06 16:30:53 2010 -0400 summary: Added tag lua-5.1.4 for changeset 7ada377f7533 $ hg pull ../pure-lua pulling from ../pure-lua searching for changes adding changesets adding manifests adding file changes added 2 changesets with 15 changes to 15 files (+1 heads) (run 'hg heads' to see heads, 'hg merge' to merge) $ hg heads changeset: 6:a71637075ea6 tag: tip user: Glenn McAllister date: Thu May 06 16:30:53 2010 -0400 summary: Added tag lua-5.1.4 for changeset 7ada377f7533 changeset: 4:cf8dc98b5be3 user: Glenn McAllister date: Thu May 06 16:21:40 2010 -0400 summary: Local change to Lua README file. $ hg merge 15 files updated, 0 files merged, 0 files removed, 0 files unresolved (branch merge, don't forget to commit) $ hg status M .hgtags M src/packages/lua/Makefile M src/packages/lua/doc/manual.html M src/packages/lua/doc/readme.html M src/packages/lua/etc/lua.pc M src/packages/lua/src/lapi.c M src/packages/lua/src/lbaselib.c M src/packages/lua/src/ldebug.c M src/packages/lua/src/loadlib.c M src/packages/lua/src/lobject.h M src/packages/lua/src/lstrlib.c M src/packages/lua/src/ltablib.c M src/packages/lua/src/lua.h M src/packages/lua/src/luaconf.h M src/packages/lua/src/lundump.c $ hg commit --message "Pull in Lua 5.1.4"
The use of the hg locate command above is worth commenting on. Basically we use it to remove all the files that Mercurial knows about in that repo. However, we want to remove only the Lua related files, not the src/packages/README file. With that and the use of the --addremove option to the commit, and we can get Mercurial to do the work on figuring out what files have been added, removed, and changed.
Typically your merges are going to be more complicated. Also, when you get multiple third-party repositories merging the .hgtags file gets to be a pain. Just remember, when merging that file, take the lines in both files.