December 22, 2008

1291 words 7 mins read

No More Holy Cows!

Debian and Ubuntu have great packaging tools and a great build process. If you never got involved into packaging and distro development, but just care about the distro itself and its quality, here’s why it is so great:

  • We have one build process that applies to all kinds of open source projects, being it C or C++ autotools projects, Python modules using distutils, a set of PHP scripts, something involving CMake or using scons, etc. Basically: a meta-build process that our tools work with.
  • We have our own little version control world that works for us no matter how often or in which form Upstream chooses to release their source code. Being it code that is never released, but available in some revision control or a project that just ships .zip files every now and then. No matter if that code includes binary files (which we won’t ship) or stuff we can’t ship because it’s non-free. Basically: we do some sort of meta-versioning that gives us the flexibility to do that.

Distribution-wise we can deal with every problem in the open source world. We have rules that are transparent and that you can understand if you have some time to spare. It’s a very interesting world of its own and an intellectual masterpiece that has evolved over many years.

We have lots and lots of flexibility: we have numerous tools which make it easy for us to adjust to almost any needs that come up during our daily task of making the world a better place. We have a bunch of toolkits on top of said build process, we even have a bunch of patch systems to add patches on top of the meta versioning we use internally. Also there’s a lot of automagic that happens behind the scenes.

Sounds great so far? So why the controversial title of this post? Because in my 4+ years with Ubuntu I’ve reviewed several hundreds of packages and answered several thousands of packaging related questions and lately I’ve been wondering about the nature of my answers on IRC, in Bug and Packaging Jams, in email or when writing documentation. With every answer I gave I could feel the speed bump for new contributors (no matter which background they have) more. To the point where I was almost feeling it physically.

To us it all makes sense, it’s all there for a reason. I’m challenging that. Wherever we have difficulties explaining what we’re doing or need to explain three or four different concepts before we get to the point of explaining what the tool is for we should make a mental note to bring it up with somebody else again and maybe do something about it. Legacy should not be the only argument for keeping it.

In addition to that have I seen so many package uploads in the last four years whose sole changes were bookkeeping only and had no effect for what my mother is using her computer is for. Add to that months worth of discussion that made the impression on newcomers that exactly the bookkeeping is the important thing in working on Ubuntu or Debian.

So what are we good at? What is the actual worth of a distribution?

We’re incredibly good at integrating things, we’re good at adding value to Upstream code by glueing it together, by making it shine, by making it more prominent, by adding documentation. I work with lots and lots of developers every day and I have mad respect for each and everyone of you. Staying on top of what our users want, what happens upstream and in other distros is a great job. Very often it’s the job of a very technical ambassador. If you like working with people, you’re good at communicating, have a knack for making things work again and love technology, you should definitely think about getting roped in.

So what can we do to help us concentrate on what we’re good at and what brings value to our users?

That’s the million dollar question and I’m not going to give sufficient answers to it.

There’s a few things that we desperately need to make easier to speed up development and get more people (new developers and upstream developers alike) involved.

In my ideal world, I’d not bother with figuring out where to get source code from. I’d use my favourite revision control system for everything - we’re in the 21st century!

Let’s say I gcalctool upstream tells me to get the latest patch of their stable release because it will fix the bug they already got 256 duplicate bugs for. Here’s what I want to be able to do:

myrcs init-repo gcalctool; cd gcalctool
myrcs get ubuntu/latest/gcalctool ubuntu-latest
myrcs get upstream/latest-stable/gcalctool upstream-latest
cd ubuntu-latest; myrcs merge ../upstream-latest`

Then I’d do whatever building, testing, diff-ing I like and just run

myrcs push ubuntu/latest/gcalctool

afterwards.

Why? Because I don’t want to

  • spend time looking up where to get source from.
  • bother converting the source into yet another source format that another tool can deal with.
  • write bookkeeping changelog entry for something that is already well-documented by upstream.
  • invent a version number for something that has sufficient versioning information already.

What do I do today in 2008?

  • Remember where to get upstream code from.
  • Get their stuff using one specific revision control tool, find out what the latest-stable branch is.
  • Get the source package from the Ubuntu archive. (If I’m not an Ubuntu developer I download a .orig.tar.gz, a .diff.gz and a .dsc file from somewhere not very well discoverable.)
  • Use the one specific revision control tool to extract whatever the last commit was, write it to a patch file.
  • Find out if the source package uses a patch system and somehow mangle the patch into it.
  • Write a debian/changelog entry that documents what upstream has documented in their ChangeLog (and the revision control log) already.
  • Convert it into a source package.
  • Upload it to the build daemon.

Even if we’re going to use revision control for everything and have a great dictionary that knows where all the sources for source code are and attach human readable names to them, we’re still going to run into something like:

$ myrcs pull ../upstream-latest/
ERROR: These branches have diverged. Use the merge command to reconcile them.
$ myrcs merge ../upstream-latest/
ERROR: Branches have no common ancestor, and no merge base revision was specified.

Why is that? Because the history of distro packages is very different from the upstream history. In a simple case, you’d have something like: 1.0-0ubuntu1, 1.0-0ubuntu2, 1.1-0ubuntu1 vs. r101, r102, r103, r104, r105, r105.1, r105.2, etc.

Even if we had a magic tool that would merge two sets of history into one we’d have the trouble of files that turn up in one of the branches and not the other, like automatically generated files in the release tarball (configure script, Makefiles, etc.).

The next use-case should work exactly the same: somebody tells me that Fedora has an interesting patch for a bug that our users complain about. One golden day in the future I want to run:

myrcs init-repo gcalctool; cd gcalctool
myrcs get ubuntu/latest/gcalctool ubuntu-latest
myrcs get fedora/latest-stable/gcalctool fedora-latest
cd ubuntu-latest; myrcs diff . ../fedora-latest

to find out how we can fix the issue.

Today we have too many speed bumps and too much frictional loss. It’s simply no fun to

  • stay on top of what’s happening in various very separate worlds.
  • try to improve things as an outsider.
  • do bookkeeping.

Have a great day and a happy new year!


My 5 today: #310334 (meta-gnome2), #294604 (zabbix), #310290 (cups-pdf), #309529 (epiphany-browser, gnome-desktop-sharp2, avant-window-navigator, icewm, go-home-applet, nautilus-open-terminal, gnome-launch-box, gnome-main-menu, quick-lounge-applet, awn-extras-applets, netbook-launcher, galeon), #303264 (cheese) Do 5 a day - every day! https://wiki.ubuntu.com/5-A-Day

comments powered by Disqus