Horizon Logo

 

Open-Source Articles

 


Version Control Systems

by Ron Murawski

© 2004



Introduction

Version control systems enable developers to maintain a historical record of every file in a project. The project record is usually referred to as a repository. As changes are made this repository would seem to grow enormous quickly, but in practice a diff  program is used to maintain the changes. Using diff, the revision control system will then only need to maintain the latest version (or, possibly, just the original?) plus a database of the small diff-generated delta files containing all the changes.

For programmers new to version control systems the major puposes are:

Version control systems support the concept of branches, where one programmer can branch off into his/her own experimental project while others can concurrently work on the main branch (trunk) or on other branches. These branches can later be merged back into the main trunk, but it usually involves some human interpretation and editing of automatically versioning-generated comments where editing conflicts arise.



Older Version Control Packages

The grandfather of version control is RCS (Revision Control System). I have not used RCS for many years and I believe that no one is doing any further development on it. Its main audience was small teams of disciplined programmers. Usage was cumbersome in that files had to be "checked out" in order to work on them. If one programmer on a team checked out a file, then others could not work on it until the file was returned (committed) back into the repository. RCS's fatal flaw was its annoying habit of versioning each file separately; there was no concept of a group of changes crossing file or directory boundaries.

If RCS is the grandfather of version control software, then CVS (Concurrent Versions System) is the father. In theory CVS is great -- it allows multiple programmers to edit the same files and then automates the interleaving of the changes. When there is a conflict in editing the same line of code, the software flags both edited lines and inserts a warning comment. CVS's biggest problem is that it is based on RCS. Because of this it is not capable of reverting back to a previous version that crossed file boundaries. CVS's problems spring from its RCS heritage.



Important Features For Version Control Systems

For large projects, the most important feature for a modern versioning system is that committed changes be atomic. In simple words, a group of editing changes across disparate files and directories should get committed as a single unit. Once commits are handled atomically, it becomes possible to revert back to previous project versions. Except for RCS and CVS, all of the below-listed version control systems have atomic commits.

Other versioning features to look for are proper handling of files and directories that have been renamed, deleted, or moved.



Free Open Source Version Control Packages



Comparison of Version Control Systems

I have not tested all of the above-listed systems. What I did find was a version control systems comparison. For me, of all the reviewed systems, it was Subversion that stood out the most due to its full features, its ability to restrict access to just one directory per user, and its great documentation. This is a major project that supports file/directory deleting, moving and renaming. It maintains backward compatability by using almost the same command set as the venerable and still-popular CVS. There are Subversion pre-built packages available for download for Windows, RedHat, SuSE, [Net | Open | Free] BSD, Linux, Solaris, and Mac OS X. Better yet, there are cross-platform GUI front-ends for the Subversion revision system available from RapidSVN . For Windows-based programmers there is also TortoiseSVN which is a simple-to-use Subversion client, implemented as a windows shell extension. Best of all from my vantage point is the fact that there is a forthcoming O'Reilly book:

The book is tentatively scheduled for release in June 2004 but the Subversion on-line book is always available at the website.

For the dedicated, there is svk , a Perl add-on. Svk enables Subversion to support replication with automatic progagation of changes to the parent repository. The GUI clients for Subversion also work for svk. The downside to svk is that it takes quite a bit of configuration to get it to go and documentation is very sparse.



Getting Started with Subversion

Setting up the Subversion server was not too dificult, but configuration was finicky. Make one small mistake and nothing will happen when you try to create or query the repository. I'm still having troubles setting up a password-protected repository and I don't know if the problem is my own mistake setting up the server or if it is a Windows98-related issue. Be advised that the server itself cannot run on Windows98 because the backend database, Berekely DB, refuses to run on Win98 systems.

I am currently using Subversion to maintain the Horizon chess engine codebase, some other small projects, and all the files for this web site. I am reporting on my results as they happen.

Windows 98 and Subversion Passwords

All the Subversion clients have troubles running on Windows 98. The two that work are Rapid 0.4.0 and Tortoise 1.0.3. Rapid 0.4.0 runs fine; you can browse the repository, checkout directories and make commits. Tortoise 1.0.3 seems almost trouble-free and has some unique diff capabilities.

At one time, my greatest problem was the issue of username-password access to the repository. Many of the GUIs claim Subversion's design is at fault because, when password-protection is enabled and anonymous access is denied, the server exits with an error instead of asking for a username/password when anonymous access is attempted. According to the Tortoise mailing list: "Subversion always tries first with the default username (UID or User ID). But there's something wrong with the API function which fetches the UID on Win98. That wouldn't be a problem if Subversion would simply ask for a username in that case (as it does if the UID is not accepted by the server), but it exits with an error." Win98 also gets blame because users do not log on and the Windows system call that some of the GUI clients use to fetch the name of the current logged-in user fails. The best workaround for enabling password support seems to be allowing anonymous read access to the repository. With anonymous read access enabled, Rapid 0.4.0 can commit changes for logged-in users. The downside to this approach is that the repository contents are publicly available to anyone with a Subversion GUI client and the proper URL path.

The Tortoise mailing list has work-arounds listed for password access but, for a long time, I was not been able to make them work. It was a bit frustrating that trying to bring up Tortoise's settings crashed it out-- that's where the username/password is set! But, one day, Tortoise started working as promised. The username/password somehow got stored permanently and now everything works wonderfully well. I have no idea what I did to get it to go, but Tortoise has been trouble-free ever since.



TortoiseSVN

A great choice for Win98 is TortoiseSVN. Other than the password problem I experienced in the beginning, I find it to be free of bugs. As I go along I'm discovering some really powerful features. There is a program called TortoiseMerge, which is a high-end diff program. You can input up to three versions of a file and it will highlight all differences and can merge all of these changes into one file. If there is a conflict it allows you to decide which change to use. If three files sounds excessive, just imagine that you have been editing some files. Then, you receive an email with someone's suggested changes. There you are with three versions: your own edited copy, the other emailed version, and the base file in the repository (if the base copy has changed since the last time you looked, then you will have four files!). TortoiseMerge is the solution.

While it's nice to have a high-end diff program available it would seem that it wouldn't be of much practical use. Well, imagine my surprise when I found out that TortoiseMerge is exactly what I needed! Here's the scenario: You are about to commit a directory containing your coding changes. You are about to fill out the little text box that documents your changes... and then you scratch your head because you don't remember exactly just what those changes were. Just double-click on any of the changed files and TortoiseMerge pops up showing you all changed lines side-by-side. So you start to fill in your description and then you double-click on the next changed file, see what changes you made and expand your description a little more. Nice and painless and always accurate. I like that! What could use a bit of work is the default highlighting color scheme. It strikes me as garish and in need of toning down.

RapidSVN

Compared to Tortoise, Rapid is a more standard sort of program. The simple menu system and shortcut bar make it easy to navigate and manipulate both your external repository and your local working copy. It also handles password-protected repositories. But I miss having TortoiseMerge built into the commit logic to help me remember all of my changes.

GUI Client Techniques

When you checkout your Subversion repository (a one-time event) into your Windows filespace creating a working copy, both Tortoise and Rapid create identical .svn subdirectories under every repository subdirectory. A quick look through these files tells me that these are journals of committed changes containing items such as revision dates and checksums, plus copies of the original committed files. These files should only be changed by Tortoise or Rapid. It seems to me that editing them would be disastrous. I would suggest to both Rapid and to Tortoise that perhaps making these directories hidden would help insure that users do not modify them.

I am using Tortoise to do all of my commits. I let TortoiseMerge tell me the exact changes and build up an acurate description of the changes. When I do the actual commit I have a niceTortoise-assisted description of all my changes.

The speed of commits to the repository is truly astounding. Like it says on the Subversion webpage: "In general, the time required for an Subversion operation is proportional to the size of the changes resulting from that operation, not to the absolute size of the project in which the changes are taking place. This is a property of the Subversion repository model." I find Subversion to be substantially faster than CVS on commits; at least 10 times faster -- and maybe as much as 100 times faster!



Subversion GUI Client Comparison

Using Windows 98SE I found the following annoyances:

And here is the good news:

Final Report Card



If you find an error, a bad link, or just want to comment, please email Ron Murawski