19 Jun 2008 Using Git for Local Version Control
The Problem
As I work, I like to be able to do very regular checkins to our Subversion server (ensuring the tests pass beforehand, of course). This gives me an easy fallback position if I make some change that causes test breakages. Sometimes it’s just easier to rollback and start again than to try and figure out how I broke it. This is especially the case when refactoring.
We use Crucible for our code reviews, connected to our Subversion server. The problem is that when I want somebody to review a new feature or bug fix, the reviews might need to span a bunch of these fine-grained changesets. Crucible allows you to have multiple changesets in a review, but I don’t want people to have to review every little changeset – I want them to be able to just look at the difference between the start point and the end point.
A Solution
To get around this, I’ve been using Git to track my local changes, a custom script to bundle my changes into a single changeset, then git-svn to commit them to my Subversion server when I am ready.
I’m not going to give you a tutorial on Git here, but here are the basic steps involved:
- Get Git and git-svn.
- Clone your remote SVN repository into a local Git repository. If your SVN repository has a lot of history, this may take a while – but this is a one-off event. Go get a coffee whilst you wait.
- Once it’s done, start coding. At this point it might be worth reading some of the introductory Git documentation to get an understanding of how it works. Commit your changes to your local Git repository as per normal Git usage.
- Periodically rebase to the remote SVN repository using
git svn rebase
. This will ensure you’ve got the latest changes in the SVN repository. - Repeat steps 3 and 4 until you’re happy with your changes. Create branches if you want, but don’t merge between them (we’ll discuss why later).
- When you’re ready to check-in to SVN, first use a custom bash script that was written by my colleague Tom Lee. Why do this? Well, to commit from Git to Subversion, you’d normally just use:
git svn dcommit
However, the problem with this is that’ll it commit a new changeset to SVN for each Git commit you’ve done. We don’t want this. Instead, we want to bundle everything up into a single commit first.Update: As suggested by one of the commenters, a better way to do this is to use
git-svn rebase -i
See My git-svn workflow for further information.The way to do is to create a temporary Git branch, and merge all of your changes into it. This will squash them into a single changeset, which you can then commit to SVN in one hit. Whilst you can do this manually if you want, Tom’s script does this for you automatically.
Put the script in the same directory as your other Git executables and make it executable too. You can then run it as follows:
git-prepare-svn-commit -m 'Added some new feature'
It’ll bundle everything up into a single changeset with the provided commit message. Note that this script has only been tested with Ubuntu – your mileage may vary. - Now you can check it in:
git svn dcommit
and it’ll just commit a single bundled changeset to your Subversion repository.
Additional Benefits of using Git
Git has a number of features that are very appealing. One that stood out for me is the ability to very quickly and easily create branches locally and then instantly switch between them. Often I would have a number of branches going at the same time, each one for different code enhancements or defect fixes.
In theory I could probably do all of this using Subversion branches, but Git offered one additional advantage – I don’t require a network connection to use it. I regularly work with it offline, only having to find a network connection when I’m ready to upload my changes to Subversion. I also have the added bonus that I don’t end up with a bazillion branches hanging around on the Subversion server.
It’s worth noting that I’ve never merged between local Git branches that originated from the Subversion server – indeed, the git-svn documentation recommends against it (see the ‘Caveats’ section of the git-svn Manual Page). However, this wasn’t ever a problem for me; by committing finished work to Subversion, switching to a Git branch containing unfinished work and then rebasing against the Subversion repository, code could easily be transmitted between branches.
Drawbacks of using Git
I found Git to have a steep learning curve. In particular, it’s two-stage approach to commits took a little getting used to. This problem was exacerbated by the fact that I was using git-svn as well. The best example of this was the extreme bewilderment I experienced when I first encountered merge conflicts during a rebase against the Subversion repository.
The best piece of advice I can give you if you encounter such a conflict during a rebase is to do what Git tells you. Git will provide a number of ‘what you can do next’ instructions when it encounters a conflict that it needs you to resolve. Read these instructions carefully and follow them. I tried to short-cut them and ended up hopelessly tangled up.
Finally, there’s not much Git tool support out there at the moment, so you’re pretty much gonna be doing it all from the command-line.
Credits
Thanks to Tom Lee for breaking a trail to this solution – I knew what I wanted and thought Git might provide a solution, but Tom was the one who actually figured it out and put it all together. He also pulled me out of a few holes that I dug myself into during the learning process.
Andrew Binstock
Posted at 16:59h, 20 JuneGood, helpful post. Thanks!
Interesting to note is that SVN’s structure specifically prevents you being able to use a local and remote SVN repository simultaneously, so you’re basically forced to use a second SCM tool to accomplish this. I asked the SVN folks about this in person and they had no solution except something of the form you’ve described.
Dustin
Posted at 17:46h, 20 JuneYou’d probably be a lot happier if you replaced that shell script with “git rebase -i git-svn” With this, you’ll be presented with a list of your changes in your text editor and be allowed to squash multiple of them into the changesets you actually want to go upstream. You can also reorder them, edit them, etc…
Chad
Posted at 19:11h, 20 JuneHi Ben or anyone else,
I’ve been using git-svn as my Subversion client for the same reasons you have, and I was wondering if you knew a way to efficiently check out an entire Subversion repository, including branches and tags. Our Subversion repository is around 30 GB, containing nearly 50k revisions. Checking out trunk with git-svn is fine*, but if I try to include all of the branches, git-svn downloads the entire history of the repository for every branch. I let this run on the actual Subversion server (a Core 2 Duo with 2 GB of RAM) for about five days before giving up. Do you know a way to have it follow branches without redownloading the shared history for each one?
Thanks,
Chad
* Checking out trunk takes about 24 hours, and each git command thenceforth pegs the cpu for several minutes, so it clearly could be better… but it’s kind of usable.
thomasl
Posted at 20:43h, 20 JuneAndrew: too right — I think it’s a big reason why many open source projects are switching over to DVCS. Funny how times change — very rarely would anyone choose a distributed SCM two or three years ago.
Dustin: reading more about this here:
http://blog.madism.org/index.php/2007/09/09/138-git-awsome-ness-git-rebase-interactive
Sounds like a nice alternative to my nasty shell script. 🙂
Chad: I’ve run into similar problems trying to clone a fairly large, remote repository (the Python codebase, to be precise). Unfortunately I don’t know of anything to remedy this, and the docs aren’t much help. I guess this is the price we pay for relying on the git-svn bridge rather than making a full switch to git. I hope somebody reading this can help you out.
Screwtape
Posted at 20:50h, 20 June@Chad:
After you checked out the trunk, did you run “git gc”? Unless you’re trying to work with a Git repository on a network-mounted drive across the Internet, or less than 32MB of RAM or something similarly bizarre, Git shouldn’t be that slow.
Also, you might want to post your questions to the git mailing-list – they’re very friendly, and are quite interested in learning about Git scalability issues like yours.
Pingback:Thomas Lee » Taking the Pain Out of Complex Forms in Rails
Posted at 23:53h, 20 July[…] other day I was discussing Rails’ form processing behavior with Ben, when the topic of editing multiple associations in a single form came up. Effectively, he needed […]
Pete Moore
Posted at 12:29h, 02 FebruaryHi Ben,
We are making Crucible a lot smarter wrt what to include in a review for the next major release. But right now there are two things that might make your life easier.
1. Select the commits you want included in your review oldest to newest when creating your review. Your review will then include a diff from previous to oldest revision to the newest of the changeset you selected. Note order is important, it (stupidly) doesn’t work if you select commits newest to oldest.
2. You can tweak the “diff to” version one file at a time to be from the selected version. This is obviously not cool for large changes.
All that said about half the crucible team use git locally. We will also have an alpha of a git plugin for crucible available very soon.
Cheers,
Pete.
Pingback:My git-svn workflow | Shine Technologies Blog
Posted at 10:50h, 17 March[…] covered my motivations for using git for local version control in a previous post, so I won’t repeat them here. Nor am I going to provide a detailed tutorial on git and […]
joez
Posted at 17:43h, 14 JanuaryThat is really what I am looking for, Thanks.
The git-svn solution help a lot for the switch process from svn to git
max
Posted at 11:32h, 25 MayHi there! I’ve been doing the same thing recently, alothugh in my case the motivation was the need to split up an old and crufty Subversion repository into multiple repositories. svnsync won’t work for this purpose since it only works with the root of the repository. svnadmin dump | svndumpfile | svnadmin load doesn’t work either, because this suite of tools falls over hard if there have been moves/renames into or out of your selected subtree.Rather than editing the config file, you can also run: git config remove-section svn-remote.svn
Gil
Posted at 09:05h, 16 MayOld thread but nevermind…
You are describing the way version control tools such as mercurial are working for years “out of the box”.
So, for people looking at such a solution, please consider changing of version control tool that really allow version control by design instead of using such tricks.
(this article is good though, I’m just telling this is not clean since this is featured but numbers of softwares).