We have moved on to git, why haven't you?

11 April 2011

Stéphane Épardaud

by Stéphane Épardaud

We have moved from a historical mix of CVS and SVN to git several months ago. The whole company. Why did we switch and what have we gained? This article will give you a first-person overview of that change and how I feel about it. The huge spoiler is that I feel damn good about it.

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Theprequel]] == The prequel

We have always used Revision Control Systems in Lunatech, for as long as I can remember. And I can remember I started using CVS when I started in Lunatech back in 1998. Over the years we moved our CVS repository once or twice to a new server, leaving old projects behind in old repositories.

Then one day we moved to subversion (SVN) for some reason. I never really understood why, or even exactly what the difference is between the two. I know for instance that CVS was using RCS to manage each individual file’s revisions, and SVN used a single revision number for the entire project. Of course we left the old projects in CVS and only added the new ones to SVN, to make the whole thing more interesting.

But really in the end I used tags in the same way, avoided branches if I could, and made huge commits with tons of unrelated things in there simply because I’m the sort of developer that likes to attain a working state in the project before committing. And that often means it’s too late to split that state into individual smaller commits that show up as more precise commit history and patches.

While I was working on my thesis I came across my first Distributed Version Control System: Mercurial. At the time I was not feeling SVN was lacking in any way, and I had a hard time understanding Mercurial, so I left it at that.

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Earlygitadopters]] == Early git adopters

Later, back at Lunatech while we were still using SVN, and the occasional branching, some guys from the team started using git, and tried to tell us how great it was, tried to sell it to us.

We saw articles on our wiki about how to do this and that in git, and really it looked very complicated, especially compared to SVN. Most of us looked at those pages with mockery: why would we want to waste time doing those complicated things when SVN is so simple to use? We even had diagrams with workflows of how things should be done in git. With arrows, loops and all, I kid you not.

It’s really hard to sell something to people who are happy with their current solution. Especially when the arguments were not clear, or not good, and the customers are not really aware of what they are missing out on.

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-WhyshouldIevenbotherwithanewCVS%3F]] == Why should I even bother with a new CVS?

I wasn’t sold on the idea of git, but then I’m often the last person to adopt some new technologies when I feel I am at peak productivity with some technology and learning the new tech would cost me days and doesn’t look much better. I mean, I was a pro with Makefiles, I could write self-generating recursive ones with eval and macros and all that stuff, and then I had to use the severely limited ant? Only to later have to learn Maven? Same thing with IDEs really: IDEs, build systems and CVS are tools we use a lot, we depend on them. And it takes us years to achieve our peak productivity with them. Use them badly and they will slow you down.

Then one day I wanted to install a SVN server on my own server, for some private projects, with remote access. I did that years ago already so I should know how to do it, right? After a few hours trying to make sense of this installation madness, I decided something radical: try git, because with git making repositories should be simple.

And you know what? It is.

$ git init --bare project.git

Hell yeah, now that’s what I’m talkin' about!

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Firststepswithgit]] == First steps with git

So having fixed the server side of using git, I learned about how to import my project in git, and there again a big surprise: you need to add the files you want in there first, and only then commit them. Well, this solves my biggest issue with new projects in SVN: every time I import a new project the second commit is always to remove unwanted files that got dragged along automatically. Also, once you have imported your files to git, the current folder is under revision control, unlike SVN where you need to then check it out somewhere else.

At that point I started using git and used it basically like SVN, nothing more: add, remove, move files, commit, push/pull to/from the remote repo, make the occasional tag, nothing fancy.

Then I got to do things I never even dreamed about with SVN:

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Uselocalbranchesforeveryprojectissue]] === Use local branches for every project issue

Whenever I’m working on a new feature or bug report, I frequently get interrupted by some other issue more important, and I have to switch away from my current unstable state to a stable release and do some hacking. This was hard with SVN, and with git I started with git stash which allows you to remove all your uncommitted changes in some pile somewhere, to get back to a clean repo. But what if you get interrupted twice? At that point I stopped using that strategy and went for local branches.

It’s not only possible to create a local branch (one that is never committed upstream, yet is still versioned and first-class) in git, it’s also ridiculously easy. I create a new branch for every issue, with the name of the issue. Whenever I need to switch to a new issue, I switch branch. If you start the work, then later figure out you need a branch, git doesn’t get in the way and let’s you create the new branch and commit your changes on that new branch as if you didn’t write them initially on a different branch. It just works.

Once you are done with your issue, you merge your branch back in your master branch and are ready to push it. It keeps all the commit messages and shows nothing of the local merge. If you come back to a branch that was already merged, git just knows how to merge it back, taking only the new commits.

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Extractpartofaprojecttoitsownrepository]] === Extract part of a project to its own repository

It happens every so often that just like plants, projects grow too big and we have to trim it down or split it in parts. What we usually do is extract parts of it into modules, especially if they can be reused by other projects. With past-gen version control systems, we just create a new project and import the code we extracted, thus losing all VCS history in the process.

With git I wondered if something smarter was possible, and it turns out there is much better. With git you can extract pieces of a project to a new project, while keeping all history and only the relevant parts of that history. The process takes more than one command, but it works. Just the fact that this is even possible with git is a huge bonus for git.

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Commitonlypartsofafile]] === Commit only parts of a file

If like me you like to only commit stuff when you attain a working stable state in your project, you will often regret that with past-gen VCS you would commit files which contain several changes which might be not directly related, or could be detailed better by a special commit message. Actually in my case I often want to commit first parts of A, B and C, then other parts of A, B and D in another commit.

With git, this is not only doable, but easy. Just use http://www.kernel.org/pub/software/scm/git/docs/git-add.html#interactive_mode[git add -p] to add parts of a file to the _index (the part that will be committed). There is an interactive menu that shows you every change in that file in diff format. If the part is too big you can split it into smaller chunks. If the part is not exactly how you want it, you can edit the diff. Yes, it’s that powerful.

Once you have taken pieces of changes in all the files that make a coherent commit, you can then commit a meaningful change set. Magic.

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Continueworkonanothercomputerwithoutcommittingupstream]] === Continue work on an other computer without committing upstream

I frequently switch between two computer for work. I start working on a feature/bug branch (as described above), then before it is ready I have to continue that work on another computer, and possibly come back again on the first computer before finally merging to master and pushing it upstream.

With past-gen VCS I’d normally use rsync, which does the job, but not fine-grained, because you overwrite not only the branch you’re working on but every other branch as well. With git, you just add a new remote, so that you have origin (upstream) on both computers, as well as a remote for the other computer. So for instance on Computer A I have origin and Computer B as remotes, while on Computer B I have origin and Computer A as remotes.

When I want to move from Computer A to Computer B I push my branch to Computer B, then continue work on the new computer. If I want to move back to Computer A before I’m ready to push upstream, then I can push from Computer B to Computer A and continue there again.

No one else in the world has to know I used local branches to get the work done, and that I used two computers. I can just keep using git any way I want in my workflow before I send my changes upstream, and my workflow is not visible/polluting upstream.

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Rewritehistory]] === Rewrite history

Suppose you’re working on a new feature branch, and between your feature commits you also have bug fixes that you happened to commit as you found them while working on the feature. Now your commit history has a mix of feature and fixes and you want to reorder then so that all the feature commits are applied in sequence, after all fixes. With git you can reorder commits, with git rebase -i.

Rebasing also allows you to do quite incredible things before you push your changes upstream, like merging commits, splitting them or changing them. Suppose you do a first commit with some incomplete changes, then another to finish them. Well, you can merge them. It also makes sense to merge commits if the second commit fixes the first one, so that nobody has to wonder why you introduced a bug only to remove it later on.

Now suppose you look at your commit log before you push them upstream, and you find that some commits are not split up enough into distinct commits. You rebase up to that point and edit the offending commit, which essentially works like a time machine and brings you back at the time you were going to commit the offending commit. At that point you can essentially redo you commit while splitting it up nicely (see above), or even make changes to the code you’re committing if needed, then resume the rebase operation to get back to the current state (by applying back all the commits past the commit you just edited).

Oh, and of course with rebase you can fix the commit messages as well, before you push upstream.

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Extendinggit]] === Extending git

Git has a brilliant plugin system. Brilliantly simple. You just define a command (in Perl, Python or Shell) in your path that is named git-foo and magic: git foo is available and will call your method.

Unfortunately there is little documentation for this feature, aside from a little for Shell extensions, but even with just that I managed to add two commands that integrate with my issue tracker:

  • git jirabranch: creates a new branch for a given JIRA issue, marks this issue as in progress and fetches the name of the issue to pre-fill all commit messages on that branch, so that the issue is always mentioned in there.

  • git jirafix: merges a JIRA branch into master, and marks the corresponding issue as resolved.

You can download those commands if you want to use them.

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Thesingularity]] == The singularity

At some point in 2010 we all were experimenting with git on new projects, teaching ourselves and one another how to do things, and to be fair, very often ending up in an IRC git colloquium with our five git experts discussing how to do something with the git newbies. I have to admit those discussions were often long and frequently lead to no consensus on how to do it in a unique way. But find a way to solve our problem we always did, though more often we found several ways, and couldn’t agree on the best one.

But we reached a point where everyone was convinced that git was a good solution. Although some will argue it is not ideal, we all agree it is usually better and more powerful than SVN and for some, at least not worse. Not because anyone forced anyone to use tools they didn’t want, or weren’t ready to want. But because we all came to realise using our own path (and the friendly help of others) that this was the way to go, and that transition could be gradual (first only use simple workflows, then master the new concepts) and that git didn’t get in the way, but helped a lot.

Then at some point a brave soul decided to install Gitorious for those private projects we didn’t want to host on Github (unlike most of our open-source projects). Gitorious is an open-source git repository with a decent web interface for creating and managing new repositories, most of the features we love on Github. Installing Gitorious was the singularity, really. We reached a point where we could make new git repositories with a few clicks, set permissions on those repositories, get our browsers to look at the code and its history, and we had all the tools needed for it to happen.

So it happened. Not only did we move to git, but this time we moved all our old SVN and ancient CVS projects to git, using the importing facilities that git provides out of the box. Yes that’s right, we kept all our branches, history, tags, you name it, from all our old projects, and they’re all now using git.

[[Wehavemovedontogit%2Cwhyhaven%27tyou%3F-Conclusion]] == Conclusion

I looked at my SVN mailbox this morning, the one where all the SVN commit mails end up, and I realised that since the change in January, not ONE code change has been committed to SVN (or CVS). We all went git-only and until we find a better tool, this is likely not to change. I personally am not looking back one bit.

I have many use cases that are not common, and I won’t claim that they are relevant for everyone, but not everyone works in the same way, and the fact that git makes it possible for us to cover those uncommon use cases is a testament to its power and versatility.

It so happens that we’re using git and not any other DVCS (yes there are others), but don’t get me wrong, I’m sure the other ones are just as powerful as git, but they don’t have any feature that would justify their adoption instead of git for us.

In my opinion, git is like Perl: incredibly powerful, and always more than one way to do things. But the inconvenient truth is that just when you thought you learned all there was to learn about VCS you need to learn a lot more new concepts with git, because all that power requires new skills.

Once you master those new skills, you feel like a better programmer. And hopefully you really are.

We have moved on to git, why haven’t you?