As a team, we developers at Emma try to hold to the agile development philosophies. We’re also big fans of Git and its distributed way of handling source code version control. Because Git is so flexible, it lends itself well to our agile, scrummy way of getting things done. If you are working on a team of more than two or three developers and you’re pondering switching to Git or changing your Git workflow, let me pique your curiosity with how we use Git at Emma to collaborate and maintain the large codebase of the Emma web application.
Remote and local repositories
Let’s start from the center. We have an internal server that hosts our remote Git repositories, with a web interface provided by cgit so we can browse change history easily. Lately there’s been talk of having Github host our remotes, but for now, this works pretty well. On the remote server, we have one mainline repository where the production code lives. Each developer also has their own repository where their own changes to the production code are stored.
To make a change, I start by pulling the latest version of the application code from the mainline production repository down to my local repository in a branch called
prod. Each of us has a
prod branch, designated for this purpose. Any time I start a new project or bug fix, I make sure
prod is up-to-date, then create a new branch from it for the new feature. This new feature branch, given a simple name to describe its contents, is where I make my changes.
Committing and pushing changes
As I make changes to the code, I will take occasional snapshots by committing changes locally as I finish parts of the project. If the change is addressing a bug fix or set of bug fixes, I generally commit after each fix. When my code changes are complete, tested and committed — or I just want to make a quick remote backup of the current state — I push that feature branch back up to my personal remote.
We also make sure every commit has a useful message describing the change. This makes browsing the source history less of a hassle. The trick is to clearly describe the change in under the 80 character single-line limit. (Of course, you can have a multi-line commit message, but most commit history viewers only show those first 80 characters.) Rather than something general like “Bug fix for personalization weirdness” (one of my own transgressions), the message should say exactly what was done, like “Fixed bug where campaign personalization tokens weren’t being replaced.”
We also use Trac to do most of our project and bug tracking. So, if the current commit addresses a particular Trac ticket, we prepend the commit message with the ticket number, like so: “(#172) Fixed bug where campaign personalization tokens weren’t being replaced.” Because you can hook up Trac’s version control system to a Git repository, this ties the two together quite well. At the very least, it’s an easy and useful way to refer to more complete details about the change.
Once my change is complete and pushed back to my remote, I let our QA team know I have new code to test, and I tell them the name of the feature branch in my remote. We have an internally-built deploy tool (which we’ll talk more about in a later post) that QA uses to deploy any branch from any repository to a test server. Typically I’ll go back and forth with QA as we find and fix issues in that branch. As they find issues, I continue to work inside the same branch, pushing changes back up to the remote for them to redeploy and test.
Occasionally when we’re testing a code change, or I’m still working on a feature, it’s necessary to update that feature branch to include recent changes to
prod that launched since originally creating the branch. In this case, Git’s rebase functionality comes in handy. Rebasing works much like merging, except that it makes your repository look like you just created your branch from the newest
prod commit rather than creating an ugly pile of merges, deviations and more merges as you catch your branch up with
prod. Aside from that, rebasing makes merging launch-ready code into
prod much easier and less prone to merge conflicts. The occasional
git rebase prod while on my feature branch — and any small merge conflicts that occur — is all it takes.
Merging and deploying
When everyone is happy with the changes, that feature branch is added to a list of branches ready to be launched. When it’s time to launch a round of changes to the site — typically once a week — one developer will collect those branches, merge them into the mainline production code repository and resolve any conflicts as needed. During large projects like our current PHP-to-Python conversion, merge conflicts are inevitable, but Git’s merge functionality is a magical and powerful thing, so usually this process is fairly straightforward.
Finally, when all the merging is complete and the “release candidate” production branch is tested, we have a set of updates ready to launch! The developer who merged together the changes for the week will tag the launch-ready branch with a version number and send it off to be deployed to the production servers.
Now that a new development cycle has started, when we need to create a new feature branch or rebase a current branch, we can pull the mainline
prod branch down to our local
prod whenever we need it. It’s not so much something we enforce as much as it’s just a good idea to avoid frustration. Plus, it’s a lot easier to trust each other to do good work than it is to get pushy about rules. It’s no fun working on outdated code anyway.
While we usually split up projects into small enough pieces for a single developer, there are times when we need to collaborate on an idea. This is a fairly simple process, since we all have access to every developer’s remote repository. We’ll coordinate who is working on what and, as code is written and pushed up to each developer’s remote, any developer can pull down those branches and merge them into their own work. Once again, the power of Git’s merge capabilities generally makes this a straightforward process.
When collaborating, those involved are communicating changes as they’re being made and pushed up. In practice, we could be pushing and merging changes from each other’s branches a few times in a day or once every few days, depending on what we’re doing. Good communication and frequent merges ensure that work isn’t repeated, and that changes by one developer aren’t interfering with the work of another.
Issues we’ve faced
Every so often, we run into issues when working on multiple projects that depend on a lot of the same code. When one project branches from another feature branch other than
prod, things can get sticky if the new branch is ready to launch before the first because it still contains the unfinished work from the first branch. This recently came up in our PHP-to-Python project, delaying the release of a project that branched from another that still had known bugs. The best way to prevent this is to make a plan as to what will be done in which branches well ahead of time. But, in our commitment to agile development practices, it’s sometimes difficult or impossible to plan that far ahead or know which parts will be ready to launch first.
Of course, the main Emma web application is only one of many codebases we maintain. For smaller repositories like a server-side utility script or a database configuration, it would be overkill for each developer to have their own remote version. In those cases, it’s generally up to the developer to decide how to work on that project. Sometimes feature branches are made and merged into the master branch. Sometimes we’ll work right on the master branch, using Git to keep a more Subversion-like history.
Given Git’s proven power and flexibility, there’s still a lot of things we are looking to incorporate into our development cycle. With the ability to look at any developer’s work, we’ve considered doing code reviews of each other’s work. Right now they’re not a normal part of our development cycle, but may be valuable as the application and team grow.
We also are moving toward the goal of having our application be more unit-test-friendly and, along with that, the idea of adopting a continuous integration workflow may make us more agile. If we were to do this, we could use one shared remote Git repository where we each push and merge our changes often — preferably daily, as some continuous integration experts suggest. With each push and merge, we would run a whole series of automated unit tests and, assuming all tests pass, we could have production code ready on a more rapid schedule.
Hand in hand with continuous integration would be automated deployment. Our weekly deploys are fairly automated as it is, but we dream of a future where a system monitoring tool like Jenkins would be watching for pushes to a shared remote that pass all unit tests and automatically deploy those changes where QA can test,and those users brave enough for beta software can try out what we’re working on.
All these things are just ideas right now, but the more we grow, the more room we have to experiment with things like this. And using Git makes adapting and trying out these ideas that much easier.
Emma is the first place I’ve worked that uses Git. And, much like it completely changed the workflow here when it was adopted, it has changed the way I look at code changes and version control. It gives us a lot of freedom and flexibility when doing our work, making the idea of agile development a reality while still maintaining a sturdy codebase.