Code reviews are one of the most valuable exercises we conduct in software engineering. They're also among the most painful. Last week, I looked at different approaches and the circumstances where each can be most effective. This week, we'll get into source code management tools, and how the branching or commit mechanisms can affect how easy — or difficult — it is to isolate code for review.
Source Code and Branching
Code reviews are about units of code, usually one or more commits. The source code management tool you use and the branching or commit mechanisms you use affect how easy or difficult it is to isolate code for review. Using feature branches, or at least review-level branches, makes the review process much easier. It's possible to review at the commit level, but that gets unwieldy fast. My tips on using a source control system effectively:
- Review one item per branch: This implies using feature branches or other transient branching mechanisms. Using a branch per reviewable item means that any modifications from the review can be corrected on that same branch. It preserves the atomic nature of the branch across implementation and polish.
- If possible, choose a source-control system that facilitates reviewing: Perforce, Git or Mercurial are good choices for source-control systems that will make reviewing easy. They're distributed and have sophisticated branching and merging support that simplify transient branching. Other source-control systems can be used, though they'll make reviewing off transient branches more difficult.
- Use small commits: Smaller sets of code are easier to review, so use small commits to help keep reviews manageable. In addition, make sure commits have a single purpose. For example, don't mix adding a new field to a form with changing an existing notification mechanism. One commit, one purpose.
Code Commenting Systems
Of course, code reviews generate comments. Sometimes they're feedback: "Refactor this to use the existing utility class NotificationFactory." Sometimes they're questions: "Why do we skip the first element here?" Sometimes they're notes from the author: "I'm not happy with this section. Any suggestions on a better implementation?" These comments get heard in only two ways: in a meeting or in a code-commenting system. Code-commenting systems allow for preparation for meetings or for performing entire code reviews. People can comment as they prepare, making any code review more productive or obviating the need for a meeting at all. Examples of code-commenting systems include Github's commenting mechanism, Gerritt, Phabricator and the like. A good code commenting tool:
- Allows for per-line and general comments.
- Is Web-based or otherwise accessible to all team members.
- Lets code under review be updated.
Atomic Units of Review
The set of code that we review is an identifiable piece. That means it needs to be something that source control can understand is atomic: a commit or a branch. To make code reviewable, then, a team needs to agree on the relative size of commits or branches. As a general rule of thumb, a single reviewable unit ought to be something that can be reviewed in under half an hour. Don't measure it by lines of code — a complex five-line change might take longer than a straightforward 100-line change — but by complexity. Whether your team does this by commits or branches depends on your committing strategy. If your team espouses many tiny commits, then do code reviews by branches. If they tend to hold commits for atomic units of functionality, then review by commits. To measure the appropriate size of a commit or branch, consider these:
- Shorten complexity: The more complex the change, the smaller the commit or branch should be. Code that's mostly scaffolded can be long; code that changes your core algorithm should be much smaller. An easy way to measure this is to simply ask the developer how long it took to write per line of code. "A long time" means the commit should be smaller.
- Ensure it's about one thing: If someone asks the developer what's in the commit or branch, the developer should be able to answer without using the word "and." This practice helps to reduce dependencies. Perfectly fine change A doesn't get tangled up with problematic change B. It also makes it easier for reviewers to understand what the code is trying to accomplish when that code is all trying to accomplish one thing.
- Keep it quick: This is particularly true when using branches rather than commits. The longer a branch survives, the further it deviates from the trunk (which continues to get other changes, presumably). As time goes by, the code in that branch gets less likely to simply work when it gets reviewed. Reviewing late leads to comments like, "Well, that was valid, but things have changed, so please re-do this section to fit with the new way we're approaching something."
Of course, reviews aren't the only tools in our arsenal. Using static code-analysis tools can leave us free to focus on the more human aspects of the code review. For example, a code-analysis tool can look for unused variables, leaving the people reviewing the code free to look for logic flaws. Let the tools do the easy work, and let the humans do the thoughtful work. Using other automated tools like build systems, test runners and coverage calculators can ensure that tests exist and are of your desired level of coverage. They can also check that the new code didn't break any existing functionality as measured by your automated tests. To use automated tools effectively in code reviews, make sure you:
- Use the tools before the review: These tools produce information that informs a review, and most things they show as problems ought to be fixed before addressing deeper issues that people will find. Think of it as tidying up before guests come to visit your code.
- Don't overlap tool work and human work: Let the tools do what they're good at, and let humans do what the tools can't. For example, if you're using a tool to look for unused variables or memory leaks, then take it off the code review checklist for humans to perform.
As with any unit of work, the development team must understand what code to review and when. Many teams use a backlog or feature list. Code reviews need an analog. This is a place where the team can track what's up for review, what's been reviewed and needed changes, and what's complete. For most teams, using the same work-tracking system that they use for developing new features is just fine. For example, it's easy to set a Jira status of "needs review" and work off that list. Source code mechanisms like pull requests can accomplish the same thing. Even a whiteboard or a shared chat system can be effective on very small teams. A work-tracking system needs to:
- Track status of code reviews: What is ready, what is done? What needs re-review and what needs fixing? The tool should show this at a glance.
- Notify the appropriate people: The tool needs to provide notification to the group of reviewers. Code reviews are intermittent, so push notifications are more effective than relying on people to check something periodically.
- Be accessible to the team: Effective code reviews should be handled entirely within the engineering team. The work-tracking system should focus on being accessible and useful for the team, not on reporting to management. While that might happen, it should be considered a beneficial side effect, not the tool's purpose.
Code Review Meetings
When it comes to the actual reviewing of the code, there are two schools of thought: One says it should happen in meetings, the others says it should be done individually. The basic idea of code review meetings is that they let the group discuss and come to consensus. They also ensure that the reviews actually happen by providing the team with a set time and space to conduct them. Individual reviews are often preferred by teams that emphasize quiet reflection. This approach allows the developer conducting the review to take the time to really dig into the code. It also helps avoid "group think" and keeps people's calendars free. In reality, the best approach is a compromise between these two extremes: Individual reviews are great for most changes. While a second pair of eyes provide much benefit, a third or fourth pair of eyes don't provide a lot of additional value. Individual reviews also tend to happen more quickly. Teams that conduct individual reviews often designate a specific reviewer each week. For them, reviews become a priority work task during that time. However, for particularly complex code or changes with wide-ranging ramifications, getting a larger group together can be beneficial. If the code under review touches four components, for example, it's helpful to have someone who knows each of them look at it, even if that means four different people are reviewing the change.
Code reviews are rapidly becoming a common software engineering practice. As adoption increases, the practice of conducting them is getting easier. Using a few simple approaches and tools, you can transform a sometimes painful process into one that is simply part of everyday software development.