Code reviews are hard but worth it. However, this equation changes when changesets grow too large. Research confirms this: the larger the change, the less effective the review.
So, we should aim to review small code changes.
In this blog post, I will show you a technique, also referred to as stacked pull requests, that I teach in my code review workshops. This technique helps engineers split code changes into smaller and more coherent pieces.
In return, reviewers have an easier time understanding the code under change, and therefore give better feedback in a shorter time.
So, are you ready? Good!
Let’s talk about stacked pull requests
Stacked code reviews work with most of the common tools that are built on Git and allow you to review code, like GitHub, GitLab, Bitbucket.
To start, let’s imagine we are working on a new feature. We want to add the functionality to add and delete items to and from a shopping card.
Let’s also imagine that normally, we use a workflow based on feature-branches for development. This means that we branch off from the main branch to do our work in isolation on a dedicated feature branch. After we finished, we merge this work back into the main branch.
The most straight forward approach now is to start development on the feature branch. We commit frequently and finally open a pull request comprising all our changes.
But, as you can imagine, this is a rather large code change. Most likely, the change will comprise more than 400 lines of code. And 400 lines are known to be an upper bound for the maximum amount of code review change size.
So, if we open a pull request for this change, the only option the reviewer has is to use our commit history to review it in smaller chunks.
Using the commits to review the code, the reviewer can follow our development process, and also reviews smaller changes. But, commits reflect the developer’s perspective, and most of the time aren’t a great way to help the reviewer understand the code.
So, we are not going to do that. Instead, we will use stacked pull requests. You find a comparison to commit based reviews at the end.
So, what are stacked pull requests?
When we use stacked pull requests, we create several individual pull requests based on chunks of the work that make for great reviewable units.
So, instead of coupling our code reviews to our commits, we take on the perspective of the reviewer. To do so, we first think about how we could break up our larger feature into more coherent and easier to understand work items.
After we identified how to split the feature into several work items, we can then create dedicated branches for each of the work items, and stack them on top of each other. Similarly, our pull requests will only highlight the changes that happened on each of the stacked branches.
This way, the reviewer reviews smaller changes, and can easily identify how the work items built upon each other. This approach also helps to keep track of the bigger picture, ie., the whole feature.
Let’s look at an example:
To split our features “adding and removing items from and to a shopping card” into several work items, we can, for example, decide to go with an approach reflecting our system’s architecture.
Based on this decision, we start by making changes to the database layer. Then, we implement the business logic. And finally, we make the necessary changes to the UI.
Working with Stacked Pull Requests
We begin our work by creating a feature branch for the first set of changes (let’s call it
fb-database, for indicating that we work on this branch on the database layer for the feature).
The git command we need for this is the git command to create a branch and check it out:
git checkout -b fb-database
As you can see from the screenshot, we are on the
development branch before we create the
fb-database branch. With this command, we now create the
fb-database branch and also switch to it.
Then, we do our work and make several, frequent commits until we reach a state where we think, the database layer is in a good shape.
git add [somefilesthatchanged anotherfile yetanotherfile]
Now, we push create a remote branch for
fb-database and push our changes to it.
git push --set-upstream origin fb-database
Then, we open a pull request just for those coherent database-changes.
Tip: An even better approach is to open a “Work in progress” pull request as soon as we create the branch. This way, our colleagues can follow our progress and get already a heads-up on our work.
Opening this pull request will look exactly like we would do for any pull request. We open a pull request comparing the code of branch
fb-database to the code on the
This means that reviewers see the changes that we made in comparison to what happened in the development branch. Right now, there is no difference to the example before (except that there are fewer changes that happened in this pull request).
Creating the first stacked pull request
After we finished our first work item, we continue to work on the next work item for our feature. But instead of creating another feature branch based on the development branch, we create a branch that is based on the previous branch fb-database.
This means that all the changes we made on the branch
fb-database, are already part of the code on this new feature branch, i.e.
fb-businesslogic. It also means that we can continue working without being blocked, while others have the time to review our previous changes.
So, now, we work on the changes to the business logic. And again we open a pull request.
But now, this pull request isn’t based on the development branch. Instead, it is based on the branch
This highlights that those changes are connected, and, as a result, the reviewer sees what changed between the
fb-database and the
So, now we implement the UI changes. And once again, we branch off our previous feature branch (
fb-businesslogic) to make those changes. We also open another pull request. Again, the PR points to the previous feature branch
fb-businesslogic, instead of to the
Reviewing Stacked Pull Requests
So, how do we review those stacked pull requests?
The idea is that the reviewer starts with the pull request that was opened first and compares the code on the
development branch with the code on the
fb-database branch. These are also the first changes we made and all other changes build upon those.
As a next step, the reviewer reviews the pull request containing the business logic. And finally, the reviewer reviews the pull request with the UI changes.
So the review direction is:
fb-database -> fb-businesslogic -> fb-ui
Merging stacked Pull Requests
It’s important that reviewers do not merge code changes right after they reviewed them. All stacked pull requests should be treated as Work In Progress pull requests until all stacked pull requests have been reviewed. This is because, we must start merging from the top, i.e., the last opened pull request. Doing otherwise will result in a state where we cannot merge our pull requests anymore.
In our example, once the reviewers reviewed all pull requests, we would merge the UI changes into the business logic changes. Then, we merge the business logic changes into the database changes. Finally, we merge the changes on the branch
fb-database (which now comprise all changes) into the
So, our merge direction is:
fb-ui -> fb-businesslogic -> fb- database
Handling Feedback and Keeping Branches in Sync
One drawback of the stacked pull request approach is the overhead of branches that we need to keep in sync.
For example, if a reviewer comments on our pull request handling with the database changes, we would make the requested changes on branch
fb-database. But this means that we have to update and sync the
fb-businesslogic and the
Let’s imagine we made changes to the
fb-database branch after receiving some feedback. Now, we have to update the
fb-businesslogic branch. We have at least two options to sync our branches:
- merge the two branches (resulting in 1 merge commit on branch
- rebasing the
fb-businesslogicbranch on the changes of branch
For both options, we first have to commit and push the changes to the
fb-database branch. Then, we switch to the
fb-businesslogic branch, which we want to “know” of the changes on
git checkout fb-businesslogic
Sync using git merge:
fb-bussinesslogic branch, we use the merge command. Then, we push the merge commit to the remote
fb-database branch so it’s visible in the pull-request:
git merge fb-database git push
fb-businesslogic are in sync again. Now, we can do the same for
fb-ui. If we have merge conflicts, we have to resolve them first, and then commit and push our changes.
Sync using the git rebase:
We can achieve a similar result with using the git rebase command. So, after we applied our changes to
fb-database and switch to the branch
fb-businesslogic, we use rebase to sync the changes.
If there is no merge conflict, we are fine. Otherwise, we have to resolve the merge conflict, and then continue the rebase using the command:
git rebase fb-database [git rebase --continue] #optinal if you have merge conflicts git push
We looked at a technique called stacked pull requests that can help code reviewers understand larger code changes more easily.
Even if the process looked a bit tedious or scary to you, I hope you take the time to try it out on your own. Only that way you can see whether this technique makes sense for you.
It might be that you need to do it a couple of times (2-3 pull requests) before you get the hang of it. But after that, you will be used to this workflow and have integrated it into your daily routine.
Now, before I let you go, let’s compare stacked pull-requests to commit-based code reviews.
Comparison to commit based code reviews
Most tools, such as GitHub, GitLab, and Bitbucket, provide you with a list of commits that happened within a pull request.
This is handy because reviewers can click on each of the commits, and see what changed just within that commit.
In theory, reviewing the code changes based on commits means that the reviewers review smaller code changes, and also that they can follow the progress of the developer.
The problem of commit-based reviews
Unfortunately, most of the time commits represent the developer’s progress and aren’t meant to ease the review. This means that in some cases the reviewer might look at changes that are undone in a subsequent commit.
Some commits might not even compile, or they aren’t tested. Well, from experience I can say that many developers have a hard time committing coherent, self-contained, and working commits. Commits often just represent progress points of the developer (like breakpoints before leaving the office or going for lunch).
Still, reviewing based on commits is frequently done, because when we face large pull requests we might clutch any straw we can get.
But even if our commits represent great reviewable items, like contain all the changes we made on each individual branch, commit-based reviews still have a main drawback to the stacked pull-request approach:
Reviewers cannot approve, reject, or ask for changes on the level of individual commits. Even commenting on code changes of a singular commit is handled differently from commenting on the whole pull request. Leaving comments on a particular commit might be hard to trace for the code author and reviewer, as they do not show up in the pull request discussion.
Well, I hope I gave you a good summary of the stacked pull request approach. This should serve as another mental model and workflow to deal with code reviews within GitHub, GitLab, or Bitbucket.
Finally, I want to highlight some other blog posts that discuss the topic of stacked pull requests:
Let’s start with an excellent blog post about stacked pull requests from Grayson Koonce. In comparison to what I showed you, Grayson highlights a workflow in which the developer isn’t working on several branches as they do the work.
Instead, they create the stacked pull requests in retrospective by
unstaging all committed changes from a single branch and using
git add --patch command to divide the work.
Tip: When you have large code reviews, it can be a great practice exercise, that I also sometimes use in my workshops, to break up large code change with this technique.
Note though that it is an error-prone and a bit daunting task to manually separate the changes. Still, you can learn a lot about single code coherence, the single responsibility princle, and in general about your habit of mingling and tangling unrelated changes together.
git add patch is a wonderful tool when you are dealing with relatively small and local changes that you want to separate. I use it on a regular basis.
A very similar blog post describes this technique when working with the Phabricator code review tool.
BTW, next to my code review workshops, I also run a newsletter providing you with awesome code review tips every other week. If that sounds interesting get your bi-weekly dose of code review best practices below!