Thoughts for Pull Requests Reviewers
Here is what I think when reviewing code. This covers aspects from how to review, how to comment and how to influence keeping a healthy PR review cycle.
How often do you review code? I think it is a lot of PRs that you review. If you don’t then you should when possible. It is usually mutually beneficial for all levels of engineers. This article is a collection of thoughts on code reviewing PRs. Why do we need to review code in the first place? There are many reasons, most important of which is making sure some human/s took a look and conclude no direct business harming code is getting through. A business harming code could be unsafe code, storing sensitive data on the browser, not encrypting transmission of data and so on… Other reasons to review are ensuring good enough quality code is getting through, the code does what it is supposed to do, readable code, follows correct styling and conventions and etc… In other words, it is a gate keeping mechanism. Some of those reasons listed above can be automated easily especially with the big rise of CI (Continuous Integration), it is very easy and cheap to automate the styling and static analysis aspect of code to ensure a very consistent code. While the other reasons require a human in the loop to evaluate qualitatively whether a PR is worthy of getting pushed. We went through the reasoning briefly. Requiring engineers to wait for code approval certainly slows down the pace of code getting merged to `master` but improves knowledge transfer and quality of code overtime. PRs are written by humans. Being really conscious of how you read, comment and approve code should be your number one priority because at the end of the day, the person who wrote the code you’re reviewing put an effort to solving the problem at hand. Any comment you add should be candid and not abusive in any form. Be considerate of how that person would react to your comments. That does not mean to not comment on anything, please comment when you’re in doubt but also respect that individual’s code. We’ve established that we need to be wary of the person receiving the comments, let’s look into how to review a PR. A PR typically has 2 parts, a summary and a diff. The summary is where you get the context and expected changes present in the PR. It should be read first. I advise against running into the diff right away, code without context is confusing code. If you find that the PR has no summary to denote what the diff is about, please ask the code author to update it with some summary. I usually stop there if the diff is is non-trivial and come back to review only when some context got added. The reason I do it is because it is not worth it to review some code I don’t know why it is there in the first place and the author hadn’t put time to explain it, it just adds to my confusion. Though on rare cases, the title of a PR is enough to convey the changes but that’s the exception not the rule. Cool. As a reviewer, I should try to understand the context to draw a mental model of what am expecting in my head. After you’ve done that. Then jump into the diff to see the code. Now that you’re reviewing, you’ve stumbled across a part that doesn’t make sense. This is where you exercise your probing questions skills. Ask the author in a comment what the code does and how it relates to existing code. Generally we don’t understand code because it is not named well, there is an underlying side effect or is out of context. Please don’t hesitate to ask when in doubt as it maybe a valid doubt that the author may not know the reason/answer for. This way you’ve uncovered some strange thing and can improve it from there. Seldom, not understanding part of the code blocks you from continuing the review and yes don’t continue if you don’t understand because chances are any other reviewer will have the same doubt. This is usually a point of improvement for the author. If you see that it is a pattern from the author, I believe it is better to bring it up with him than keeping it quiet because it could also be irritating for the author that his code is slow to merge without a clue why. Show the author good examples of PRs that don’t have similar issue/s and inspire him to be more documenting/clear/intentional of code. When you stumble upon a code that can be improved, please don't ignore it, I believe it is a duty to mentor others through ways to improve the code to increase the overall health of the codebase. Also don’t just comment the better approach . Show the reasoning behind why your suggestion is better for the situation. Instead of saying: ``` Do it this way …. ``` Say: ``` Here is a better approach to achieve the same outcome. The reason it is better is because we are leveraging this other code that has been well tested over many use cases doing the same as what is intended here. ``` It is up to you to decide how much details you want to give. Depending on the author, more details may convince the author to use your suggestion because you have clearly thought about it. As long as there is details backing up your suggestion then it should be fine. It is a balancing act, different people respond to comments differently. Also be very intentional with your comment, if you wish to block the author’s code from getting through then make that clear. If it is a nice to have then clearly state it. The reason we go through all the hassle of providing details is because you want the other person to be informed that there is some valid reasons to comments and not just because you would implement the code that way. You also want to avoid back and forth having to explain your side of story, back and forth is annoying to both parties causing plenty of context switching as a result. Sweet. If the code is hard to understand because of naming. Suggest better names than just asking the author to name it better. Provide examples of better names that you think would make the code easier to understand. To me, the most important part of clean code is easily understandable code based off well named constructs. So far we’ve covered a lot of ground. Next we cover whether you review just code or logic. Reviewing just code is the easiest way of reviewing where you care mostly about styling and whether the code looks good. This is a shallow form of reviewing, it is very handy for when you want to review as much as possible but I always try to avoid doing that. Logic reviewing in my opinion is more important especially if the code you’re reviewing is code you will touch soon or expected to code review more of in the future. But what’s logic reviewing? It is a term I made up that basically means reviewing not just the code but also how it interconnect, whether it makes sense that the code behaves as is and whether can be improved. Logic reviewing is also a core reason of why we want to review code in the first place. It is for every developer's benefit to have a simple codebase. To you the reviewer, it allows you to understand the patterns used. This can inspire you in your future work and allows you to see pitfalls and advantages. In other words, code review is not just about ensuring good quality code is going through, it is also about learning. Everyone benefits as a result of learning. You learn about the progress of the codebase which is valuable to see what trends are forming and what anti-patterns are getting added. Understanding the logic also helps you see whether the entire diff is atomic or not. Sometimes more code is included than actually needed. This is where you flag it to the author that parts of the diff can be cut off and put in separate PRs. This links to the idea that a PR should be easy to understand as one unit and easily revert-able if goes wrong when merged. Keep in mind the size of the diff. Bigger diffs are harder to understand as a whole and require a lot of time to review. Ensure that you communicate this to the author. You will probably see bigger PRs lingering around more than smaller ones because the size of the diff reduces engagement. Don’t get me wrong, pressure to deliver code justifies the need for big PRs but generally if you divide up your problem into smaller pieces you can do a lot with smaller PRs but not always. I think boilerplate code PRs are fine to be big. Out of all of what was covered, don’t pressure the author to change code unless it can be justifiable and don’t go for complete PR rewrites unless you think it is 100% better. That’s all time spent, try to optimise it for the author. Some authors will feel bad as a result and end up refactoring the entire PRs to fit your needs, try to be clear about what you want changed and not. If you know of something that the reviewer lacks knowledge of, bring it up to the reviewer. This is generally good when asking authors to use a better design pattern, approach and/or utility for future PRs. PR summaries are generally not well detailed. Try to reduce that by spreading the word that better summaries provide better clarity and may make code reviewing less boring encouraging more participation. At the end of the day, it is a crucial business process that we add the code review gate keeping mechanism to our codebase. Make sure to be clear in your communication with the code author and have respect for their time and effort. This a general framework that applies to a lot of situations. Other things that can be considered when code reviewing is whether you need to review all PRs equally, how to approach team PRs vs others and how to allocate time to review PRs. More articles will be written on this subject in the near future.