Code Reviews are Changes Too

Consider the code review. It's part of many people's daily development practice. Easily available code review tooling has allowed us to review code far more often, and has helped us improve and maintain the quality of our software.

But it could be so much better.

# A typical review-based development process

Consider a typical code review workflow.

  1. The author of a change submits a patch.
  2. A reviewer makes some comments about the change. A discussion ensues, concluding in a few additional changes being made to the patch.
  3. In parallel, a CI system builds and tests the change, and puts a green checkmark next to the patch for everybody to see.
  4. The reviewer observes that CI passed, and that all problems were resolved to their satisfaction.
  5. Someone applies the patch to the main codebase.

This is implemented in many forms, but is often mail based (like the Linux kernel) or web based (like Github). Everything here is assuming some git-based workflow.

# What have we wrought?

Having gone through all of this, what do you end up with?

  • A high quality update to the codebase
  • A merge commit bearing the name of the person who applied the patch
  • If you care enough (we do), digital signatures on all the source and merge commits.

But it's missing quite a lot.

  • The code review correspondence is sitting over in another system. Perhaps it's a mailing list archive, or perhaps it's a database run by a SaaS tool vendor. Regardless, it's not directly represented in your history, and it's not signed.
  • The test results and artifacts are elsewhere as well, perhaps in a different system.
  • In order to find these things, you have to know where to look. The merge commit does not actually reference everything that went in to it.

# A merge commit is an attestation

A signed merge commit should be a statement to the world: "I certify this change is a good one." What's currently missing from that is any notion of how you knew it was a good change. All the information generated as part of the review process should be treated as information contributing to the merge. In git terms, it's all relevant content, and should have an impact on the merge commit's SHA.

For code review, the messages (the correspondence of the review) and context of the discussion (the part of the code they are talking about and which revision) should be preserved in the same way as code changes. All comments in the code review should be signed by their authors.

The same principles can be applied to tests as well, but with more moving parts. The definitions of the tests and the test results are all content, of the same class as review comments. They should be signed in some way, with a key that is accessible to the CI infrastructure. A more thorough setup might consider the content hash of the toolchain used to build the software as content as well, or perhaps even a hash of the entire CI system image.

All of this information should be visible as part of the merge commit, allowing the merger to say "THIS is how I know the change was good", and to apply their digital signature to that evidence.

# A straw-man technical approach

This idea is currently little more than that: an idea. But I have a simple approach that is fairly sound, and has some legs.

Consider the requirements laid out above: it is clear that some of the artifacts we want in the history of a merge commit really don't belong in regular source control. Or at least not in source-control proper. But there's no reason they couldn't live in a different git repository.

A simple, general way to implement this could be to introduce a standard way to talk about cross-repository links as a commit message footer. It should contain the SHA of the commit on the other end, ensuring the content-dependency. It should also contain a tag describing what kind of content is found on the other end -- is it test evidence, a review, or something else entirely?

My Great Merge Commit

Link: Kind=TestResults Url=https://github.com/auxoncorp/proj-test-results.git commit=034c7bd80f323e8cc1e354e6d4c44057a64a7042
Link: Kind=Review Url=https://github.com/auxoncorp/proj-reviews.git commit=5fd25c35b3ef5a7750704c4ee506e94334cdcf57

The scheme is fairly generic. It also paves the way for a simple verification tool, which can walk the dependencies, check signatures, and perhaps check that required tags are present based on pre-configured rules.

# Certified Development

With a system like this, you can audit the history of a Git repository to see if a particular development process was followed. All the required information is available in the commit log. This compares very well with typical documentation-based approaches, which amount to nothing more than a piece of paper saying that the process was followed.

It is also clearly much better than typical workflow-based process enforcers in terms of the resulting artifact. At the end of development, you have a git repository that fully represents everything that went into the code. This is where the information belongs, not in a third party database.

This technique can scale beyond direct implementation development, and into requirements management and design processes as well. If you can manage your requirements and design documents in the same way as source code (and you REALLY should, if you aren't already), then you can track revisions to those documents in the same fully traceable and verifiable way. A verification procedure can check that you have performed any required design updates that may have been required by a requirements change; this kind of process is required by functional safety standards.

# Summary

By bringing your requirements, designs, code, tests, and reviews into Git's content addressable storage system, you get many benefits:

  • Full revision control of all documents, with digitally signed change tracking
  • A uniform, traceable, and signed review process
  • A canonical record of all changes to your software along with everything that went into them
  • The ability to mechanically verify that required processes were followed along the way

This is a vision, really, a description of what could be. I hope that it spurs some thinking about your own development process, or perhaps even helps a future tool-maker to consider their goals a little bit differently.