New Fun Blog – Scott Bilas

Take what you want, and leave the rest (just like your salad bar).

Archive for the ‘code reviews’ Category

Peer Code Reviews: How Did We Do?

with 6 comments

Oranges in Brisbane

This is the sixth in a series of posts on our peer code review process at Loose Cannon Studios. Here’s the full series so far:

Time to analyze how the process worked out. Did we get the benefits we expected? Did we run into the problems we predicted we would? What new things came up that surprised us? Or did it all just devolve into a nit-picky passive-aggressive waste of time?

I’ll compare the last twelve months, where we were doing reviews as per my “success” post, with the six months before it, where almost no reviews were being done and it was the wild west as in the “first attempts”.

Code Reviews Strongly Recommended

Given the number of posts I’m doing on this subject, it should be obvious that I consider our peer code review process to be one of the most vital and successful things we’re doing at Loose Cannon. In fact, I’ve come to believe that code reviews should be a part of every studio’s process.

Not the way we’re doing it, necessarily. Our process was the result of planning by senior team members plus a lot of ongoing feedback from the team to keep it relevant and effective. So it’s necessarily specific to our culture, personalities, and projects. Another studio may need different tools. Or they may ditch the primary reviewer group, or go with pair programming instead.

But I think every studio needs some kind of code review process. I’d do it even with a team size of two.

Revisiting Our Goals

First things first. Let’s revisit our original goals and see if we hit them. Did code reviewing live up to its promises? The short answer is a definite “yes”. Here’s the list from the first post:

  • Share Knowledge
  • Catch And Correct System Misuse
  • Raise The General Quality Of Code
  • Mentor Junior Engineers
  • Educate About And Enforce Standards

In all of these areas, we had significant improvement.

Share Knowledge

This was a clear win, but there are tradeoffs that highlight the differences between using a real-world conversation and a virtual-world tool. You could say that, ideally, sharing knowledge is best done face-to-face. In person, you get a tight feedback loop that quickly cuts off irrelevant discussion. And obviously speaking and listening are a lot faster than typing and reading, so you can cover a lot more ground.

Yet I already covered how, in our first code review attempts, face-to-face reviews had logistical issues. But more importantly, in the context of knowledge sharing, the spoken word doesn’t get a permalink. None of it gets captured. It can’t be referenced nor linked. It can’t be adapted into a wiki page. It has no recorded history and can’t be searched. It’s inherently private, not public, so we have a lot of duplication, on top of the warping that happens when it becomes the “telephone game”. It gets more difficult and time-consuming the more people that are involved. Lots of problems.

Once the conversation is over, it can only live on as tribal knowledge.

With a code review tool, every comment gets a permalink and is publicly viewable and searchable. Everything gets captured by design. And it’s captured directly in context, right with the lines of code that are being discussed.

I saw many examples of knowledge sharing in our comments. Most of it was casual – throwing out a bit of info about how a function works or what a particular order of calls should be. I gave some examples of these in my post on Good Commenting Practices. But often it was more structured. The reviewer knew that there was an opportunity to explain in detail some nuanced part of a system and took the time to write a fuller description. A frequent reply I saw was “oh! I didn’t know you could do that!” Cool.

Reviewees often pre-commented their reviews with questions. Even though they already had a working, tested change, they wanted to know more about the systems they were using. This is great – so often I see engineers not caring much about how a system works, just wanting to hit “.” and flip through the Intellisense list to look for a relevant function to call. In a review they’re able to ask questions when it’s already on their minds, in context. This happens much less often in casual conversation!

Sharing went the other way as well. As I reviewed systems under development I learned more about how they worked. Through the small snapshot of a single change I wouldn’t usually understand much. But over time, as I watched things evolving, and reviewed different parts, I eventually got a pretty good idea of how parts of the game worked that I normally would have little contact with. I also learned how different engineers tended to do things, and learned some new techniques along the way. When it came time for me to go into any given system to help debug it at the end of the project, I usually had some familiarity with it and knew what I was getting into.

Catch And Correct System Misuse

This is a special case of Share Knowledge, and was clearly improved with the new process. In particular, the ability to easily add multiple extra reviewers for their domain-specific knowledge was a key advantage. There are many examples in our review database of a domain expert chiming in on something like where to put new system-wide Nant properties, or how a particular change to a state machine isn’t taking into account some dependency. People also got comfortable bringing in new reviewers with domain expertise and directing them to some extra-grognardy part of their change for a focused review.

It was also a good way to detect where we had bad API design. This was something that seems obvious but I wasn’t expecting. If people are continuing to get confused and doing the wrong thing across multiple reviews when using a given system, it means there’s something very wrong with it.

For example, a couple years ago I wrote a set of four array classes, each meant to be used in a very specific way. After doing a lot of reviews involving usage of these classes, it seems that most people still don’t understand the difference among them despite heavy documentation and regular reminders in reviews. Time to rewrite. I get a much stronger message that way than I would hear in casual conversation.

Raise The General Quality Of Code

This is an area I saw huge improvement. It really was the wild west before we started doing reviews again. Everybody had a different style and used different patterns, and the code looked like a mess. Parts are still a mess now, of course, but I feel like we’re mostly rowing the boat in the same general direction.

Further than just stylistic improvement, the overall quality of changes went up. I seriously think this is just the peer pressure at work here. If you know that someone is going to be seriously looking over your code and not just complaining when it crashes the game, you will take more time to get it right before considering it ready to go. As a user of code, I’d normally only care about the public interface to it and its performance characteristics. As long as it is self documenting and works as it should, I shouldn’t care a whole lot what’s going on inside. But as a reviewer, I’m looking deep inside the implementation, and the reviewee knows it.

This also lets me more easily become a maintainer of the code in the future if necessary, like if the owner goes on vacation. Towards the end of a project, we all tend to jump from system to system, fixing all kinds of random bugs wherever we can help out. Being suddenly dumped into working on someone’s crazy implementation code that I previously hadn’t been caring about is frustrating. It’s so much better to have been familiar with it already through regular reviews.

Mentor Junior Engineers

I mentioned earlier in this article that knowledge sharing may be done best face-to-face, but in the case of mentoring junior engineers I think the written form is probably better. At minimum a set of written comments should be the start of a conversation. It lets you clearly define what the new technique is that you’re trying to show to them, and it’s recorded so they (or anybody else) can refer back to it whenever they need. Having it in text form lets the junior engineer really take their time in absorbing the knowledge and thinking about it. They can try out the unfamiliar technique and get to know it a little before responding to find out more information or perhaps argue about it.

Speaking of which, being written makes it easier for engineers to challenge their mentors in a less confrontational way. As we all have learned with the Internet, people are willing to say things in text that they wouldn’t in person. So if someone doesn’t buy an argument, it’s a lot easier to state that as a reply to a comment in a review than by directly challenging them on it. I often prefer the direct-challenge approach myself, but lots of folks aren’t like this. You could expect this to lead to passive-aggressive discussions but from what I’ve seen so far in our reviews this hasn’t been a significant issue.

Educate About And Enforce Standards

Given that we didn’t have coding standards until the new review process was in place, I can’t compare this with the pre-review period. But with the new process, I noticed a few things:

  • New code written by different people started to all look similar. People were adhering to standards. This was awesome. During the wild west time, we had at least four drastically different styles in play.
  • New engineers who came on board and had trouble adjusting did it a lot quicker than I’ve seen in the past due to the regular reminders they got in reviews.
  • Weak or ambiguous points in the coding standards were found very quickly. Reviewees often pointed these out, which then got escalated and either resolved or postponed for later discussion.
  • Areas where we needed to decide on new standards were found quickly. And those new rules got out to the rest of the team quicker and more effectively through reviews than just by email.

We also realized that we needed best practices on top of basic coding standards. What’s the standard naming for resource allocation/release functions? When and how is it appropriate to assert state? How should we lay out platform-common and platform-specific code for a given class or system? Recently, this has expanded into big discussions about our overall engineering design process. I plan to post on this topic in the future as we figure things out.

Comments related to standards compliance varied a lot among reviews and reviewers. Some reviewers spotted violations right away, and others will miss 95% of them. It didn’t matter when things were missed. Eventually, with enough reviews, people would start coding to the standard. I think the mix of attention levels helped in keeping things from getting too nit-picky.

Towards the end of the project we altered our process to just stop bothering with comments about coding standards. None of that code (usually certification-related) was going to be brought forward to the next game and we wanted to really focus on getting the game shipped.

Version 1 Problems Fixed?

Back in the first attempt, I identified three critical problems that contributed to the death of the original process. So how did we fare on these in the final process?

Mini-Cliques

This was primarily a physical location issue that disappeared when we switched to using Crucible and started working mostly in the virtual world.

At one point, though, one of the primary reviewers noticed that they were never getting reviews from a couple engineers on the team. These engineers were overriding the random automatic choice of crucreate and always going with the same reviewer. In both cases, it was a misunderstanding that was easily resolved.

I ran some stats to see what the spread was for each engineer and everybody else was ok, spreading their reviews out pretty well. But it did bring up some enhancements I’d like to make to crucreate’s chooser dialog when showing potential reviewers. Just a little more info to help the random chooser and give more information to the reviewee:

  • What percent of your reviews has each reviewer been on in the last 30 days?
    To prevent favoring a reviewer too much by accident.
  • What is the current review load of each reviewer?
    To prevent overloading a reviewer who already has a big queue.
  • Is the reviewer active at their workstation?
    To prevent people forgetting when someone’s taken vacation or is out to a doctor appointment or whatever. We can detect this from their idle status on the Jabber server.

It would be fun to implement those features, but it’s admittedly low priority stuff.

Lack of Simultaneous Availability

This is another physical issue that was mostly solved by doing reviews in the virtual world. I say “mostly” because obviously it doesn’t change the situation when you need to ask questions and discuss a change in person.

In fact, it’s sometimes worse in these cases. Unless someone is in a rush to check in, they usually don’t pay any attention to a reviewer being available at that moment. There’s no need, and furthermore it’s a major benefit of the process. So nobody’s thinking about timing all that much.

The problem comes up when a reviewer needs to discuss a change in person, and the reviewee just took off. Now you have to do the scheduling time thing to get together. It’s not a big deal, all that happens is the review is delayed. But as a reviewer it’s always frustrating when I’m down in the middle of a big review, and have a small question, and the reviewee is nowhere to be found.

I’ve dealt with this a couple ways. As a reviewer, I can spend a minute skimming the review to get an idea of its complexity. If it’s going to be a tough one, I can check on the availability of the reviewee before digging in over IM. “This is going to be a tricky review – you around for the next hour or so in case I need to ask you anything?” Or as a reviewee, I usually know in advance if a review will need discussion or not. If it will, I’ll avoid sending it out if I’m about to leave the office. Or I’ll add a general comment at the top saying “this might be tricky, we may need to discuss before you get too deep, let me know” kind of thing.

The Blind Leading the Blind

We handled this in a couple of ways that I think worked: the primary reviewers group, and adding multiple reviewers.

Requiring a primary reviewer means you’re guaranteed a more senior person. And the ability to have multiple reviewers working on a review simultaneously means it’s easy to add in people when you don’t know what you’re doing in a particular area. This helps out significantly with the blind leading the blind.

There’s also accountability with the system, so you can trace back problems and get them resolved. With in-person reviews the entire conversation is lost due to being all verbal. But with our process, every change has an associated review. It’s easy enough to find out who reviewed a change that caused big problems and “review the reviewer”.

Special Thanks

atlassian8

I want to give some extra props to Atlassian. It took us many months to really settle on Crucible as our tool of choice and then pay for it and get a real license. We had a few false starts and it took a while to give it a real shot. During this time, Atlassian had no problem continuing to renew our evaluation licenses (as they did previously with Jira and Confluence).

This was great because it let us get the experimentation and testing we needed, not to mention the budget. 30 days is a really short time to check out a workflow tool, particularly one of this magnitude for us. I wish everybody had such a liberal policy with their evals. Not to mention using a simple license key and not requiring some obnoxious DRM licensing server or annoying things like IP address locking like so much other software we use. So hats off to them.

Series

My full series on code reviews:

Written by Scott

August 26th, 2009 at 4:13 pm

About Our Crucible-Perforce Bridge

with 9 comments

MacGuyverin' a wine bottle This is the fifth in a series of posts on our peer code review process at Loose Cannon. In the third, I briefly mentioned a tool I created to bridge the gap between Perforce pending changelists and Crucible. Two great tastes that taste great together. A key part of our review process. In this post, I’ll talk about how it works in excruciating detail!

Caution: this might be the most boring post ever.

About The Tool

The “tool” is actually a command crucreate that is part of a Perforce-based tool I wrote called P4X for adding small utility functions. P4X is built on top of p4.net, which is itself built on top of a native lower layer. Unfortunately, p4.net is rather weak on supporting specific P4 “forms”, preferring only to give general form access. I assume this is to prevent issues with changes to forms across P4 versions. So of course I had to make yet a third layer that wraps up p4.net to build real classes instead of generic forms and arrays, like ChangelistSpec and FileSpec.

P4X comes in two flavors: p4x.exe, the command line tool (uses stdin/out/err), and p4xwin.exe, the gui tool (uses Windows dialogs). The command line tool is meant to be used from scripts and console apps including a command shell. It behaves similarly to p4.exe, mimicking P4’s simple command line format for consistency (prefix params for user/pass etc., command name + command args, ‘help’ for each, etc.).

The gui tool is meant to be hooked into P4V/P4Win, taking advantage of context-sensitive features, or as a simple external tool from editors and so on. For example, it includes a “blame” feature that receives a filename and line number from the editor, which in turn opens up FishEye to the given file and line (FishEye does a much better job than P4 annotate or P4Web IMO).

Installing P4X

Installing support for P4X commands is easy. For P4Win, you double-click a .reg file we have checked into our depot. And for P4V, you import custom tools from a .xml file (also in the depot). This adds in a variety of commands, including the one we care about:

Name = Create Crucible Review from Changelist…
Command = p4xwin.exe
Arguments = –c $c –p $p –u $u crucreate %c
AddToContext = true
image

This will install a “Create Crucible Review from Changelist…” command to the Tools menu, but it will only be enabled if the user currently has a changelist focused. And by using AddToContext = true, we can right-click on any changelist and the command will show up as enabled. Running it passes in the changelist number in place of %c (or 0 if default). Note that it also passes in the current user/password/port settings. The crucreate command can’t ever assume a global set of connection settings. If these aren’t passed in, then P4X just inherits whatever the environment’s settings are.

It’s really convenient to be able to just right-click a pending changelist and get a review. My personal workflow goes like this for any given change:

  1. Code and test.
  2. Play Geo Defense.
  3. Move bare minimum of files involved in the change to a new pending changelist if not there already.
  4. Select all the files and diff to look for stupid things like #if xyzzy.
  5. Play Geo Defense again.
  6. As I go through each diff, I update the changelist’s description with why I did what I did.
  7. Once done, proceed with the process detailed in the third post.
  8. Play more Geo Defense.

Note that after creating the review, I will additionally go through the diffs again in the new review in Crucible to make any necessary comments inline.

How It Talks To Crucible

The Crucible Web API

Crucible uses an XML REST API for communications. This is a recent change, Atlassian having removed the previous SOAP interface. I understand that SOAP has all kinds of annoying problems, I really do. But nothing is simpler than telling Visual Studio to add a “web reference” to the WSDL’s URL, then suddenly being able to get full Intellisense on your new Crucible SOAP object as you’re writing code. I miss that.

Hopefully the industry will settle on a WSDL type thing for REST (there are several serious proposals getting attention, so it should be soon), Atlassian will add support for it, and then I will be able to use that! For now, though, I’m maintaining a C# wrapper to talk to Crucible’s REST API. Intellisense is something I just can’t live without. API’s should be explorable and obvious enough to use without having to resort to opening some HTML file and copying names into code. I heart static code analysis from generated files.

It’s not a huge problem in general, although for every new release of Crucible I download, I do need to review my usage of the API to deal with any changes. Atlassian is good about backwards-compatibility but, without a generated spec that I can type-check against statically, I can’t guarantee that there aren’t typos (on either side of the equation).

Again, not a huge problem, but worth mentioning.

Authentication Needs Improvement

Ok, well there is another frustrating issue with the API. I don’t have a way to reuse the user’s Crucible credentials from their web browser or Windows login. I have to send user/pass in plaintext through a GET auth-v1/login. Not cool. We use LDAP hosted by our Active Directory for all our services, and the whole point of that is to not ask the user for their credentials over and over. I don’t want to do a “save password” checkbox where it stores it in the registry either. It’s just a bad practice.

So instead, we have a “cruciblerobot” user with a password hard coded into p4x and is used for all API calls. It needs “Edit Review Details” to do its job which is normally only granted to admins, moderators, and creators.

I’m assuming this situation has been improved in 2.0, but haven’t had a chance to look at it yet. Ok, well, we, um, don’t have the money yet to renew our maintenance. But I’d really like a way to either reuse the browser’s cookie, or tell Crucible to trust the user’s existing domain login token somehow. Perhaps Crowd could help here, but we’re very happy building everything on top of AD (which is already paid for).

How It Works

What follows is a detailed doc on how the current version of crucreate works. I have a mile long list of improvements and bug fixes I’d like to make to it, but it works well enough now so it’s been a low priority to get past “good enough”.

Creating Reviews From Submitted Changelists

When P4X crucreate is run, it first checks to see if it was given a submitted changelist. If that’s the case, it takes a cheap shortcut and just has Crucible create it itself using a POST to cru/create. [Note that this may not work if crucreate was run after the changelist was submitted but before Crucible has noticed. The user gets an error and has to wait a few minutes for the scanner to pick it up, then try again.]

It would be better if it handled pending and submitted changelists the same. Unfortunately, it would have been more work to implement, and it’s rare enough that we need to do this that it wasn’t worth the time. Even in emergencies where a checkin has to happen immediately, the review is still always required to be created before the checkin.

Creating Reviews From Pending Changelists

If it’s a pending changelist, here’s what crucreate automatically does next:

  1. Log into Crucible.
    • Do a GET to auth-v1/login with the user/pass for cruciblerobot.
    • Set a cookie “remember” with the resulting login token.
    • Make sure to use the cookie container on any web request from now on.
  2. Check if the changelist already has a review by scanning its description for a review tag using a regex.
    • We need to know if we should offer to do an incremental review.
    • We also want to query from Crucible who the reviewers are on the other review, so it can pre-select them in the reviewer chooser dialog.
  3. Query Crucible for the list of possible reviewers.
    • Do a GET to reviews-v1/[review]/reviewers.
    • The user submitting the review is removed from the list as it would be redundant.
    • Fake users like cruciblerobot are removed as well.
    • In 1.x (maybe fixed in 2.0), there is no way to query group membership, so for now I have the primary reviewer group hard coded. It rarely changes so I haven’t even bothered putting the list in the XML file.
  4. A dialog comes up with available reviewers.
    • For totally new reviews, a primary reviewer is pre-selected at random.
      New review dialog
    • For existing reviews (identified in step 2) then the reviewers from the most recent review are pre-selected instead. The user is also given the option of creating a new review or adding onto the current one as an incremental update.
      Existing review dialog
    • Note that the Go button isn’t enabled until at least one primary reviewer is selected.
      • I know, it’s a more usable UI to always be enabled and do an error instead on a click, but I haven’t gotten around to fixing that.
  5. If the user hits Go, we get to work. This is where we split down a couple code paths.

Creating New Reviews

If the user has chosen to create a new review, this is what crucreate does next:

  1. Create an empty file to hold the patch.
  2. For every file in the changelist…
    • Skip if binary file type.
    • Check if the file is scheduled for resolve and error out if that’s the case, telling the user to resolve them. We want a clean set of files to work with.
    • Create two tempfiles – a “before” and an “after”.
    • Print the file the user is sync’d to (the “have”, not the “head”) from P4 into a temp file, if it is not an ‘add’ action.
      • We want the “have” because we only want the changes the engineer made. If we use the head revision then changes from other people will get mixed in, which will ruin the review.
      • If an engineer wants to review their changes from the head revision, then they need to sync and merge, then run crucreate.
    • Copy the local file into another temp file, if it is not a ‘delete’ action.
    • Run diff.exe
      • Use these options:
        –unified=1000000
        –text
        –strip-trailing-cr
        –minimal
        <tempfile1> <tempfile2>
      • Note that the million lines of context ensures that Crucible has the full contents of each file. This is important! Crucible will still only show a few lines of context or whatever according to per-user prefs, but we need the –unified=1000000 so Crucible has what it needs in case the user wants to expand the diffs to full.
    • Skip if result is empty. It means there was no diff and there’s no point adding it to the review.
      • Note that for an add or a delete, the diff against a 0-length file will force every line to be different. This is great because you can see the full contents of the file being added or deleted. It’s a lot more useful than the simple “file was deleted/added” tag because you can make per-line comments like any other file.
      • This is an area where I think our solution is better than Crucible’s creation of post-submit reviews (as well as FishEye’s browsing of changelists) where they don’t let you expand the file inline.
    • Replace the first line of the diff with something that will make more sense to the reviewer than a couple of temp filenames. We use a format like “— ///depot/path/to/file.txt\trev. #123”.
    • Append the contents of the diff to the patch file we’re building, plus an extra \n just to be sure the diff was terminated.
      • Note that a patch file is just one or more diff outputs stuck together.
      • I could submit a patch per file, but it’s simpler if I have a single patch per changelist to send to Crucible. Less clutter and easier incremental updates later on.
  3. Skip entire review and put up a message to user if the patch file is empty. It means every file in the changelist is either binary or did not change from the depot version.
  4. Create a Crucible review via a POST to reviews-v1, remembering the ID we get back for the new review.
    • Creator = author = moderator = the current user.
      • Annoyingly, Crucible apparently has case-sensitive usernames, at least through certain API’s (regardless of server setting for lowercasing usernames). Have to ensure the username is lowercase here or Crucible will create a new user.
      • This seems to be a common problem with cross-platform tools. Perforce has historically had a lot of issues with case-sensitivity too. It’s the “unix way”. People can’t even spell things right – expecting them to get the case correct too is just out of the question. I have a policy of lowercasing as much as I can, enforcing if possible through policy (branch names, client names, folder names..).
    • Description = the pending changelist description from P4.
    • Name = the trimmed, truncated first line of the changelist. People typically put some kind of overall summary in the description so this works out pretty well as a review name.
    • Project Key = currently hard coded. I need to have a way to associate metadata to a project. I’m thinking maybe just a simple projectinfo.txt in the client root (Jira name, Crucible name, Confluence home, etc.).
    • Allow Reviewers To Join = sure, why not. Not sure why you’d want to limit this.
    • Patch = attach the patch file generated previously.
  5. Update the P4 pending changelist with the ID of the new review and an http link to it. Ours looks like “[Review: TAC-1234 ( http://crucible/cru/TAC-1234 )]”. Note that I had to put spaces around the URL otherwise some scanners pick up the trailing parenthesis as part of the link.
  6. Open the web page of the new review!

So with a few clicks, and all the above automation, the review is in draft mode. The user can now proceed with their review process. Not much can be automated past this point!

Creating Incremental Reviews

Assuming a changelist was already in review, and the user has chosen to update it with an incremental via the review dialog, we go to this route instead. In a way there’s less work to do, but it’s a lot more complicated.

In building the new patch file, we have to reconstruct the state of each file as it existed before the most recent changes, diff that with its current state into a patch, and upload that to the Crucible review. Because multiple incrementals can be tacked onto a review, we need to run each file through each existing patch to progressively get to the n-1 state.

It works like this:

  1. Create an empty file to hold the incremental patch.
  2. Fetch the review from a GET to reviews-v1/[review]/details.
  3. Fetch all the patches from the review via each patchUrl in reviewItems.
    • If this doesn’t work, make sure you’re using the cookie container from the original login. Took me a while to figure this out.
    • I store the patches in a table mapping URL to patch data. Useful for what comes next.
  4. For each patch…
    • Break patch down with a regex by the individual file diffs in it.
    • Look up the matching P4 filespec from the diff’s first line “header”.
    • Skip diffs for files that aren’t in the pending changelist. Someone may have reverted a file since the patch was created.
    • Store in a table mapping by filespec to the diff data.
  5. For every file in the changelist, do the same thing as in “Creating New Reviews” Step 2 above, except we need to rebuild the “old” version of the data. So if the filespec we’re working on exists in the patch/diff table…
    • Create a temp directory to work in so we can name the files the same as its name in the patch file and not worry about collisions. This is so patch.exe works.
    • Print out the “have” state for the filespec from Perforce to a file in the newly created directory. If it didn’t exist, print out a zero-length file instead.
      • It’s important to check that the “have” hasn’t changed from the last time a diff was done. If the user has done a sync-and-merge, it breaks the whole incremental patch chain.
      • I’m, um, not checking for this at present. This results in a lot of screwy inexplicable results from the incremental feature of crucreate making people not trust it. Oops. Must fix.
    • Follow the patch chain. For each patch that is associated with this filespec, in the same order added to the review…
      • Call patch.exe with these options:
        -t
        –g0
        (stdin)
        = stream in the patch data (make sure last line is \n terminated!)
        (working dir) = temp directory (patch works directly on the file there)
      • Note that the –g0 is required because we need to disable the weird automatic Perforce support in patch.exe. Took me a while to figure out what was going on here. WTF GNU?
    • Take the final patched file and use it as the “from” in the diff, doing the diff the same as with a new review.
    • When replacing the first line in the diff as with new reviews, append “UPDATE 1” or “UPDATE 2” etc. depending on how many incrementals we’ve done so far. Without this, there’s no way to tell the difference between the original and the incrementals in the review.
    • Take the resulting diff and append to the patch file that we’re building for the update.
  6. If the filespec didn’t exist in the patch/diff table, then the user must have added/edited/deleted a file new to the changelist since the last review. Do the same as if it was in a new review.
    • When doing the first-line diff rename, still append the UPDATE # with the filename so it is considered part of the overall incremental update.
  7. Skip incremental update and put up a message to user if the patch file is empty. It means every file in the changelist is either binary or did not change from when the review or last incremental was made.
  8. Add the patch to the review using a POST to reviews-v1/[review]/addPatch.

The review is now updated to have an additional patch attached. Reviewers will see files changed since the review (new and old) in the file list with everything else, except having “UPDATE” on their names to see the incremental progression.

Now, at this point, the reviewee has to notify everyone on the review that they added an incremental. Ideally I’d have crucreate make every reviewer un-complete the review (which would also trigger some notification emails), but when I do this I get an exception from Crucible.

I suppose if this isn’t fixed in v2 I’ll update crucreate to just send notifying emails itself. But I’d be really surprised if this isn’t fixed or at least improved, given all the changes I’ve seen in the previews I read about v2 features!

Special Diff Hacks

One of the things on my list is to find something better than GNU diff and patch. It is incredibly old and hacky feeling, particularly the weird Perforce support. There are so many better diff algorithms out there too. In particular, I wish I could use the truly excellent diff in Beyond Compare, but only maybe half the team uses that…maybe I will run a “diff server”.

Anyway, in order to make diffs work right for Crucible via GNU, I have to do the following…

  • Check diff/patch versions.
    • Command line options vary across versions and I want to make sure people don’t accidentally have some other diff.exe/patch.exe in their path before our standard tools that are sync’d via P4 (or they’re out of sync).
    • Just run the exe with ‘-v’ and check against a hard coded version. We’re supporting 2.8.7 for diff and 2.5.9 for patch.
  • When fetching a file from P4, do it as a binary. Don’t want P4 doing any of its screwy translations.
  • Convert to ASCII. Gnu diff apparently doesn’t know what a byte order marker is or anything about unicode or code pages.
  • Split by line.
  • Trim trailing whitespace. Don’t want this stuff cluttering the diff.
  • Replace the contents of RCS keywords ($Id, $Header, $Date, $DateTime, $Change, $Revision) with “<ignored>”. This is generated code and we don’t want it cluttering every single diff.
  • If a file ends in .lua, then prefix every line with a dot (“.”). I don’t know how to work around this problem, but Lua comments start with dash-dash (“—”) and that confuses the hell out of diff/patch. The leading dot is a little ugly but getting an empty diff is worse.
  • Rejoin by line and write out. This ensures 100% consistent line endings on both sides.

Maybe I’m reading the (rather awful) docs wrong, and maybe diff/patch can do what I need. But I can’t figure out how to make it do these things on its own.

Final Thoughts

This tool works pretty well. The process looks big and nasty and complicated, and a lot of it is, but the end result to the users comes down to three or four clicks to make a review. 99% automated. With Crucible 2.0 I’ll make some changes, remove duplicate functionality and so on. But it feels like it will work fine for us, well into the future.

Still, there are a lot of things I plan to work on as I find time on the side:

  • Upload the source to a site like SourceForge or CodePlex. Lots of cleanup to do first, not to mention seeing if anyone is remotely interested in using it in their studio (this is part of the reason I’m posting this).
  • Enhancements to the dialog to query each of the primary reviewers to see what their workload is like. Number of outstanding reviews and so on, so the random choice and reviewee can balance out the load better.
  • Rebuild the guts on top of PowerShell cmdlets so we can reuse it in other ways. In particular, the patch and diff management stuff was a pain to get right and would be useful in implementing a ‘stash’ feature in P4X. I heart PowerShell.
  • Lots and lots of bug fixes and little enhancements.

It’s my hope that Atlassian will eventually build such a bridge on their own and eliminate the need for me to keep maintaining something like this. Although, now that we have it built, we can continue to tune it to meet our exact needs.

I’m in Brisbane for just a few more days, then it’s back to Seattle where my life can finally return to a normal pace. Sushi in the Park Saturday the 15th at Cal Anderson!

Series

My full series on code reviews:

Written by Scott

August 5th, 2009 at 9:00 am

Peer Code Reviews: Good Commenting Practices

with one comment

Gnarrr This is the fourth in a series of posts on our peer code review process at Loose Cannon.

Somewhere in the middle of the third post, I started to talk about the “make comments” part of the process, but it’s a big subject, deserving its own entire post. So here we go.

Comments in a review are where the real goods are delivered. This is where we get all the benefits that I had talked about way back in the first post.

What We Don’t Expect From Reviewers

First let’s talk about what reviewers aren’t expected to do when they’re reviewing a change.

Reviewers aren’t expected to catch everything.

It’s impractical and arguably a waste of time. Knowledge-spreading, mentoring, and so on are more of a “seeping” process than a hard core lesson plan. The idea is that, eventually, with enough reviews and shuffling of reviewers, the knowledge will spread throughout the entire team. There’s just no need to focus on catching every single thing in every single review.

Reviewers aren’t expected to catch deep or systemic design problems.

A changelist is a snapshot of a small part of the game. It’s really hard to try to see the big picture through a pinhole. Reviewers will often open up their editor and browse around in code outside of the change during a review, to get more context. FishEye’s browsers and search (and blame!), and Perforce’s time lapse view help here a lot too. But this only goes so far.

At some point, you’ve got to throw up your hands and say “we’ve got to talk, I can’t see what’s going on here”, head to the whiteboard, and discuss it in person.

The reviewer is not (necessarily) the boss.

Reviewers do not necessarily have the authority to enforce changes, nor should they have extra responsibility for the quality of code they didn’t write. The reviewee maintains responsibility for their own changes.

We are tapping into reviewers’ brains and schedules to help make the entire project better. This is a service they provide, not an opportunity for dictatorship. While it is a requirement of our process that all reviewers’ comments must be resolved, this does not necessarily mean “do what the reviewer says”. Ultimate responsibility and authority remains with the reviewee’s lead.

Now, it’s pretty easy to end up in dictator-speak mode when you’re in the zone, ripping through reviews, making comments. It helps to soften more subjective comments with phrases like “I suggest”, “are you sure this is the best way?”, and “this is totally optional and my opinion, but…”.

What Reviewers Seek

Ok, now on to what reviewers are actually commenting on. Reviewers are looking for the following kinds of things, in no particular order of priority.

Are There Architectural and Domain-Specific Issues?

Every project has experts in different problem domains. You want these people reviewing changes in areas in which they have expertise. Graphics, scripting, debugging, architecture, assembly, tuning, you name it.

Here are some examples I picked out at random from our reviews.

Domain expertise + standardsDomain expertiseAdding a reviewer for domain knowledgeDomain expertise in VGM

All we’re doing here is looking for “gotcha’s”: hidden rules in systems that you know well are especially important. Or perhaps better, more efficient ways to do things. Or how new code should fit into old code and interoperate with other systems.

This isn’t just for immediate course correction. With each review comment in a specific problem domain, the reviewee learns how to do things better in the future, so we don’t hit this again. And, often, the reviewee will run through their other code where the same mistake was made (but not caught yet) and fix it.

This is a wonderful way to spread knowledge while improving the code in the immediate changelist.

Are They Following Best Practices?

Here we are drawing heavily on the unique career experience of the reviewers, who are looking for things like:

  • Good comments, well-placed, relevant, etc.
  • Good naming of variables – descriptive, not named after the type…
  • Good flow in a function
  • Avoiding code duplication
  • Hoisting common code out to utilities/systems
  • Calling out when someone is being lazy (in a bad way)
  • Making unnecessary changes that are just personal taste
  • General readability concerns
Best practices
Best practices
Best practices

A lot of this can really subjective and often results in spirited debate. But this is a good thing – everybody learns something! And often we’ll agree to disagree and move on. However, if the same issue comes up again and again, then we can add the lead to the review and ask them to make a call to resolve it.

To the right is a sample clipping with a best practices discussion that will lead into an offline meeting. Best practice (naming) discussion

Are There Any Opportunities to Mentor?

Teams are often made up of people with a wide range of experience levels. Review comments can be a great place to mentor a more junior engineer. If they’re sharp and fearless, they’ll challenge you on comments they don’t agree with or understand. Instead of getting mad and replying with a “just do it, this is the best way” – take advantage of the opportunity!

If you instead take the time to really give them a good explanation of why you made the comment, a couple things may happen. First, they will learn something. Great. But another possibility (this happens to me a lot) is that by forcing yourself to explain why it must be done that way, you find out that you actually don’t have a good reason. Maybe your reasoning is based on religion, or outdated techniques, or wasn’t completely thought-out. I like when this happens because it makes me a better engineer. Make sure the team knows that as a reviewer you expect to be challenged.

Comments in Crucible (our code review tool of choice) have permalinks as well, so it’s easy enough to link to the discussion from the team’s wiki for spreading the word.

This has been one of the more successful parts of our code review process. People will ask questions and say things in text form that would never happen in person. It just doesn’t come up as often in casual conversation to talk about a lot of seemingly minor things like why a particular naming convention exists. In text, when you’re directly reviewing code, it’s a natural part of the process to say “why does it need to be this way?” and can easily be done in a non-confrontational manner.

Are They Adhering to Our Coding Standards?

Not long after we started the new code review process at Loose Cannon, we sat down to hammer out an initial set of coding standards. We were planning on having coding standards anyway, but it became clear right away when we started reviews again that we needed standards immediately. It’s very hard to review code where every person has their own weird style and habits.

Comments about coding standards are simple and easy, and should be short. We have a Confluence doc with our standards, so as you notice things that don’t adhere, flag them. This is a good way to teach new engineers our existing standards, as well as updating everyone on the occasional new standard.

On our current game, we backed off from most of this when we were very close to shipping. At that point you’re writing a lot of junk (particularly certification compliance) just to get the game done that you don’t intend to carry forward to future games. More nitpicky stuff like coding standards is just not a priority in the final weeks.

Did They Write A Good Changelist Description?

This is a relatively new thing we added to reviews. It is really a best practice, but I wanted to call it out as a special item because good changelist descriptions are so important when debugging problems with unfamiliar code. And so many people do it wrong. [I need to write up a post on how to write good changelist notes. People always make the mistake of commenting on the ‘what’ they changed, and missing the ‘why’. Any fool can do a diff and find out what changed, but six months later remembering why it was changed? That needs good changelist notes.]

When our crucreate tool tells Crucible to create a review, it automatically sets the Objectives of the review to the contents of the Perforce changelist description. This is important for a couple reasons.

First, obviously, it helps guide the reviewers by answering questions in advance about the point of the change. Reviewers should always check the objectives to get background on what they’re about to review. Saves a lot of time asking questions in comments that are already answered in the objectives.

And second, as it is a part of the review, it can and should be commented on (with a general review comment). No more lazy, useless changelist descriptions!

It’s for this reason that we modified our Crucible installation to pre-expand the objectives (normally collapsed by default) any time a review is viewed. Otherwise people never saw the objectives because they never bothered to expand them. [Thanks to the nice Atlassian support folks for their help here.]

The “Final Comment”

The last comment a reviewer makes, as documented in the last post, is to say what should be done with the review. Here are some samples I picked at random.

The most common final comment is something like “looks good, check in”. The common case The common case
This one is a little more complicated. Obviously some in-person discussion has happened in between, as well as an incremental UPDATE 1 that was attached. A more complicated, rare case
Sometimes the last comment is the first comment as well. I found a great example of sharing knowledge using reviews. Here, a reviewer got added specifically so they could learn about JSFL in Flash. Add reviewers to share knowledge

And that’s a wrap! Just in time too – I’m getting on a plane in a couple hours (the first of three) to leave Quito and head on over to Sydney.

Series

My full series on code reviews:

Written by Scott

July 26th, 2009 at 11:00 am