Archive for the ‘loose cannon studios’ Category
Peer Code Reviews At Loose Cannon
Now that the game is basically done (though weirdly still being kept under wraps by Konami), I plan to write about some of the things we did, both good and bad. I’ve learned a lot from my experience at Loose Cannon so far and need to write about this while it’s still fresh!
I’ll start out with something I think went really well, all things considered: Peer Code Reviews.
Now, we haven’t had our postmortem meetings yet to discuss things like this. And I know that people on the team have issues with the process and our standards, but I feel like overall this has been one of the best things we’ve done at our studio.
Before Loose Cannon, I had never worked anywhere with a review process of any kind. We never even had a coding standard in place at any of those. When I joined, Matt already had reviews going and wanted to add coding standards too. I was skeptical at first, mainly out of ignorance, but Matt convinced me to try it. After a little time optimizing the process and building and writing some tools, I became a believer and evangelist.
The next couple posts are about what I learned, what we implemented, and how it all works.
What Is A Code Review?
A code review is simple: get at least one other person to thoroughly review your changes before you check in. They need to (a) understand the code and (b) make sure anybody else could understand it too. Like, in six months when nobody remembers how it works any more.
Simple in concept. We’ll get to the details later.
Notice that I didn’t say “find bugs” above. Our reviewers usually do look for possible bugs, and they’ll often find some, but this isn’t an important goal for us. Bugs are simple things, really. Architectural problems, API design problems, lack of safety, and so on – those are the real dragons to slay.
What follows are the rough goals of our code review process, in descending order of importance.
Share Knowledge
If the graphics engineer gets hit by an SUV on their way to work (or quits the company, takes a sabbatical, etc.), you don’t want to be scrambling to figure out how all their stuff works.
Code reviews really help with this. In an individual review you won’t grok an entire system. It’s not really necessary to try, either, because over time, after many reviews, you not only get a feel for what’s going on in a system, but also how that particular engineer thinks. This is invaluable when you come back later to work on that system yourself.
I’ve experienced this directly recently as I’ve been bouncing around the project in the final hours, trying to make myself useful, investigating bugs in systems I’ve barely touched but have often reviewed. It’s a weird feeling, feeling things come into focus, wandering through familiar functions…
Anyway, this means that reviewers need to try to understand the code they’re reviewing. Not just skim through it and look for nitpicky easy things. If they don’t understand it, they need to comment with questions. If it’s too big for a simple answer, then they need to stop and find the reviewee, bring them over to a computer, and get them to explain it.
The more eyes on your code, the better.
Catch And Correct System Misuse
Every system tends to have a maintainer, even if you have a “no code owners” policy at the studio. There’s always that one person who works with it more than anybody else and knows all the hidden rules about it. Even if it’s heavily documented, there will always be knowledge that only exists in that persons head, and you definitely want them to be included if you modify “their” systems. That way they are made aware of what’s being changed, and can think about possible repercussions.
They can comment about possible issues, and in the process not only do we get corrections made, but everybody learns something. Not just about what’s being changed, but about rules in the system being modified or used. “Don’t assume that the game object coming back is valid, could be deferred-add” and so on.
In every nontrivial review, the reviewer and reviewee should learn something.
I especially like this aspect of code reviews. In past projects, I’ve subscribed to Perforce review emails for changes to systems I’ve worked with. And as I got the checkin notices I’d run a diff tool and review everything. All after the fact, all optional, all dropped when things got rough. And only in the areas I was paying attention. As a result, I’d miss a lot of things that I couldn’t afford to miss.
In our code review process, I’m often included on things that need my attention, and I have an opportunity to provide pre-checkin feedback to make course corrections on things I think are important. I love this aspect.
Raise The General Quality Of Code
This one is easy. When you know that somebody on the team will be reviewing your code, you’re a little more careful about what you write. You’re a little more deliberate in your choices, a little less hacky, a little more likely to think things through. Yes, we should all be professionals 100% of the time. It’s what we get paid to do. Well, code reviews help to reinforce that.
Peer pressure is a great tool for improving behavior. If you think someone will make fun of you for some embarrassing code that you’ve written, maybe you’ll think twice about doing it. And, well, when it turns out that you don’t think twice, and your ridiculous embarrassing code is caught in review, you’ll have to fix it anyway. Then we make fun of you.
Want to know the pet peeve I’m thinking of right now? Code turds: when someone just comments out some code. Doesn’t explain why it’s commented out, just leaves it in. This has all but disappeared with this new review process. I love you, Mr. Review Process.
Mentoring Junior Engineers
At Loose Cannon, like most studios, we have a wide range of experience in our engineering staff. For me, at past jobs we’ve always thrown the junior folks to the wolves, hoped for the best, and forgotten about them until ship time. And of course, this usually leads to some bad results. Game Developer magazine postmortems are filled with stories about this. (By the way, does anyone read Game Developer any more? I totally forgot about that magazine until just now..)
The problem is that we hire the junior engineers with the best of intentions. They seem sharp but green, we figure we’ll mentor them, and they’ll learn and deliver. A good deal! But what always happens is the seniors get busy, and they just don’t put in the time. The juniors slowly create monstrosities and we manage to ship the game anyway.
With a code review process, more senior members of the team are regularly reviewing junior-level code. There’s just no way around it. They can make course corrections more often, as well as do some mentoring right there. “You should do it like X” “Why?” “Well because…” is such an easy a conversation to have in the context of a review. Outside of a review you have…brown bag lunch events? Regular meetings? Send them to training? Never happens. Code reviews are perfect places to spread knowledge and best practices.
And in some cases, reviews are where we’d identify people that just aren’t getting it, and need some more att
ention (or, possibly, to find another place to work). You’ll find out about it a lot sooner if you’ve got some people paying close attention to their code. You’d think that this would always be the case with junior level people assigned large tasks, but I have yet to see it in any of my jobs before Loose Cannon.
Educate About And Enforce Standards
We have a coding standard at Loose Cannon. Not everyone is happy with it, and most people have a problem with at least some part of it. We’ll be revising it. But warts and all, it is something that has helped. I remember how things were before the standards and it really was a mess.
But I don’t want to discuss coding standards here yet. Future post, perhaps. It’s a controversial topic, no doubt. How far to go with it? Tab/space settings? Curly brace placement? Naming of temporaries? Block headers in cpp/h files? Do we have a standard for Lua scripts too? What about batch files? And so on.
Suffice it to say I’m a big believer in coding standards as tools for increasing readability and comprehensibility. I also believe that even a bad one is better than none at all. It’s important to us that we check for standards compliance in our code reviews. As a small side effect, the consistency makes the code review process a ton easier.
As a simple example, consider a standard that bool-returning functions should “sound” like a boolean operation. So DebugMode() and LoadLevel() are incorrect (they sound like mutable actions, don’t they?), but IsDebugMode() and ShouldLoadLevel() are correct – they are clearly boolean queries. Well, this would get caught in a code review. This ends up helping to standardize API design. Anyone calling an API, looking at the functions involved, can quickly get a rough idea of how the thing works, based on the way the functions are named.
A review is also a good place to ask people to comment or rename things to be more clear. I tend to add a lot of “what’s this member variable for? needs better name” and “this load routine is complicated, needs an overview comment on how it works” type comments to reviews.
Yeah, ideally all code is self-documenting, yadda yadda. But when there’s a particular order of functions you have to call to get Bink to shut down and then the audio system in order to go to the Wii’s home menu, it needs a comment.
“Big Deal, We Do This In Agile”
Agile people are probably thinking “we already do this, what’s the big deal?” Pair programming inherently includes peer review, after all.
Well, we don’t do Agile development at Loose Cannon. Not yet, anyway. I haven’t really looked into it yet, actually. I was waiting for the hype to die down, and that now seems to be the case. Now that teams I respect a lot (like the good folks at Atlassian) are pushing it I’m starting to look into it more.
We don’t do a lot of things that maybe we should. We don’t do unit testing. We don’t have a continuous integration build server. We have information stored in 10 different places and it’s not searchable. So, basically the same as everywhere else I’ve worked.
We have a lot of things to improve. Agile development is just one more thing on the list of things to research and consider integrating into our process.
But overall, I’m very happy that we’ve gotten Peer Code Reviews as an integral part of our daily process.
Up Next: The Details
So I’ve talked about why we do this. Lots of generalities, fuzzy discussion. But how we do it, specifically? Next post.
Series
My full series on code reviews:
- Part 1: Peer Code Reviews At Loose Cannon
- Part 2: Peer Code Reviews: First Attempts
- Part 3: Peer Code Reviews: Success!
- Part 4: Peer Code Reviews: Good Commenting Practices
- Part 5: About Our Crucible-Perforce Bridge
- Part 6: Peer Code Reviews: How Did We Do?
Working Remotely
This has been on my mind a lot lately so I wanted to take a detour and write it up.
Ally and I decided to move to Peru to live for six months (her blog talks about who/what/when/where/why/how). I would have preferred a year but Ally’s got to go to school in the fall, so six months it is.
My boss Matt at Loose Cannon is super awesome and instead of firing my ass he decided to try to make it work. He gave me a very long term mostly research and prototyping project to create our next gen animation system. Intimidating as hell but cool. That means I don’t absolutely need to be online all the time.
Matt does want me to continue doing code reviews on our game if possible. This requires a fast turnaround to avoid blocking checkins for too long, especially as we get close to shipping. So it’s preferable if I can be online during normal work hours or at least a window of time during each weekday. Peru is 3 hours ahead of Seattle so that means I can go see pyramids in the morning and still get online and working by normal Seattle core office hours.
Anyway, I’ll be working remotely for six months. We’ve already passed through four hostels in three towns, and have finally gotten a real apartment in the comfy Yanahuara district of gorgeous Arequipa. We’ll do extended-weekend trips every couple weeks (which means working on buses to/from) and then move up to Trujillo in a few months. Lots of mobility required and it can’t interfere with my work. I really do have to keep a regular schedule.
This brings up several super important requirements, which I’ve spent a lot of time thinking about and working on.
- Data and hardware must both be secure. If I can’t work because my laptop was stolen or I had a hard drive crash, then I lose a lot of time.
- Our personal data must also be secure. As we’re managing all of our finances online now, we can’t afford passwords and credit card info and such to get out.
- I also have to figure out a way to work with Ally’s, um, let’s say “lack of interest” in security.
- I need at least periodic access to the internet so I can check in at work through the VPN. This means stealing wi-fi, buying cell-based broadband, plugging in direct, or using sketchy public computers at locutorios.
- For longer term, I need a comfortable work environment. It’s still 40-50 hours a week, remote or not. Can’t work hunched over every day at the carpal tunnel festival. But I’m not bringing a whole office with me, it’s got to be small and light.
I’ll probably talk about #2 and #3 at some point but next I want to talk about travelling securely. At least, what has been working well for us.
Posts in this series:
Back Up (Some Of) That Local Data!
I previously posted about using the local workstation as a data store. I concluded that local storage for some important data is an inevitability, and that’s ok as long as it’s backed up.
And I have yet to work at a studio where more than a few workstations had a formal local backup process in place. I believe this is mostly due to how rarely we lose work to catastrophic failures. There’s usually plenty of advance warning from a hard drive before it barfs, giving you time to move over to a new drive. If it ain’t broke often, no reason to fix it in advance. Even in the case of a total sudden failure, it only takes a day or two of re-imaging the OS and reinstalling apps to get back to being productive.
I tend to agree. I don’t think the occasional hard drive failure is worth a general studio-wide full workstation backup policy. It’s just too much work, storage, and time, and is going to cost far more than it will save. Instead, let’s have a more targeted approach. For most people in the studio, I prefer one or more of the options I’m going to enumerate below.
Local-Based Backups
This is the method I use. Even though I just said it’s not a huge deal to have a total failure, some people (like me) are ultra paranoid about losing even the smallest amount of work or time. And for that, nothing beats a local full-drive USB backup.
How?
Big fat USB drives are cheap. You can trade size and price for the performance you don’t need. I’d also get one around twice the size of the main workstation drive array if possible. That way two or three backup rotations plus incrementals can be kept going at once (assuming backup compression).
For software I’d get something that uses the shadow volume service to do copies so it can pick up the in-use files. I particularly like Ghost even though I loathe most things Symantec makes. It’s important to set up frequent (two to four times a day) incremental backups. Set them to low priority to avoid bothering you while you work. And set up automatic replacement of old backups when the drive is full. The system has to be fully automated and easy to use so you can forget about it until you need it when something goes wrong. Ghost even has a nice feature to let you use Google Desktop to search your various backups, a little reminiscent of Apple’s Time Machine.
I’d pick this over a mirrored RAID setup any day because you get incremental backups, and you get an easy way to restore data if your workstation dies and you need to move to a new machine quickly.
But Why?
Why am I so paranoid about this? A couple reasons. The less important one actually is the total loss issue that I mentioned earlier. The main reason I have frequent incremental full machine backups going is that I frequently make incremental boneheaded mistakes in my work.
The most common scenario is where I’m working on some small task, and in the middle it explodes into a giant task. Before I know it I’m four days in with 50 files checked out and crap I just accidentally reverted a bunch because I thought I was working on my other client for that quick bug fix someone needed. Or a day later I confidently go down some new mental path and redo a bunch of work without realizing that this is the wrong way, and now I need to undo back to the previous day. This is especially useful when coding a little drunk after lunch. But if the incremental is only half a day old at worst, you’re not in too much trouble.
This problem is so common for me that I’ve started building some tools to fix it on top of the incremental backup. Right now I’m eyeing git for ideas because it has features like stash and bisect. I’m also considering adding on a continuous backup system (like Norton GoBack or perhaps I’ll mess with Previous Versions a bit more). Like I said, I’m paranoid.
Incidentally, this would be a lot less of a problem for me if P4 had a local repository concept like git and mercurial and other more more modern source control systems have. Then I could check in as often as I wanted for backups and revision history as I worked, and get all my favorite source control tools to help develop.
The P4 way to fake local repositories is to do private branches, but on a large project it’s surprisingly time-consuming to manage. Perforce does not provide any tools to help out with this so I’ll have to roll my own here as well. I’ll probably write about this in a future post. It’s getting increasingly frustrating working with P4. Especially considering its obnoxious price and how much time I’ve put into building my own tools for it.
Avoid the Network
For full-drive backups, I’d stick with a nice simple local USB drive instead of a fancy server-based backup over the network. Doing it from a master server means you have to deal with the IT department every time you want a file back, and the network will get saturated at unpredictable times. This a lot of the benefit of using this system in the first place.
Now, the IT department should at least remotely monitor all of the workstations using this method to make sure that backups are working properly. Someone could easily have a USB hub failure or accidentally kick the connector out and not necessarily notice that backups were failing. People can’t be trusted to pay attention to balloon popups in the tray, they’ll just ignore them forever!
But like I said, this method is for the truly paranoid. Not many of those on the team, and so most can use the next method: server-based backups.
Server-Based Backups
Ok so you’ve got a tape backup and you’re not afraid to use it. You want to roll some local data into the server based backup. But you don’t want to overwhelm the network and your tape storage with enormous full drive backups. Here are some other options. They aren’t mutually exclusive, either.
User Folders on a Server Share
Give each user a public and/or private folder on the server (where the private folder has permissions only for them), and map it to a local drive or folder using a domain login script, perhaps P: for public and X: for private. Tell people to keep, copy, or sync files there that they want backed up. Simple, and easy to manage.
There are a couple big problems with this system. First is that it puts the burden on the user. They have to manage their files that are going to be backed up or not. So people may forget or not want to bother with this. Or once they start they may not want to keep it up and then you have outdated files sitting in the store (though this can have its own slight advantages). The other problem is that people tend to get lazy and copy large folder trees over (temporary .obj’s and so on included) and not worry about size or duplication much.
Ultimately certain types of users will use the system effectively for local workstation backup and others will not. It’s partly a matter of education but also a reason why (ideally) we want to provide multiple options. But overall this is the main (and usually only) method of local data backup I’ve seen in use at studios I’ve worked at.
Remapped User Folders to the Server
This is a variation on the public/private folder idea. Windows lets you remap local folders (such as Documents, Images, Contacts, etc.) to a remote server share. You can just right-click on the folder, go to Properties, and change the folder’s Location to point wherever you want, like a private share on the server, set up per-user.
This is the method we use at home, actually. Ally mapped her Documents folder to our ReadyNAS in the closet. She doesn’t have to do anything special, and her stuff is mirrored, snapshotted every two hours, and backed up offsite. Works great. You can also manually set up NTFS junction points if the OS shell doesn’t permit remapping a particular folder (such as the desktop).
The down side of course is that all these files are mapped directly over the network. On our home network, which is very fast over the wire and only has two people using it, it’s not noticeable. But at a game studio with typical minimal IT investment and lots of people simultaneously working with large files, this could be a big problem. Another possible problem is that if the server goes down, people can’t work with these files locally because copies are not kept locally.
Offline files are a potential option here that I haven’t explored much. I haven’t heard good things about the sync performance, but it’s only anecdotal. Also, it sounds like you need Vista to make it work well, and our industry is really dragging its feet on moving to Vista from XP (sounds like we’re all just going to skip it and jump straight to Windows 7). Other options like rsync are available but we might as well use one of the other options below instead.
Server Pull of Specific Folders
This option is where the backup server periodically reaches into each workstation and backs up files from a standard set of folders. Perhaps the Desktop and Documents folders, or maybe from a special “Backed Up” folder kept on the desktop of C:\ of each machine (or all of the above). With this option people continue to work with local files so performance for them is high.
As with the Remapped User Folders option, educate the team about which folders on their machine get backed up. Then they will know that anything they copy to the desktop or save to their documents folder or whatever (which is the default on so many programs) will be safe.
Now, the big downside with this option is that the server has to back up a lot of files through a relatively slow network. It’s more targeted than a full drive backup but still is going to be quite a lot of data. With all of the other options, where the primary data is kept on the server, the overall network cost is minimal because users are accessing files on demand. The server can back up things locally through ultra high speed links in the server room. But with server-pull from workstations, the server must access every file that is a candidate for backup over the studio’s more ordinary network, which takes a lot longer.
Also, it must deal with individual workstation problems. Backups are typically a serial process, so if one workstation is having issues (perhaps it’s running super slow due to an overnight render or batch process it’s doing) then the whole system is bottlenecked.
Exclusions
With all of the above methods, we will have a big problem with users dumping things in their Documents folder that they really don’t need backing up such as movie trailers, music, game demos and so on. You can nag people but it’s easier to tell the server to exclude large junk file types like exe, pdb, mov, avi, mp3, m4a, iso, and so on from any of the local workstation data it backs up.
Roaming profiles
A quick note on roaming profiles. Don’t use them.
Roaming profiles keep the user’s profile on the server. This includes their entire Users\Username folder minus local-only settings and temp and cache files. The profile is synced with the server on user login and logout. We used these at Loose Cannon before I pushed hard to get them killed.
Roaming profiles are interesting in theory, but in practice, they are a terrible, awful idea. I admit, it’s tempting to have a system that shares all your settings so that you can log into any machine on the domain and have the same setup. You can log into Bob’s machine and all your Visual Studio keyboard shortcuts come with you. Doesn’t that sound nice? But the reality is that sync is ultra slow and makes logging in/out or rebooting frustrating beyond belief. Why is it slow? Well, profiles tend to get really amazingly huge. Everything goes into the user’s profile. And if it hurts to log out/in then it will be even harder to get people to keep their systems patched, which is already a big problem.
Roaming profiles are bad enough with the latest versions of Vista and Windows Server. But they’re even worse on a Linux-managed “domain” (Samba), which is what we had way at Loose Cannon way back. Any bugs or compatibility problems on the way the sync happens get propagated down to your local machine, potentially damaging your profile forever. Our Linux-based server was apparently incompatible with Vista and I got my roaming profile horribly mangled a couple of times until I just forced it to go local. Since then, we’ve been on a nice, easy to admin Windows Server with local profiles only and haven’t looked back.
We may have been able to get roaming profiles working right after a lot of work and tuning, but is it really worth the trouble? Is it really all that useful to log into someone else’s machine and have all your settings be the same? Nope! Depends on some studio workflow policies (pair programming, standards on supported editors, etc.) but I still say no! Windows is simply not any good at this, and gets worse at it every year. Perhaps with a massive investment of custom tools, but…bah. Better things to work on.
Ok let’s move on. Next up: back to the series. Data Store 2: Server Shares.

