New Fun Blog – Scott Bilas

Take what you want, and leave the rest (just like your salad bar).

Data Store 1: The Local Workstation

with 3 comments

Overview

vintage_hard_drive

In this upcoming series of posts, I’m going to catalog most of the types of data stores we use at Loose Cannon, along with features that would make you want to choose one versus another.

Sometimes the lines are blurred a bit, because data stores continue to add interesting features each year. For example, modern wikis are starting to get some decent revision control features for binary data. And some revision control systems such as Subversion have the ability to instrument the file system with metadata, like a database.

Sometimes tools can bridge the gap among these data stores as well. For example, a file system has no ability to notify via email of changes, but you can pretty easily write a file scanner that watches for changes and emails team members based on a spec. Overall, though, while tools like these are useful and can enhance the underlying data store, they can’t really change its basic nature. So I’ll be reviewing each with that in mind.

This overview of data stores is probably going to be very basic for most, and you could even question why I’m bothering with such obvious stuff. Well, that’s what Ctrl-Yawn-W is for. But this is how I think, and so I can’t help it. I always start with the basic foundation and work my way up to the top. It’s slow, but regularly revisiting old assumptions is critical. As I write these upcoming articles I’ll probably end up reconsidering some of the things I do now and will make notes for changes to the next version of the tool chain for late 2009.

Anyway, back to the post. What we’re trying to do is answer the question “where does this go?” We’ll start with the first, most obvious option.

The Local Workstation Hard Drive

This is the fastest and easiest to use data store. Everyone stores at least some of their data on their local workstation’s hard drive.

For files that need to be “shared” with other people, you can copy or email files around to other people you work with. There are services that make it easier to do this for distributed groups (Groove comes to mind), but at that point we’re outside the scope of local storage and into the realm of services. Let’s stick with local.

What Gets Stored Locally?

I see local storage used a lot with spreadsheets, concept art, in-progress design docs, audio samples, task lists, quickie test projects and scripts, and so on. Private and secure data is often stored locally as well: budgets with salary information, employee reviews, private emails, and so on. I myself keep loads of private notes, tasks, and research progress in OneNote and, occasionally, email (in recent years I’ve found email to be increasingly painful and hardly use it much for work).

Ownership and management of local data is clear and simple: you own the files and you organize them however you want! Nobody messes with them unless you explicitly share a location out and, even then, you control the permissions. You can send updates to whomever you want when you want. Operating system shells are pretty good at making this as simple and powerful as possible. You can add metadata and tags, create virtual search folders, organize using apps like Picasa and iTunes, and so on. It’s great!

Those same nice shells often store irritating hidden files like Thumbs.db, Picasa.ini, .DS_Store, and so on – files that are not meant to be shared and often clutter up shared data stores. We have a special trigger in Perforce to prevent people checking in these files by accident (there’s also .user, .bak, .~, .svn\* and a hundred other siblings in this temporary family).

Local Storage Can Hurt

All of this lovely freedom can make things easy for you and a freaky mess for everyone else on the team.

Say someone leaves the company. Now you have a computer filled with files and nobody knows where anything is. Their desktop is invariably going to be chaotic, with files like “sword_large_02.psd”, “sword-marketing2008.psd”, “Copy of” files and “New Folder (2)” folders and so on. And more files stored in C:\ and My Documents and other seemingly random locations, maybe even burned to piled-up CD’s. I’ve known too many people that work like this. They run their hard drives down to 0 bytes free and then just delete random stuff to free up space. It’s Concentrated Crazy! Yet they still manage to kick out final PNG’s to the right spot in the Perforce depot, so nobody notices the mess until it’s too late.

At Sierra we had a policy of burning to CD the full hard drive of anyone who left the team. With the turnover we had on Gabriel Knight 3, of course, that gave us a mighty big stack of CD’s. Once or twice some poor intern had to go through it to (fail to) find this or that odd piece of marketing art we needed. I think all the CD’s ended up in an offsite vault eventually. All that effort archiving, storing, and searching that mess was such a ridiculous waste of resources.

Beyond the hopefully rare “developer leaves team” story, local files are simply the opposite of good communication. They’re by definition private. Even if you share a folder out, it’s still your machine. If you want to send an update of a file to someone you have to send an email or copy it to a share and IM them about it. And that’s private too, so you have to know who might need to know about these updates. This also assumes you even remember to do the update, and send the right file, and all that. I’ve lost track of the number of times I’ve gotten emails from remote workers, who attached the wrong version of the file. Oops! Sorry about that dude, let me re-send!

It’s a lot of seemingly optional responsibility to attach to a person who is probably too busy to keep it organized. At least, organized as much as the rest of the team might need.

Don’t Mess With the Crazy

I used to strongly believe that if someone leaves the team you should be able to flatten their machine and give it to someone else without worrying about losing important data. Nobody should have to decode the Crazy. So of course I was often frustrated by people on my team who insisted on working this way.

Well, I still believe this, I’ve just given into reality. Apps are written with the assumption of either working locally, or working through some kind of custom sharing service (like Office + SharePoint and even that isn’t done very well). It’s an uphill battle to try to enforce a structure and process on what amounts to an inherently personalized and optimized development experience. So many things are just easier locally. Like working with folders! You can rename them, or move them wherever you like, instantly. In contrast, doing the same through source control is a tedious pain.

There’s nothing necessarily wrong with using local storage for ad-hoc, brainstorm-y, temporary, or private work. The trick is knowing when to move things off the local drive to another, more team-friendly data store. One where people can subscribe to changes made, get their own copy, get a history, and so on.

So how to know where it goes and when to start maintaining it there instead? In writing this I’m trying to figure out some basic rules for what can stay local and what should get promoted. But it depends so much on the team, discipline, type of data, and so on. Some things are easy and absolute: all source required to build the game and its content go into source control. Some things are more gray: where does that intermediate concept art go? Where do those little prototype gameplay modules go?

For the easy ones, look to the upcoming articles. For the more gray issues, I’m going to fall back on the “make sure it’s backed up” solution. Hope for the best, and if things go bad, go to backup for recovery. But we want to avoid the expensive problem of backing everything up and having a big disorganized mess of Crazy, right?

The solution to that I think is in a detour article I’m posting next on backup options. One great way to have your team members separate out the signal from the noise on their hard drive is through a partial backup solution. More on this next!

December 9th, 2008 at 10:00 pm

3 Responses to 'Data Store 1: The Local Workstation'

Subscribe to comments with RSS or TrackBack to 'Data Store 1: The Local Workstation'.

  1. When dealing with the Crazy, I make a snapshot or backup of the entire hard drive, or even just take the hdd out and set it on a shelf. That way 9 months later when someone goes “oh it was on so-and-so’s machine” it’s possible to look like the hero sometimes. Granted this doesn’t scale well, but for a game studio it works great.

    Treat local storage like a black box when approaching it from the outside.

    Aaron Bockelie

    3 Jan 09 at 11:16 pm

  2. Yeah HDD’s are pretty cheap. And a year or two later when you’re certain you don’t need their data, stick the drive in some other machine that needs an auxiliary (like the DJ for more music!)

    Scott

    4 Jan 09 at 12:52 pm

  3. [...] pm · Filed under backups, loose cannon studios, process I previously posted about using the local workstation as a data store. I concluded that local storage for some important data is an inevitability, and [...]

Leave a Reply

Note: This post is over a year and a half old. Time moves fast on the internet and this article may be total bunk now! You may want to check later in this blog to see if there is any new information relevant to your comment.

Want to paste some code into your comment? Just wrap it in [code] [/code]. Also, please note that off-topic or overly commercial comments will likely be removed at my discretion.

Switch to our mobile site