New Fun Blog – Scott Bilas

Take what you want, and leave the rest (just like your salad bar).

Some Notes on Dungeon Siege

without comments

I get emails on occasion from people asking about this or that GDC talk I did. Recently someone asked me a few questions about my Continuous World paper, so I figured I’d reprint it here.

Q: Could you describe some of the ” countless optimizations ” based on the world node graph?

It’s been a very long time since I worked on DS so I’ve forgotten most of the optimizations. We had stuff for physics, shooting arrows, pathfinding, all of which I didn’t work on much unfortunately. One big one I do remember is the “AIQuery” system. Most AI-capable objects would query periodically to ask what’s near them and what they’re doing. So a shaman may ask for dead krugs near it so it can consider resurrecting them. Or the hero waits until an enemy enters a certain radius and auto attacks if so. This can be ridiculously expensive, worst case being n^2 distance calcs per frame (where n = # active game objects). In DS’s case it’s even more expensive due to the node/world transformations required to get the distance.

A normal game just does a space partitioning scheme like quadtrees (DS was essentially a 2D game, we usually dropped the ‘z’ component). In DS the whole game is already partitioned based on these siege nodes, so we did tricks with that, where a database would track who’s in which node, watch when transitions were made, and cache queries. So it may have a stationary object asking for enemies every frame. This would get cached as a query that returned instantly, only getting invalidated as either the object moves past a certain distance, or if the membership state of any affected nodes changes. Or of course if a new node enters the world or a current one leaves it, based on world streaming.

I think some more of the optimizations we did were trying to work in local space as long as possible, and then getting notified when the focused (camera) node changed. Looking back, that’s absolutely how we should have done everything, assuming we kept the node system (which many on the team didn’t think was worth the cost). I would have wrapped up the concept of a position in a class owned by a manager, and updated for the new transform on entering a new node, everybody all at once. Then have individual objects check in/out positions from the list as needed for real-time updates. But I really would try to do everything in my power to discourage using a node-based system like that again. Very interesting engine but that feature cost us a couple million dollars at least.

Q: Could you explain the curious “spherical traits-based occupants queries” used by AI? Some kind of walk through the graph?

Ah, that’s the AIQuery system. A few where spherical but most were just radius based. I think we always referred to them as spheres, because that’s how the debug engine drew them. Anyway, occupants were members of siege nodes. So for example, a call might look like this:

gAIQuery.GetOccupantsInSphere(
  GetGo()->GetPlacement()->GetPosition(),
  r,
  GetGo(),
  NULL,
  output ? 0 : 1,
  OF_PARTY_MEMBERS | OF_ALIVE,
  output,
  !IsRidingElevator() );

This says to do a query based on the current component’s owner Go’s position, of a certain radius, with a source used in the filter part, up to 1 member, where we only want alive party members, and only cache it if we’re not riding an elevator. So this one finds the closest alive party member. The filter (OF_*) checks against traits in the Go set at load-time and updated as needed during runtime. In this case the cache would probably be a sorted list of Go’s relative to the current Go out to the max radius asked ‘recently’. So on a cache hit it just has to walk through that short list, testing max radii and the filter to see who to include in the output. Cached queries would expire after too much had changed (distance, occupancy, etc.) or after a certain amount of time (around a second or two I believe).

It was a difficult system to get right, had a lot of bugs until the very end. I would never do it this way again. Uncached queries using normal space partitioning (like quad trees) is a much better way.

Q: Could you describe some of the “self-healing” features used by the logic? I imagine things like event checking to suspend execution etc.

I think I used ‘self-healing’ with a couple meanings.

First, it would repair bad content. Stuff like values out of range, handling where someone typed in the class of an object wrong, or parse errors in the spec files. My attitude was that the game should never ever crash. It should complain loudly and then try as best it can to keep going. The most important thing is to keep the content developers moving. They’re making the game, and try to keep out of their way. A crash is going to stop them completely. Then log the errors and deal with them separately.

And second, and I think this is the more interesting one, it would gracefully handle severe failures in the game related to objects going missing. In DS, an object you’re working with – shooting an arrow at it, have a “daze” spell cast on it, whatever – may simply disappear next frame. Happened all the time, due to world streaming. We did some tricks with putting objects in purgatory and deferring deletes and such but we needed to release these things to get the memory back. In DS, programmers were getting the rug pulled out from under them pretty much constantly. And it is pretty painful to write code where you are checking if an object exists or not all the time. In script code it’s easy – I could detect a NULL and just abort the script. Simple. In C++ code it was more work, and honestly most of the time we did just end up testing for IsValid on everything. But most of the game’s AI was done in script so this handled it pretty well.

Q: Could you provide an overview of the loading process and some of the optimizations and special cases you discovered?

If I remember right it went something like this. We’d check to see if one of the world frustums had moved (there were up to 8, one for each party member) and if so, see if that encompassed any new siege nodes. If so, we queued the geometry and textures etc. to load on another thread. Once those work orders were complete we’d populate them with game objects based on the SCID’s in the level’s .gas files. Again, on another thread. It was nuts – everything was multithreaded in this game, which is easy to screw up. We’d run script, do initial pose animations, whatever. Game objects would calculate parameterized content (such as their starting inventory) on this thread too, and then recursively load them. On the main thread we’d also get orders from the server in multiplayer games to construct Go’s.

There really wasn’t much we could do to optimize the loading process beyond the usual stuff you do in a game – reduce the size of the data to transfer, reduce how long calls are blocked due to fighting over resources controlled by critical sections, write a little assembly, use different types of specialized allocators, etc. Memory usage was a huge concern, so I had to do things like making a string database that stored unique strings, where things like the template name would reference off that.

After all that, we’d figure out which siege nodes were no longer in any world frustum, and queue them for deletion, which would in turn queue their occupants for deletion. Not super interesting but this is where the self-healing code came in. Old references were still often valid for about 15 seconds if I remember right, before they’d finally end up colliding and point at a real (but the wrong) object instead. That had more to do with the size of the magic # I used in the ID (5 bits I think) than any timer.

I remember on the siege node loader we had a lot of pathological cases to deal with, from the way the worlds were constructed. Every major area in DS was very different. We couldn’t optimize for a single type of level, such as the farmhouse to Stonebridge route. That had nodes that were a couple meters square on average, and they were all rectangular pretty much. The castle on the other hand had enormous nodes of weird shapes, which was another set of special cases we had to do (I remember lighting info being one of them, just storage space). And then there were these really long crazy high up nodes in the dragon cave that had yet another use pattern. This game was the opposite of simple.

Well that’s about the contents of my brain, everything else has fallen out!

Scott

December 10th, 2007 at 3:50 am

Leave a Reply