When you were first learning about how .NET finalization works, did it just feel wrong to you? It sure did for me. The mechanisms and rules involved with finalizers always felt painfully over-complicated, hard to get right, and hacky. Here we have this clean managed-memory paradigm that feels great to use. And it’s got a big gnarly barnacle named Finalize growing out of its side that we’re supposed to use to deal with unmanaged resources.
I learned .NET when the beta of 1.0 was out, and at the time Microsoft was putting a lot of effort into educating programmers the difference between native and managed programming to make the switch. “In C++ you’d do this, but in .NET you do this” kind of thing. So practically every article on the CLR talked about finalizers. They often made analogies to C++ destructors. C# even added some ill-advised sugar that converts ~Class() to Finalize() to make us grognards feel more at home.
Finalizers were obviously very important, and so we learned about them. We evolved base classes to help us do the boilerplate work and wished that the .NET languages supported mixins. But it never felt right to me.
The Problem With Finalizers
Recently, after a couple conversations at work, I figured out what’s been bothering me about finalizers. The problem is that we’re using a commodity manager to manage non-commodity resources.
In .NET, raw memory can be treated as a commodity, like a gas tank. We can use it fluidly and at any granularity, and treat it all roughly the same. This is a simplification of course, and there are performance considerations, but they closely map onto the underlying OS and are familiar to everyone.
Yet non-memory resources are not commodities and cannot be treated with a one-size-fits-all pattern. They have semantics and effects far beyond incrementally draining and refilling that gas tank. Each situation, each class, is different, and has different implications that we have to remember. Because of this, we can’t treat these resources in a nondeterministic way without potential hazards.
For example, take a file handle managed by a finalizer. Off the top of my head, there are three adverse effects of an unused handle not finalizing in time:
- If there are pending writes to the file, then they are lost if the app exits without forcing finalizers to be called.
- If the file was opened with restrictive sharing, then other applications cannot manipulate it until the GC gets around to finalizing the file object.
- The OS isn’t designed to treat file handles as a commodity, and using them that way can negatively affect performance. We could even run out of file handles because the GC has no idea when there’s pressure in this space (all finalizers are equal).
Worse yet, in diagnosing any of the above issues, we also must roll our own tools. System tools such as LockHunter and Process Explorer have no way of distinguishing which handles are actually in use and will just give us a noisy, useless mess.
And that’s just a simple file handle example. The situation is obviously worse as more limited and complicated resources are involved like DirectX surfaces or database connections.
Dispose: Only A Partial Solution
You might wonder why I’m making a fuss. There’s an obvious answer to the non-determinism, right? Microsoft recognized this problem early in in .NET 1.0 and gave us the disposable pattern. It is a standard way of managing resources the old-fashioned way: ‘new’ allocates the resource, and Dispose() frees it. We even have some special syntax that helps automate this:
using (var textureMgr = new TextureManager())
using (var texture = textureMgr.AllocTexture())
The above is roughly equivalent to:
var textureMgr = new TextureManager();
texture = textureMgr.AllocTexture();
This gives us a rough, though somewhat tedious approximation of the RAII pattern used universally in C++.
In Microsoft’s .NET classes that wrap unmanaged resources, both patterns are typically used. There is a Dispose system that closes the handle, and a safety Finalize that cleans up if not already done already via Dispose.
So what’s the big deal? Well, the problem is in (a) knowing when it is necessary to call Dispose, (b) remembering to call it, and (c) updating dependent code when an existing type adds a new IDisposable implementation. It’s easy to miss something and let bugs creep in. For every single ‘new’ call, we must check the type to see if it or one of its parent classes implements IDisposable. If it does, then we must manage the instance directly. This means wrapping allocation in the ‘using’ construct for local temporaries, or implementing IDisposable and forwarding Dispose if contained as a member in another class.
Forgetting to dispose an unmanaged resource can lead to some of the most frustrating and difficult to track down bugs. And the nondeterminism in the underlying system pretty much squares that problem. It’s guaranteed to behave differently in the field than on a development machine.
Every new .NET programmer figures this out quickly when they write their first command line app that opens a file, does something to it, and writes out a new file. They find that, apparently randomly, sometimes the end of the new file is cut off. Forgot to call Dispose to flush and close it eh?
Most Strategies Not 100% Guaranteed
It’s not hopeless. We’ve come up with strategies to deal with this. First is knowledge base. Over time, we get a sense for what types of objects tend to implement IDisposable. File handles, database connections and so on, those are easy. Though what about systems written by someone else on your team? Does the source control connection have to get disposed when you’re done with it because it wraps a COM object? For those cases, it’s better to be safe and check for IDisposable until the foreign classes become familiar.
But these strategies still aren’t enough for me. I want absolute certainty that Dispose is called on all unmanaged resources, and not left to the nondeterministic finalizer system to cause latent timing problems.
So I want to propose an additional strategy. It’s pretty simple, actually: consider the finalizer to be a last, and illegal resort. Microsoft’s guidelines say that you should not assume Dispose has been called. I propose the reverse: if the finalizer is called, then there is a bug in the code.
More on this in the next post.