New Fun Blog – Scott Bilas

Take what you want, and leave the rest (just like your salad bar).

Assertive Finalizers

with 5 comments

Oruro

In my previous post, I talked about why I have stopped using finalizers for unmanaged resource collection. I want this to be done through the disposable pattern instead, forcing the programmer to manage resources manually.

Ironically, finalizers are a great way to verify this.

Sander van Rossen quickly figured out where I was going with this and proposed in a comment that we can just assert in a finalizer. We just need a couple more things:

  1. We need to track the source of the object to figure out where the leak is.
  2. We need to ensure that finalizers run on shutdown, or our assert will never get hit.

This is familiar territory to C++ programmers. Most of us use memory management libraries that provide leak detection and reporting. Let’s do something similar in C#.

DisposableBase

If you look around online, you’ll find some full-featured IDisposable base classes intended to deal with the full finalization model. We can eliminate most of that as either too complicated or unnecessary. We just need a few things:

  • A stack trace grabbed at construction time
    This will be used for the error report. Because this is expensive to gather, we need to control it with an #if that is off by default, only turned on if needed to help an investigation. Most of the time a leaked disposable will be easy to find by inspection, but in the 5% case we will need a lot more context.
  • A finalizer that reports the problem
    It could throw an exception, fire an assert, or route to an error reporter. Depends on the application. In my example I just have it output to the debug window for a demo.
  • Disposal helper methods
    These wrap up disposal a bit, so inheritors only need to implement OnDispose. The most important feature, though, is that Dispose will call GC.SuppressFinalize when our object is disposed. This eliminates the performance cost of having a finalizer in the normal case when clients are disposing this class properly. This is why the finalizer has no “if” in it – if it ever gets called, then we have a bug.

This is what I am currently using as my base class for handling unmanaged resources:

// comment out unless diagnosing a leak
#define DEBUG_DISPOSE

public abstract class DisposableBase : IDisposable
{
    // store stack at point of construction for possible later use
#	if DEBUG_DISPOSE
    StackTrace _trace = new StackTrace(true);
#	endif

    // finalizer will not be called if object was properly disposed
    ~DisposableBase()
    {
        string message = "!! Forgot to dispose a " + GetType().FullName;
#		if DEBUG_DISPOSE
        message += "\n\nStack at construction:\n\n" + _trace + "!!";
#		endif
        Debug.WriteLine(message);
    }

    public bool IsDisposed { get; private set; }

    public void Dispose()
    {
        ThrowIfDisposed();
        IsDisposed = true;

        try { OnDispose(); }
        finally { GC.SuppressFinalize(this); }
    }

    protected abstract void OnDispose();

    protected void ThrowIfDisposed()
    {
        if (IsDisposed)
            throw new ObjectDisposedException(GetType().FullName);
    }
}

Demonstration

Here is a simple test app that shows what happens if we forget to dispose an instance.

public class DatabaseConnection : DisposableBase
{
    protected override void OnDispose()
        { Debug.WriteLine("Disposing"); }
}

public class Program
{
    static void Main(string[] args)
    {
        using (var remembered0 = new DatabaseConnection())
        using (var remembered1 = new DatabaseConnection())
        {
        }

        var forgotten = new DatabaseConnection();

        GC.Collect();
        GC.WaitForPendingFinalizers();
    }
}

The first two instances dispose fine, but the third is leaked and so we get a log to the output window:

Disposing
Disposing
!! Forgot to dispose a DatabaseConnection

Stack at construction:

   at DisposableBase..ctor() in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 18
   at DatabaseConnection..ctor()
   at Program.Main(String[] args) in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 89
   at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
   at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
   at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
!!

We can easily zero in on the exact spot where the leaked resource was allocated and wrap it in a ‘using’ to resolve.

Note the use of GC.Collect and GC.WaitForPendingFinalizers right before the test application exits. This is necessary in order to force all finalizing objects to be collected and reported before process shutdown, otherwise they are simply dropped by the system when the process’s memory is released.

These two can even be called during normal app run as well for leak testing without needing to wait for shutdown. This would be useful in a live service with a periodic leak for which we want an in-session nonfatal log of leaks.

What About External Classes?

This takes care of our own classes, where we have full control. But what about system or third party classes that are using finalizers? With those it’s back to square one.

Well, I suppose would write a helper class..

public class SafeDisposer<T> : DisposableBase where T : IDisposable
{
    public SafeDisposer(T disposable) { Obj = disposable; }

    public T Obj { get; private set; }

    protected override void OnDispose()
    {
        Obj.Dispose();
        Obj = default(T);
    }
}

public static class SafeDisposer
{
    public static SafeDisposer<T> Wrap<T>(T disposable) where T : IDisposable
        { return new SafeDisposer<T>(disposable); }
}

And equivalent updates in the demo code:

public class Program
{
    static void Main(string[] args)
    {
        using (var remembered2 = SafeDisposer.Wrap(new StringReader("foo")))
        using (var remembered3 = SafeDisposer.Wrap(new StringReader("poo")))
        {
            remembered2.Obj.Peek();
        }

        var forgotten2 = SafeDisposer.Wrap(new StringReader("boo"));

        GC.Collect();
        GC.WaitForPendingFinalizers();
    }
}

And output:

!! Forgot to dispose a SafeDisposer`1[[System.IO.StringReader, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]

Stack at construction:

   at DisposableBase..ctor() in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 18
   at SafeDisposer`1..ctor(T disposable) in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 53
   at SafeDisposer.Wrap[T](T disposable) in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 67
   at Program.Main(String[] args) in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 98
   at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
   at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
   at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()
!!

So to use, we just wrap a disposable at the point of creation in SafeDisposer.Wrap() and access it through “.Obj”. Not bad, but not great either. I’m ok with the wrapper function, but the required access via the Obj member is a pain, and also means that converting an unsafe disposable into a safe one requires updating a lot of code.

Another option is (maybe) to use a dynamic proxy system to inject the functionality we need, letting client code remain unchanged except at the point of creation. Or perhaps a run-time system that patches system assemblies to do the injection.

I’ll leave this as an exercise for the reader because I think we’re probably getting seriously diminishing returns at this point. Most disposable objects in a large system will be classes that we have full control over. The few objects not under our control will likely be low level primitives that will be wrapped up by our own foundation classes anyway.

March 8th, 2010 at 6:04 pm

Posted in .net, programming

Finalizers: An Incomplete Pattern

with 2 comments

Whistler When you were first learning about how .NET finalization works, did it just feel wrong to you? It sure did for me. The mechanisms and rules involved with finalizers always felt painfully over-complicated, hard to get right, and hacky. Here we have this clean managed-memory paradigm that feels great to use. And it’s got a big gnarly barnacle named Finalize growing out of its side that we’re supposed to use to deal with unmanaged resources.

I learned .NET when the beta of 1.0 was out, and at the time Microsoft was putting a lot of effort into educating programmers the difference between native and managed programming to make the switch. “In C++ you’d do this, but in .NET you do this” kind of thing. So practically every article on the CLR talked about finalizers. They often made analogies to C++ destructors. C# even added some ill-advised sugar that converts ~Class() to Finalize() to make us grognards feel more at home.

Finalizers were obviously very important, and so we learned about them. We evolved base classes to help us do the boilerplate work and wished that the .NET languages supported mixins. But it never felt right to me.

The Problem With Finalizers

Recently, after a couple conversations at work, I figured out what’s been bothering me about finalizers. The problem is that we’re using a commodity manager to manage non-commodity resources.

In .NET, raw memory can be treated as a commodity, like a gas tank. We can use it fluidly and at any granularity, and treat it all roughly the same. This is a simplification of course, and there are performance considerations, but they closely map onto the underlying OS and are familiar to everyone.

Yet non-memory resources are not commodities and cannot be treated with a one-size-fits-all pattern. They have semantics and effects far beyond incrementally draining and refilling that gas tank. Each situation, each class, is different, and has different implications that we have to remember. Because of this, we can’t treat these resources in a nondeterministic way without potential hazards.

For example, take a file handle managed by a finalizer. Off the top of my head, there are three adverse effects of an unused handle not finalizing in time:

  1. If there are pending writes to the file, then they are lost if the app exits without forcing finalizers to be called.
  2. If the file was opened with restrictive sharing, then other applications cannot manipulate it until the GC gets around to finalizing the file object.
  3. The OS isn’t designed to treat file handles as a commodity, and using them that way can negatively affect performance. We could even run out of file handles because the GC has no idea when there’s pressure in this space (all finalizers are equal).

Worse yet, in diagnosing any of the above issues, we also must roll our own tools. System tools such as LockHunter and Process Explorer have no way of distinguishing which handles are actually in use and will just give us a noisy, useless mess.

And that’s just a simple file handle example. The situation is obviously worse as more limited and complicated resources are involved like DirectX surfaces or database connections.

Dispose: Only A Partial Solution

You might wonder why I’m making a fuss. There’s an obvious answer to the non-determinism, right? Microsoft recognized this problem early in in .NET 1.0 and gave us the disposable pattern. It is a standard way of managing resources the old-fashioned way: ‘new’ allocates the resource, and Dispose() frees it. We even have some special syntax that helps automate this:

using (var textureMgr = new TextureManager())
using (var texture = textureMgr.AllocTexture())
{
    texture.Fill(Color.White);
}

The above is roughly equivalent to:

var textureMgr = new TextureManager();
try
{
	texture = textureMgr.AllocTexture();
	try
	{
		texture.Fill(Color.White);
	}
	finally
	{
		texture.Dispose();
	}
}
finally
{
	textureMgr.Dispose();
}

This gives us a rough, though somewhat tedious approximation of the RAII pattern used universally in C++.

In Microsoft’s .NET classes that wrap unmanaged resources, both patterns are typically used. There is a Dispose system that closes the handle, and a safety Finalize that cleans up if not already done already via Dispose.

So what’s the big deal? Well, the problem is in (a) knowing when it is necessary to call Dispose, (b) remembering to call it, and (c) updating dependent code when an existing type adds a new IDisposable implementation. It’s easy to miss something and let bugs creep in. For every single ‘new’ call, we must check the type to see if it or one of its parent classes implements IDisposable. If it does, then we must manage the instance directly. This means wrapping allocation in the ‘using’ construct for local temporaries, or implementing IDisposable and forwarding Dispose if contained as a member in another class.

Forgetting to dispose an unmanaged resource can lead to some of the most frustrating and difficult to track down bugs. And the nondeterminism in the underlying system pretty much squares that problem. It’s guaranteed to behave differently in the field than on a development machine.

Every new .NET programmer figures this out quickly when they write their first command line app that opens a file, does something to it, and writes out a new file. They find that, apparently randomly, sometimes the end of the new file is cut off. Forgot to call Dispose to flush and close it eh?

Most Strategies Not 100% Guaranteed

It’s not hopeless. We’ve come up with strategies to deal with this. First is knowledge base. Over time, we get a sense for what types of objects tend to implement IDisposable. File handles, database connections and so on, those are easy. Though what about systems written by someone else on your team? Does the source control connection have to get disposed when you’re done with it because it wraps a COM object? For those cases, it’s better to be safe and check for IDisposable until the foreign classes become familiar.

There are also tools that help out. FxCop can find problems that are discoverable via static analysis. CodeRush will draw graphical markup for types it detects as IDisposable.

But these strategies still aren’t enough for me. I want absolute certainty that Dispose is called on all unmanaged resources, and not left to the nondeterministic finalizer system to cause latent timing problems.

So I want to propose an additional strategy. It’s pretty simple, actually: consider the finalizer to be a last, and illegal resort. Microsoft’s guidelines say that you should not assume Dispose has been called. I propose the reverse: if the finalizer is called, then there is a bug in the code.

More on this in the next post.

March 1st, 2010 at 8:45 pm

Posted in .net, programming

Bracketing Operations With IDisposable

with 14 comments

Carnaval Oruro I learned a fun little .NET trick today at work, figured I’d pass it along. This is probably a well-known pattern, but I don’t know its name.

The Problem

Say you have a pair of functions that bracket other operations you want to do. For example, you must first call BeginRender() before you call RenderModel(), and after you’re all done, you need to call EndRender(). Another example would be a EnterLock() on a critical section, that must be accompanied by a LeaveLock(). I’ll call these bracketing functions simply “BeginX” and “EndX” for the purposes of this article.

The problem is that the caller must not only remember to call EndX, but must do so even in case of an exception, to avoid blocking the system.

The Solution: IDisposable

What we’re doing here, really, is acquiring and releasing a resource. The .NET Framework already has a pattern for that, and it’s called IDisposable. Ok, that’s not much of a trick in itself. What I learned today was a little twist on the concept.

The trick is simple:

  1. Create a private inner class that implements IDisposable.
  2. Return a new’d instance of this class in BeginX.
  3. In the instance’s Dispose(), have it call the private EndX.
  4. Callers can simply wrap the ‘handle’ in a using declaration for cleanup.

It’s obvious in retrospect, and it’s clean. The ‘using’ guarantees calling EndX even in case of an exception. And in leveraging IDisposable we can build on top of the C# programmer’s instinct to wrap usages of IDisposable objects in ‘using’ constructs. As a fallback, we can rely on tools like CodeRush to catch it as we type (or FxCop as a fallback).

Implementation

First, we need a fun little helper base class.

public abstract class DisposerBase<T> : IDisposable
{
    bool _disposed;
    T _owner;

    public DisposerBase(T owner)
        { _owner = owner; }

    public void Dispose()
    {
        Debug.Assert(!_disposed);
        _disposed = true;
        OnDispose(_owner);
    }

    protected abstract void OnDispose(T owner);
}

Here’s a sample class that demonstrates the IDisposable-bracket pattern. Note the inner class that has access to private methods, which we use to call EndX. This method is private so that the only way to close the bracket is to call Dispose() on the handle. Simple, straightforward, consistent.

public class TransactionManager
{
    bool _inTransaction;

    public IDisposable BeginTransaction()
    {
        Debug.Assert(!_inTransaction);
        Console.WriteLine("Begin Transaction");

        _inTransaction = true;

        // ... do other transaction setup here ...

        // caller must dispose this when done
        return new TransactionDisposer(this);
    }

    private void EndTransaction(TransactionDisposer state)
    {
        Debug.Assert(_inTransaction);
        Console.WriteLine("End Transaction");

        // ... do transaction cleanup here ...
        // (state could provide more context if needed)

        _inTransaction = false;
    }

    #region TransactionDisposer

    class TransactionDisposer : DisposerBase<TransactionManager>
    {
        // obviously some state could go here
        // (say, for more transactions in flight at once)

        public TransactionDisposer(TransactionManager owner)
            : base(owner) { }

        protected override void OnDispose(TransactionManager owner)
            { owner.EndTransaction(this); }
    }

    #endregion

    public void DoSomeAction(string actionName)
    {
        Debug.Assert(_inTransaction);

        // ... do the action ...
        Console.WriteLine("  Action performed: " + actionName);
    }
}

Note that this is a very simple example that only permits one transaction in flight at once. If we wanted to do more, we could have a Dictionary that maps outstanding transactions to pending state, or we could just put the state directly into the TransactionDisposer. I’ve noted this in the comments.

Now, it’s a bit of typing to implement this disposer, but that’s what snippets are for, right? Bracketed code like this isn’t overwhelmingly common either, so it shouldn’t be a big deal. Still, I have to say I miss my old friend #define from the C world sometimes. Not often. I’m looking forward to a future version of the C# compiler that lets us plug into the AST directly and codegen as we go.

Anyway, here is an example usage of this pattern:

var transactionManager = new TransactionManager();
using (var handle = transactionManager.BeginTransaction())
{
    transactionManager.DoSomeAction("action 1");
}

Implicit Chaining

There is one interesting benefit that comes from this: implicit chaining.

Say you have a higher level class that provides additional functionality within its own BeginX. Perhaps it calls BeginRender() and then immediately sets some default states. Rather than having to create a new inner class to manage the bracketing, you can simply return the result of BeginRender, and the caller will not know the difference.

Let’s extend the above example in this way.

public class RequestManager
{
    TransactionManager _transactionManager;

    public RequestManager(TransactionManager transactionManager)
        { _transactionManager = transactionManager; }

    public IDisposable BeginRequest(string initialActionName)
    {
        var handle = _transactionManager.BeginTransaction();
        _transactionManager.DoSomeAction(initialActionName);
        return handle;
    }
}

See how it chains the TransactionManager’s bracketed calls through itself? We could have gone through the trouble of making an inner RequestDisposer class and a private EndRequest and all that, using the IDisposable from TransactionManager as the state, but why bother? Just chain it along. When the caller disposes the handle, it routes straight through to the TransactionManager, which is all we needed.

Some sample usage:

var requestManager = new RequestManager(transactionManager);
using (var handle = requestManager.BeginRequest("action 2"))
{
    transactionManager.DoSomeAction("action 3");
}

Of course, if RequestManager has to do anything special in its EndX, then we’ll have to go and add that inner class after all. It’s a nice shortcut otherwise.

The output from both examples looks like this:

Begin Transaction
Action performed: action 1
End Transaction
Begin Transaction
Action performed: action 2
Action performed: action 3
End Transaction

Easy huh?

Performance Implications

It’s worth asking what the costs of this method are. How does this compare to simply calling BeginX and EndX?

Let’s consider this sample usage again:

using (var handle = transactionManager.BeginTransaction())
{
    transactionManager.DoSomeAction("action 1");
}

This is syntactic sugar (I loathe this term but it is unfortunately apt) that the compiler translates into something like this:

IDisposable handle = transactionManager.BeginTransaction();
try
{
    transactionManager.DoSomeAction("action 1");
}
finally
{
    if (handle != null)
        handle.Dispose();
}

Note that the runtime does not do anything special with IDisposable types. They are created and collected exactly the same as any other type. And as a result, this is almost identical to what we’d have to do for a manual BeginX/EndX:

transactionManager.BeginTransaction();
try
{
    transactionManager.DoSomeAction("action 1");
}
finally
{
    transactionmanager.EndTransaction();
}

Now, this is assuming no state needs to be maintained in the handle. If it were, then we’d have to store it in a local and pass it into EndTransaction, and the BeginX/EndX pattern would almost exactly match the disposer pattern.

So: the cost of the disposer pattern over BeginX/EndX is that an extra object gets new’d up, with a bit of glue logic that runs. Insignificant, not worth worrying about.

Even if it did have a real cost, I’d still not worry about it. This is intended to be used in bracket patterns where resources are being allocated and freed. The cost of the underlying resource management is going to mask whatever minor cost the disposer adds.

The cleaner, more maintainable, and statically-analyzable code that results is an easy win.

February 9th, 2010 at 9:45 pm

Posted in .net, trick

Switch to our mobile site