Archive for March, 2010
Quickie: Adding Utility Functions To Interfaces
In working in C#, one thing I miss from C++ is being able to implement interfaces via a rich base class. C#’s lack of multiple inheritance (a good decision on balance) just about prevents this.
I have two things in mind when I say “rich base class”. First is for mixing in object functionality. I’ve spoken about this before. There really is no way to do this nicely in C#. However, today’s post is about the other thing C++ gives you with multiple inheritance: utility methods.
Consider this interface and implementing class, paying attention to the overloads:
interface ILogger
{
void Log(string message);
void LogLine(string message);
void Log(string format, params object[] args);
void LogLine(string format, params object[] args);
}
class DebugLogger : ILogger
{
void ILogger.Log(string message)
{ Debug.Write(message); }
void ILogger.LogLine(string message)
{ Debug.WriteLine(message); }
void ILogger.Log(string format, params object[] args)
{ Debug.Write(string.Format(format, args)); }
void ILogger.LogLine(string format, params object[] args)
{ Debug.WriteLine(string.Format(format, args)); }
}
The more versions of the log function you want, the more work you have to do in any class implementing the interfaces. You could create a LoggerBase that does all of this, but that is severely limiting.
In C++ it’s easy of course. Make a mixin class.
class Logger
{
public:
void Log(const char* message)
{ OnLog(message); }
void LogLine(const char* message)
{ OnLog(message); OnLog("\n"); }
void LogF(const char* format, ...)
{
char buffer[2000];
va_list args;
va_start(args, format);
vsprintf_s(buffer, format, args);
va_end(args);
Log(buffer);
}
void LogLineF(const char* format, ...)
{
char buffer[2000];
va_list args;
va_start(args, format);
vsprintf_s(buffer, format, args);
va_end(args);
LogLine(buffer);
}
protected:
virtual void OnLog(const char* message) = 0;
};
class DebugLogger : public Logger
{
virtual void OnLog(const char* message)
{
OutputDebugString(message);
}
};
Exactly one virtual method is required in the derived class. To add new overloads, you just add them to the base, and have them call the virtual. All end up with dynamic behavior. With interfaces in C#, you’re forced to implement each overload. This ends up with a lot of duplication everywhere you implement this same interface. Worse, if you want to add more utility functions to the interface, you break every implementing class, which must now implement that function as well.
The other day, it hit me that there’s an easy and perhaps obvious solution to this in C#: extension methods.
interface ILogger
{
void Log(string message);
}
static partial class Extensions
{
public static void LogLine(this ILogger logger, string message)
{
logger.Log(message);
logger.Log("\n");
}
public static void LogFormat(this ILogger logger, string format, params object[] args)
{
logger.Log(string.Format(format, args));
}
public static void LogLineFormat(this ILogger logger, string format, params object[] args)
{
logger.LogFormat(format, args);
logger.Log("\n");
}
}
class DebugLogger : ILogger
{
void ILogger.Log(string message)
{
Debug.Write(message);
}
}
This will do what I want. I can implement a minimal interface, and easily add new utility functions. It’s not quite as good as C++, because I can’t store any data in my extension methods, and I must work solely through the published interface, but it’s good enough for 80%.
In one very small way it’s actually better than C++. In C++, that base class isn’t always something that can be changed. Perhaps it was provided by a standard library or a third party. Yet in C#, anybody can create an extension class to add functionality to any other class. So you can add all the overloads you like.
Assertive Finalizers
In my previous post, I talked about why I have stopped using finalizers for unmanaged resource collection. I want this to be done through the disposable pattern instead, forcing the programmer to manage resources manually.
Ironically, finalizers are a great way to verify this.
Sander van Rossen quickly figured out where I was going with this and proposed in a comment that we can just assert in a finalizer. We just need a couple more things:
- We need to track the source of the object to figure out where the leak is.
- We need to ensure that finalizers run on shutdown, or our assert will never get hit.
This is familiar territory to C++ programmers. Most of us use memory management libraries that provide leak detection and reporting. Let’s do something similar in C#.
DisposableBase
If you look around online, you’ll find some full-featured IDisposable base classes intended to deal with the full finalization model. We can eliminate most of that as either too complicated or unnecessary. We just need a few things:
- A stack trace grabbed at construction time
This will be used for the error report. Because this is expensive to gather, we need to control it with an #if that is off by default, only turned on if needed to help an investigation. Most of the time a leaked disposable will be easy to find by inspection, but in the 5% case we will need a lot more context. - A finalizer that reports the problem
It could throw an exception, fire an assert, or route to an error reporter. Depends on the application. In my example I just have it output to the debug window for a demo. - Disposal helper methods
These wrap up disposal a bit, so inheritors only need to implement OnDispose. The most important feature, though, is that Dispose will call GC.SuppressFinalize when our object is disposed. This eliminates the performance cost of having a finalizer in the normal case when clients are disposing this class properly. This is why the finalizer has no “if” in it – if it ever gets called, then we have a bug.
This is what I am currently using as my base class for handling unmanaged resources:
// comment out unless diagnosing a leak
#define DEBUG_DISPOSE
public abstract class DisposableBase : IDisposable
{
// store stack at point of construction for possible later use
# if DEBUG_DISPOSE
StackTrace _trace = new StackTrace(true);
# endif
// finalizer will not be called if object was properly disposed
~DisposableBase()
{
string message = "!! Forgot to dispose a " + GetType().FullName;
# if DEBUG_DISPOSE
message += "\n\nStack at construction:\n\n" + _trace + "!!";
# endif
Debug.WriteLine(message);
}
public bool IsDisposed { get; private set; }
public void Dispose()
{
ThrowIfDisposed();
IsDisposed = true;
try { OnDispose(); }
finally { GC.SuppressFinalize(this); }
}
protected abstract void OnDispose();
protected void ThrowIfDisposed()
{
if (IsDisposed)
throw new ObjectDisposedException(GetType().FullName);
}
}
Demonstration
Here is a simple test app that shows what happens if we forget to dispose an instance.
public class DatabaseConnection : DisposableBase
{
protected override void OnDispose()
{ Debug.WriteLine("Disposing"); }
}
public class Program
{
static void Main(string[] args)
{
using (var remembered0 = new DatabaseConnection())
using (var remembered1 = new DatabaseConnection())
{
}
var forgotten = new DatabaseConnection();
GC.Collect();
GC.WaitForPendingFinalizers();
}
}
The first two instances dispose fine, but the third is leaked and so we get a log to the output window:
Disposing Disposing !! Forgot to dispose a DatabaseConnection Stack at construction: at DisposableBase..ctor() in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 18 at DatabaseConnection..ctor() at Program.Main(String[] args) in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 89 at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args) at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args) at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly() at System.Threading.ThreadHelper.ThreadStart_Context(Object state) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ThreadHelper.ThreadStart() !!
We can easily zero in on the exact spot where the leaked resource was allocated and wrap it in a ‘using’ to resolve.
Note the use of GC.Collect and GC.WaitForPendingFinalizers right before the test application exits. This is necessary in order to force all finalizing objects to be collected and reported before process shutdown, otherwise they are simply dropped by the system when the process’s memory is released.
These two can even be called during normal app run as well for leak testing without needing to wait for shutdown. This would be useful in a live service with a periodic leak for which we want an in-session nonfatal log of leaks.
What About External Classes?
This takes care of our own classes, where we have full control. But what about system or third party classes that are using finalizers? With those it’s back to square one.
Well, I suppose would write a helper class..
public class SafeDisposer<T> : DisposableBase where T : IDisposable
{
public SafeDisposer(T disposable) { Obj = disposable; }
public T Obj { get; private set; }
protected override void OnDispose()
{
Obj.Dispose();
Obj = default(T);
}
}
public static class SafeDisposer
{
public static SafeDisposer<T> Wrap<T>(T disposable) where T : IDisposable
{ return new SafeDisposer<T>(disposable); }
}
And equivalent updates in the demo code:
public class Program
{
static void Main(string[] args)
{
using (var remembered2 = SafeDisposer.Wrap(new StringReader("foo")))
using (var remembered3 = SafeDisposer.Wrap(new StringReader("poo")))
{
remembered2.Obj.Peek();
}
var forgotten2 = SafeDisposer.Wrap(new StringReader("boo"));
GC.Collect();
GC.WaitForPendingFinalizers();
}
}
And output:
!! Forgot to dispose a SafeDisposer`1[[System.IO.StringReader, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]] Stack at construction: at DisposableBase..ctor() in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 18 at SafeDisposer`1..ctor(T disposable) in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 53 at SafeDisposer.Wrap[T](T disposable) in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 67 at Program.Main(String[] args) in C:\Users\Scott\Cloud\Proj\tests\BlogSamples\ConsoleApplication1\Finalizers.cs:line 98 at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args) at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args) at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly() at System.Threading.ThreadHelper.ThreadStart_Context(Object state) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ThreadHelper.ThreadStart() !!
So to use, we just wrap a disposable at the point of creation in SafeDisposer.Wrap() and access it through “.Obj”. Not bad, but not great either. I’m ok with the wrapper function, but the required access via the Obj member is a pain, and also means that converting an unsafe disposable into a safe one requires updating a lot of code.
Another option is (maybe) to use a dynamic proxy system to inject the functionality we need, letting client code remain unchanged except at the point of creation. Or perhaps a run-time system that patches system assemblies to do the injection.
I’ll leave this as an exercise for the reader because I think we’re probably getting seriously diminishing returns at this point. Most disposable objects in a large system will be classes that we have full control over. The few objects not under our control will likely be low level primitives that will be wrapped up by our own foundation classes anyway.
Finalizers: An Incomplete Pattern
When you were first learning about how .NET finalization works, did it just feel wrong to you? It sure did for me. The mechanisms and rules involved with finalizers always felt painfully over-complicated, hard to get right, and hacky. Here we have this clean managed-memory paradigm that feels great to use. And it’s got a big gnarly barnacle named Finalize growing out of its side that we’re supposed to use to deal with unmanaged resources.
I learned .NET when the beta of 1.0 was out, and at the time Microsoft was putting a lot of effort into educating programmers the difference between native and managed programming to make the switch. “In C++ you’d do this, but in .NET you do this” kind of thing. So practically every article on the CLR talked about finalizers. They often made analogies to C++ destructors. C# even added some ill-advised sugar that converts ~Class() to Finalize() to make us grognards feel more at home.
Finalizers were obviously very important, and so we learned about them. We evolved base classes to help us do the boilerplate work and wished that the .NET languages supported mixins. But it never felt right to me.
The Problem With Finalizers
Recently, after a couple conversations at work, I figured out what’s been bothering me about finalizers. The problem is that we’re using a commodity manager to manage non-commodity resources.
In .NET, raw memory can be treated as a commodity, like a gas tank. We can use it fluidly and at any granularity, and treat it all roughly the same. This is a simplification of course, and there are performance considerations, but they closely map onto the underlying OS and are familiar to everyone.
Yet non-memory resources are not commodities and cannot be treated with a one-size-fits-all pattern. They have semantics and effects far beyond incrementally draining and refilling that gas tank. Each situation, each class, is different, and has different implications that we have to remember. Because of this, we can’t treat these resources in a nondeterministic way without potential hazards.
For example, take a file handle managed by a finalizer. Off the top of my head, there are three adverse effects of an unused handle not finalizing in time:
- If there are pending writes to the file, then they are lost if the app exits without forcing finalizers to be called.
- If the file was opened with restrictive sharing, then other applications cannot manipulate it until the GC gets around to finalizing the file object.
- The OS isn’t designed to treat file handles as a commodity, and using them that way can negatively affect performance. We could even run out of file handles because the GC has no idea when there’s pressure in this space (all finalizers are equal).
Worse yet, in diagnosing any of the above issues, we also must roll our own tools. System tools such as LockHunter and Process Explorer have no way of distinguishing which handles are actually in use and will just give us a noisy, useless mess.
And that’s just a simple file handle example. The situation is obviously worse as more limited and complicated resources are involved like DirectX surfaces or database connections.
Dispose: Only A Partial Solution
You might wonder why I’m making a fuss. There’s an obvious answer to the non-determinism, right? Microsoft recognized this problem early in in .NET 1.0 and gave us the disposable pattern. It is a standard way of managing resources the old-fashioned way: ‘new’ allocates the resource, and Dispose() frees it. We even have some special syntax that helps automate this:
using (var textureMgr = new TextureManager())
using (var texture = textureMgr.AllocTexture())
{
texture.Fill(Color.White);
}
The above is roughly equivalent to:
var textureMgr = new TextureManager();
try
{
texture = textureMgr.AllocTexture();
try
{
texture.Fill(Color.White);
}
finally
{
texture.Dispose();
}
}
finally
{
textureMgr.Dispose();
}
This gives us a rough, though somewhat tedious approximation of the RAII pattern used universally in C++.
In Microsoft’s .NET classes that wrap unmanaged resources, both patterns are typically used. There is a Dispose system that closes the handle, and a safety Finalize that cleans up if not already done already via Dispose.
So what’s the big deal? Well, the problem is in (a) knowing when it is necessary to call Dispose, (b) remembering to call it, and (c) updating dependent code when an existing type adds a new IDisposable implementation. It’s easy to miss something and let bugs creep in. For every single ‘new’ call, we must check the type to see if it or one of its parent classes implements IDisposable. If it does, then we must manage the instance directly. This means wrapping allocation in the ‘using’ construct for local temporaries, or implementing IDisposable and forwarding Dispose if contained as a member in another class.
Forgetting to dispose an unmanaged resource can lead to some of the most frustrating and difficult to track down bugs. And the nondeterminism in the underlying system pretty much squares that problem. It’s guaranteed to behave differently in the field than on a development machine.
Every new .NET programmer figures this out quickly when they write their first command line app that opens a file, does something to it, and writes out a new file. They find that, apparently randomly, sometimes the end of the new file is cut off. Forgot to call Dispose to flush and close it eh?
Most Strategies Not 100% Guaranteed
It’s not hopeless. We’ve come up with strategies to deal with this. First is knowledge base. Over time, we get a sense for what types of objects tend to implement IDisposable. File handles, database connections and so on, those are easy. Though what about systems written by someone else on your team? Does the source control connection have to get disposed when you’re done with it because it wraps a COM object? For those cases, it’s better to be safe and check for IDisposable until the foreign classes become familiar.
There are also tools that help out. FxCop can find problems that are discoverable via static analysis. CodeRush will draw graphical markup for types it detects as IDisposable.
But these strategies still aren’t enough for me. I want absolute certainty that Dispose is called on all unmanaged resources, and not left to the nondeterministic finalizer system to cause latent timing problems.
So I want to propose an additional strategy. It’s pretty simple, actually: consider the finalizer to be a last, and illegal resort. Microsoft’s guidelines say that you should not assume Dispose has been called. I propose the reverse: if the finalizer is called, then there is a bug in the code.
More on this in the next post.

