New Fun Blog – Scott Bilas

Take what you want, and leave the rest (just like your salad bar).

Gem: A Generic Function Binding Interface

with 2 comments

From Game Programming Gems
Hardcover, Charles River Media, August 2000
Edited by Mark DeLoura

Introduction

Scripting engines and network messaging have an important requirement in common: they must be able to interface with the game’s functionality in a type safe, efficient, and convenient way. This gem provides a method for exporting functions and then binding to them dynamically at runtime. It does so without sacrificing runtime speed or convenience.

Requirements

The basic requirement for our scripting engine is that we can call a function and possibly pass it parameters. For this we’ll need to know its name, its location in memory, and the types that it takes. The types for these parameters must be types that we support directly in the scripting engine as part of the language. Let’s assume we support bool, float, int, string, and void.

The basic requirement for our network remote procedure calls (RPC’s) is that we can call a function on a remote machine and possibly pass it parameters. Given that our machines will probably be running the code at different memory addresses, we can’t pass function pointers over the network and must instead convert them into a token that both sides recognize. For this we’ll use a serial ID that can be converted back and forth to an actual function pointer very quickly. Also, for the parameters, we’ll need to know how to recognize strings and memory pointers in the parameters so that the data they point to can be packed at the end of the RPC chunk for hand-off to the network transport.

For convenience, we should be able to simply call an RPC-capable function without having to do any explicit parameter packing from the caller’s code. If the call is meant for another machine, the called function should automatically send its parameters and serial ID to the network transport, then return immediately. If meant for local execution, it would just execute the code directly. The dispatcher on the remote machine would look up functions based on the serial ID and then call them directly after resolving to a function pointer.

Platform Concerns

This is a good place to point out that the sample code provided with this gem is very specific to a particular platform: Visual C++ 6.0 running on an x86 version of Win32. In particular: (1) there’s a little bit of assembly code in here that is obviously x86 specific, (2) the name mangling and unmangling and how calling conventions work is specific to Visual C++ 6.0, and (3) I make use of the specific way that Win32 image (DLL/EXE) exports work.

At the very least, the concepts if not the implementation should still be portable to other platforms. All the x86 assembly code can be converted to any other instruction set, though you’ll need knowledge of the calling conventions of that platform for it to work. Dynamic link libraries are hardly unique to Win32 – all this gem needs is a table that maps exported function names to memory addresses. And finally, it should be possible to figure out how other compilers (especially open source compilers such as GCC) mangle and unmangle names.

Attempt #1

Let’s get back to the task at hand. We are trying to find a way to export game functionality in a generic way so that it can be called from scripts or passed over the network as RPC’s. Here is a really simple solution:

[code lang=”cpp”]
void Foo( void );
void Bar( void );
// …

enum eFunction
{
FUNCTION_FOO,
FUNCTION_BAR,
// …
};

struct Function
{
typedef void (*Proc)( void );

const char* m_Name;
Proc m_Proc;
eFunction m_Function;
};

Function g_Functions[] =
{
{ "Foo", Foo, FUNCTION_FOO, },
{ "Bar", Bar, FUNCTION_BAR, },
// …
};
[/code]

The eFunction enumeration provides a serialized list of unique ID’s for all available functions. The Function structure maps a text name onto a function pointer and unique ID. And finally the g_Functions array is the set of all published functions in the system. Our example function exports are of course Foo and Bar.

Our imaginary scripting engine can search through the g_Functions array when it’s compiling a script to resolve function calls by name and then call the procedure directly once found – hopefully this lookup would be done through an index for speed. Our imaginary network messaging system could convert function calls into their eFunction ID and use that to resolve the remote procedure call on the other machine. Easy and simple.

This solution would work fairly well but suffers from a critical drawback: all functions must be the same – they must all take no parameters and return void. We could change the Function::Proc type so that the functions could at least return a value and take some parameters. However this is not an acceptable solution as it’s highly unlikely that all published functions will have identical signatures. Besides that it’s a very inconvenient limitation, considering the large and varied function sets required of modern games.

One way to work around this would then be to cast parameters back and forth from their real types to the common types required by Function::Proc. We could, for example, have each function pass two or three unsigned integers, and pack our real parameters into them. This is a common and efficient technique used by API’s for callbacks such as window procedures. However, it’s unsafe and can’t be supported by a general purpose scripting language very well. It would also be impossible to figure out which of the generic parameters are pointers, which makes passing the parameters over the network for RPC’s very difficult. Hacks are on the horizon. Let’s try something else.

Attempt #2

A common partial fix to the problems of solution #1 is to provide a package class that stores the parameters in an internal buffer and provides add and extract methods to serialize data in and out of the object:

[code lang=”cpp”]
struct Parameters
{
std::vector <unsigned char> m_Data;

bool ExtractBool ( void );
int ExtractInt ( void );
float ExtractFloat ( void );
const char* ExtractString( void );

void AddBool ( bool );
void AddInt ( int );
void AddFloat ( float );
void AddString( const char* );
};

void Foo( Parameters& params )
{
int param1 = params.ExtractInt ();
float param2 = params.ExtractFloat();
// use param1, param2…
}

void Bar( Parameters& params );
// …

enum eFunction
{
FUNCTION_FOO,
FUNCTION_BAR,
};

struct Function
{
typedef void (*Proc)( Parameters& );

std::string m_Name;
Proc m_Proc;
eFunction m_Function;
};

Function g_Functions[] =
{
{ "Foo", Foo, FUNCTION_FOO, },
{ "Bar", Bar, FUNCTION_BAR, },
// …
};
[/code]

Now we can pass generic parameters to any function, a big improvement! This method, however, has its own set of drawbacks, some of which it shares with the first attempt.

First, this solution is inherently non-type safe and dangerous because of its add/extract functions. The C++ compiler cannot check the types at compile time because it doesn’t know what’s supposed to go into a Parameters object – by its very definition it can hold anything. The best we can do is provide some basic runtime checking by storing a type each time an Add method is called, and then checking those types from the called function each time an Extract method is called. This isn’t very efficient, and can be error-prone. Also, any time the function parameters change, every call to that function must be searched for and updated to match. The compiler can’t detect changes like this, and the manual search-and-replace is another error prone process. Missing one changed call by accident could introduce latent and difficult to find bugs.

Calling functions in this way is also tedious and inefficient. The add/extract process adds a lot of memory copying and verification overhead. It also has serious engineering time overhead. A simple function can no longer be added to an export list – it must now change its function signature and have a prologue that converts a Parameters object into local variables. Likewise, callers must construct the Parameters object to begin with, though this can be made a little easier through some clever template work. Still, there must be a better way.

Half of the Solution

Let’s start at the end and work back to the beginning for the solution. What we’re really looking for here is a function specification table that gives us everything we need to know about how to call a particular function in a completely generic way. We need to be able to set up the stack with a chunk of memory (i.e. push the parameters), jump directly to the function for the call, and then retrieve the return value to pass back to the original caller. For this, we need to know the function’s name, location in memory, return type, parameter types, and calling convention:

[code lang=”cpp”]
// function specification
struct Function
{
// simple variable spec
enum eVarType
{
VAR_VOID, VAR_BOOL, VAR_INT, VAR_FLOAT, VAR_STRING,
};

// possible calling conventions
enum eCallType
{
CALL_CDECL, CALL_FASTCALL, CALL_STDCALL, CALL_THISCALL,
};

typedef std::vector <eVarType> ParamVec;

std::string m_Name;
void* m_Proc;
unsigned int m_SerialID;
eVarType m_ReturnType;
ParamVec m_ParamTypes;
eCallType m_CallType;
};

typedef std::vector <Function> FunctionVec;

// the global set of specifications for exported functions
FunctionVec g_Functions;
[/code]

Assume for the moment that we have a way to fill g_Functions with specifications for all our exported functions (I’ll explain how to do that a little later). Now how can we use this information to actually call functions? First we must know how our platform’s various calling conventions work.

Calling Conventions

You can check your compiler’s documentation to see how its calling conventions work. On Visual C++ for x86 Win32, all function calls have certain things in common:

  1. The stack grows downward, and all parameters are pushed from right to left. In effect, parameters go from left to right on the stack for increasing memory addresses.
  2. The stack pointer (esp) always points to the lowest memory address of the stack, which unfortunately has the name of “top”. It must be dword (4-byte) aligned, so each parameter pushed must be likewise aligned to a dword. The push instruction decrements esp first, then stores the data. The pop instruction loads data first, then increments esp.
  3. Parameters passed by value are pushed on the stack in their entirety. Doubles (8-byte) and user-defined types are just copied onto the stack. The memory addresses contained by references and pointers are pushed on the stack directly.
  4. Simple non-float return values like integers and pointers are stored in the eax register. 8-byte structures are returned in edx and eax as a pair. Floats and doubles are returned through the FPU in ST0. Return values for user-defined types have their addresses pushed onto the stack last, but will also be returned in eax.

Here are the two calling conventions that we’ll be supporting:

  • __cdecl
    The caller cleans up the stack, meaning that it is responsible for popping its own arguments off the stack after the call completes. This convention is required for variable argument functions because the called function doesn’t necessarily have the information it needs to pop the correct number of arguments. This is the default calling convention for static and global functions in C and C++.
  • __stdcall
    The called function cleans up the stack. This is the standard convention used for Win32 API calls, probably because it is more efficient in terms of client code size.

Support for the other three calling conventions (__fastcall and the two “thiscall” variants) is beyond the scope of this gem, but may be worth looking into and supporting, depending on the application.

Now we have enough information to do generic function calls with these two conventions. We’ll also need a function to retrieve a floating-point value from the FPU’s ST0 register (as is convention) to be stored in a generic return value. Here are some functions that do the dirty work:

[code lang=”cpp”]
DWORD Call_cdecl( const void* args, size_t sz, DWORD func )
{
DWORD rc; // here’s our return value…
__asm
{
mov ecx, sz // get size of buffer
mov esi, args // get buffer
sub esp, ecx // allocate stack space
mov edi, esp // start of destination stack frame
shr ecx, 2 // make it dwords
rep movsd // copy params to real stack
call [func] // call the function
mov rc, eax // save the return value
add esp, sz // restore the stack pointer
}
return ( rc );
}

DWORD Call_stdcall( const void* args, size_t sz, DWORD func )
{
DWORD rc; // here’s our return value…
__asm
{
mov ecx, sz // get size of buffer
mov esi, args // get buffer
sub esp, ecx // allocate stack space
mov edi, esp // start of destination stack frame
shr ecx, 2 // make it dwords
rep movsd // copy it
call [func] // call the function
mov rc, eax // save the return value
}
return ( rc );
}

__declspec ( naked ) DWORD GetST0( void )
{
DWORD f; // temp var
__asm
{
fstp dword ptr [f] // pop ST0 into f
mov eax, dword ptr [f] // copy into eax
ret // done
}
}
[/code]

Now, given a function’s address and some parameters stored in a memory buffer, we can call a function in an almost completely generic way.

Calling the Function

Before making the actual call, our client subsystem (scripting engine, network RPC’s, etc.) will need to do a little preliminary work. First it will look up the instance of the Function structure within g_Functions that corresponds to the function it will be calling. For the scripting engine, we’ll want to verify that the function’s specification matches up with what we’re expecting – check and convert any parameters if necessary, or give an error if it’s a mismatch. This procedure may be expensive and should be done during the script compilation phase, and not in real-time. Looking up the Function instance for network RPC’s is a little more complicated. A good way to set this up is to intercept the call from within the function that is destined to be called over the network. Look in g_Functions for the Function instance with the highest m_Proc value that is less than the current instruction pointer (eip) to figure out which function is currently being called. Here is an example:

[code lang=”cpp”]
__declspec ( naked ) DWORD GetEIP( void )
{
__asm
{
mov eax, dword ptr [esp]
ret
}
}

// sample RPC’able function
void NetFoo( bool send, int i )
{
// FindFunction() should look in g_Functions for highest ‘m_Proc’
// less than ‘ip’ and return it
static const Function* sFunction = FindFunction( GetEIP() );
if ( send )
{
// RouteFunction() should pack up the parameters and send the
// request over the network.
RouteFunction( sFunction, (BYTE*)&send + 4 );
return;
}

// … normal execution of NetFoo
printf("i = %d\n", i );
}
[/code]

The next step is to construct the parameter buffer to pass to the function. For a scripting engine based on a virtual machine, this is easy – all our parameters are already on a dword-aligned virtual stack. We can just take the address of the start of the parameters and pass it along. For network RPC’s it will be a little more difficult. We can’t pass pointers generically over the network, but we can make a special case for strings, so analyze the m_ParamTypes for VAR_STRING types and append the contents of the string to the end of the buffer that gets sent to the network transport. On the receiving end, resolve the pointers to point to the appended data, and then use the start of the chunk as the beginning of the parameter buffer.

Now that we have the Function instance and our parameter buffer, we call either Call_cdecl() or Call_stdcall() depending on m_CallType, passing in the parameter buffer and m_Proc. Then we can either use the return value or call GetST0() to get it if m_ReturnType is a float or double. And that’s all there is to calling a function generically!

Completing the Solution

Up until now we’ve been assuming that the g_Functions array has already been set up. Let’s go back and fill in this hole now. There are several ways to fill out the g_Functions array. Perhaps the easiest to implement but least safe to use is to apply macros or a function to set it up:

[code lang=”cpp”]
float Foo( int, const char* );
int Bar( void );

void SetupFunctionExports( void )
{
{
Function function;

function.m_Name = "Foo";
function.m_Proc = Foo;
function.m_SerialID = g_Functions.size();
function.m_ReturnType = VAR_FLOAT;
function.m_ParamTypes . push_back( VAR_INT );
function.m_ParamTypes . push_back( VAR_STRING );
function.m_CallType = CALL_CDECL;

g_Functions.push_back( function );
}

{
Function function;

function.m_Name = "Bar";
function.m_Proc = Bar;
function.m_SerialID = g_Functions.size();
function.m_ReturnType = VAR_INT;
function.m_CallType = CALL_CDECL;

g_Functions.push_back( function );
}
}
[/code]

This example is illustrative but not exactly optimal. It could be improved with some helper functions and macros to make it easier to add new functions to the table. However, it will always be unsafe and inconvenient. Adding a new function to the table means that someone has to write some code that specifies its types, name, and calling convention. Changing a function (adding a parameter, for example) without updating the table could introduce some nasty and hard to debug problems. It will be a lot of work to keep the function specifications in sync with the actual function prototypes.

We need a way to build this table automatically and safely to eliminate these problems. Fortunately, the C++ compiler already has all the information we need. While parsing the function’s prototype, the compiler builds an internal representation of the function – its return type, parameters, calling convention, etc – exactly what is required to construct a function specification! Unfortunately, we don’t have access to this information from within the code, and besides, all that information gets thrown away when the linker constructs the final EXE. We could probably find a way to use the PDB (debug symbols database) to query for what we need, but we can’t ship debug symbols with the game. Besides, we wouldn’t have an easy way to tell which functions are for export and which aren’t.

Combining the export table functionality of a Win32 image file with the C++ language’s name mangling facility gives us the information we require. If we tag a function for exporting using the __declspec( dllexport ) keywords, that function’s name and address will appear in the EXE (or DLL) export table. And because this is a C++ application, those names will be mangled to support type safety and overloaded name resolution. Mangled names are encoded with all the information we require, so all we need is to decode the names into a form we can understand and then use that to build the Function entry to add to g_Functions.

The name-mangling format is completely implementation specific, undocumented, and even changes from release to release of Visual C++, so attempting to reverse-engineer it is probably not a good idea. It’s also unnecessary – Microsoft exported a name-unmangling function called UnDecorateSymbolName() from both ImageHlp.dll and DbgHelp.dll that does exactly this. So if we were to take our Foo() function from the last sample and DLL-export it, the entry ?Foo@@YAMHPBD@Z would appear in the EXE’s export table. If we unmangle the name, here’s what we will get back: float __cdecl Foo(int,char const *). Now this is something we can easily parse and convert to a Function entry for addition to our g_Functions table.

So now our procedure for building g_Functions is:

  1. Iterate over all entries in the EXE’s export table, and retrieve each function’s address and mangled name.
  2. Unmangle each name to get a function prototype in text form.
  3. Parse the function prototype to retrieve name, type and calling convention information.
  4. Store the results in a new entry within g_Functions. Repeat for each export.

Iterating over the exports to get the function addresses and mangled names requires knowledge of the binary format of Win32 Portable Executable (PE) format files. A specification for this format is available from the Microsoft Developer Network Library. Search for the “.edata” section within the library entry for the “Microsoft Portable Executable and Common Object File Format Specification” to find the structure of a Win32 export table.

There’s one final little detail – the entries in the export table point to a jump table, which in turn points to the actual functions. This detail isn’t important if all you’re interested in is binding to functions and calling them generically. However, if you need to be able to do a reverse lookup and convert eip from within the called function to find its Function instance (required for RPC’s as described earlier), you’ll need to get the actual address of the function for comparison, not the address of the entry in the jump table. This is easy enough: dereference the address given by the DLL export entry to find the jump table entry. The first byte will be 0xE9 (jmp), followed by a 4-byte offset to the actual entry point of your function. Take the address given by the DLL export entry, add 5 for the full jmp instruction, add the 4-byte offset, and this will be the address of the entry point of your function. This can then be used for reverse lookup to find the Function instance from within g_Functions.

Conclusion

We now have everything we need to call functions in a completely generic way. In order to publish a function in the system and allow other subsystems such as scripting and network RPC’s to bind to it, we simply tag it with __declspec( dllexport ) (this verbose tag is best wrapped in a macro to reduce clutter). At runtime the function-binding publisher will iterate over the Win32 export table, and extract name, type and calling convention information from each entry. Other subsystems can look up functions by memory address, name, or serial ID and call them generically using Call_cdecl() or Call_stdcall().

This seems like quite a bit more work to implement than necessary, and for smaller projects with small export sets it probably is. Larger projects, on the other hand, will probably be changing constantly. The good news is that, once the basic work is done, adding new functions to the system is as simple as tagging them for export and they’ll immediately be available. This more than pays for itself, and is a powerful ability to give any engineer on the team. When combined with a general-purpose scripting engine it can be turned into a useful debugging tool as well as serving the content-specific needs for which it was originally written.

In the interests of space and simplicity, a lot of features have been left out of this gem. The generic function-binding concept can be taken much further in a variety of ways. It can easily be enhanced to include support for pointers and references, variable argument functions, and passing more than just strings over a network. User-defined types could be supported for RPC packaging through a serialization interface that can be detected and called directly when post processing RPC parameter buffers for outbound network buffers. Also, support for calling class member functions is a very useful tool and can be easily added. And finally, one feature that may or may not be necessary is a tool that will post process an EXE, stripping off the exports table and converting it into a native data format for direct import into g_Functions. This may be necessary either for security reasons (to prevent cheating perhaps) or to make it so that it’s not necessary to ship DbgHelp.dll with the game.

August 1st, 2000 at 12:00 pm

Posted in

2 Responses to 'Gem: A Generic Function Binding Interface'

Subscribe to comments with RSS or TrackBack to 'Gem: A Generic Function Binding Interface'.

  1. Do I need an assembler like TASM in order to make it work or is there a way to make sure the GCC compiler (used by code::Blocks) uses the assembly code. I’m unfamiliar with assembly.

    Boro

    6 Mar 10 at 11:05 am

  2. GCC has an inline assembler language that you can use. IBM has a page I found on it that can get you started.

    http://www.ibm.com/developerworks/linux/library/l-ia.html

    Of course, it requires learning assembly language to get anywhere. But you can cheat a little by looking at the output of unoptimized assembly code in a debugger. In fact, that’s how I figured out how all the different calling conventions of the VC++ compiler worked in memory. Make a function of a particular type, call it from another, then step through the debugger in disassembly view, and you’ll be able to work out how it operates, and what the different assembly opcodes are.

    Scott

    7 Mar 10 at 8:33 pm

Leave a Reply