A tale of unexpected oddities of Windows' x64 calling convention

As a pragmatic C++ developer with a few decades of experience in computers, I cherry-pick features and concepts that make sense to me from so-called “modern” C++ and mix them with traditional, proven concepts like OOP and low-level knowledge about the architecture I’m targeting.

In my private “toolbelt” codebase, that means that I totally embrace the concept of non-owning references to things. In the old days, these were plain pointers and people nowadays frown upon them. These days, there is a multitude of helpful abstractions that yield a lot of value: - std::string_view for strings - std::span for a contiguous area of memory, which I avoid for its limited API, mix of “view” and “modification” and support for non-dynamic extents - array_view and span: Like std::span, just supports dynamic extents and separates read-only (array_view) from read-write (span) access. This separation provides more clarity for me. Also, it has convenience methods to slice in a lot of common ways. - function_view: A bit more than a simple function pointer. Type-erased access to (static) functions, Lambdas, non-static member functions

Since all these things are non-owning, they should be simple. That is true for the types that operate on a range of contiguous memory! They are really handy and easy to write. And then there’s all the things that C++ calls Callable. Vittorio Romeo shared his function_view and Jonathan wrote a critique on it. If you read these pieces, you get a glimpse of what writing C++ library code is like. In my humble words: It is horribly complicated if you want to do it correctly. That is because C++ is a feature-packed language — with the wrong defaults everywhere, by the way.

But back to function_view: What I want is just a glorified function pointer with a bit more functionality, so that I can also reference a stateful Lambda, for example. So my concept stores a function pointer and a context pointer. This is how it has been done since the C days, this is what’s the “this” pointer in member functions. Historically, the “this” pointer is also the implicit first argument of any member function in C++. It doesn’t take a genius to try to make use of this and unify all these things in a function_view, so that you can bind it to almost anything: a global function, a stateful lambda, a pragmatic member function call.

Member function pointers are a special beast, because they are of variable size, depending on the platform and whether you talk about a virtual method and so on. If you want to be correct, this is really hard: During the creation and destruction of a nested object whose constituent classes in the inheritance hierarchy override the same virtual method with different implementations, the target of this virtual function call depends on the time when you call the method. When the innermost class is created, it points to this method. As a later class get created, that call is then redirected to the outer class’s implementation. At destruction, the order is reversed. As you see, to be correct, you have to walk the vtable every time you call the method (unless you can prove that you’re not in the ctor/dtor phase).

But being a pragmatic developer in my leasure time (at work, we do things differently), I take my glasses off, ignore this fine print and say: Well, what about: Just don’t use function_view on a half-destroyed object. That allows us to now simplify the problem: Walking the vtable, we find the one true function pointer to the member function that is being used and that’s the function pointer we put into our function_view. The context pointer is the this object, adjusted for the part of the object that the member function expects. Neato, we reached our goal! A simple function_view that supports most Callables. That’s it, the story could end here. Well…

The fine print empire strikes back

Today, I changed the signature of a function_view‘ed parameter:

struct FancyInt {
  FancyInt() = default;
  FancyInt(uint64_t value) : member(value) {}
  
  // ... a ton of operators ...

private:
  uint64_t member = 0;
};

using OldCallbackType = function_view<bool(int, const NonTrivial&)>;
using NewCallbackType = function_view<FancyInt(int, const NonTrivial&)>;

and my project crashed with pSomeObject: vtable is null in some method like this:

class SomeClass {
  FancyInt someMemberFunc(int foo, const NonTrivial&) {
    pSomeObject->someVirtualMethod(foo);
    return {}
  }
}

That crash occurred when the method was called a second time. Considering that the vtable is at the start of SomeClass, like FancyInt::member, could it be that somehow, the two parameters were confused? :o Well, no, because types like FancyInt can be passed via register. Or not? Hmm. Let’s read it up:

User-defined types can be returned by value from global functions and static member functions. To return a user-defined type by value in RAX, it must have a length of 1, 2, 4, 8, 16, 32, or 64 bits. It must also have no user-defined constructor, destructor, or copy assignment operator; no private or protected non-static data members; no non-static data members of reference type; no base classes; no virtual functions; and no data members that do not also meet these requirements. […] Otherwise, the caller assumes the responsibility of allocating memory and passing a pointer for the return value as the first argument.

We can check off most things, might wonder about the user-defined constructor — and the phrase I’ve highlighted: If you read the highlighted phrase, it misses some crucial category of functions: Non-static member function, the thing we’re dealing with.

Just to repeat: Member functions pass their return type differently than static member functions or global functions. This was one of the times where I thought to myself: So that’s one of these things why people hate to develop on the platform. Compared to that, the SYSV amd64 calling convention is more regular.

I tried to figure out the rules of the game, but in the end there is more than one case in which the global function world and the member function world differ and C++ <type_traits> is a poor fit for the criteria outlined in the quoted paragraph. That’s why I went for the easy way and generated a small shim to adapt between the two worlds (my caller side plays in the global-function world, regardless of who it’s calling), for every creation of a function_view that’s bound to a member function which doesn’t return void or an integral type.

This new shim also provided me with a “safe member function call” facility that works during object construction/destruction for free. So in the end, I learnt a lot about member function pointers and Windows ABI.

The shim is ugly to call (function_view<type>::fromMemFn<Class, &Class::memberFunc>(pObject)), but easy to read:

template <typename RetType, typename... Args>
class function_view {
  template <typename Class, typename RetType (Class::*MemberFunc)(Args...)>
  static FunctionView fromMemFnSafe(Class* pObject) noexcept {
    using RealFuncType = RetType (*)(void*, Args...);
    const auto pTrampoline = (void*)(RealFuncType)[](void* pThis, Args... args)->RetType {
      return (((Class*)pThis)->*MemberFunc)(std::forward<Args>(args)...);
    };
    return {(void*)pObject, pTrampoline};
  }
};

It has the proper signature of a global function with the context as first argument and leaves all the annoyance of adapting between this calling convention and the member function call to the compiler. I wish I would have gone that way hours earlier. So, maybe I’m still working on my pragmatism. ;)