Avoid virtual functions in C++ (when possible)

Xiahua Liu August 27, 2024 #C++

Virtual functions are a core feature of C++, introduced at the very beginning to enable Run-time Polymorphism. This gives C++ a capability that C lacks natively: the ability to treat different types as a single abstract base type.

However, many developers overlook the "dark side" of this feature. It comes with hidden costs in both memory and performance.

The Cost of Runtime Polymorphism

Let's say you are writing a database application. You decide that every data type must have a print() function that returns a std::string.

The classic OOP approach is to define an abstract base class with a pure virtual print() function. Later, you implement this for specific types, like Int1.

(You can find the example code HERE)

Memory Overhead

The first issue is size. Even though Int1 only holds a single 4-byte int member variable, sizeof(Int1) is 16 bytes on a 64-bit machine.

In contrast, Int2 (which has no virtual functions) takes up only 4 bytes.

Where does this size difference come from?

If you check the assembly in the constructor of Int1, you'll see the program storing an additional value: vtable for Int1+16. This is the vptr (virtual pointer), which points to the vtable in memory.

mov     edx, OFFSET FLAT:vtable for Int1+16
mov     rax, QWORD PTR [rbp-8]
mov     QWORD PTR [rax], rdx    # Write 8 bytes (vtable address)

The Int2 constructor, however, simply writes the integer value and returns.

We need this pointer because the vtable pointer is unrelated to the object type during runtime. If you static_cast an Int1 object to BaseData, the object still points to Int1's vtable. This is how the program knows to call Int1::print() even when holding a BaseData pointer.

But if you are storing billions of these integers in a database, quadrupling your memory usage (4 bytes -> 16 bytes) is a disaster.

Performance Overhead

The second issue is speed. Calling a virtual function requires a "pointer chase":

  1. Follow the object's vptr to the vtable.
  2. Look up the correct function address in the table.
  3. Jump to that address.

This has several side effects:

Even using the final keyword—which suggests to the compiler that it can devirtualize—is not a guaranteed fix.

The Solution: Static Polymorphism (CRTP)

If we want the flexibility of an interface without the runtime cost, we can use the Curiously Recurring Template Pattern (CRTP).

Here is the optimized example code

In the CRTP example, Int3 inherits from BaseData<Int3>. The base class casts this to the derived type at compile time to call the implementation:

template<class T>
struct BaseData {
    std::string print() const {
        return static_cast<const T*>(this)->user_print();
    }
};

The assembly code for Int3 is now identical to Int2. It consumes only 4 bytes and involves no pointer chasing.

A Warning on Recursive Calls

You might notice that I named the implementation function user_print() instead of print().

In CRTP, it is best practice to distinguish the interface (in the base class) from the implementation (in the derived class).

If the derived class fails to implement user_print(), you get a clear compile-time error. However, if both functions were named print(), and the derived class forgot to implement it, the base class print() would essentially call itself (since static_cast<Derived*>(this)->print() would resolve back to BaseData::print() via inheritance). This causes infinite recursion, leading to a stack overflow/segmentation fault.

Generic Programming

CRTP is the foundation of generic programming in C++. Instead of relying on a common runtime type, we rely on a common interface.

We can write a function that accepts any type inheriting from BaseData:

template<class T>
std::string print_object(const BaseData<T>& a) {
  return a.print();
}

Here, the compiler determines the type T at compile time. We get the safety and structure of an interface, but with zero runtime cost.