Abstract: we describe how C++ operator dynamic_cast<> can be implemented. We also present "operator" guid_cast<>, which can be implemented by hand (I use it all the time!).

C++ has an operator dynamic_cast<>, which can be used to safely downcast or even cast to an unrelated type (in other words, traverse inheritance hierarchy down or even across). For example, given the following code:

class A {
	virtual void enable_rtti(){}
};
class B : public A {
};

A *a = ...;
B *b = dynamic_cast<B*>(a);

If variable "a" really points to an object of type "B", then result of dynamic_cast<> will be a valid pointer to "B". Otherwise, NULL is returned. Magic. How does this work? First, dynamic_cast<> should be enabled as a compiler option (it's usually called RTTI for RunTime Type Information).Whenever a class has at least 1 virtual function, it is called "polymorphic", and RTTI is enabled for it by the compiler. An extra virtual function, named say CastTo() is introduced. This function has the following prototype:

virtual void *CastTo(type_info &ty){
}

The implementation of CastTo typically looks like this:

virtual void *CastTo(type_info &ty){ 
	if (typeid(*this)==ty) return this;
	for (P in each parent class) {
		if (P has virtual function table) {
			void *test=P::CastTo(ty); if (test) return test;
		} else {
			test typeid against all classes in this non-polymorphic 
			branch of the tree.
		}
	}
	return NULL;
}

(note the there are two ways to write CastTo: 1. to translate hierarchy of classes into code, 2. to translate it into data, and use an algorithm for traversing that data (like a tree). In this example, case #1 is shown. Case #2 will trades speed for flexibility - you can do much more with hierarchy data than traverse it in one specific way. For example, you could print this hierarchy.

type_info is a class that stores a type's mangled name. Mangled name is a string uniquely identifying a class; it is usually some ugly concatenation of class name (including namespace name), all parent class names, and all template parameter type names. One type_info object is generated for each C++ class in the program, polymorphic or not. The type_info's operator == usually first compares the addresses. If addresses are equal, it returns true; otherwise, it compares mangled names.

The reason why dynamic_cast<> can go anywhere in the inheritance hierarchy is simple: when an object is instantiated, it is at the bottom of inheritance hierarchy. C++ compiler's function CastTo overrides all CastTo's in all base (parent) classes. So no matter through which base class dynamic_cast<> is called, the final object's CastTo is called. This CastTo executes a depth-first traversal of the inheritance tree, thus searching all parent classes for a match (as illustrated in the pseudo-code above).

The inefficiency and unsafety of dynamic_cast<> is in the mangling scheme: If there are two disjoint class hierarchies in two separate libraries, which happen to name their classes the same, then it will be possible to dynamic_cast<> between them. Mangled names rely only on class names, and hence are not unique.

A better solution would be to generate a globally unique identifier (GUID) and paste it right into the source code. Then this GUID would be used everywhere as a unique label for the class, thus eliminating any chance of confusing it with another. typeid::operator ==() would be significantly more efficient, because it would only have to compare 16 bytes (typical length of a GUID) instead of a mangled name, which can be extremely long (easily 1000 bytes in length). Using a GUID would also save space in the object file.

A programmer can easily implement the CastTo() scheme him/herself:

struct WithCastTo {
    virtual void *CastTo(const GUID &a)=0;
};

#define IMPL_GET_IID(P1,P2,P3,P4,P5,P6,P7,P8,P9,P10,P11)\
    static const GUID &GetIID() {\
    static const GUID guid = {P1,P2,P3,P4,P5,P6,P7,P8,P9,P10,P11};\
    return guid;\
}

inline bool Matches(const GUID &a, const GUID &b) {
    return &a==&b || memcmp(&a,&b,sizeof(GUID))==0;
}

// Depth-first search of inheritance tree
#define TRY_PARENT_CASTTO(Type,guid)\
{void *_result = Type::CastTo(guid); if (_result) return _result;}

#define TRY_DOWNCAST(To,guid)\
if (Matches(To::GetIID(),guid)) return static_cast<To*>(this);

template<class From,class To> inline To *guid_cast(From *t) {
    return t ? (To*)t->CastTo(To::GetIID()) : 0;
}

Usage:

struct A : public WithCastTo {
	// {6642D2E4-E60E-4ace-92B9-42072A323EF4}
	IMPL_GET_IID( 0x6642d2e4, 0xe60e, 0x4ace, 0x92, 0xb9, 0x42, 0x7, 0x2a, 0x32, 0x3e, 0xf4 );
};

struct C {
	// {DDDAE12B-3D08-4aa2-A4AB-0D7BF60A2B4C}
	IMPL_GET_IID( 0xdddae12b, 0x3d08, 0x4aa2, 0xa4, 0xab, 0xd, 0x7b, 0xf6, 0xa, 0x2b, 0x4c );
};
struct B : public A, public C {
	// {CAE563C3-CA3F-411f-BB97-9523B33F04B4}
	IMPL_GET_IID( 0xcae563c3, 0xca3f, 0x411f, 0xbb, 0x97, 0x95, 0x23, 0xb3, 0x3f, 0x4, 0xb4 );
	virtual void *CastTo(const GUID &guid) {
		TRY_DOWNCAST(B,guid);
		TRY_DOWNCAST(A,guid);
		TRY_DOWNCAST(C,guid);
		return 0;
	}
};

A *a = ...;
C *c = guid_cast<C>(a);	// works just like dynamic_cast<>

Notice that the class "To" has to have a static function GetIID() that returns a GUID. This is not always possible -- sometimes To is a class from some system library that we do not want to edit. We can extend guid_cast<> easily by introducing a template class GuidOf<T> that by default calls T::GetIID(), but can be specialized to return a GUID on behalf of the target:

template<class T> struct GuidOf {
	static const GUID &GetIID() {
		return T::GetIID();

	}
};
struct GuidOf<SystemClass> {// specialization for class that doesn't have GetIID()
	IMPL_GET_IID(...);
};

There is an important difference between guid_cast<> and dynamic_cast<>. Operator guid_cast<> does not imply traversal of inheritance hierarchy. It is more general than that: it is a request for a piece of information based on GUID. GUIDs can be used to uniquely label anything, including all atoms in the universe, one by one.

It is not always possible to apply guid_cast<> to the result of another guid_cast<>. The reason is that the target of guid_cast<> is not required to have a virtual table. This behavior is true for dynamic_cast<> as well. The above example guid_cast<C>(a) demonstrates this.

Since C++ has more information at compile time than we do, it disallows applying dynamic_cast<> to classes that have no virtual function tables (such classes are called polymorphic). guid_cast<> can always be called if the class has a function CastTo, regardless of whether it's virtual or not (In practice this is not a problem).

 

www.self-similar.com
e-mail: alexei at self-similar.com
© 2002 Alexei Lebedev