Digging Deep on Casting Generics in .NET
Posted on 6/26/2007
Things don't always work as you might expect when you go about casting one .NET data type to another. This is especially true with the generic collections because they are strongly-typed. When you cast a generic collection, you're asking the compiler to assert that the outer type (the container) is a match and that the inner type (the values contained within the collection) also match. In this article, we'll use various debugging tools to take a deep look at what at first glance appears to be a simple casting problem. To demonstrate this problem, suppose you implement an interface called ITalk with a class called Talker as follows:
interface ITalk {
float VoiceQuality { get; }
}
public class Talker : ITalk {
private float voiceQuality;
public float VoiceQuality {
get { return voiceQuality; }
}
}
At first glance, you might think that in .NET you could assign an instance of a List<Talker> to an IList<ITalk> but you can't. Consider the following code:
// The following line of code will generate a compiler error
// saying something like: Cannot implicitly convert type
// 'List<Talker>' to 'Ilist<Italk>'. An explicit conversion
// exists (are you missing a cast?)
IList<ITalk> speakers = new List<Talker>();
What's wrong with the assumptions being made by the programmer here? After all, the generic List implements IList and the Talker implements ITalk as we saw above. So why can't the C# compiler figure this out? To understand the problem, one has to look at the second half of the error message. The compiler says, "An explicit conversion exists" and then asks, "Are you missing a cast?" It's not as if the compiler doesn't know what we're asking it to do here. It knows that an explicit conversion from the concrete type to another exists. But which other type does it mean? Let's try using a cast to the target type we’re interested in as the compiler suggests:
// Cast the new object to the target type. This compiles
// just fine, but, unfortunately, it also does not work.
IList<ITalk> speakers = new List<Talker>() as IList<ITalk>;
Using C#'s as operator as shown above is often cited as the "safe" way to perform cast operations because it uses the Microsoft Intermediate Language (MSIL) instruction ISINST to check that the object being cast IS an INSTance of the target type, returning null if the types don't match. If you wrote an old C-language style cast instead, the compiler would emit the CASTCLASS instruction which does throw an exception when the types don't match. However, the new code above is not very safe if we don't bother to check the speakers reference for null-ness before attempting to use it. We know that Talker implements ITalk and we know that a generic List implements IList. And the compiler allows the cast operation to compile. In fact, it wouldn't balk at a static C-style cast to the target type either. But the "safe" (dynamic) cast returns null while the "unsafe" (static) one throws an exception. Either way, we don't get a reference to a usable list. This is not at all what we want or expect.
So, what's going wrong in the example code above? Let's use the MSIL Disassembler (ILDASM) do a bit of investigation. First, run the ILDASM tool and open the compiled assembly. You can usually find this tool in the {Visual Studio .NET Root}\SDK\{Version}\bin folder. Within the ILDASM GUI, find the class that contained the method with the code above and double-click the method name to see the disassembly. The MSIL will look something like the next code block. Don't be frightened by what you see. I'm going to explain all of it to you, line by line:
.locals init ([0] class
[mscorlib]System.Collections.Generic.IList`1<class
CopyToInterfaceExample.ITalk> speakers)
IL_0000: nop
IL_0001: newobj instance void class
mscorlib]System.Collections.Generic.List`1<class
CopyToInterfaceExample.Talker>::.ctor()
IL_0006: isinst class
[mscorlib]System.Collections.Generic.IList`1<class
CopyToInterfaceExample.ITalk>
IL_000b: stloc.0
On the first line, we can see a local variable called speakers being allocated. Notice the [0] syntax up front? That means the speakers local variable will be referred to as variable zero (0) in the code that follows. If you pick through the notation a bit, you can see that this variable is of type IList<ITalk>. That certainly matches what we wrote in C#. Now the "line numbers" start. On line IL_0000, we see a nop (No Operation) instruction. The compiler sometimes inserts these for special circumstances or when it needs to pad things to get an instruction to line up a certain way in memory. On line IL_0001, the newobj (New Object) instruction is used. Again, this matches what we wrote in C# with the new operator. If you scan across that line, you can see that the object being created is of type List<Talker> which also nicely matches the C# code. At the very end of the line, you can see that the constructor is being invoked (using the .ctor syntax). It's pretty easy to observe how your C# code looks from the compiler's perspective, huh?
Now we come to line IL_0006 which implements the cast operation as an isinst instruction. Why does line IL_0006 follow line IL_0001? Well, it's because they aren't really line numbers at all. These notations represent byte offsets in the current method. The nop instruction is only one (1) byte wide. This is why the compiler uses it to perform padding operations: the nop instruction is small, odd-sized and changes nothing, simply perfect for adding fluff when you need it. The newobj instruction and it's operand at IL_0001 must be five (5) bytes wide. So, if we look ahead, the isinst instruction (and it's operand) on IL_0006 must also be five (5) bytes wide because the stloc.0 (Store Local Variable 0) that follows is marked with offset IL_000b.
Let me throw in a couple of other facts about MSIL to help you understand what's happening in this code. First of all, MSIL is stack-based. There are a lot of implied stack operations in any MSIL code you read. In this short bit of code, I count four (4) stack operations. On IL_0001, the newobj instruction creates a new object and places the reference to it on the stack. Then, on IL_0006, the isinst instruction pulls the reference to the new object from the stack and evaluates it. If it is a direct match with the type IList<ITalk> or one of its subclasses , isinst pushes the original reference back onto the stack. If there is no match, it pushes a null reference onto the stack instead. Finally, the stloc.0 instruction pops the last item pushed onto the stack off and places it in local variable zero (0), which we know by the name speakers.
OK, we've looked at the MSIL that represents the C# code above. It’s pretty easy to follow, don’t you think? We learned that the MSIL is stack-oriented. We also learned that the newobj instruction in MSIL is used to implement C#'s new keyword. Finally, we've seen how the isinst instruction is used to implement C#'s as operator, used for casting one type to another. What we don’t yet understand is that whatever isinst is doing internally, it doesn't believe that a List<Talker> is a subclass of an IList<ITalk>, despite what we programmers think. Another interesting view of the program that could shed some light on the subject is to run the code in a debugger and look at the Intel x86 disassembly of the same code. The Intel assembly language is generated when the Just-In-Time (JIT compiler) converts the MSIL byte code into something that the underlying machine can actually execute. The easiest way to get to the disassembly of a .NET program in many of the Microsoft debuggers is to load the program and set a standard breakpoint where you would like to stop. When the debugger stops at the breakpoint, right-click in the code window and click the "Go to disassembly" item on the pop-up menu. You will see something like the following in the disassembly window for the code we're interested in:
IList<ITalk> speakers = new List<Talker>() as IList<ITalk>;
00000049 mov ecx,919794h
0000004e call FFCA0E64
00000053 mov esi,eax
00000055 mov ecx,esi
00000057 call 7873D568
0000005c mov edx,esi
0000005e mov ecx,919ACCh
00000063 call 7923D88E
00000068 mov dword ptr [ebp-3Ch],eax
Again, don't panic. I will walk you through the Intel x86 assembly language and correlate what you see here with the MSIL disassembly we looked at earlier. Remember the IL_0001 offset in the MSIL disassembly above? That was the line that contained the newobj instruction to allocate a new List<Talker>. And if you scan to the end of that line, you'll remember that call to the constructor using the .ctor notation.Well both of those steps show up in the Intel assembly language. The call instruction at offset 4E in the x86 disassembly branches to some internal Common Language Runtime (CLR) code to allocate space on the heap for the list. The hexadecimal value 919794h that's moved into the ECX register just before the call is significant. This is a numeric type handle that refers to the type we know as List<Talker>. It makes sense that .NET refers to types by numbers rather than names when possible. Numbers are much faster to process than strings are.
The mapping between the .NET types loaded into your program and their metadata is a very dynamic thing. Run the program once and a List<Talker> might be known by the type handle 919794h as it is above. Run the same program again and the List<Talker> type may have a completely different type handle. That's OK because the type handles are inserted during JIT compilation so they can afford to be very dynamic. If you want to inspect the type handle for one of your classes at runtime, that's pretty easy. Consider the following QuickWatch expression used in the Visual Studio .NET debugger to see the type handle for the List<Talker> type:
((System.RuntimeType)(typeof(List<Talker>))).TypeHandle.Value
By casting the type object for a known type to a System.RuntimeType and dereferencing the Value of the TypeHandle property, we can see the metadata descriptor that the CLR will use to refer to that type internally. You can replace List<Talker> in the QuickWatch expression above with any type to get the unique number that refers to that type within the current running program. OK, back to the code. We know that the call at offset 4E uses the List<Talker>’s type handle to allocate memory from the heap. The next instruction at offset 57 calls the new object's constructor.
Now's a good time to go back to that stack-based conversation from earlier. You can skip this paragraph if you don't care to understand how the MSIL stack is implemented. For those of you who know Intel x86 assembly language, you may be wondering where the push and pop instructions have gone in the code above? Or at a minimum, we should see a lot of move operations dereferencing the EBP register which contains the pointer to the top of the Intel stack. You should understand that when I say that MSIL is stack-based, the actual stack is a pure MSIL construct. Whether the JIT compiler has a CPU-assisted stack that can be used or not is an implementation detail. Rather than using the Intel stack to manage everything related to the data that MSIL manipulates, the Intel JIT compiler uses a combination of CPU registers and other memory to create the concept of the stack as MSIL defines it. CPU registers are typically much faster than other types of memory anyway.
OK, so far we've seen the allocation of the List<Talker> collection and the call to its constructor. All that remains is the cast operation and the assignment of the result to the variable known as speakers. As you might have guessed, the x86 call instruction at offset 63 is the actual cast operation implemented in C# using the as operator and implemented in MSIL using the isinst instruction. Now look at the mov instruction at offset 5E, just before the last call. Can you guess what that number 919ACCh being moved into the ECX register is? That's right. It's the type handle for an IList<ITalk>. The code that's being invoked in the last call instruction is somehow going to use a reference to the list we created and this type handle to determine if a List<Talker> IS an INSTance of an IList<ITalk>. Finally, the assignment of the result is represented by the last line of the assembler code at offset 68 which moves the result of the cast operation onto the x86 stack where the speakers variable is stored.
Well, we know the outcome from the C# code from which this assembly language code was derived. Running the program proves that the speakers value will be assigned a null reference. Unfortunately, all of my attempts to use a debugger to step into the last call at offset 63 have been thwarted. I used four different Windows debuggers, setting various options to strip JIT optimizations and to allow stepping into unmanaged code. Nothing I tried would let me inspect how the ininst instruction operates. Microsoft has protected that code pretty well. I suppose if I still had an Integrated Circuit Emulator (ICE) as I did in the old days, I could step in. If someone reading this article can show me how to step into the isinst implementation using one of my debuggers, I would be grateful. However isinst does its work, it's clear that it does not think that a List<Talker> is a subclass of an IList<ITalk>. So, I'm at the end of my investigation, right?
Well, not quite yet. There's one more thing I'd like to try. What if I could simply blot out the isinst instruction in the MSIL as if it never happened? I wonder if I could use those nop instructions we saw earlier to make it so that the isinst instruction simply never happened. The reference to the new object would then be stored back into the local variable called speakers without having gone through the type check that the cast operation performs. After all, we're programmers so we're way smarter than the CLR, right? We know that it's safe to make this assignment so we would like to hack our way around this obvious short-coming. If we could do this, the modified MSIL might look like this:
.locals init ([0] class
[mscorlib]System.Collections.Generic.IList`1<class
CopyToInterfaceExample.ITalk> speakers)
IL_0000: nop
IL_0001: newobj instance void class
mscorlib]System.Collections.Generic.List`1<class
CopyToInterfaceExample.Talker>::.ctor()
IL_0006: nop
IL_0007: nop
IL_0008: nop
IL_0009: nop
IL_000a: nop
IL_000b: stloc.0
This is pretty simple to do actually. Suppose that the name of my assembly that contains the MSIL byte code is CopyToInterfaceExample.exe. To generate the MSIL representation of this program, I can run ILDASM in command line mode using this command:
ildasm CopyToInterfaceExample.exe /out=CopyToInterfaceExample.il
This will produce a text file with an IL extension containing the MSIL representation of the whole program. Next, I open the IL text file in my editor and modify the code inserting five (5) nop instructions as shown above to replace the five (5) bytes previously consumed by the isinst instruction. I could simply remove the isint instruction rather than replacing it byte-for-byte with nop instructions. However, if I did that, I'd have to be careful not to violate the CLR’s rules about instruction alignment and such. It's a lot easier to just insert the nop instructions to "blot out" the offensive code. The last step is to assemble the MSIL back into a working program. This can be done with ILDASM's cousin known as the MSIL Assembler, or ILASM. Here's the command line to assemble my hacked MSIL file back into an executable program.
ilasm /exe /debug CopyToInterfaceExample.il out=CopyToInterfaceExample2.exe
Notice that I output the new executable assembly to a new name (with a number appended), so as to not overwrite the original executable with the assembler. This is so I can try other disassembly experiments with the original program without having to recompile it each time. When you run the hacked program, you'll notice that it runs just fine until you try to dereference the speakers variable. Let's say, for example, that you call the Add() member of the speakers list to add a new Talker. When that happens in the hacked program, the CLR encounters a very nasty problem. It generates an internal exception saying that something is very wrong. It may even tell you that it has detected an illegal program before shutting it down. Maybe the CLR was smart not to allow the cast operation after all.
Why did hacking the program to assign an instance of a List<Talker> to a reference for an IList<ITalk> crash the program? The answer lies in the implementation of the List<T> generic type which implements six (6) different interfaces: IList<T>, Icollection<T>, Ieenumerable<T>, IList, Icollection and Ienumerable. Back in the bad old days of COM, Don Box (whom I think the world of for suffering through the COM years as the de facto standards-bearer) used to talk about using interfaces to shear into the virtual method table (called the vtable) for an object instance. It was lovely imagery, in my opinion. If you think of the vtable as a list of pointers to methods that are supported by a running object instance, with its interfaces sort of stacked in groups, we can imagine shifting (or shearing in Box-speak) into the table to find the block of methods that represent an individual interface. Of course, the languages that we used to write COM objects didn’t really support interfaces the way .NET does. Interfaces were supported in the Interface Description Language (IDL) that we used to build COM type libraries but we had to fool our other languages into supporting the idea. In C++, we used so-called pure virtual base classes which were essentially abstract classes full of nothing but abstract methods. That’s sort of what an interface is if you think about it: all promise and no implementation. Thankfully, the .NET CLR supports interfaces intrinsically. And we can see in the definition of the List<T> type that six (6) separate interfaces are implemented.
The problem with the trickery we attempted above is that the specialized type IList<Talker> which List<Talker> implements, is only one (1) of six (6) interfaces that the List<Talker> implements. When we create a List<Talker> and then store it’s reference in an IList<ITalk>, the vtable for the running object is a complete mismatch, especially since we’re trying to treat an List<Talker> as the even more abstract IList<ITalk>. At runtime, the generic IList<ITalk> that we’re seeking is essentially a pre-computed shearing into the contained objects. The compiler and the CLR are cooperating to make some assumptions about the virtual methods available on the container and the contained objects through the strongly-typed generic class. Much of the dynamic code that would be implemented for Visual Basic or IronRuby to find the interface by name or ID at runtime can be avoided entirely. Strong-typing helps to add raw speed to the program at the expense of some flexibility. The CLR knows this kind of stuff which is why casting a generic collection of a concrete type to a generic collection of a single interface supported by the concrete type is forbidden. The CLR just knows it won’t work. This is why purely dynamic languages like Ruby, which allow very loose and late-bound typing, are gaining momentum in the .NET world. However, there will always be a need for balancing speed and type-safety, among other constraints which is why languages like C# and C++ will always have their place.
While experimenting with this code, I watched my hacked program crash in multiple ways. The clue that led me to understand what was happening was that when I called the Add() method of the IList<ITalk> interface, the program sometimes crashed inside a method like GetEnumerator() which is part of another interface implemented by List<T> called Ienumerable. If I called IList<T>::Add(), how in the world could I end up inside the Ienumberable::GetEnumerator() method? I disassembled the Add() method to make sure it wasn’t invoking GetEnumerator() for some odd reason and it clearly was not doing that. So, the only plausible way to end up in the list’s GetEnumerator() method by calling Add() was that the dispatching mechanism that was supposed to be routing my program to he Add() method had somehow become corrupted, sending the CPU careening down the wrong path.
The dispatching mechanism for virtual methods of running objects in .NET is the same sort of vtable outlined for COM above. How could the vtable have become corrupted? Technically, it hadn’t been corrupted at all. It’s just that the shape of the vtable from a List<T>’s perspective is quite different from the shape of a vtable for an IList<T>. It’s like wearing someone else’s glasses to correct my vision. Through someone else’s glasses, the world might seem odd to me, wholly inaccurate. And it’s more than a conceptual problem. If I wear the wrong glasses, I can’t drive safely, I will most likely trip over things as I misestimate the proximity and shape o obstacles and eventually I will crash. In the same way, having the wrong view of an object’s implementation can make the CLR trip and crash, too. The learning here is that we should always be curious about how our compilers and runtime engines work. And we shouldn’t make too many assumptions about the implementation of our ideas in code. The next time your compiler or the CLR does something you don’t expect, dig a bit deeper using the tools you learned about here to find out how it all works. In discovering the methods that your ideas are implemented with, you may gain a much deeper understanding.