A Look under the hood of the .NET Framework

1. Table of Contents

Afterword

2. Introduction

Well over a year ago, when I was still pretty new to .NET and programming in general, I witnessed a lecture by Bart de Smet titled "Behind the Scenes of 10 C# Language Features". I was unfamiliar with most features, I was unfamiliar with C# (I'm a VB programmer by trade), I never even heard of Intermediate Language (which was abundant!) and even for seasoned C# veterans this lecture was tough. Needless to say I had a great time! Even though I hardly got anything Bart said that day it did inspire me. It inspired me to look beyond the code I was writing and it inspired me to write this article. And when I looked back at the lecture on Channel9 just last week I understood what Bart was saying.

Here is a link to the video of that lecture: Bart de Smet - Behind the Scenes of 10 C# Language Features

Throughout this article I will make references to this video when I think it could help to better understand the topic I am discussing. Needless to say I encourage you to watch the entire video although it is not necessary to understand this article. Is this article the written version of the video? Most certainly not! Although the video and this article have some overlap I hope most of it is complimentary to the other.

Another source of inspiration and knowledge is a book I recently reviewed for Manning Publications, calledMetaprogramming in .NET by Kevin Hazzard and Jason Bock. Unfortunately the book has not been published yet, but will be during fall this year. The introductory chapter is available for free, so I recommend you read it. Once again, reading this book is not necessary for understanding this article, but I might tell you to read a certain chapter when I think it could help to better understand the topic I am discussing (so I won't have to edit my article once the book is released).

So what IS necessary to understand this article? For starters a bit of persistency and good will. This article is BIG and I realize that. On top of that the topics discussed in this article are not easy, but I'll make sure you'll get there. A little understanding of Intermediate Language comes in handy. If you never heard of Intermediate Language (or IL) or you do know it and think it is really very scary (which I would totally understand), don't turn away yet, I'll explain about it in a bit. Furthermore we'll see some .NET constructs you may or may not yet be familiar with, such as Auto-Properties, Anonymous Types, Lambda Expressions and Iterator Methods. Again, don't sweat it. It sounds harder than it is.

I would advice you to not read this article all at once. Take a break every now and then, put it in your bookmarks and read on tomorrow evening. Let the new found knowledge settle into your brain before continuing. I wish you lots of pleasure reading this. So are we ready? Let's go!

3. An introduction to IL

So as I said we will look at IL (Intermediate Language). What is this IL? You could say sweet dreams are made of these! All the .NET code you will ever write is compiled into IL. This means the codes you write in .NET (C#, VB, F#...) are transformed into this language. This IL is then turned into machine code so your software does what it does. Did you click that link there? Notice that IL refers to a more broad concept that is not unique to .NET? That is because when I say IL I actually mean CIL or MSIL! So it's actually this Common IL or Microsoft IL that we are going to look at. So why are we looking at this IL? We can see what happens from looking at our VB or C# code, right? Sure, but what is REALLY going on is only visible on IL level (or even Assembler level, but let's not go there). For example, For Each (VB) or foreach (C#) cannot be literally turned into IL. What really happens are some function calls on the IEnumerable and IEnumerator Interfaces as we will see in an example later in this article.

Before we look at an example of IL let me tell you why you would want to learn IL. First of all, learning a new language can be good fun and any new language you learn will make a consecutive language easier to learn. Second, IL is not like any higher level language such as VB or C#. Learning how things are handled in other languages makes you think on how you code in your own language. Whether you use this new found knowledge or not is up to you. At least you know the alternatives. Another good reason to learn IL specifically is because it gives you a better understanding of how .NET works under the hood. Whether you value such knowledge or not is up to you, but at least you can brag about it to your collegues. A more practical reason to learn IL is because you can write, compile and execute your own IL at runtime using Reflection.Emit. Doing this might be useful because using IL you can use language constructs that are not available in VB or C#. As a bonus Reflection.Emit is faster than any other dynamic code generation you will find. We will see an example of this at the end of this article. I hear you think you never needed this before, why will you need this now? The truth is that you probably won't, but it is good to know the options are open to you.

So you must now be eager to see some IL! Let's first look at a simple Hello World example (yes, really). Open up Visual Studio and create a new Console Application, either in VB or C#. Paste the following code into the (parameterless) Main method.

Collapse | Copy Code

static void Main()
{
   string s = "Hello IL!";
   Console.WriteLine(s);
   Console.ReadKey();
}

Whether you can see it or not, but IL was emitted when you built this. You can look at the IL of an assembly by using Intermediate Language Disassembler or IL DASM for short. It is included in the Microsoft SDK's, if you have Visual Studio 2010 installed you should have no trouble finding it (you can simply use the 'Find' tool to search for ILDASM). In case you can't find it, you can download an older version of ILDASM right here. So let's start it up and you should get a window looking something like this:

Now try opening the Console Application you just created. Go to File -> Open and select the ConsoleApplication (make sure you saved and built your project). It should be available in the bin\debug folder. You should now get a tree view containing Namespaces, Classes, and Methods.

You can double-click on any Method to see it's IL. Whether you have created your Console Application in VB or C# does not matter. The IL will be mostly the same. The part of the IL we will be looking at is the following part, which should be the same for VB and C#.

Collapse | Copy Code

  .locals init ([0] string s)
  IL_0000:  nop
  IL_0001:  ldstr      "Hello IL!"
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_000d:  nop
  IL_000e:  call       valuetype [mscorlib]System.ConsoleKeyInfo [mscorlib]System.Console::ReadKey()
  IL_0013:  pop
  IL_0014:  ret
} // end of method Program::Main

Yikes! Now that is pretty scary code! No, it isn't. We will look at it one line at a time. But before we do there is something you should know about IL. IL is a stack-based language. That means variables can only be pushed up on a stack (just literally think of it as a stack of values) and must be 'consumed' in the order they were pushed on the stack. So let's look at the code sample. In the first line we see .locals init ([0] string s). Does this even need explanation? This is simply the IL declaration of the string s we declared in our program. The next line saysnop which is a pretty accurate description of what it does, nop. We will ignore any nop opcodes we'll run across. Did I say opcodes? Yes I did, because what we see here is an opcode, an OPeration CODE, that tells the machine what to do. basically everything you see in IL is an opcode, so it's a lot less scary than it sounds. Let's continue with the next line of code. This is where it gets interesting! ldstr "Hello IL!". The ldstr opcode means that a stringshould be pushed on the stack. In this case that string is "Hello IL!". The next line, stloc.0 stores this stringin local variable 0 (basically it stores the first item on the stack, which is the string Hello IL! into the local variable 0). So what is local variable 0? Take a look at the first line again, .locals init ([0] string s). There is your answer, the string s.

The next line says ldloc.0. Can you guess what it does? It loads the value of local variable 0 on the stack. You will see a lot of opcodes starting with st or ld. It is safe to say these always mean STore and LoaD, you will do well to remember that. So what will IL do now "Hello IL!" is back on the stack? It makes a call toSystem.Console.Writeline which take a string as argument. At this point the string that is on the stack is consumed and the stack is empty again. Whatever happens in Console.Writeline is unknown to us (if you wish to look it up, be my guest though). Whenever the code returns from Console.Writeline IL makes a call toSystem.Console.ReadKey which returns the valuetype System.ConsoleKeyInfo and puts this on the stack. Since we are not using this ConsoleKeyInfo the next opcode is pop. Pop simply takes the first value from the stack. That concludes the example and ret is emitted, meaning return to the calling code.

Was that so hard? I don't think so. There are more opcodes you will see in this article, but you have seen the basics of IL, a stack-based language.

Further reading:
ILDASM.exe Tutorial
A list of OpCodes from the OpCodes Class
The first part of chapter 5 of the book Metaprogramming in .NET
Purpose of the nop opcode?

4. Under the hood of everyday .NET code

So now that you have seen a simple Hello World program let's look at some more interesting IL. Actually, let's look at some IL you may not have expected from looking at your code. Open one of the sample applications that can be downloaded at the top of this artible. Either UnderTheHoodVB or UnderTheHoodCSharp will do. You should leaveTheCuriousCaseOfFSharp alone for now. Once you open the solution you will see two projects. One project contains some Windows Forms and the other contains some Classes that are or aren't used by the WinForms in the other project. The truth is that we are not going to run some of the code, it just sits there for theoretical analysis. The code that we will be running is mostly to show the code really does what I say it does. You might as well run ILDASM and open either the UnderTheHoodVB.Examples.dll or UnderTheHoodCSharp.Examples.dll since that is what we will be looking at mostly. The dll's can be found in the bin\debug folder of their respective project folders. So, are you set? Let's look at our first IL example!

4.1. The case of Properties

A question I have seen in the QA section of CP quite often is "What is the difference between a Public field and a Property?" or "Why would I use a Property instead of get and set functions?". Let's look at the first question first. In your solution open up the AutoPropertyClass, the PropertyClass and the GetterSetterClass (all three found under the PropertyExample folder). You will find the following code:

Collapse | Copy Code

Public Class AutoPropertyClass
   Public Property Text As String
   Public Property Number As Integer
End Class 
 
Public Class PropertyClass
 
   Private _Text As String
   Public Property Text() As String
      Get
         Return _Text
      End Get
      Set(ByVal value As String)
         _Text = value
      End Set
   End Property
 
   Private _Number As Integer
   Public Property Number() As Integer
      Get
         Return _Number
      End Get
      Set(ByVal value As Integer)
         _Number = value
      End Set
   End Property
 
End Class
 
Public Class GetterSetterClass
 
   Private _Text As String
   Private _Number As Integer
 
   Public Function get_Text() As String
      Return _Text
   End Function
 
   Public Sub set_Text(ByVal value As String)
      _Text = value
   End Sub
 
   Public Function get_Number() As Integer
      Return _Number
   End Function
 
   Public Sub set_Number(ByVal value As Integer)
      _Number = value
   End Sub
 
End Class

Three seemingly completely different Classes. However, if you open up ILDASM and look at the code that was generated you will find that the classes are actually pretty much the same! The AutoPropertyClass and thePropertyClass are even exactly the same. The compiler has actually generated backing fields for the auto-properties, as well as get and a set functions, the ones you see written in the GetterSetterClass. What more do we see? A Property is nothing more than a wrapper for a get and a set function (These are the red triangles in ILDASM, they are not present in the GetterSetterClass's IL).

So what happens when we get or set the value of a Property? This can be seen in the PropertyUser class. It contains three methods, one that gets and sets the Properties in the AutoPropertyClass, one that does the same for the PropertyClass and one that calls the get and set functions in the GetterSetterClass. The generated IL? It's exactly the same for all three methods!

Collapse | Copy Code

  .locals init ([0] int32 n,
           [1] class UnderTheHoodVB.Examples.PropertyExample.PropertyClass p,
           [2] string t)
  IL_0000:  nop
  IL_0001:  newobj     instance void UnderTheHoodVB.Examples.PropertyExample.PropertyClass::.ctor()
  IL_0006:  stloc.1
  IL_0007:  ldloc.1
  IL_0008:  ldstr      "Hello"
  IL_000d:  callvirt   instance void UnderTheHoodVB.Examples.PropertyExample.PropertyClass::set_Text(string)
  IL_0012:  nop
  IL_0013:  ldloc.1
  IL_0014:  ldc.i4.s   42
  IL_0016:  callvirt   instance void UnderTheHoodVB.Examples.PropertyExample.PropertyClass::set_Number(int32)
  IL_001b:  nop
  IL_001c:  ldloc.1
  IL_001d:  callvirt   instance string UnderTheHoodVB.Examples.PropertyExample.PropertyClass::get_Text()
  IL_0022:  stloc.2
  IL_0023:  ldloc.1
  IL_0024:  callvirt   instance int32 UnderTheHoodVB.Examples.PropertyExample.PropertyClass::get_Number()
  IL_0029:  stloc.0
  IL_002a:  nop
  IL_002b:  ret
} // end of method PropertyUser::UseTheProperties

After the Hello World example you should actually be able to read this pretty well. We see some new opcodes such as newobj, which is pretty self-explanatory. ldc.i4.s 42 might need some explanation. Ldc.i4 pushes a suppliedInt32 on the stack. The .s means it treats the supplied value as an Int16 rather than an Int32, which may be right since 42 fits an Int16 just as well. What about the callvirt opcode? This is used to call overridable functions in a polymorphic manner. That is, callvirt will call the function on a superclass rather than a base class even if the design time type of an object is of its baseclass (but a superclass is provided). Sounds difficult? Don't worry about it. In this context just assume callvirt does the same as call. So what do we see in the IL above? No such thing as a Property is called, they are all get and set methods! So why would we still use Properties? For starters they provide an intuitive API when coding. Instead of looking for the correct function to get or set some value we simply use one Property to get or set the same value. Why not use a Public field? Well, I hope that's pretty obvious. Properties, through get and set methods, provide encapsulation and allow you to write extra code when a Properties value is get or set.

That wraps up our example on Properties. Was it what you expected it to be? Let's look at another VB and C# construct and see what IL makes of it.

4.2. The case of With

Have you ever wondered what happens when you use the With keyword in VB? It allows you to set someProperties after initializing an Object without needing a reference to the Object. C# knows this same construct, but does not have a keyword for it like in VB. Let's look at the code. You can find it under the WithExample folder.

Collapse | Copy Code

Public Sub ExampleWithoutWith()
   Dim p As New Person
   p.FirstName = "Fu"
   p.LastName = "Bar"
   p.Age = 50
End Sub
 
Public Sub ExampleUsingWith()
   Dim p As New Person With {.FirstName = "Fu", .LastName = "Bar", .Age = 50}
End Sub

I think by now you can guess what the IL of ExampleWithoutWith looks like, so I'm not going to discuss it. But what happens when we use that With keyword? Here is the IL:

Collapse | Copy Code

  .locals init ([0] class UnderTheHoodVB.Examples.Person p,
           [1] class UnderTheHoodVB.Examples.Person VB$t_ref$S0)
  IL_0000:  nop
  IL_0001:  newobj     instance void UnderTheHoodVB.Examples.Person::.ctor()
  IL_0006:  stloc.1
  IL_0007:  ldloc.1
  IL_0008:  ldstr      "Fu"
  IL_000d:  callvirt   instance void UnderTheHoodVB.Examples.Person::set_FirstName(string)
  IL_0012:  nop
  IL_0013:  ldloc.1
  IL_0014:  ldstr      "Bar"
  IL_0019:  callvirt   instance void UnderTheHoodVB.Examples.Person::set_LastName(string)
  IL_001e:  nop
  IL_001f:  ldloc.1
  IL_0020:  ldc.i4.s   50
  IL_0022:  callvirt   instance void UnderTheHoodVB.Examples.Person::set_Age(int32)
  IL_0027:  nop
  IL_0028:  ldloc.1
  IL_0029:  stloc.0
  IL_002a:  nop
  IL_002b:  ret
} // end of method WithExample::ExampleUsingWith

The first thing you should see is the extra local variable that is initialized. It has some weird name (by which you can see I'm using the VB generated IL) that is not valid in regular VB or C# code. What we then see is that a newPerson is created, but it is not assigned to Person p, but to the extra, weird variable. From then on everything is pretty normal, all the Properties are set on the extra, variable. After that, on IL_0028, the weird variable is assigned to our own variable p. If you are looking at the C# generated IL you will see exactly the same, except that the extra variable has a different name. See how the compiler is playing tricks on us again?

Let's look at another, pretty easy example before we start off with the more difficult stuff.

4.3. The case of For Each

One of my favorite, yet easy, compiler tricks is how it handles the For Each... Next Statement (foreach in C#). You can open up the example in the ForEachExamples folder. It contains a single Class with four methods. Two methods for an iteration on an IEnumerable and two methods for an iteration on an IEnumerable(Of T)(IEnumerable<T> in C#). The first method simply uses the For Each Keyword, the second uses the code as it is emitted by the compiler (as seen in IL). So let's look at the code for the IEnumerable which uses For Each.

Collapse | Copy Code

Public Shared Sub ForEach(ByVal l As IEnumerable, ByVal handler As Action(Of Object))
   For Each obj As Object In l
      handler.Invoke(obj)
   Next
End Sub

These methods look pretty straightforward and it's probably not something you haven't done a million times before. The IL code that is generated is slightly different for C# and VB. So let's look at the IL code for VB. Have a look at the IL of C# at your own leisure.

Collapse | Copy Code

.method public static void  ForEach(class [mscorlib]System.Collections.IEnumerable l,
                                    class [mscorlib]System.Action`1<object> 'handler') cil managed
{
  // Code size       82 (0x52)
  .maxstack  2
  .locals init ([0] object obj,
           [1] class [mscorlib]System.Collections.IEnumerator VB$t_ref$L0,
           [2] bool VB$CG$t_bool$S0)
  IL_0000:  nop
  IL_0001:  nop
  .try
  {
    IL_0002:  ldarg.0
    IL_0003:  callvirt   instance class [mscorlib]System.Collections.IEnumerator [mscorlib]System.Collections.IEnumerable::GetEnumerator()
    IL_0008:  stloc.1
    IL_0009:  br.s       IL_0025
    IL_000b:  ldloc.1
    IL_000c:  callvirt   instance object [mscorlib]System.Collections.IEnumerator::get_Current()
    IL_0011:  call       object [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::GetObjectValue(object)
    IL_0016:  stloc.0
    IL_0017:  ldarg.1
    IL_0018:  ldloc.0
    IL_0019:  call       object [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::GetObjectValue(object)
    IL_001e:  callvirt   instance void class [mscorlib]System.Action`1<object>::Invoke(!0)
    IL_0023:  nop
    IL_0024:  nop
    IL_0025:  ldloc.1
    IL_0026:  callvirt   instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
    IL_002b:  stloc.2
    IL_002c:  ldloc.2
    IL_002d:  brtrue.s   IL_000b
    IL_002f:  nop
    IL_0030:  leave.s    IL_0050
  }  // end .try
  finally
  {
    IL_0032:  ldloc.1
    IL_0033:  isinst     [mscorlib]System.IDisposable
    IL_0038:  ldnull
    IL_0039:  ceq
    IL_003b:  ldc.i4.0
    IL_003c:  ceq
    IL_003e:  stloc.2
    IL_003f:  ldloc.2
    IL_0040:  brfalse.s  IL_004e
    IL_0042:  ldloc.1
    IL_0043:  isinst     [mscorlib]System.IDisposable
    IL_0048:  callvirt   instance void [mscorlib]System.IDisposable::Dispose()
    IL_004d:  nop
    IL_004e:  nop
    IL_004f:  endfinally
  }  // end handler
  IL_0050:  nop
  IL_0051:  ret
} // end of method ForEachExamples::ForEach

Wow! That is a lot of IL for such a short piece of code! I have pasted the entire IL code in here, because there are two input arguments. You see no less than two extra local variables that are created by the compiler. AnIEnumerator and a Boolean (bool in IL and C#). Also, we see a Try Finally Block that I really didn't put there in my code. As you can see the first thing that is done is that a call is made to the GetEnumerator method on theIEnumerable (which is pushed on the stack by ldarg.0 (or 'load argument 0', where argument means a parameter that was passed to the method). We then see a weird opcode, br.s. Whenever you see an opcode that starts withbr this usually means BRanch. It is followed by an address such as IL_0025. Br_s means that the code will continue executing on the specified address (a jump or GoTo if you will). So if we follow this path we will see that a call toMoveNext is made on the Enumerator. The result, a Boolean (bool in C#) is stored at local variable 2. We should now be able to guess what brtrue.s means. BRanch if TRUE to the specified address. We seek out the address IL_000b and end up right where we were. A call to get_Current (a Property!) is made. We are going to ignore the next line, it boxes the Object. VB generated IL differs from C# generated IL on this point. C# never makes the call to GetObjectValue. Next we are going to call Invoke on the delegate we passed to the method (the Object from the call to get_Current is on the stack and passed to the Invoke method. MoveNext is called again and the loop starts again. If MoveNext returns false we move to the finally block. Here the IEnumerator is checked for type IDisposable. The isinst opcode casts an Object to a specified type. If the IEnumerator implementsIDisposable then Dispose is called (to cleanup resources) and the method is finished executing.

Now that we have stepped through the IL line by line we should actually be able to translate that IL back into VB or C#! That is exactly what I have done. Take a look at the following code and also notice the slight difference between VB and C#.

Collapse | Copy Code

Public Shared Sub ForEachRewritten(ByVal l As IEnumerable, ByVal handler As Action(Of Object))
   Dim e As IEnumerator
   Try
      Dim obj As Object
      e = l.GetEnumerator
      Do While e.MoveNext
         obj = e.Current
         handler.Invoke(obj)
      Loop
   Finally
      If TryCast(e, IDisposable) IsNot Nothing Then
         DirectCast(e, IDisposable).Dispose()
      End If
   End Try
End Sub

Now compare the IL that was generated by function ForEach and by function ForEachRewritten, they are exactly the same! That's pretty neat, isn't it? There is quite some stuff going on that you didn't know about! Who would have thunk it?

I have done the same for GenericForEach and GenericForEachRewritten (which use an IEnumerable(Of T)) (IEnumerable<T> in C#). You may explore their respective IL at your own leisure. You can check if the different functions really have the same output by starting the application and clicking the 'For each'-button. You now get aForm with four buttons which each execute one of the ForEach functions and print their output to the TextBoxes.

4.4. The case of Lambda Expressions

4.4.1. An easy level example
Now you should have gotten the hang of it. Let's look at another example of how IL generates something pretty different than what you had typed. Lambda Expressions (a sort of anonymous function+) are a great example of what the compiler can do! Open up the LambdaExamples folder in the solution and look for theEasyLambdaButtonFactory. What we are going to do is create a set of Buttons and assign a lambda expression to the Button.Click Event. Take a look at the following code.

Collapse | Copy Code

Public Function GenerateButtons() As System.Collections.Generic.IEnumerable(Of System.Windows.Forms.Button) Implements IButtonFactory.GenerateButtons
   Dim list As New List(Of Button)
      For i As Integer = 1 To 10
         Dim btn As New Button
         btn.Text = "1"
         AddHandler btn.Click,
             Sub(sender, e)
                Dim senderBtn As Button = DirectCast(sender, Button)
                senderBtn.Text = (Convert.ToInt32(senderBtn.Text) + 1).ToString
             End Sub
         list.Add(btn)
      Next
   Return list
End Function

So what do we see here? In a loop we create ten Buttons. We assign the string value of "1" to each Buttons Text Property. Whenever the Button is clicked we cast the sender to a Button, convert the Buttons Text Property to an Integer, add 1 to it and assign the new value to the Buttons Text Property. The result should be that each time you click a Button its Text is incremented by one. You can see this for yourself on theEasy lambda form by starting the application.
Let's look at the IL again. Actually we can see some weird stuff has happened just by looking at the Class in ILDASM. It got an extra method that we did not implement!

Now where did that extra Shared (static in C#) function come from? That is our lambda expression! Just look at the IL of that thing.

Collapse | Copy Code

.method private specialname static void  _Lambda$__4(object sender,
                                                     class [mscorlib]System.EventArgs e) cil managed
{
  .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 ) 
  // Code size       38 (0x26)
  .maxstack  3
  .locals init ([0] class [System.Windows.Forms]System.Windows.Forms.Button senderBtn,
           [1] int32 VB$t_i4$S0)
  IL_0000:  nop
  IL_0001:  ldarg.0
  IL_0002:  castclass  [System.Windows.Forms]System.Windows.Forms.Button
  IL_0007:  stloc.0
  IL_0008:  ldloc.0
  IL_0009:  ldloc.0
  IL_000a:  callvirt   instance string [System.Windows.Forms]System.Windows.Forms.ButtonBase::get_Text()
  IL_000f:  call       int32 [mscorlib]System.Convert::ToInt32(string)
  IL_0014:  ldc.i4.1
  IL_0015:  add.ovf
  IL_0016:  stloc.1
  IL_0017:  ldloca.s   VB$t_i4$S0
  IL_0019:  call       instance string [mscorlib]System.Int32::ToString()
  IL_001e:  callvirt   instance void [System.Windows.Forms]System.Windows.Forms.ButtonBase::set_Text(string)
  IL_0023:  nop
  IL_0024:  nop
  IL_0025:  ret
} // end of method EasyLambdaButtonFactory::_Lambda$__4

It gets the argument Object sender and EventArgs e, it casts the sender to a Button, Gets the Text and converts it to an Integer, adds one to it (using the add_ovf opcode) and assigns it to the Text Property of theButton. That can't be a coincidence!
So how is this thing called in the function we DID implement?
Well, here you have it.

Collapse | Copy Code

IL_001d:  ldftn      void UnderTheHoodVB.Examples.LambdaExamples.EasyLambdaButtonFactory::_Lambda$__4(object,
                                                                                                        class [mscorlib]System.EventArgs)
IL_0023:  newobj     instance void [mscorlib]System.EventHandler::.ctor(object,
                                                                          native int)
IL_0028:  callvirt   instance void [System.Windows.Forms]System.Windows.Forms.Control::add_Click(class [mscorlib]System.EventHandler)

A pointer to the generated function is pushed up the stack (using the ldftn opcode). A new instance of anEventHandler delegate is created and a the pointer is passed to the constructor. The EventHandler is added to the list of listeners for the Buttons Click Event. We might as well had coded our own Shared (static in C#) function and used AddressOf (+= operator in C#).
So why don't we? Well, first of all it is not very readable to have a lot of Shared functions sitting around in ourClass that are used in just one place, second because we can do very nifty stuff using lambda expressions, as we will see in the next example.

By the way, you might have noticed that the C# compiler also created a field called something likeCachedAnonymousDelegate. Want to know what's up with that? This is a little performance issue. In order to prevent creating multiple delegates the C# compiler creates one, stores it and re-uses it instead of creating a new delegate every time.

4.4.2. A medium level example
So it is time to open up the MediumLambdaButtonFactory. This lambda is just slightly different from the first one. Let's see.

Collapse | Copy Code

AddHandler btn.Click,
   Sub(sender, e)
      btn.Text = (Convert.ToInt32(btn.Text) + 1).ToString
   End Sub

What is the trick? We have used the btn variable in the handler, even though is outside the scope of the lambda expression (had we created a Shared function we would not have access to the btn variable)! So let's check ILDASM again.

Holy cow! The compiler created an entire new type called _Closure$__5! What's that? It holds the Button that was outside the scope of the function as a field and has a function called _Lambda$__9. I don't think it's necessary to take a look at the IL of that _Lambda$__9 function. It simply does what the previous lambda example did, except this time it doesn't cast the sender to a Button, instead it uses the btn field. What is interesting is to look at the IL of GenerateButtons. Unfortunately I can't post it here since the emitted lines of opcode become to wide for the average monitor. However, what we see in the IL is that a new instance of _Closure$__5 is created and that the Button is assigned to _Closure$__5.$VB$Local_btn. To assign the function to the Button.Click Event the same code is emitted as in the example above, except the delegate now takes a pointer to_Closure$__5._Lambda$__9.
So why did the compiler create an inner type for this function? As I said, the btn variable is out of the functions scope. In this case the btn variable would actually go out of scope as soon as the next For Loop starts, but the function that is created by our lambda expression stays alive for as long as the Button that the btn variable points to does or until the Click Handler is removed. So the compiler must find a way to keep that btn variable alive for as long as the delegate is alive. It does this by wrapping the btn variable in a new Type and keeping a reference to that an instance of that Type through the Button.Click Event. So while this example does the exact same as the previous example (you can check it in the medium lambda form) the emitted IL is quite a bit different!

4.4.3. A hard level example

You might have guessed, but what happens if variables from multiple scopes are used in the same lambda expression? The compiler actually creates an inner type for each level of scope! So let's look at the next example, in which we are going to introduce a counter outside the For Loop that is shared among all Button Click Events.

Collapse | Copy Code

Public Function GenerateButtons() As System.Collections.Generic.IEnumerable(Of System.Windows.Forms.Button) Implements IButtonFactory.GenerateButtons
   Dim list As New List(Of Button)
   Dim counter As Integer = 1
   For i As Integer = 1 To 10
      Dim btn As New Button
      btn.Text = counter.ToString
      AddHandler btn.Click,
         Sub(sender, e)
            counter += 1
            btn.Text = counter.ToString
         End Sub
      list.Add(btn)
   Next
   Return list
End Function

So what we see here is that the btn variable is still within the scope of the For Loop, but the counter is outside the scope of the For Loop and thus shared by all Buttons. This means that if you would click one button it's Textwould change to "2" and if you would then click another button it's Text would change to "3" (because the first button already incremented the counter). You can see this effect in the hard lambda form.
Once again we will take a look at ILDASM to see what was created for us by the compiler.

That's a type inside a type inside a type that was created for you... The most inner type (_Closure$__2) holds a reference to it's outer type (_Closure$__1). Why is that? Well, the lambda needs a reference to a btn variable, which is unique for each Event Handler, and a reference to the counter variable, which is shared between allEvent Handlers. So each Buttons Click EventHandler will have a reference to a unique instance of_Closure$__2 which will all hold a reference to the same instance of _Closure$__1, which holds the countervariable. If you would look at ILDASM with the C# project you would see that there is no inner-inner type, just two inner types. Besides that small difference all else still holds true for C#. Once again I won't show any IL code, because it wouldn't fit the page. You can look at it yourself. It's quite a bit, but don't be discouraged! Simply read it line by line and you will get it. We will look at the VB and C# equivalents in a minute, if it isn't clear to you now it will be in the next example.

4.4.4. An insane level example
Don't let that title scare you off. The only thing that makes this example slightly more difficult from the previous one is that a new level of scope was added to the lambda. I have created an additional counter called _outerCounteras a field in the InsaneLambdaButtonFactory.

Collapse | Copy Code

Private _outerCounter As Integer = 1
 
Public Function GenerateButtons() As System.Collections.Generic.IEnumerable(Of System.Windows.Forms.Button) Implements IButtonFactory.GenerateButtons
   Dim list As New List(Of Button)
   Dim counter As Integer = 1
   For i As Integer = 1 To 10
      Dim btn As New Button
      btn.Text = _outerCounter.ToString + " - " + counter.ToString
      AddHandler btn.Click,
         Sub(sender, e)
            counter += 1
            If counter Mod 10 = 0 Then
               _outerCounter += 1
            End If
            btn.Text = _outerCounter.ToString + " - " + counter.ToString
         End Sub
      list.Add(btn)
   Next
   Return list
End Function

So as you see the _outerCounter variable is used inside the lambda and is shared by all Buttons (much like thecounter variable). There is a difference though. _outerCounter might be changed by something other than a button click. So another scope another inner type? Nope! In this case the first inner type holds a reference to an instance of the object that created it.

Pretty smart, eh? The function in _Closure$__4 now has access to the instance of the ButtonFactory that created it through the reference to _Closure$__3. This way the ButtonFactory and the lambda function both look at the same _outerCounter. The _outerCounter is incremented by one if the counter that is shared by just the Buttons is incremented ten times. If you want to see how it works go ahead and open up the insane lambda form and click twenty times on whichever buttons you want.

So that is pretty neat, but wouldn't it be clearer if you could see some of this in VB or C#? Well, it's your lucky day! I have studied the IL for the insane lambda example and made a Class that creates the exact same IL (save for some name changes). Go take a look at InsaneLambdaButtonFactoryRewritten and compare the emitted IL to that of the InsaneLambdaButtonFactory. Also compare the VB variant to the C# variant to spot some minor differences.

Collapse | Copy Code

Public Class InsaneLambdaButtonFactoryRewritten
   Implements IButtonFactory
 
   Private outerCounter As Integer = 1
 
   Public Function GenerateButtons() As _
          System.Collections.Generic.IEnumerable(Of System.Windows.Forms.Button) _
          Implements IButtonFactory.GenerateButtons
      Dim iLambda As New InnerLambda
      iLambda.Field_Me = Me
      Dim list As New List(Of Button)
      Dim iILambda As InnerLambda.InnerInnerLambda
      iLambda.Local_counter = 1
      For i As Integer = 1 To 10
         iILambda = New InnerLambda.InnerInnerLambda(iILambda)
         iILambda.NonLocal_Inner_InnerLambda = iLambda
         iILambda.Local_btn = New Button
         iILambda.Local_btn.Text = outerCounter.ToString + " - " + iLambda.Local_counter.ToString
         AddHandler iILambda.Local_btn.Click, AddressOf iILambda.EventHandler
         list.Add(iILambda.Local_btn)
      Next
      Return list
   End Function
 
   Public Class InnerLambda
 
      Public Local_counter As Integer
      Public Field_Me As InsaneLambdaButtonFactoryRewritten
 
      Public Sub New()
      End Sub
 
      Public Sub New(ByVal innerLambda As InnerLambda)
         If Not innerLambda Is Nothing Then
            Field_Me = innerLambda.Field_Me
            Local_counter = innerLambda.Local_counter
         End If
      End Sub
 
      Public Class InnerInnerLambda
 
         Public Local_btn As Windows.Forms.Button
         Public NonLocal_Inner_InnerLambda As InnerLambda
 
         Public Sub New()
         End Sub
 
         Public Sub New(ByVal innerInnerLambda As InnerInnerLambda)
            If Not innerInnerLambda Is Nothing Then
               Local_btn = innerInnerLambda.Local_btn
            End If
         End Sub
 
         Public Sub EventHandler(ByVal sender As Object, ByVal e As EventArgs)
            NonLocal_Inner_InnerLambda.Local_counter = NonLocal_Inner_InnerLambda.Local_counter + 1
            If NonLocal_Inner_InnerLambda.Local_counter Mod 10 = 0 Then
               NonLocal_Inner_InnerLambda.Field_Me.outerCounter = _
                        NonLocal_Inner_InnerLambda.Field_Me.outerCounter + 1
            End If
            Local_btn.Text = NonLocal_Inner_InnerLambda.Field_Me.outerCounter.ToString + _
                             " - " + NonLocal_Inner_InnerLambda.Local_counter.ToString
         End Sub
 
      End Class
 
   End Class
 
End Class

As you can see there is not a sight of a lambda expression. You should be able to debug this code and see what it does. The lambda part was moved to the function in the InnerInnerLambda (InnerLambda2 in C#). This also holds the btn variable and a reference to InnerLambda (or InnerLambda1 in C#). The InnerLambda takes care of the counter variable and holds a reference to the ButtonFactory for the _outerCounter. All of the variables are set in the original function that creates the buttons. In fact you can see all variables have been removed from this function and are replaced by InnerLambda and InnerInnerLambda calls.

You can check that the InsaneLambdaButtonFactoryRewritten really does the same as theInsaneLambdaButtonFactory by running the application and opening the insane lambda rewritten Form.

You can also experiment with this yourself, try nesting even further using nested For Each loops and If Then Else Statements. You now know how it works!

Further reading:
Lambda Expressions (Visual Basic) on MSDN
Anonymous functions (C# Programming Guide) on MSDN

A couple of blogs on the subject:
Anonymous Methods, Part 1 of ?
Anonymous methods as event handlers - Part 1
The implementation of anonymous methods in C# and its consequences (part 1)

I now encourage you to take a look at the lecture of Bart de Smet. He also takes a look at lambda expressions and additionally explains how they could cause memory leaks if you are not careful. It is actually the first topic he talks about so you can just start the video, sit back and relax.
Bart de Smet - Behind the Scenes of 10 C# Language Features

4.5. The case of Anonymous Types

Let's move on to the next VB and C# construct I have prepared for you, anonymous types. You can find the examples in the AnonymousTypeExamples folder in your solution. What we are going to do is create a collection ofPerson objects and select a sub-set of Properties, which will create a so-called anonymous type. So let's look at the first example. Look at either the FormalNamePeopleFactory or the NickNamePeopleFactory. It does not matter at which we'll look first, so I'll go with the FormalNamePeopleFactory. Here is the code for it.

Collapse | Copy Code

Public Function GeneratePeople() As System.Collections.IList Implements IPeopleFactory.GeneratePeople
   Return PeopleHelper.GetPeople.Select(Function(p) New With {.FullName = p.LastName + ", " + p.FirstName, .Age = p.Age}).ToList
End Function

That's not a lot of code, but a lot is going on that you don't know about (but will know about in a few moments). First, let's see what this code actually does. PeopleHelper.GetPeople simply creates a collection of Personobjects. We then call the Select function, which is an Extension Method on IEnumerable(Of T)(IEnumerable<T> in C#). You can see we are creating a new Object because both the VB and C# example have the New Keyword. However, instead of defining a Type, such as New Person, we are using that With Keywordagain (see the earlier With example in this article). We then define a set of non-existant Properties and assign a value to them.

Let's look at the IL that was generated for this function.

Collapse | Copy Code

.method private specialname static class VB$AnonymousType_0`2<string,int32> 
        _Lambda$__2(class UnderTheHoodVB.Examples.Person p) cil managed
{
  .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 ) 
  // Code size       38 (0x26)
  .maxstack  3
  .locals init ([0] class VB$AnonymousType_0`2<string,int32> _Lambda$__2,
           [1] class VB$AnonymousType_0`2<string,int32> VB$t_ref$S0)
  IL_0000:  ldarg.0
  IL_0001:  callvirt   instance string UnderTheHoodVB.Examples.Person::get_LastName()
  IL_0006:  ldstr      ", "
  IL_000b:  ldarg.0
  IL_000c:  callvirt   instance string UnderTheHoodVB.Examples.Person::get_FirstName()
  IL_0011:  call       string [mscorlib]System.String::Concat(string,
                                                              string,
                                                              string)
  IL_0016:  ldarg.0
  IL_0017:  callvirt   instance int32 UnderTheHoodVB.Examples.Person::get_Age()
  IL_001c:  newobj     instance void class VB$AnonymousType_0`2<string,int32>::.ctor(!0,
                                                                                     !1)
  IL_0021:  stloc.0
  IL_0022:  br.s       IL_0024
  IL_0024:  ldloc.0
  IL_0025:  ret
} // end of method FormalNamePeopleFactory::_Lambda$__2

Of course we have used a lambda expression, so we should look at the IL in the generated _Lambda$__2function. As you can see this is a function that returns a VB$AnonymousType_0`2<string, int32> (it's in the most upper line). We see that the fullname is pushed up the stack, p.LastName, ", " and p.FirstName and are concatenated. Then the concatenated FullName and p.Age are pushed on the stack and a new instance of anAnonymousType(Of T1, T2) (AnonymousType<T1, T2> in C#) is created where T1 is a string (theFullName Property) and T2 is an int32 (the Age Property). So where did this AnonymousType(Of T1, T2) come from and why is it Generic?
When you check ILDASM you can actually see the AnonymousType sitting in the Global Namespace.

So the compiler actually creates a new type for you (making the anonymous type a lot less anonymous under the hood). So that explains where the AnonymousType came from, but not why it is Generic or why it is sitting in theGlobal Namespace and not just right next to the function where it is used (perhaps even as another inner type).
That second part can be explained by looking at the other example, NickNamePeopleFactory. So let's look at the code.

Collapse | Copy Code

Public Function GeneratePeople() As System.Collections.IList Implements IPeopleFactory.GeneratePeople
   Return PeopleHelper.GetPeople.Select(Function(p) New With {.FullName = p.FirstName + " " + p.LastName, .Age = p.Age}).ToList
End Function

As you can see this function does almost exactly the same, except the FullName Property is formatted slightly different. Since this is another function in another Class you would expect the compiler to simply create another anonymous type (after all, it does that for lambda's too). This is not the case however, when we look at the IL code of this function we can see the following.

Collapse | Copy Code

  .locals init ([0] class VB$AnonymousType_0`2<string,int32> _Lambda$__3,
           [1] class VB$AnonymousType_0`2<string,int32> VB$t_ref$S0)
  IL_0000:  ldarg.0
  IL_0001:  callvirt   instance string UnderTheHoodVB.Examples.Person::get_FirstName()
  IL_0006:  ldstr      " "
  IL_000b:  ldarg.0
  IL_000c:  callvirt   instance string UnderTheHoodVB.Examples.Person::get_LastName()
  IL_0011:  call       string [mscorlib]System.String::Concat(string,
                                                              string,
                                                              string)
  IL_0016:  ldarg.0
  IL_0017:  callvirt   instance int32 UnderTheHoodVB.Examples.Person::get_Age()
  IL_001c:  newobj     instance void class VB$AnonymousType_0`2<string,int32>::.ctor(!0,
                                                                                     !1)
  IL_0021:  stloc.0
  IL_0022:  br.s       IL_0024
  IL_0024:  ldloc.0
  IL_0025:  ret
} // end of method NickNamePeopleFactory::_Lambda$__3

It looks much like the IL that the previous method generated, although we can see a difference in the formatting of the FullName Property. But that's really the only difference we see! The same anonymous type is used for this function! Now what if this anonymous type was used in a Private Inner Class and the anonymous type would have been another subtype of the Private Class? Then obviously the NickNamePeopleFactory would not have access to it anymore and a new AnonymousType would have to be generated. Appearently it takes longer for the compiler to generate a new AnonymousType than to reuse an already existing one.

So why, then, is it Generic? Because the AnonymousType is in the Global Namespace it does not have access to any Private Types, but because the AnonymousType is Generic it never actually references anyPrivate Types and as such it could be reused with any Type you could possibly think of. Let's look at theBuggedPeopleFactory.

Collapse | Copy Code

Public Function GeneratePeople() As System.Collections.IList Implements IPeopleFactory.GeneratePeople
   Return PeopleHelper.GetPeople.Select(Function(p) New With {.FullName = p.Age, .Age = p.FirstName}).ToList
End Function

As you can see some nitwit programmer (in this case me) switched Age and FullName so Age now displaysFullName and FullName displays Age! That means FullName is no longer a string and Age is no int32. Yet when we look at the generated IL we can see that the same AnonymousType is used.

Collapse | Copy Code

.method private specialname static class VB$AnonymousType_0`2<int32,string> 
        _Lambda$__1(class UnderTheHoodVB.Examples.Person p) cil managed
{
  .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 ) 
  // Code size       22 (0x16)
  .maxstack  2
  .locals init ([0] class VB$AnonymousType_0`2<int32,string> _Lambda$__1,
           [1] class VB$AnonymousType_0`2<int32,string> VB$t_ref$S0)
  IL_0000:  ldarg.0
  IL_0001:  callvirt   instance int32 UnderTheHoodVB.Examples.Person::get_Age()
  IL_0006:  ldarg.0
  IL_0007:  callvirt   instance string UnderTheHoodVB.Examples.Person::get_FirstName()
  IL_000c:  newobj     instance void class VB$AnonymousType_0`2<int32,string>::.ctor(!0,
                                                                                     !1)
  IL_0011:  stloc.0
  IL_0012:  br.s       IL_0014
  IL_0014:  ldloc.0
  IL_0015:  ret
} // end of method BuggedPeopleFactory::_Lambda$__1

How about that? It simply returns the same AnonymousType, but with different Generic parameters. Now if we would have used a Private Type that another method that uses the same AnonymousType does not have access to? It can still use the same AnonymousType, just with other Generic parameters! It really is a work of beauty!

But why and when are anonymous types reused anyway? They are reused when the number, name and order of the Properties on the anonymous type are the same. For example, try switching FullName and Age around on one of the functions and you will see a second anonymous type being created in ILDASM. You could also spellFulName with a single L on one of the functions and you will likewise see a new anonymous type being generated. The reason they are reused is so you can have two lists of anonymous types that represent the same Object and you can still compare them (if they would have been different Types entirely a comparison would always returnFalse).

By the way, did you notice all the functions in the example return an IList? That is so I can bind to the anonymous type that is returned by the functions. You can see this in action in the Forms that are in the GroupBox labeled 'Anonymous type examples'.

Further reading:
Anonymous Types (Visual Basic) on MSDN
Anonymous Types (C# Programming Guide) on MSDN
Why are anonymous types generic?

And once again I also want to point you at the lecture by Bart de Smet. He explains a thing or two about anonymous types at around 45:30 mins.
Bart de Smet - Behind the Scenes of 10 C# Language Features

4.6. The case of Cases

The next thing I want to talk about is the Select Case Statement (switch Statement in C#). There is some magic going on here which is explained very well by Bart de Smet at around 13:40 mins. I recommend you watch this part before continuing. At this point I have some sad news for the C#ers who are reading this article. The next section is VB only (but of course you're very welcome to read it too). Why VB only? VB has a special kind of Select Case, being one where they turn things around quite a bit. A regular Select Case compares a value to other values and executes the Case where the two values are the same (or Else). In this Select Case however the first Casewhere whatever statement returns True is executed. Let's look at an example of a regular case in VB. You can find the code under the SelectCaseExample in the VB solution.

Collapse | Copy Code

Public Sub DoACase()
   Dim i As Integer = 10
 
   Select Case i
      Case 1
         Console.WriteLine("i = 1")
      Case 2
         Console.WriteLine("i = 2")
      Case 3
         Console.WriteLine("i = 3")
      Case Else
         Console.WriteLine("i is something else.")
   End Select
End Sub

As you can see i is compared to 1, 2 and 3 and the code inside the cases is executes only when a Case returnsTrue. Now let's turn things around.

Collapse | Copy Code

Public Sub DoATrueCase()
   Dim i As Integer = 10
 
   Select Case True
      Case i = 1
         Console.WriteLine("i = 1")
      Case i = 2
         Console.WriteLine("i = 2")
      Case i = 3
         Console.WriteLine("i = 3")
      Case Else
         Console.WriteLine("i is something else.")
   End Select
End Sub

As you can see in this Select Case we test for a couple of statements (that could be anything as long as it returns a Boolean) and compare their outcomes to True. However, let's look at their respective generated IL.

Collapse | Copy Code

.method public instance void  DoACase() cil managed
{
  // Code size       86 (0x56)
  .maxstack  2
  .locals init ([0] int32 i,
           [1] int32 VB$t_i4$L0,
           [2] int32 VB$CG$t_i4$S0)
  IL_0000:  nop
  IL_0001:  ldc.i4.s   10
  IL_0003:  stloc.0
  IL_0004:  nop
  IL_0005:  ldloc.0
  IL_0006:  ldc.i4.1
  IL_0007:  sub
  IL_0008:  stloc.2
  IL_0009:  ldloc.2
  IL_000a:  switch     ( 
                        IL_001d,
                        IL_002b,
                        IL_0039)
  IL_001b:  br.s       IL_0047
  IL_001d:  nop
  IL_001e:  ldstr      "i = 1"
  IL_0023:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_0028:  nop
  IL_0029:  br.s       IL_0053
  IL_002b:  nop
  IL_002c:  ldstr      "i = 2"
  IL_0031:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_0036:  nop
  IL_0037:  br.s       IL_0053
  IL_0039:  nop
  IL_003a:  ldstr      "i = 3"
  IL_003f:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_0044:  nop
  IL_0045:  br.s       IL_0053
  IL_0047:  nop
  IL_0048:  ldstr      "i is something else."
  IL_004d:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_0052:  nop
  IL_0053:  nop
  IL_0054:  nop
  IL_0055:  ret
} // end of method SelectCaseExample::DoACase

You can clearly see a Select Case being executed here (it's the switch opcode). So let's look at the second example where the Select Case does not compare values, but looks if a given statement returns True.

Collapse | Copy Code

.method public instance void  DoATrueCase() cil managed
{
  // Code size       97 (0x61)
  .maxstack  3
  .locals init ([0] int32 i,
           [1] bool VB$t_bool$L0,
           [2] bool VB$CG$t_bool$S0)
  IL_0000:  nop
  IL_0001:  ldc.i4.s   10
  IL_0003:  stloc.0
  IL_0004:  nop
  IL_0005:  ldc.i4.1
  IL_0006:  stloc.1
  IL_0007:  nop
  IL_0008:  ldloc.1
  IL_0009:  ldloc.0
  IL_000a:  ldc.i4.1
  IL_000b:  ceq
  IL_000d:  ceq
  IL_000f:  stloc.2
  IL_0010:  ldloc.2
  IL_0011:  brfalse.s  IL_0020
  IL_0013:  ldstr      "i = 1"
  IL_0018:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_001d:  nop
  IL_001e:  br.s       IL_005e
  IL_0020:  nop
  IL_0021:  ldloc.1
  IL_0022:  ldloc.0
  IL_0023:  ldc.i4.2
  IL_0024:  ceq
  IL_0026:  ceq
  IL_0028:  stloc.2
  IL_0029:  ldloc.2
  IL_002a:  brfalse.s  IL_0039
  IL_002c:  ldstr      "i = 2"
  IL_0031:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_0036:  nop
  IL_0037:  br.s       IL_005e
  IL_0039:  nop
  IL_003a:  ldloc.1
  IL_003b:  ldloc.0
  IL_003c:  ldc.i4.3
  IL_003d:  ceq
  IL_003f:  ceq
  IL_0041:  stloc.2
  IL_0042:  ldloc.2
  IL_0043:  brfalse.s  IL_0052
  IL_0045:  ldstr      "i = 3"
  IL_004a:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_004f:  nop
  IL_0050:  br.s       IL_005e
  IL_0052:  nop
  IL_0053:  ldstr      "i is something else."
  IL_0058:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_005d:  nop
  IL_005e:  nop
  IL_005f:  nop
  IL_0060:  ret
} // end of method SelectCaseExample::DoATrueCase

Well, well, well! Not a single switch opcode can be found! What we have here is a lot of comparisons and BRanch opcodes. What you can see here is that it's actually something that looks like an If Then ElseIf ElseIf Else Statement is being generated. Compare the IL of the previous example with the IL of the DoAnIfThenElseIfmethod. You will see some similiarities. You might also see why C# does not support it. It is not a switch Statement, but it is also not as consise as If Then ElseIf Else. As for readability, I'll leave that up to you.

4.7. The case of Iterators

Was the previous example for VB readers, this example is actually for C# people. Iterator methods have been featured in C# for a while now. It is featured in VB in the VS Async CTP release and it will be featured in VB11 by default (or so I was told). I will not explain this part in much detail since Bart de Smet does a great job at explaining it too. I am simply going to point out some stuff.
Let's first look at a code example using an Iterator. It can be found in the IteratorExample folder in your C# solution.

Collapse | Copy Code

public static string UseTheItator()
{
   StringBuilder sb = new StringBuilder();
   foreach (string s in EnumeratorFunction())
   {
      sb.AppendLine(s);
   }
   return sb.ToString();
}
 
private static IEnumerable<string> EnumeratorFunction()
{
   string hello = "Hello";
   yield return hello;
   hello += " people!";
   yield return "Iterator!";
}

What do you think will the StringBuilder in UseTheIterator return? "Hello people! Iterator"? On the Main form press the Iterator button and find out. You can see the returned text is "Hello Iterator". This is strange, because that would mean the hello variable was already returned to the calling method before " people!" was appended to it, but "Iterator!" was returned as well. This is exactly what an Iterator does. The yield Keyword tells the function to return to the calling method, then come back and continue executing. Just how does it do this? ILDASM has the answer.

Sweet mother of IL! The C# compiler generated a Type that implements both IEnumerable<string> andIEnumerator<string>! If we look at the IL of the original EnumeratorFunction we can see that it simply returns an instance of this generated type.

Collapse | Copy Code

  .locals init ([0] class UnderTheHoodCSharp.Examples.IteratorExample.IteratorExample/'<EnumeratorFunction>d__0' V_0,
           [1] class [mscorlib]System.Collections.Generic.IEnumerable`1<string> V_1)
  IL_0000:  ldc.i4.s   -2
  IL_0002:  newobj     instance void UnderTheHoodCSharp.Examples.IteratorExample.IteratorExample/'<EnumeratorFunction>d__0'::.ctor(int32)
  IL_0007:  stloc.0
  IL_0008:  ldloc.0
  IL_0009:  stloc.1
  IL_000a:  br.s       IL_000c
  IL_000c:  ldloc.1
  IL_000d:  ret
} // end of method IteratorExample::EnumeratorFunction

So where is all the logic to return "Hello", append " people!" etc.? You can all find it in the MoveNext method of the generated type.

Collapse | Copy Code

.method private hidebysig newslot virtual final 
        instance bool  MoveNext() cil managed
{
  .override [mscorlib]System.Collections.IEnumerator::MoveNext
  // Code size       142 (0x8e)
  .maxstack  3
  .locals init ([0] bool CS$1$0000,
           [1] int32 CS$4$0001)
  IL_0000:  ldarg.0
  IL_0001:  ldfld      int32 UnderTheHoodCSharp.Examples.IteratorExample.IteratorExample/'<EnumeratorFunction>d__0'::'<>1__state'
  IL_0006:  stloc.1
  IL_0007:  ldloc.1
  IL_0008:  switch     ( 
                        IL_001f,
                        IL_001b,
                        IL_001d)
  IL_0019:  br.s       IL_0021
  IL_001b:  br.s       IL_004d
  IL_001d:  br.s       IL_0080
  IL_001f:  br.s       IL_0023
  IL_0021:  br.s       IL_0088
  IL_0023:  ldarg.0
  IL_0024:  ldc.i4.m1
  IL_0025:  stfld      int32 UnderTheHoodCSharp.Examples.IteratorExample.IteratorExample/'<EnumeratorFunction>d__0'::'<>1__state'
  IL_002a:  nop
  IL_002b:  ldarg.0
  IL_002c:  ldstr      "Hello"
  IL_0031:  stfld      string UnderTheHoodCSharp.Examples.IteratorExample.IteratorExample/'<EnumeratorFunction>d__0'::'<hello>5__1'
  IL_0036:  ldarg.0
  IL_0037:  ldarg.0
  IL_0038:  ldfld      string UnderTheHoodCSharp.Examples.IteratorExample.IteratorExample/'<EnumeratorFunction>d__0'::'<hello>5__1'
  IL_003d:  stfld      string UnderTheHoodCSharp.Examples.IteratorExample.IteratorExample/'<EnumeratorFunction>d__0'::'<>2__current'
 // Etc...
  IL_0080:  ldarg.0
  IL_0081:  ldc.i4.m1
  IL_0082:  stfld      int32 UnderTheHoodCSharp.Examples.IteratorExample.IteratorExample/'<EnumeratorFunction>d__0'::'<>1__state'

So what can we see here? Each time MoveNext is called the _state field is incremented by one and dependent on the value of the _state field MoveNext performs another piece of code. Feels kind of 'dirty' doesn't it? Anyway, the compiler really does an excellent job in keeping such difficult stuff hidden from the programmer. It really is a piece of art!

As I said, you should really check out Bart de Smet's talk on Iterators. He starts about Iterators after about 36:30 mins.
Bart de Smet - Behind the Scenes of 10 C# Language Features

5. Emitting IL using VB or C#

As I already mentioned we can emit our own opcodes and generate IL on the fly using VB or C#! That is exactly what we are going to do here. But, we are not going to generate just any method, we are going to generate a method that makes use of a Try... Fault Block! This feature is not available in VB or C#, but it is available in IL. The Try... Fault block looks like a Try... Catch with the difference that a Fault block does not actually catch the Exception. It simply executes some code, but only when an Exception is thrown (and ALWAYS if anException is thrown, much like the Try... Finally Block). This is not as hard as it sounds, really. Open up the TypeFactory Class in the EmitExamples folder of your solution. When you open it you see a Public Shared (static in C#) function that returns a Type. The Type, however, is generated when the function is called for the first time. Let's see how the Type is created.

First we must create an Assembly (or dll) to hold the type, we can do this using an AssemblyBuilder. After that we create a Module, using a ModuleBuilder, that actually holds the Type we are going to create. With the Module we can get a TypeBuilder which builds the Type and gives us access to MethodBuilders to define new methods on theType. That is all fairly simple, right? Let's take a look at the code and it will become clear to you.

Collapse | Copy Code

' Create a new Assembly using an AssemblyBuilder.
Dim domain As AppDomain = System.Threading.Thread.GetDomain()
Dim assmName As New AssemblyName("DynamicAssembly")
Dim dynamicAssmBuilder As AssemblyBuilder = domain.DefineDynamicAssembly(assmName, AssemblyBuilderAccess.RunAndSave)
 
' Create a new Module using a ModuleBuilder.
Dim dynamicModule As ModuleBuilder = dynamicAssmBuilder.DefineDynamicModule("DynamicModule", "DynamicModule.dll")
 
' Create a new type using a TypeBuilder.
Dim dynamicTypeBuilder As TypeBuilder = dynamicModule.DefineType("DynamicType", TypeAttributes.Public)

At this point we have done practically everything that is needed to implement our own methods on a type. That was really only a few lines of code! So how do we get our methods? We can see an example of that in theGenerateILMethod method.

Collapse | Copy Code

Dim internalILMethod As MethodBuilder = typeBuilder.DefineMethod("InternalILMethod", _
      MethodAttributes.Private Or MethodAttributes.Static, Nothing, methodParams)

As you can see we call the DefineMethod method on the TypeBuilder, which returns a MethodBuilder. The method will have the name "InternalILMethod" and it will be Private and Shared (static in C#). It also requires two parameters, in this case an Action(Of String) (Action<string> in C#) and a Boolean. Now that we have a MethodBuilder we want to give it some body. We want to emit some opcodes so the method actually does something when we are going to call it later. We do this by calling GetILGenerator on the MethodBuilder.GetILGenerator returns an ILGenerator Object for the current method. We can then use the ILGenerator to emit opcodes. This is actually pretty easy as you will see.

Collapse | Copy Code

' Use an ILGenerator to emit opcodes.
Dim internalILGen As ILGenerator = internalILMethod.GetILGenerator()
 
' A directive for the compiler.
Dim skipThrow As Label = internalILGen.DefineLabel
' Begin a try-finally block.
internalILGen.BeginExceptionBlock()
' Begin a try-fault block.
internalILGen.BeginExceptionBlock()
' Load the first argument (index 0), this is an Action(Of String).
' Be aware that if this were not Shared the first argument would be
' the object on who the method was called.
internalILGen.Emit(OpCodes.Ldarg_0)
' Push a string on the stack.
internalILGen.Emit(OpCodes.Ldstr, "Entered Emit method.")
' Invoke invoke on the Action(Of String) (it is first on the stack)
' and pass the string as a parameter (which is second on the stack).
internalILGen.Emit(OpCodes.Call, invokeInfo)
' Push the second parameter (index 1) on the stack, this is a boolean.
internalILGen.Emit(OpCodes.Ldarg_1)
' Check if the boolean is true by pushing a 1 on the stack and comparing it.
internalILGen.Emit(OpCodes.Ldc_I4_1)
' Compare the two values.
internalILGen.Emit(OpCodes.Ceq)
' If the boolean is false (does not equal 1) go to skipThrow.
internalILGen.Emit(OpCodes.Brfalse, skipThrow)
' If the boolean was not equal to 0 push a string (exception message) on the stack.
internalILGen.Emit(OpCodes.Ldstr, "Well, that's it for you!")
' Create a new Exception. The exception message is on the stack and will be passed to the constructor.
internalILGen.Emit(OpCodes.Newobj, exType.GetConstructor(stringType))
' Throw the Exception.
internalILGen.Emit(OpCodes.Throw)
' If the third parameter to the method was False (0) then the code will skip the previous lines and continue here.
internalILGen.MarkLabel(skipThrow)
' Load the second argument again.
internalILGen.Emit(OpCodes.Ldarg_0)
' Push a string on the stack.
internalILGen.Emit(OpCodes.Ldstr, "Emit method finished successfully.")
' Invoke the invoke method on the Action(Of String) and pass the string as a parameter.
internalILGen.Emit(OpCodes.Call, invokeInfo)
 
' Begin the illusive Fault block!
' This code ONLY runs when an Exception was thrown, but does NOT catch the Exception.
internalILGen.BeginFaultBlock()
' Load the Action(Of T) again.
internalILGen.Emit(OpCodes.Ldarg_0)
' Push another string on the stack.
internalILGen.Emit(OpCodes.Ldstr, "Emit method finished unsuccessfully.")
' Again invoke the invoke method and pass the string (on the stack) as a parameter.
internalILGen.Emit(OpCodes.Call, invokeInfo)
' End the try-fault block.
internalILGen.EndExceptionBlock()
 
' Begin a finally block.
internalILGen.BeginFinallyBlock()
' Once again invoke the invoke method on the Action(Of String) (first parameter)
' with the string that was pushed onto the stack.
internalILGen.Emit(OpCodes.Ldarg_0)
internalILGen.Emit(OpCodes.Ldstr, "Leaving Emit method.")
internalILGen.Emit(OpCodes.Call, invokeInfo)
' End the try-finally block.
internalILGen.EndExceptionBlock()
 
' Return.
internalILGen.Emit(OpCodes.Ret)

So we now have a method that would look like the following:

Collapse | Copy Code

Private Shared Sub InternalILMethod(ByVal a As Action(Of String), ByVal b As Boolean)
   Try
      a.Invoke("Entered Emit method.")
      If b = True Then
         Throw New Exception("Well, that's it for you!")
      End If
      a.Invoke("Emit method finished successfully.")
   Fault ' Not possible in VB!
      a.Invoke("Emit method finished unsuccessfully.")
   Finally
      a.Invoke("Leaving Emit method.")
   End Try
End Sub

There is a little trick you should remember should you ever need to emit your own opcodes like this. First write and build the code you actually want to emit and then check IL to see the IL that was emitted. Doing this will greatly simplify coding IL like this. So we now have our method, that uses a delegate to pass some messages to its caller and may throw an Exception if the supplied Boolean is True. If an Exception is thrown the code will step into the Fault block and send the message "Emit method finished unsuccessfully.". The code will always execute the part in the Finally block. We have a little problem now though. I will be calling this method dynamically, and the Exception will not be caught by me, but by the dynamic caller, which will wrap it into anotherExceptions InnerException and then throw it back to me. I prefer to dynamically call a method that does not throw an Exception. So what we are going to do is call the method we just created from another method we are going to create. This extra method will catch the Exception for us and pass the Exception Message andStackTrace to the delegate.

Collapse | Copy Code

' Create another method using a MethodBuilder.
' This method calls the previous method.
' If the previous method throws an Exception this method will Catch it and write the message using a delegate.
' If the Exception is not caught in this code then the InvokeMember method
' will wrap the Exception in a TargetInvocationException.
Dim execILMethod As MethodBuilder = typeBuilder.DefineMethod("ExecuteILMethod", _
                 MethodAttributes.Public Or MethodAttributes.Static, Nothing, methodParams)
' Use an ILGenerator to emit opcodes.
Dim execILGen As ILGenerator = execILMethod.GetILGenerator
 
' Begin a try-catch block.
execILGen.BeginExceptionBlock()
' Load the Action(Of String) on the stack.
execILGen.Emit(OpCodes.Ldarg_0)
' Load the Boolean on the stack.
execILGen.Emit(OpCodes.Ldarg_1)
' Invoke the WriteInternal method on the current
' type and pass the delegate and Boolean as parameters.
execILGen.Emit(OpCodes.Call, internalILMethod)
 
' Begin a catch block.
execILGen.BeginCatchBlock(exType)
' Create a local variable, which will hold the caught Exception.
Dim ex As LocalBuilder = execILGen.DeclareLocal(exType)
' At this point the Exception is on the stack.
' Store it in the local variable we just created.
execILGen.Emit(OpCodes.Stloc_0)
' Load the delegate.
execILGen.Emit(OpCodes.Ldarg_0)
' Load the Exception.
execILGen.Emit(OpCodes.Ldloc_0)
' Get the getter Function of the Message Property on the Exception.
execILGen.Emit(OpCodes.Call, exType.GetProperty("Message").GetGetMethod)
' Pass the exception message to the Invoke method on the delegate.
execILGen.Emit(OpCodes.Call, invokeInfo)
 
' Load the delegate again.
execILGen.Emit(OpCodes.Ldarg_0)
' Load the Exception again.
execILGen.Emit(OpCodes.Ldloc_0)
' Get the getter Function of the StackTrace Property on the Exception.
execILGen.Emit(OpCodes.Call, exType.GetProperty("StackTrace").GetGetMethod)
' Pass the stacktrace to the Invoke method on the delegate.
execILGen.Emit(OpCodes.Call, invokeInfo)
 
' End the try-catch block.
execILGen.EndExceptionBlock()
 
' Return.
execILGen.Emit(OpCodes.Ret)

Phew, that was a lot of code for a method that does so little! Well, such is the nature of IL. One thing you should notice is that I am using MethodBody Objects to make calls to methods on Objects that are on the stack. In the call to the InternalILMethod we just created I can simply put in the MethodBuilder for that method as an argument. Because the method is Shared (static in C#) I do not need a reference to the current Object.
So now that we have implemented two methods, one which calls the other, we have to actually create the Type to be able to use it. Luckily this is very easy. We just call CreateType on the TypeBuilder.

Collapse | Copy Code

' This creates the type and closes it for any further notifications.
Dim dynamicType As Type = dynamicTypeBuilder.CreateType
' Save the assembly to disk so we can check out the emitted IL.
dynamicAssmBuilder.Save("DynamicModule.dll")

Now all I have to do is return the Type to the caller and call the method we just generated. You can see how that's done in the EmitForm. You can also open the Emit form from the Main form to see what happens when the method is called with and without throwing an Exception. You can actually see at the StackTrace that we really created a new method that calls another method in our dynamically created Type! Ain't that something!?

6. Generating IL using Expression Trees

Luckily there is a shorter way to emit IL using the .NET Framework. It's called Expression Trees. You might have noticed that we actually created three methods on our dynamic Type. Two using IL and another one usingExpression Trees. What exactly is an Expression Tree? It is a representation of code in the form of data. That sounds pretty abstract, but believe me, it's not. Expression Trees revolve around theSystem.Linq.Expressions.Expression Type. All Expressions Inherit from this base class and all Expressionscan be created using Shared (static in C#) factory methods on this type. Let's look at the code that created theInternalILMethod above, but this time using Expression Trees.

Collapse | Copy Code

Private Shared Function GenerateInnerExpressionTree(ByVal actionParam As ParameterExpression, _
        ByVal throwExParam As ParameterExpression, ByVal invokeInfo As MethodInfo) As Expression
Return Expression.TryFinally(
   Expression.TryFault(
      Expression.Block(
         Expression.Call(actionParam, invokeInfo, Expression.Constant("Entered Expression Tree method.")),
         Expression.IfThen(Expression.Equal(throwExParam, Expression.Constant(True)),
            Expression.Throw(Expression.[New](GetType(Exception).GetConstructor({GetType(String)}),
               Expression.Constant("Well, that's it for you!")))),
         Expression.Call(actionParam, invokeInfo, Expression.Constant("Expression Tree method finished successfully."))),
      Expression.Call(actionParam, invokeInfo, Expression.Constant("Expression Tree method finished unsuccessfully."))),
   Expression.Call(actionParam, invokeInfo, Expression.Constant("Leaving Expression Tree method.")))
End Function

Does that look easy? Not exactly, mostly because I nested every Expression in the containing Expression. You should read it as follows: We declare a TryFinally Expression, which requires an Expression that makes up the body for the Try block and an Expression that makes up for the body of the Finally block. As body of the Tryblock we create a TryFault Expression which, as you can guess, again needs an Expression for the Try block and an Expression for the Fault block. So for the Try block we create a Block of Expressions, starting with a Call Expression which invokes Invoke on the Action(Of String) (Action<string> in C#) parameter and passes the Constant Expression "Entered Expression Tree Method." as an argument. The next Expression in ourBlock Expression is an IfThenExpression, which of course needs an If and a Then Expression. So for the If we create an EqualExpression and compare the Boolean parameter to the Constant value True. In the Thenblock we put a ThrowExpression in which we put a NewExpression which creates the Exception. We are now out of the IfThenExpression and into the BlockExpression again, where we put in a final Expression, being another call to the Invoke method of the delegate input parameter. That concludes the Try block and we are now in the Fault block, where we again do a call to Invoke. We are then out of the Fault block and in the Finallyblock where we make a call to Invoke one last time. That makes up for the entire Expression. I admit it takes some time to get used to, but once you get the hang of it actually makes sense.

So that was the inner method, now let's take a look at the method which catches the Exception. And here we have a problem... When using Expression Trees it is not possible to use a MethodBuilder like as we did when using the ILGenerator. The reason, appearently, is that a MethodBuilder is still able to change. Why this restriction only goes for Expression Trees I don't know, but it's a fact we have to live with. So instead of passing in a method I simply call the function that created the Expression Tree and the Expression Tree is neatly combined with the outer Expression Tree when creating the method. So let's see what that looks like in code.

Collapse | Copy Code

Private Shared Sub GenerateExpressionTreeMethod(ByVal typeBuilder As TypeBuilder, ByVal methodParams As Type())
   Dim invokeInfo As MethodInfo = GetType(Action(Of String)).GetMethod("Invoke", {GetType(String)})
 
   ' Input parameters.
   Dim actionParam As ParameterExpression = Expression.Parameter(GetType(Action(Of String)), "action")
   Dim throwExParam As ParameterExpression = Expression.Parameter(GetType(Boolean), "throwEx")
 
   ' Local variable.
   Dim exParam As ParameterExpression = Expression.Parameter(GetType(Exception), "ex")
 
   Dim exp As Expression = Expression.TryCatch(
      TypeFactory.GenerateInnerExpressionTree(actionParam, throwExParam, invokeInfo),
         Expression.Catch(exParam,
            Expression.Block(
               Expression.Call(actionParam, invokeInfo,
                  Expression.Property(exParam, "Message")),
               Expression.Call(actionParam, invokeInfo,
                  Expression.Property(exParam, "StackTrace")))))
 
   Dim expTreeMethod As MethodBuilder = typeBuilder.DefineMethod("ExecuteExpressionTreeMethod", MethodAttributes.Public Or MethodAttributes.Static, Nothing, methodParams)
   Expression.Lambda(Of Action(Of Action(Of String), Boolean))(exp, actionParam, throwExParam).CompileToMethod(expTreeMethod)
End Sub

Perhaps this piece of code is a bit easier than the other function. What's notable in this example is that theExpression Tree is wrapped in a LambdaExpression which can be compiled into the newly created method. Actually there are two option for compiling Expression Trees. One is CompileToMethod which emits IL into theMethodBuilder argument. Another way to compile an Expression Tree is to call the Compile method which returns a delegate that can be Invoked right away. However, since we are using a TryFault block this will throw an NotSupportedException since TryFault blocks are not supported in VB and C#.
Again, you can see how this method performs by opening the Expression tree form from the Main form. As you can now see in the StackTrace of the Exception only one method was created.

Now remember that we saved the created assembly to disk? You can find it in the bin folder of the startup project. Open it using ILDASM and check the emitted IL for both methods. It's exactly the same! Here is the IL for theTryFault block for both the Emit and the Expression Trees example.

Collapse | Copy Code

    .try
    {
      IL_0000:  ldarg.0
      IL_0001:  ldstr      "Entered Emit method."
      IL_0006:  call       instance void class [mscorlib]System.Action`1<string>::Invoke(!0)
      IL_000b:  ldarg.1
      IL_000c:  ldc.i4.1
      IL_000d:  ceq
      IL_000f:  brfalse    IL_001f
      IL_0014:  ldstr      "Well, that's it for you!"
      IL_0019:  newobj     instance void [mscorlib]System.Exception::.ctor(string)
      IL_001e:  throw
      IL_001f:  ldarg.0
      IL_0020:  ldstr      "Emit method finished successfully."
      IL_0025:  call       instance void class [mscorlib]System.Action`1<string>::Invoke(!0)
      IL_002a:  leave      IL_003b
    }  // end .try
    fault
    {
      IL_002f:  ldarg.0
      IL_0030:  ldstr      "Emit method finished unsuccessfully."
      IL_0035:  call       instance void class [mscorlib]System.Action`1<string>::Invoke(!0)
      IL_003a:  endfinally
    }  // end handler
    IL_003b:  leave      IL_004c
  }  // end .try

Collapse | Copy Code

      .try
      {
        IL_0000:  ldarg.0
        IL_0001:  ldstr      "Entered Expression Tree method."
        IL_0006:  callvirt   instance void class [mscorlib]System.Action`1<string>::Invoke(!0)
        IL_000b:  ldarg.1
        IL_000c:  ldc.i4.1
        IL_000d:  ceq
        IL_000f:  brfalse    IL_001f
        IL_0014:  ldstr      "Well, that's it for you!"
        IL_0019:  newobj     instance void [mscorlib]System.Exception::.ctor(string)
        IL_001e:  throw
        IL_001f:  ldarg.0
        IL_0020:  ldstr      "Expression Tree method finished successfully."
        IL_0025:  callvirt   instance void class [mscorlib]System.Action`1<string>::Invoke(!0)
        IL_002a:  leave      IL_003b
      }  // end .try
      fault
      {
        IL_002f:  ldarg.0
        IL_0030:  ldstr      "Expression Tree method finished unsuccessfully."
        IL_0035:  callvirt   instance void class [mscorlib]System.Action`1<string>::Invoke(!0)
        IL_003a:  endfinally
      }  // end handler
      IL_003b:  leave      IL_004c
    }  // end .try

That does look pretty similiar! So IL can be emitted using Reflection.Emit opcodes or Expression Trees. Both methods have their pro's and cons. The con to Reflection.Emit is obviously that you need lots of code to get things done and debugging is quite hard (but possible). The pro is that the sky is the limit, there is virtually nothing that can't be done with Emit! The pro to Expression Trees is that it is easier to understand and debug, especially when you write it out piece by piece (although you get nice IntelliSense support when nesting them). The cons are that it is actually not very well documented. Most pages to Expression Classes on MSDN have no examples or even descriptions! Also, Expression Trees have some limitations, as we experienced we could not make a call to a method that did not yet exist. Another limitation is that with Expression Trees we can only generate Shared (static) methods. This may or may not be a problem of course.

Further reading:
While I have no real documentation on Expression Trees you could, like always, check out MSDN.
What you should read is chapter 6 of the book Metaprogramming in .NET

7. The curious case of F#

There are two more things I would like to take a look at. Functions as first class citizens and tail calls. Both are features of Microsofts functional programming language, F#.

7.1. The case of Functions as First Class Citizens

F# (and other functional languages) treats functions as first class citizens, which means that they can be passed as arguments to other functions, returned by functions, and stored in variables. Basically first-class functions are really just treated like any other variable such as in Integer or a String. VB and C# can sort of mimic this behaviour through delegates, but it's not quite the same. You can download the TheCuriousCaseOfFSharp sample project at the top of this article. Open the solution and take a look at the code. The first thing you will see is the following.

Collapse | Copy Code

// Functions as 'first class citizens'.
// This function returns a new function which takes a function as an argument.
let SomeFunc a b c d =
    let newFunc f = f (a, b) + f (c, d)
    newFunc
 
// Call SomeFunc passing in four integers and passing in an anonymous function
// that adds two integers to the function that is returned by SomeFunc.
let resultAdd = SomeFunc 1 2 3 4 (fun (a, b) -> a + b)
// Do the same as above, but multiply the integers.
let resultMult = SomeFunc 1 2 3 4 (fun (a, b) -> a * b)
 
// Print the results.
printfn "The result of adding: %d" resultAdd
printfn "The result of multiplying: %d" resultMult

Well, that goes pretty easy indeed! A function that returns a function that needs a function as argument. Notice the lambda's that are passed to the function that is returned from SomeFunc. We already know this from VB and C#, but they got it from F#. So what do you think? Does the compiler simply create Shared (static) functions like we saw in the lambda examples earlier?

As you can see a new Type is created for each function. The new types actually Inherits from FSharpFunc(Of T, U) (FSharpFunc<T, U>). Now whenever an FSharpFunc is 'used as a value' the compiler generates a call to theInvoke method of the FSharpFunc. I can't show the generated IL in here, because it would not fit the screen, however you can take a look for yourself. This is where you will want to look.

And this is what you should be looking out for.

Collapse | Copy Code

IL_0074:  callvirt   instance !1 class [FSharp.Core]Microsoft.FSharp.Core.FSharpFunc`2<int32,class [FSharp.Core]Microsoft.FSharp.Core.Unit>::Invoke(!0)

I never called Invoke in my code, so this is what the compiler does for me. And there you have it in a nutshell. Functions as first class citizens!

7.2. The case of Tails Calls

So let's take a look at tail calls. Functional languages excel in recursion. That is a method that may call itself. If you don't watch out you will get a StackOverflowException! That happens when the number of calls to the same functions reaches a certain amount. I'm not sure when, but it happens. Take a look at the following VB and C# code.

Collapse | Copy Code

Public Function GetTenMillion(ByVal i As Integer) As Integer
   If i < 10000000 Then
      Return GetTenMillion(i + 1)
   ElseIf i > 10000000 Then
      Return GetTenMillion(i - 1)
   Else
      Return i
   End If
End Function

Any experienced programmer knows this function will cause a StackOverflowException if it were called with input parameters that are not quite close to 10000000. So if we call it with input 1 our application will crash for sure. Well here's the deal, it won't in F#! Here is the F# code for the same function together with the calling code.

Collapse | Copy Code

// Recursive 'tail call'. Eliminates the stack before calling a method.
let rec GetTenMillion i =
    if i < 10000000
    then GetTenMillion (i + 1)
    elif i > 10000000
    then GetTenMillion (i - 1)
    else i
 
// This would throw a StackOverflowException in VB or C#!
GetTenMillion 1 |> printfn "GetTenMillion from 1: %d"

So what is this 'tail call'? Well, whenever the call to a recursive function is the last statement of that function then the call stack is cleared before the call is made. So in this case we can see that if i is smaller than ten million we callGetTenMillion again with i + 1. In this example i + 1 is executed before the call to GetTenMillion and nothing happens after that. This causes the call stack to clear. If, for example, we would add 1 AFTER the function executed it will be compiled as just another call on the stack and a StackOverflowException may be thrown. So let's see some IL for this function.

Collapse | Copy Code

.method public static int32  GetTenMillion(int32 i) cil managed
{
  // Code size       41 (0x29)
  .maxstack  4
  IL_0000:  nop
  IL_0001:  ldarg.0
  IL_0002:  ldc.i4     0x989680
  IL_0007:  bge.s      IL_000b
  IL_0009:  br.s       IL_000d
  IL_000b:  br.s       IL_0014
  IL_000d:  ldarg.0
  IL_000e:  ldc.i4.1
  IL_000f:  add
  IL_0010:  starg.s    i
  IL_0012:  br.s       IL_0000
  IL_0014:  ldarg.0
  IL_0015:  ldc.i4     0x989680
  IL_001a:  ble.s      IL_001e
  IL_001c:  br.s       IL_0020
  IL_001e:  br.s       IL_0027
  IL_0020:  ldarg.0
  IL_0021:  ldc.i4.1
  IL_0022:  sub
  IL_0023:  starg.s    i
  IL_0025:  br.s       IL_0000
  IL_0027:  ldarg.0
  IL_0028:  ret
} // end of method Program::GetTenMillion

Do you see that? Not a single call opcode was emitted! What happens instead? If i is smaller than ten million 1 is added to i and we simply branch back to the beginning of the function. The same happens if i is bigger than ten million, except 1 is subtracted from i. Simple, but quite effective! Using Reflection.Emit you could use this trick to make your own very deep recursive functions.

You can run the F# application and really see that 10000000 is printed and no StackOverflowException is thrown.

Further reading: A book that greatly helped me to understand at least some of F# is Expert F# 2.0 from Apress.

8. Afterword

Well, that certainly was A LOT of writing (for me) and reading (for you). As I said in the introduction this is not an easy subject. I hope I have made it as easy as possible and that you have enjoyed reading it as much as I've enjoyed writing it. Most of the stuff I've written down was new to me before I started writing so I can say I've learned A LOT and I hope you can say the same.

I would be happy to answer any questions or comments.

Happy coding!

저작자표시 비영리 (새창열림)

Imitaion .NET

A Look under the hood of the .NET Framework

1. Table of Contents

2. Introduction

3. An introduction to IL

4. Under the hood of everyday .NET code

4.1. The case of Properties

4.2. The case of With

4.3. The case of For Each

4.4. The case of Lambda Expressions

4.5. The case of Anonymous Types

4.6. The case of Cases

4.7. The case of Iterators

5. Emitting IL using VB or C#

Further reading:

6. Generating IL using Expression Trees

7. The curious case of F#

7.1. The case of Functions as First Class Citizens

7.2. The case of Tails Calls

8. Afterword

티스토리툴바