Ever needed to return an IEnumerable<T>
from a method? Did you create a List<T>
instance that you populated and returned? There is a better way, with less memory footprint and better performance.
The yield return
statement is one of the more mysterious, yet very useful constructs in C#. With yield return
it is possible to return an IEnumerable<T>
without creating a collection of any sort (no array, no list, no anything):
public static IEnumerable<int> Numbers(int max) { for(int i = 0; i < max; i++) { yield return i; } } |
yield return
adds one item to the returned IEnumerable<T>
each time it is called, but it does not end the function as a normal return
would. The function ends when flow of control reaches the end of the function body.
Using yield return
makes the code shorter than creating and populating e.g. a list, but that’s only part of the strength. The real power lies in the lazy evaluation.
Lazy Evaluation
Let’s extend the Numbers
function to print out a message each time it returns a number:
public static IEnumerable<int> Numbers(int max) { for(int i = 0; i < max; i++) { Console.WriteLine("Returning {0}", i); yield return i; } } |
Then we use the IEnumberable<int>
returned from Numbers
in a loop, but only the first three elements.
foreach(int i in Numbers(10).Take(3)) { Console.WriteLine("Number {0}", i); } |
The output shows that each number is returned just before it is needed. Only three numbers are returned, obviously the loop in Numbers
is somehow halted after three turns.
Returning 0 Number 0 Returning 1 Number 1 Returning 2 Number 2 |
Under the Hood
There is certainly something special to the yield return
construct. The Numbers
function is only allowed to run until it returns an item, then it is paused until another item is needed. When no more items are needed it is interrupted and not run to end.
How is this accomplished? The MSDN Docs gives a hint:
The compiler generates a class to implement the behavior that is expressed in the iterator block.
I used reflector to have a look at the generated class (the code is further down in this post). The class implements both IEnumerable
and IEnumerator
on the same time. These are different concepts. IEnumerable
is something that can be enumerated. IEnumerator
is a handle used when enumerating. It points at the current element and can be moved forward to the next element. Implementing them both in the same class makes the implementation shorter, but also more confusing.
The first method in the class that will be called is IEnumerable<int>.GetEnumerator()
. If the call is coming from the same thread that instantiated the class, it will reset the state to 0 and return this
. The next thing the calling code would do is to step the enumerator forward through IEnumerator<int>.MoveNext()
. The two most important variables are <>1_state
and <i>5__1
. <>1_state
keeps track of the current state of the enumerator. 0
means freshly created with a position before the first item. 1
means currently pointing at an item with <i>5__1
keeping track of where we are in the loop. Finally a state of -1
means we’ve past the last item in the enumeration.
The for
loop have been translated to a while
loop in IEnumerator<int>.MoveNext()
. Using the label Label_005C
in combination with goto
gives a way to continue in the middle of the loop. It is all very elegant, but hard to understand. I don’t want to think about how a more complex method with several yield return
statements would be translated…
The good news is that the compiler handles all of this for us. We can just continue to ignorantly use yield return
and know that it will produce efficient code that only creates the needed elements.
[CompilerGenerated] private sealed class <Numbers>d__0 : IEnumerable<int>, IEnumerable, IEnumerator<int>, IEnumerator, IDisposable { // Fields private int <>1__state; private int <>2__current; public int <>3__max; private int <>l__initialThreadId; public int <i>5__1; public int max; // Methods [DebuggerHidden] public <Numbers>d__0(int <>1__state) { this.<>1__state = <>1__state; this.<>l__initialThreadId = Thread.CurrentThread.ManagedThreadId; } private bool MoveNext() { switch (this.<>1__state) { case 0: this.<>1__state = -1; this.<i>5__1 = 0; while (this.<i>5__1 < this.max) { Console.WriteLine("Returning {0}", this.<i>5__1); this.<>2__current = this.<i>5__1; this.<>1__state = 1; return true; Label_005C: this.<>1__state = -1; this.<i>5__1++; } break; case 1: goto Label_005C; } return false; } [DebuggerHidden] IEnumerator<int> IEnumerable<int>.GetEnumerator() { YieldReturn.<Numbers>d__0 d__; if ((Thread.CurrentThread.ManagedThreadId == this.<>l__initialThreadId) && (this.<>1__state == -2)) { this.<>1__state = 0; d__ = this; } else { d__ = new YieldReturn.<Numbers>d__0(0); } d__.max = this.<>3__max; return d__; } [DebuggerHidden] IEnumerator IEnumerable.GetEnumerator() { return this.IEnumerable<int>.GetEnumerator(); } [DebuggerHidden] void IEnumerator.Reset() { throw new NotSupportedException(); } void IDisposable.Dispose() { } // Properties int IEnumerator<int>.Current { [DebuggerHidden] get { return this.<>2__current; } } object IEnumerator.Current { [DebuggerHidden] get { return this.<>2__current; } } } |
Since Reflector since long is payware I would recommend the open source alternative ILSpy.
Excellent !