Using C#'s yield to Return an Enumerable Collection
- Iterating Collections and Creating Sub-lists
- Switching to yield Is Much More Efficient
- Mixing in a Predicate<T> Affords Maximum Flexibility
- Knowing yield's Basic Rules and Limitations
- Summary
Throughout the years I have said that the CodeDOM would offer some pretty cool capabilities. Well, Microsoft stuck a new one in the seemingly innocuous phrases yield return and yield break. In this article, you learn how this new word pairing works and how using the inline state machine—that yield creates—and dynamic iterators can save you a lot of time and effort, and eliminate much of the need for copying sub-lists.
Iterating Collections and Creating Sub-lists
One of the most common programming fragments is code that loops over an array or collection of data. Loops are everywhere. In fact, looping code is so common that a behavior pattern—the Iterator—was invented just to cut down on some of the tedium.
In .NET, IEnumerator and IEnumerable implement the Iterator behavior pattern. IEnumerable means that something is enumerable, and IEnumerator returns the plumbing that makes iterative code work so well in C#. It is this capability that makes binding collections to controls and the foreach keyword work.
Implementing the Iterator pattern in .NET means that we individual programmers do not have to do it ourselves or for every application we write (although I find implementing patterns informative and fun.) Iterator itself is pretty straightforward and a small change, but collectively it saves us programmers from writing a lot of tedious code.
Another small invention that will save tons of code over your career is the yield keyword. Let's look at how we used to use foreach to copy lists and sub-lists, and then let's look at how yield makes this even easier.
A very common programming task is to take a big something and find the small something within. For example, for all of our customers we may only want those customers in the 48846 zip code on any given day. So, some programmer might write some code to search by customer. Same thing is true for any collection.
The code in Listing 1 contains a custom class Event, and Listing 2 shows the classic way to create a sub-list using a List<T> of Event objects.
Listing 1 A simple custom event class.
public class Event { /// <summary> /// Initializes a new instance of the Event class. /// </summary> /// <param name="occurs"></param> /// <param name="description"></param> public Event(DateTime occurs, string description) { this.occurs = occurs; this.description = description; } private DateTime occurs; public DateTime Occurs { get { return occurs; } set { occurs = value; } } private string description; public string Description { get { return description; } set { description = value; } } public override string ToString() { return string.Format("{0} occurs on {1}", description, occurs); } }
Listing 2 A classic foreach statement that returns a sub-list of items.
using System; using System.Collections.Generic; using System.Text; namespace YieldReturn { class Program { static void Main(string[] args) { List<Event> events = new List<Event>(); events.Add(new Event(new DateTime(2007, 4, 6), "Rent")); events.Add(new Event(new DateTime(2007, 6, 15), "Chicago")); events.Add(new Event(new DateTime(2007, 6, 15), "The Fray")); events.Add(new Event(new DateTime(2007, 4, 1), "Wrestlemania")); List<Event> aprilEvents = GetAprilEvents(events); foreach(Event ev in aprilEvents) Console.WriteLine(ev); Console.ReadLine(); } // 97 bytes of MSIL public static List<Event> GetAprilEvents(List<Event> all) { List<Event> april = new List<Event>(); foreach(Event ev in all) if(ev.Occurs.Month == 4) april.Add(ev); return april; } } }
The code in Listing 2 works and has been written and re-written thousands, maybe millions, of times. If you open ILDASM (Intermediate Language Disassembler), you will see that the GetAprilEvents function takes up about 97 bytes of MSIL (Microsoft Intermediate Language), as shown in Figure 1.
Figure 1 Creating a new sub-list and an iterator and test; to populate the list takes about 97 bytes of code.
Of course, one problem in addition to size is that code like this needs to be manually written and re-written for all of the variations you need. (We could simplify and reuse this code by passing in the date, but we'd still have to modify and add new versions if we wanted to search by description.) Thus the code isn't tiny, and even though it works it isn't very flexible.