LINQ GroupBy Explanations and Examples

I’m a big fan of LINQ, and one of my favorite extension methods is GroupBy. I have found that GroupBy is one of the most useful methods in LINQ and the more I use it the more useful I find it to be. The issue, though, is the enormous amount of overloads GroupBy has. Let’s review the different method signatures of GroupBy:

GroupBy<TSource, TKey>(IEnumerable<TSource>, Func<TSource, TKey>)
GroupBy<TSource, TKey>(IEnumerable<TSource>, Func<TSource, TKey>, IEqualityComparer<TKey>)
GroupBy<TSource, TKey, TElement>(IEnumerable<TSource>, Func<TSource, TKey>, Func<TSource, TElement>)
GroupBy<TSource, TKey, TResult>(IEnumerable<TSource>, Func<TSource, TKey>, Func<TKey, IEnumerable<TSource>, TResult>)
GroupBy<TSource, TKey, TElement>(IEnumerable<TSource>, Func<TSource, TKey>, Func<TSource, TElement>, IEqualityComparer<TKey>)
GroupBy<TSource, TKey, TElement, TResult>(IEnumerable<TSource>, Func<TSource, TKey>, Func<TSource, TElement>, 
    Func<TKey, IEnumerable<TElement>, TResult>)
GroupBy<TSource, TKey, TResult>(IEnumerable<TSource>, Func<TSource, TKey>, Func<TKey, IEnumerable<TSource>, TResult>, 
    IEqualityComparer<TKey>)
GroupBy<TSource, TKey, TElement, TResult>(IEnumerable<TSource>, Func<TSource, TKey>, Func<TSource, TElement>, Func<TKey, 
    IEnumerable<TElement>, TResult>, IEqualityComparer<TKey>)

Wow that’s confusing. No wonder people get overwhelmed when seeing all these overloads, especially in a little Intellisense window. Once you break it down, though, it’s really not that bad. For now I’m going to cover 2 of my most used GroupBy methods.

First, let’s look at the easiest of all the overloads, GroupBy<TSource, TKey>(IEnumerable<TSource>, Func<TSource, TKey>). Lets break down the parameters:

  • IEnumerable<TSource> is the source collection
  • Func<TSource, TKey> is the function to extract the key for each element in the collection

A Func is a delegate that takes a certain amount of parameters and has a return type. In this case there is one parameter of type TSource with a return type of TKey.

Lets look at an example:

List<string> colors = new List<string>();
colors.Add("green");
colors.Add("blue");
colors.Add("yellow");
colors.Add("green");
colors.Add("yellow");
IEnumerable<IGrouping<string, List<string>>> groupedColors = colors.GroupBy(c => c);

The GroupBy method returns an IEnumerable<IGrouping<TKey, TSource>>, which is a collection of collections and their key. The result of the previous grouping is:

Key Collection
green { green, green }
blue { blue }
yellow { yellow, yellow }

Now, that’s great, but that doesn’t look too incredibly useful for now. We have a grouping with a color as the key and any number of the same colors as the collection. Let’s look at another example with a little bit more complexity:

public class Car
{
    public string Make { get; set; }
    public string Model { get; set; }
    public string Color { get; set; }
}

List<Car> cars = new List<Car>();
cars.Add(new Car { Make = "Honda", Model = "Accord", Color = "blue" });
cars.Add(new Car { Make = "Dodge", Model = "Caravan", Color = "green" });
cars.Add(new Car { Make = "Ford", Model = "Crown Victoria", Color = "red" });
cars.Add(new Car { Make = "Honda", Model = "Civic", Color = "blue" });
cars.Add(new Car { Make = "Dodge", Model = "Stratus", Color = "blue" });
cars.Add(new Car { Make = "Honda", Model = "Pilot", Color = "red" });

IEnumerable<IGrouping<string, List<Car>>> carGroups = 
    cars.GroupBy(c => c.Make);
Key Collection
Honda Car { Make = “Honda”, Model = “Accord”, Color = “blue” } Car { Make = “Honda”, Model = “Civic”, Color = “blue” } Car { Make = “Honda”, Model = “Pilot”, Color = “red” }
Dodge Car { Make = “Dodge”, Model = “Caravan”, Color = “green” } Car { Make = “Dodge”, Model = “Stratus”, Color = “blue” }
Ford Car { Make = “Ford”, Model = “Crown Victoria”, Color = “red” }

Now that’s a more useful result. We have a collection of cars grouped by their make. To use it, we can loop through and, for instance, print it out to the console:

foreach(IGrouping<string, List<Car>> carGroup in carGroups)
{
    Console.WriteLine(string.Format("Key (Make): {0}", carGroup.Key));
    foreach(var car in carGroup)
    {
        Console.WriteLine(string.Format("tModel: {0}", car.Model));
    }
}

Let’s look at the other overload I use the most:

GroupBy<TSource, TKey, TResult>(IEnumerable<TSource>, Func<TSource, TKey>, Func<TKey, IEnumerable<TSource>, TResult>)

This overload allows you to specify a selector function to the grouping results. Here’s a breakdown of the resultSelector Func:

  • TKey is the key of the grouping
  • IEnumerable<TSource> is the elements with the given key as their key
  • TResult is the return type of the selector

Let’s look at an example using our colors list:

List<string> colors = new List<string>();
colors.Add("green");
colors.Add("blue");
colors.Add("yellow");
colors.Add("green");
colors.Add("yellow");
IEnumerable<IGrouping<string, List<string>>> groupedColors = 
    colors.GroupBy(
        c => c, 
        (key, result) => return new { Color = key, Count = result.Count() }
    );

foreach(var group in groupedColors)
{
    Console.WriteLine(string.Format("Key (Color): {0}tCount: {1}", group.Color, group.Count));
}

As you can see, we’re grouping the colors by name, and then using a result selector to return an anonymous object with properties for the color and the count of each color. In the example, the output of our console application would be:

Key (Color): green    Count: 2
Key (Color): blue     Count: 1
Key (Color): yellow   Count: 2

We can easily see how many of each color we have without having to write some loop code with a counter and a Dictionary.

I have found that GroupBy comes in handy in many, many cases, and is a staple in my LINQ toolbag.