Monday, October 19, 2009

Efficient checking whether IEnumerable contains data

Have you ever needed to check - in a LINQ context, or otherwise - whether an IEnumerable<T> (or plain IEnumerable) contained any elements?

Of course a check using the Count<T>() extension method to check for an element count of zero works here, but this would be... unfortunate... for sequences yielding large numbers of elements because of the O(n) linear behavior.

Maybe the following source code (or the LINQ Any() method, see comments) could be of use to you in those cases from now on:

/// <summary>Sequence contains at least 1 item?</summary>
/// <typeparam name="T">Type of elements</typeparam>
/// <param name="sequence">Sequence to check</param>
/// <returns>true/false</returns>
public static bool NotEmpty<T>(this IEnumerable<T> sequence)
{
  return sequence.GetEnumerator().MoveNext();
}

/// <summary>Sequence contains at least 1 item?</summary>
/// <param name="sequence">Sequence to check</param>
/// <returns>true/false</returns>
public static bool NotEmpty(this IEnumerable sequence)
{
  return sequence.GetEnumerator().MoveNext();
}

These extension methods check whether the given sequence contains elements or not, but does so taking only O(1) constant time. In other words: the entire sequence is not fully evaluated, but the minimal work is being done to check if the sequence contains at least one element.

6 comments:

  1. Nice Article... Although using Extension Methods are good, but if somebody doesn't want to do it, they could check the condition right in the if block..

    if(sequence.GetEnumerator().MoveNext())
    {
    // Process if sequence has data
    }

    --
    Preetham Reddy Chamakura
    www.preetham-reddy.com

    ReplyDelete
  2. That almost goes without saying, yes.

    However, I prefer the method (extension or normal) since it makes the code self-describing. Many developers would not immediately know what "sequence.GetEnumerator().MoveNext()" meant, I think. And that would be bad.

    Also, there is the advantage of encapsulation: what if your semantics of "not empty" need to functionally change in your application for whatever reason? Then you would be stuck with finding/remembering all those individual ifs - without missing a single one - while all I'd have to do is change my NotEmpty() method.

    ReplyDelete
  3. Maarten Oosterhoff23 November, 2009 09:43

    Doesn't IEnumerable<T> already implement IEnumerable? Seems to me you only need one method.

    ReplyDelete
  4. Or, I should use Any() when using LINQ: http://www.kodefuguru.com/post/2009/12/07/Any-versus-Count.aspx

    ReplyDelete
  5. Here's the Any() implementation:

    public static bool Any(this IEnumerable source)
    {
    if (source == null)
    {
    throw Error.ArgumentNull("source");
    }
    using (IEnumerator enumerator = source.GetEnumerator())
    {
    if (enumerator.MoveNext())
    {
    return true;
    }
    }
    return false;
    }

    ReplyDelete
  6. Great that this Any() implementation includes the parameter checking and using statement that I left as "an exercise to the reader" in my original post. ;-)

    ReplyDelete