Best linq questions in January 2011

How to combine 2 lists using LINQ?

25 votes

Env.: .NET4 C#

Hi All,

I want to combine these 2 lists : { "A", "B", "C", "D" } and { "1", "2", "3" }

into this one:

{ "A1", "A2", "A3", "B1", "B2", "B3", "C1", "C2", "C3", "D1", "D2", "D3" }

Obviously, i could use nested loops. But I wonder if LINQ can help. As far as I understand, Zip() is not my friend in this case, right?

TIA,

Essentially, you want to generate a cartesian product and then concatenate the elements of each 2-tuple. This is easiest to do in query-syntax:

var cartesianConcat = from a in seq1
                      from b in seq2
                      select a + b;

Bad implementation of Enumerable.Single?

24 votes

I came across this implementation in Enumerable.cs by reflector.

public static TSource Single<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
    //check parameters
    TSource local = default(TSource);
    long num = 0L;
    foreach (TSource local2 in source)
    {
        if (predicate(local2))
        {
            local = local2;
            num += 1L;
            //I think they should do something here like:
            //if (num >= 2L) throw Error.MoreThanOneMatch();
            //no necessary to continue
        }
    }
    //return different results by num's value
}

I think they should break the loop if there are more than 2 items meets the condition, why they always loop through the whole collection? In case of that reflector disassembles the dll incorrectly, I write a simple test:

class DataItem
{
   private int _num;
   public DataItem(int num)
   {
      _num = num;
   }

   public int Num
   {
      get{ Console.WriteLine("getting "+_num); return _num;}
   }
} 
var source = Enumerable.Range(1,10).Select( x => new DataItem(x));
var result = source.Single(x => x.Num < 5);

For this test case, I think it will print "getting 0, getting 1" and then throw an exception. But the truth is, it keeps "getting 0... getting 10" and throws an exception. Is there any algorithmic reason they implement this method like this?

EDIT Some of you thought it's because of side effects of the predicate expression, after a deep thought and some test cases, I have a conclusion that side effects doesn't matter in this case. Please provide an example if you disagree with this conclusion.

Yes, I do find it slightly strange especially because the overload that doesn't take a predicate (i.e. works on just the sequence) does seem to have the quick-throw 'optimization'.


In the BCL's defence however, I would say that the InvalidOperation exception that Single throws is a boneheaded exception that shouldn't normally be used for control-flow. It's not necessary for such cases to be optimized by the library.

Code that uses Single where zero or multiple matches is a perfectly valid possibility, such as:

try
{
     var item = source.Single(predicate);
     DoSomething(item);
}

catch(InvalidOperationException)
{
     DoSomethingElseUnexceptional();    
}

should be refactored to code that doesn't use the exception for control-flow, such as (only a sample; this can be implemented more efficiently):

var firstTwo = source.Where(predicate).Take(2).ToArray();

if(firstTwo.Length == 1) 
{
    // Note that this won't fail. If it does, this code has a bug.
    DoSomething(firstTwo.Single()); 
}
else
{
    DoSomethingElseUnexceptional();
}

In other words, we should leave the use of Single to cases when we expect the sequence to contain only one match. It should behave identically to Firstbut with the additional run-time assertion that the sequence doesn't contain multiple matches. Like any other assertion, failure, i.e. cases when Single throws, should be used to represent bugs in the program (either in the method running the query or in the arguments passed to it by the caller).

This leaves us with two cases:

  1. The assertion holds: There is a single match. In this case, we want Single to consume the entire sequence anyway to assert our claim. There's no benefit to the 'optimization'. In fact, one could argue that the sample implementation of the 'optimization' provided by the OP will actually be slower because of the check on every iteration of the loop.
  2. The assertion fails: There are zero or multiple matches. In this case, we do throw later than we could, but this isn't such a big deal since the exception is boneheaded: it is indicative of a bug that must be fixed.

To sum up, if the 'poor implementation' is biting you performance-wise in production, either:

  1. You are using Single incorrectly.
  2. You have a bug in your program. Once the bug is fixed, this particular performance problem will go away.

EDIT: Clarified my point.

EDIT: Here's a valid use of Single, where failure indicates bugs in the calling code (bad argument):

public static User GetUserById(this IEnumerable<User> users, string id)
{
     if(users == null)
        throw new ArgumentNullException("users");

     // Perfectly fine if documented that a failure in the query
     // is treated as an exceptional circumstance. Caller's job 
     // to guarantee pre-condition.        
     return users.Single(user => user.Id == id);    
}