Best linq questions in September 2011

Does the order of LINQ functions matter?

50 votes

Basically, as the question states... does the order of LINQ functions matter in terms of performance? Obviously the results would have to be identical still...

Example:

myCollection.OrderBy(item => item.CreatedDate).Where(item => item.Code > 3);
myCollection.Where(item => item.Code > 3).OrderBy(item => item.CreatedDate);

Both return me the same results, but are in a different LINQ order. I realize that reordering some items will result in different results, and I'm not concerned about those. What my main concern is in knowing if, in getting the same results, ordering can impact performance. And, not just on the 2 LINQ calls I made (OrderBy, Where), but on any LINQ calls.

It will depend on the LINQ provider in use. For LINQ to Objects, that could certainly make a huge difference. Assume we've actually got:

var query = myCollection.OrderBy(item => item.CreatedDate)
                        .Where(item => item.Code > 3);

var result = query.Last();

That requires the whole collection to be sorted and then filtered. If we had a million items, only one of which had a code greater than 3, we'd be wasting a lot of time ordering results which would be thrown away.

Compare that with the reversed operation, filtering first:

var query = myCollection.Where(item => item.Code > 3)
                        .OrderBy(item => item.CreatedDate);

var result = query.Last();

This time we're only ordering the filtered results, which in the sample case of "just a single item matching the filter" will be a lot more efficient - both in time and space.

It also could make a difference in whether the query executes correctly or not. Consider:

var query = myCollection.Where(item => item.Code != 0)
                        .OrderBy(item => 10 / item.Code);

var result = query.Last();

That's fine - we know we'll never be dividing by 0. But if we perform the ordering before the filtering, the query will throw an exception.

How do I do a left outer join with Dynamic Linq?

19 votes

I am trying to mimick the left outer join here but using dynamic linq extension methods. What i have:

public static IQueryable SelectMany(this IQueryable source, string selector, 
    string resultsSelector, params object[] values)
{
    if (source == null) throw new ArgumentNullException("source");
    if (selector == null) throw new ArgumentNullException("selector");

    // Parse the lambda 
    LambdaExpression lambda = DynamicExpression.ParseLambda(
        source.ElementType, null, selector, values);

    // Fix lambda by recreating to be of correct Func<> type in case  
    // the expression parsed to something other than IEnumerable<T>. 
    // For instance, a expression evaluating to List<T> would result  
    // in a lambda of type Func<T, List<T>> when we need one of type 
    // an Func<T, IEnumerable<T> in order to call SelectMany(). 
    Type inputType = source.Expression.Type.GetGenericArguments()[0];
    Type resultType = lambda.Body.Type.GetGenericArguments()[0];
    Type enumerableType = typeof(IEnumerable<>).MakeGenericType(resultType);
    Type delegateType = typeof(Func<,>).MakeGenericType(inputType, 
        enumerableType);
    lambda = Expression.Lambda(delegateType, lambda.Body, lambda.Parameters);

    ParameterExpression[] parameters = new ParameterExpression[] { 
        Expression.Parameter(source.ElementType, "outer"), 
        Expression.Parameter(resultType, "inner") 
    };

    LambdaExpression resultsSelectorLambda = DynamicExpression.ParseLambda(
        parameters, null, resultsSelector, values);

    // Create the new query 
    return source.Provider.CreateQuery(Expression.Call(typeof(Queryable), 
        "SelectMany", new Type[] { 
            source.ElementType, 
            resultType, 
            resultsSelectorLambda.Body.Type 
        }, source.Expression, Expression.Quote(lambda), 
        Expression.Quote(resultsSelectorLambda)));            
}

and:

public static IQueryable GroupJoin(this IQueryable outer, IEnumerable inner,
    string outerKeySelector, string innerKeySelector, string resultSelector, 
    params object[] values)
{
    Type innerElementType = inner.AsQueryable().ElementType;

    var outerParameter = Expression.Parameter(outer.ElementType, "outer");
    var innerParameter = Expression.Parameter(innerElementType, "inner");
    var groupParameter = Expression.Parameter(typeof(IEnumerable<>)
        .MakeGenericType(innerElementType), "group");

    var outerLambda = DynamicExpression.ParseLambda(new[] { outerParameter },
        null, outerKeySelector, values);
    var innerLambda = DynamicExpression.ParseLambda(new[] { innerParameter },
        outerLambda.Body.Type, innerKeySelector, values);
    var resultLambda = DynamicExpression.ParseLambda(new[] { 
        outerParameter, groupParameter }, null, resultSelector, values);

    return outer.Provider.CreateQuery(Expression.Call(typeof(Queryable), 
        "GroupJoin", new[] { outer.ElementType, innerElementType, 
        outerLambda.Body.Type, resultLambda.Body.Type },
        outer.Expression, Expression.Constant(inner),
        Expression.Quote(outerLambda), Expression.Quote(innerLambda),
        Expression.Quote(resultLambda)));
}

However where I fall down is with the DefaultIfEmpty within the SelectMany

Add void DefaultIfEmpty(); to interface IEnumerableSignatures

Then use

public static object DefaultIfEmpty(this IQueryable source)
{
    if (source == null) throw new ArgumentNullException("source");
        return source.Provider.Execute(
    Expression.Call(
        typeof(Queryable), "DefaultIfEmpty",
        new Type[] { source.ElementType },
        source.Expression));
}

Then you have a call like

var qry = Foo.GroupJoin(Bar, "outer.Id", "inner.Id", "new(outer.Id as Foo, group as Bars)").SelectMany("Bars.DefaultIfEmpty()", "new(outer.Foo as Foo, inner as Bar)");

Get current enumerator (iterator ?) in LINQ query. Like a current index in for loops

4 votes

Is that possible to get current Enumerator (...or iterator? Don't know which tern is the correct one) in a LINQ query ?

For example, I try to create a XML output (via LINQ to XML) of all currently loaded assemblies.

Dim xmldoc As XDocument = New XDocument(
        New XElement("Types",
        Assembly.GetExecutingAssembly().GetReferencedAssemblies() _
                .Select(Function(name) Assembly.Load(name)) _
                .SelectMany(Function(assembly) assembly.GetTypes()) _
                .Select(Function(type) New XElement("Type", type.FullName))))

Tho output looks like this.

<Types>
  <Type>System.Object</Type>
  <Type>FXAssembly</Type>
  <Type>ThisAssembly</Type>
  <Type>AssemblyRef</Type>
  <Type>System.Runtime.Serialization.ISerializable</Type>
  <Type>System.Runtime.InteropServices._Exception</Type>
  .....
</Types>

is it possible to somehow get current "index" (counter?) from LINQ's Selects? I would like to use it in XML

<Types>
  <Type ID="1">System.Object</Type>
  <Type ID="2">FXAssembly</Type>
  <Type ID="3">ThisAssembly</Type>
  <Type ID="4">AssemblyRef</Type>
  <Type ID="or-some-other-unique-id-#5">System.Runtime.Serialization.ISerializable</Type>
      .....
</Types>

Yup - you just need to use the overload of Select which takes a Func<TSource, int, TResult>. So in C# it would be something like:

XDocument doc = new XDocument(new XElement("Types",
    Assembly.GetExecutingAssembly().GetReferencedAssemblies()
        .Select(name => Assembly.Load(name))
        .SelectMany(assembly => assembly.GetTypes())
        .Select((type, index) => new XElement("Type",
                                              new XAttribute("ID", index + 1), 
                                              type.FullName))));

Sorry it's not in VB, but it's more likely to work this way - hopefully you can work out the translation :)