Best linq questions in January 2012

LINQ + Foreach vs Foreach + If

21 votes

I need to iterate over a List of objects, doing something only for the objects that have a boolean property set to true. I'm debating between this code

foreach (RouteParameter parameter in parameters.Where(p => p.Condition))
{ //do something }

and this code

foreach (RouteParameter parameter in parameters)
{ 
  if !parameter.Condition
    continue;
  //do something
}

The first code is obviously cleaner, but I suspect it's going to loop over the list twice - once for the query and once for the foreach. This won't be a huge list so I'm not overly concerned about performance, but the idea of looping twice just bugs me.

Question: Is there a clean/pretty way to write this without looping twice?

Jon Skeet sometimes does a live-action LINQ demo to explain how this works. Imagine you have three people on stage. On the left we have one guy who has a shuffled deck of cards. In the middle we have one guy who only passes along red cards, and on the right, we have a guy who wants cards.

The guy on the right pokes the guy in the middle. The guy in the middle pokes the guy on the left. The guy on the left hands the guy in the middle a card. If it is black, the guy in the middle throws it on the floor and pokes again until he gets a red card, which he then hands to the guy on the right. Then the guy on the right pokes the guy in the middle again.

This continues until the guy on the left runs out of cards.

The deck was not gone through from start to finish more than once. However, both the guy on the left and the guy in the middle handled 52 cards, and the guy on the right handled 26 cards. There were a total of 52 + 52 + 26 operations on cards, but the deck was only looped through once.

Your "LINQ" version and the "continue" version are the same thing; if you had

foreach(var card in deck)
{
    if (card.IsBlack) continue;
    ... use card ...

then there are 52 operations that fetch each card from the deck, 52 operations that test to see if each card is black, and 26 operations that act on the red card. Same thing exactly.

Get 4 max numbers from a List<int> use lambda expression

13 votes

this is my list:

List<int> numbers=new List<int> { 12, 5, -8, 4, 7, 28, 3, 22 };

How can I get 4 maximum numbers by lambda: I need this ones: {28, 22, 12, 7}

Use:

var result = numbers.OrderByDescending(n => n).Take(4);

Should the UI layer be able to pass lambda expressions into the service layer instead of calling a specific method?

6 votes

The ASP.NET project I am working on has 3 layers; UI, BLL, and DAL. I wanted to know if it was acceptable for the UI to pass a lambda expression to the BLL, or if the UI should pass parameters and the Service method should use those parameters to construct the lambda expression? Here is an example class showing both senarios.

public class JobService 
{
    IRepository<Job> _repository;

    public JobService(IRepository<Job> repository) 
    {
        _repository = repository;
    }

    public Job GetJob(int jobID)
    {
        return _repository.Get(x => x.JobID == jobID).FirstOrDefault();
    }

    public IEnumerable<Job> Get(Expression<Func<Job, bool>> predicate)
    {
        return _repository.Get(predicate);
    }
}

For the above class is it acceptable for the UI to call the following:

JobService jobService = new JobService(new Repository<Job>());
Job job = jobService.Get(x => x.JobID == 1).FirstOrDefault();

or should it only be allowed to call GetJob(int jobID)?

This is a simple example, and my question is in general, should the UI layer be able to pass lambda expressions into the service layer instead of calling a specific method?

This is a judgement call based on the situation. It's not necessarily wrong to pass in a predicate like this. I think it should be considered a minor bad smell though.

If the passing in of a lambda expression allows you to reduce 6 methods down to 1, then it might be a good move. On the other hand, if you can just as easily pass in a simple type, then lambda syntax is a needless complication.

In the above example, not knowing the context, my preference would be to use a simple integer parameter. There should usually be a basic method that just gets a record by it's ID. And maybe one or two other such methods that are repeatedly used through your application. And then maybe a general purpose method that takes a lambda.

You should also consider what some would suggest should be a rule: that you not have any methods with lambda-specified predicates between your UI and your business layer. (And some believe, with reason, that your repositories shouldn't even have such methods!) I don't believe this should be an iron-clad rule, but there's good reasons for it. Your business and data layers, between them, should keep dangerous queries from happening. If you allow the passing in of lambdas, it's very easy for a junior developer in the UI layer to specify queries that could really hose your database. (For example, they'll do huge queries against non-indexed fields, and/or filter against the resultset using LINQ-to-objects, and not realize how inefficient that is.)

Like many other good practices, this will depend somewhat on scope. In my recent large application, I have no passing of lambda syntax from the UI layer to the business layer. My plan was to invest heavily in the business layer, to make it very smart. It has all the needed methods with simple types. In fact, it typically gives you what you need through simple domain object properties, with no parameters at all. My interface assures that the UI can only cause efficient queries to happen, with perhaps just minor LINQ-to-Objects predicates in the UI layer. (This goes even for "search" pages. My business layer accepts a criteria object with constrained possibilities, and ensures an efficient query.)

Now, you said "layers", rather than "tiers". So these are all in the same assembly? Another disadvantage of lambda is they're (currently) difficult to serialize. So you'd regret them if you had to separate your tiers.

Remove specific nodes under the XML root?

5 votes

My XML is below;

<XML ID="Microsoft Search Thesaurus">
 <thesaurus xmlns="x-schema:tsSchema.xml">
   <diacritics_sensitive>1</diacritics_sensitive>
   <expansion>
     <sub>Internet Explorer</sub>
     <sub>IE</sub>
     <sub>IE5</sub>
   </expansion>
   <expansion>
     <sub>run</sub>
     <sub>jog</sub>
   </expansion>
 </thesaurus>
</XML>

I want to remove the "expansion" nodes from the XML. After removing process, it would be like that;

<XML ID="Microsoft Search Thesaurus">
 <thesaurus xmlns="x-schema:tsSchema.xml">

 </thesaurus>
</XML>

My code is below;

XDocument tseng = XDocument.Load("C:\\tseng.xml");
XElement root = tseng.Element("XML").Element("thesaurus");
root.Remove();
tseng.Save("C:\\tseng.xml");

I got an error "Object reference not set to an instance of an object." for line "root.Remove()". How can I remove the "expansion" nodes from XML file? Thanks.

Use:

Will remove only expansion elements:

XNamespace ns = "x-schema:tsSchema.xml";
tseng.Root.Element(ns + "thesaurus")
    .Elements(ns + "expansion").Remove();

Will remove all children of thesaurus:

XNamespace ns = "x-schema:tsSchema.xml";
tseng.Root.Element(ns + "thesaurus").Elements().Remove();

Substring of a DateTime with Linq to Sql extensions

5 votes

I'm trying to get a list of dates from my table, which contains a number of DateTime values in a column called StartTime. My predecessor was using the following SQL:

SELECT DISTINCT SUBSTRING(CONVERT(VARCHAR(50), StartTime, 120), 1, 10)

This results in a distinct list of dates in "yyyy-MM-dd" format for each row in the table. I'm trying to convert this to Linq-to-SQL by doing the following:

query.Select(o => o.StartTime.ToString("yyyy-MM-dd")).Distinct()

However this results in an error "Method 'System.String ToString(System.String)' has no supported translation to SQL."

How can I do this Substring/Convert using Linq-to-SQL?

Thanks!

Instead of relying on string processing, you can handle this via DateTime properties supported in LINQ to SQL:

var results = query.Select(o => o.StartTime.Date).Distinct();

If you want to view this as a string, later, you can use LINQ to Objects to convert the results:

var stringResults = results.AsEnumerable().Select(d => d.ToString("yyyy-MM-dd"));