Best .net questions in December 2011

Is it a good practice to have logger as a singleton?

40 votes

I had a habit to pass logger to constructor, like:

public class OrderService : IOrderService {
     public OrderService(ILogger logger) {
     }
}

But that is quite annoying, so I've used it a property this for some time:

private ILogger logger = NullLogger.Instance;
public ILogger Logger
{
    get { return logger; }
    set { logger = value; }
}

This is getting annoying too - it is not dry, I need to repeat this in every class. I could use base class, but then again - I'm using Form class, so would need FormBase, etc. So I think, what would be downside of having singleton with ILogger exposed, so veryone would know where to get logger:

    Infrastructure.Logger.Info("blabla");

UPDATE: As Merlyn correctly noticed, I've should mention, that in first and second examples I am using DI.

This is getting annoying too - it is not DRY

That's true. But there is only so much you can do for a cross-cutting concern that pervades every type you have. You have to use the logger everywhere, so you must have the property on those types.

So lets see what we can do about it.

Singleton

Singletons are terrible <flame-suit-on>.

I recommend sticking with property injection as you've done with your second example. This is the best factoring you can do without resorting to magic. It is better to have an explicit dependency than to hide it via a singleton.

But if singletons save you significant time, including all refactoring you will ever have to do (crystal ball time!), I suppose you might be able to live with them. If ever there were a use for a Singleton, this might be it. Keep in mind the cost if you ever want to change your mind will be about as high as it gets.

If you do this, check out other people's answers using the Registry pattern (see the description), and those registering a (resetable) singleton factory rather than a singleton logger instance.

There are other alternatives that might work just as well without as much compromise, so you should check them out first.

Visual Studio code snippets

You could use Visual Studio code snippets to speed up the entrance of that repetitive code. You will be able to type something like loggertab, and the code will magically appear for you.

Using AOP to DRY off

You could eliminate a little bit of that property injection code by using an Aspect Oriented Programming (AOP) framework like PostSharp to auto-generate some of it.

It might look something like this when you're done:

[InjectedLogger]
public ILogger Logger { get; set; }

You could also use their method tracing sample code to automatically trace method entrance and exit code, which might eliminate the need to add some of the logger properties all together. You could apply the attribute at a class level, or namespace wide:

[Trace]
public class MyClass
{
    // ...
}

// or

#if DEBUG
[assembly: Trace( AttributeTargetTypes = "MyNamespace.*",
    AttributeTargetTypeAttributes = MulticastAttributes.Public,
    AttributeTargetMemberAttributes = MulticastAttributes.Public )]
#endif

Difference between the implementation of var in Javascript and C#

27 votes

I would like to ask a theoretical question. If I have, for example, the following C# code in Page_load:

cars = new carsModel.carsEntities();

var mftQuery = from mft in cars.Manufacturers 
               where mft.StockHeaders.Any(sh=> sh.StockCount>0) 
               orderby mft.CompanyName 
               select new {mft.CompanyID, mft.CompanyName};
               // ...

Questions:

  1. This code uses the var keyword. What is the benefit of this construct?
  2. What is the key difference between the implementation of var in Javascript and C#?

JavaScript is a dynamically typed language, while c# is (usually) a statically typed language. As a result, comparisons like this will always be problematic. But:

JavaScript's var keyword is somewhat similar to C#'s dynamic keyword. Both create a variable whose type will not be known until runtime, and whose misuse will not be discovered until runtime. This is the way JavaScript always is, but this behavior is brand new to C# 4.

dynamic foo = new DateTime();
foo.bar();  //compiles fine but blows up at runtime.

JavaScript has nothing to match C#'s var, since JavaScript is a dynamically typed language, and C#'s var, despite popular misconception, creates a variable whose type is known at compile time. C#'s var serves two purposes: to declare variables whose type is a pain to write out, and to create variables that are of an anonymous type, and therefore have no type that can be written out by the developer.

For an example of the first:

var conn = new System.Data.SqlClient.SqlConnection("....");

Anonymous type projections from Linq-to-Sql or Entity Framework are a good example of the second:

var results = context.People.Where(p => p.Name == "Adam")
                            .Select(p => new { p.Name, p.Address });

Here results is of type IQueryable<SomeTypeTheCompilerCreatedOnTheFly>. No matter how much you might like to write out the actual type of results, instead of just writing var, there's no way to since you have no knowledge of the type that the compiler is creating under the covers for your anonymous type—hence the terminology: this type is anonymous

In both cases the type is known at compile time, and in both cases, subsequently saying either

conn = new DateTime();

or

results = new DateTime();

would result in a compiler error, since you're setting conn and results to a type that's not compatible with what they were declared as.

C# Why can equal decimals produce unequal hash values?

26 votes

We ran into a magic decimal number that broke our hashtable. I boiled it down to the following minimal case:

decimal d0 = 295.50000000000000000000000000m;
decimal d1 = 295.5m;

Console.WriteLine("{0} == {1} : {2}", d0, d1, (d0 == d1));
Console.WriteLine("0x{0:X8} == 0x{1:X8} : {2}", d0.GetHashCode(), d1.GetHashCode()
                  , (d0.GetHashCode() == d1.GetHashCode()));

Giving the following output:

295.50000000000000000000000000 == 295.5 : True
0xBF8D880F == 0x40727800 : False

What is really peculiar: change, add or remove any of the digits in d0 and the problem goes away. Even adding or removing one of the trailing zeros! The sign doesn't seem to matter though.

Our fix is to divide the value to get rid of the trailing zeroes, like so:

decimal d0 = 295.50000000000000000000000000m / 1.000000000000000000000000000000000m;

But my question is, how is C# doing this wrong?

To start with, C# isn't doing anything wrong at all. This is a framework bug.

It does indeed look like a bug though - basically whatever normalization is involved in comparing for equality ought to be used in the same way for hash code computation. I've checked and can reproduce it too (using .NET 4) including checking the Equals(decimal) and Equals(object) methods as well as the == operator.

It definitely looks like it's the d0 value which is the problem, as adding trailing 0s to d1 doesn't change the results (until it's the same as d0 of course). I suspect there's some corner case tripped by the exact bit representation there.

I'm surprised it isn't (and as you say, it works most of the time), but you should report the bug on Connect.

Storing data into list with class

21 votes

I have the following class:

public class EmailData
{
    public string FirstName{ set; get; }
    public string LastName { set; get; }
    public string Location{ set; get; }
}

I then did the following but was not working properly:

List<EmailData> lstemail = new List<EmailData>(); 
lstemail.Add("JOhn","Smith","Los Angeles");

I get a message that says no overload for method takes 3 arguments.

If you want to instantiate and add in the same line, you'd have to do something like this:

lstemail.Add(new EmailData { FirstName = "JOhn", LastName = "Smith", Location = "Los Angeles" });

or just instantiate the object prior, and add it directly in:

EmailData data = new EmailData();
data.FirstName = "JOhn";
data.LastName = "Smith";
data.Location = "Los Angeles"

lstemail.Add(data);

Is it possible to observe a partially-constructed object from another thread?

20 votes

I've often heard that in the .NET 2.0 memory model, writes always use release fences. Is this true? Does this mean that even without explicit memory-barriers or locks, it is impossible to observe a partially-constructed object (considering reference-types only) on a thread different from the one on which it is created? I'm obviously excluding cases where the constructor leaks the this reference.

For example, let's say we had the immutable reference type:

public class Person
{
    public string Name { get; private set; }
    public int Age { get; private set; }

    public Person(string name, int age)
    {
        Name = name;
        Age = age;
    }
}

Would it be possible with the following code to observe any output other than "John 20" and "Jack 21", say "null 20" or "Jack 0" ?

// We could make this volatile to freshen the read, but I don't want
// to complicate the core of the question.
private Person person;

private void Thread1()
{
    while (true)
    {
        var personCopy = person;

        if (personCopy != null)
            Console.WriteLine(personCopy.Name + " " + personCopy.Age);
    }
}

private void Thread2()
{
    var random = new Random();

    while (true)
    {
        person = random.Next(2) == 0
            ? new Person("John", 20)
            : new Person("Jack", 21);
    }
}

Does this also mean that I can make all shared fields of deeply-immutable reference-types volatile and (in most cases) just get on with my work?

I've often heard that in the .NET 2.0 memory model, writes always use release fences. Is this true?

It depends on what model you are referring to.

First, let us precisely define a release-fence barrier. Release semantics stipulate that no other read or write appearing before the barrier in the instruction sequence is allowed to move after that barrier.

  • The ECMA specification has a relaxed model in which writes do not provide this guarantee.
  • It has been cited somewhere that the CLR implementation provided by Microsoft strengthens the model by making writes have release-fence semantics.
  • The x86 and x64 architectures strengthen the model by making writes release-fence barriers and reads acquire-fence barriers.

So it is possible that another implementation of the CLI (such as Mono) running on an esoteric architecture (like ARM which Windows 8 will now target) would not provide release-fence semantics on writes. Notice that I said it is possible, but not certain. But, between all of the memory models in play, such as the different software and hardware layers, you have to code for the weakest model if you want your code to be truly portable. That means coding against the ECMA model and not making any assumptions.

We should make a list of the memory model layers in play just be explicit.

  • Compiler: The C# (or VB.NET or whatever) can move instructions.
  • Runtime: Obviously the CLI runtime via the JIT compiler can move instructions.
  • Hardware: And of course the CPU and memory architecture comes into play as well.

Does this mean that even without explicit memory-barriers or locks, it is impossible to observe a partially-constructed object (considering reference-types only) on a thread different from the one on which it is created?

Yes (qualified): If the environment in which the application is running is obscure enough then it might be possible for a partially constructed instance to be observed from another thread. This is one reason why double-checked locking pattern would be unsafe without using volatile. In reality, however, I doubt you would ever run into this mostly because Microsoft's implementation of the CLI will not reorder instructions in this manner.

Would it be possible with the following code to observe any output other than "John 20" and "Jack 21", say "null 20" or "Jack 0" ?

Again, that is qualified yes. But for the some reason as above I doubt you will ever observe such behavior.

Though, I should point out that because person is not marked as volatile it could be possible that nothing is printed at all because the reading thread may always see it as null. In reality, however, I bet that Console.WriteLine call will cause the C# and JIT compilers to avoid the lifting operation that might otherwise move the read of person outside the loop. I suspect you are already well aware of this nuance already.

Does this also mean that I can just make all shared fields of deeply-immutable reference-types volatile and (in most cases) get on with my work?

I do not know. That is a pretty loaded question. I am not comfortable answering either way without a better understanding of the context behind it. What I can say is that I typically avoid using volatile in favor of more explicit memory instructions such as the Interlocked operations, Thread.VolatileRead, Thread.VolatileWrite, and Thread.MemoryBarrier. Then again, I also try to avoid no-lock code altogether in favor of the higher level synchronization mechanisms such as lock.

Update:

One way I like visualize things is to assume that the C# compiler, JITer, etc. will optimize as aggressively as possible. That means that Person.ctor might be a candidate for inlining (since it is simple) which would yield the following pseudocode.

Person ref = allocate space for Person
ref.Name = name;
ref.Age = age;
person = instance;
DoSomething(person);

And because writes have no release-fence semantics in the ECMA specification then the other reads & writes could "float" down past the assignment to person yielding the following valid sequence of instructions.

Person ref = allocate space for Person
person = ref;
person.Name = name;
person.Age = age;
DoSomething(person);

So in this case you can see that person gets assigned before it is initialized. This is valid because from the perspective of the executing thread the logical sequence remains consistent with the physical sequence. There are no unintended side-effects. But, for reasons that should be obvious, this sequence would be disastrous to another thread.

C# & .NET: stackalloc

19 votes

I have a few questions about the functionality of the stackalloc operator.

  1. How does it actually allocate? I thought it does something like:

    void* stackalloc(int sizeInBytes)
    {
        void* p = StackPointer (esp);
        StackPointer += sizeInBytes;
        if(StackPointer exceeds stack size)
            throw new StackOverflowException(...);
        return p;
    }
    

    But I have done a few tests, and I'm not sure that's how it work. We can't know exactly what it does and how it does it, but I want to know the basics.

  2. I thought that stack allocation (Well, I am actually sure about it) is faster than heap allocation. So why does this example:

     class Program
     {
         static void Main(string[] args)
         {
             Stopwatch sw1 = new Stopwatch();
             sw1.Start();
             StackAllocation();
             Console.WriteLine(sw1.ElapsedTicks);
    
             Stopwatch sw2 = new Stopwatch();
             sw2.Start();
             HeapAllocation();
             Console.WriteLine(sw2.ElapsedTicks);
         }
         static unsafe void StackAllocation()
         {
             for (int i = 0; i < 100; i++)
             {
                 int* p = stackalloc int[100];
             }
         }
         static void HeapAllocation()
         {
             for (int i = 0; i < 100; i++)
             {
                 int[] a = new int[100];
             }
         }
     }
    

gives the average results of 280~ ticks for stack allocation, and usually 1-0 ticks for heap allocation? (On my personal computer, Intel Core i7).

On the computer I am using now (Intel Core 2 Duo), the results make more sense that the previous ones (Probably because Optimize code was not checked in VS): 460~ ticks for stack allocation, and about 380 ticks for heap allocation.

But this still doesn't make sense. Why is it so? I guess that the CLR notices that we don't use the array, so maybe it doesn't even allocate it?

A case where stackalloc is faster:

 private static volatile int _dummy; // just to avoid any optimisations
                                         // that have us measuring the wrong
                                         // thing. Especially since the difference
                                         // is more noticable in a release build
                                         // (also more noticable on a multi-core
                                         // machine than single- or dual-core).
 static void Main(string[] args)
 {
     System.Diagnostics.Stopwatch sw1 = new System.Diagnostics.Stopwatch();
     Thread[] threads = new Thread[20];
     sw1.Start();
     for(int t = 0; t != 20; ++t)
     {
        threads[t] = new Thread(DoSA);
        threads[t].Start();
     }
     for(int t = 0; t != 20; ++t)
        threads[t].Join();
     Console.WriteLine(sw1.ElapsedTicks);

     System.Diagnostics.Stopwatch sw2 = new System.Diagnostics.Stopwatch();
     threads = new Thread[20];
     sw2.Start();
     for(int t = 0; t != 20; ++t)
     {
        threads[t] = new Thread(DoHA);
        threads[t].Start();
     }
     for(int t = 0; t != 20; ++t)
        threads[t].Join();
     Console.WriteLine(sw2.ElapsedTicks);
     Console.Read();
 }
 private static void DoSA()
 {
    Random rnd = new Random(1);
    for(int i = 0; i != 100000; ++i)
        StackAllocation(rnd);
 }
 static unsafe void StackAllocation(Random rnd)
 {
    int size = rnd.Next(1024, 131072);
    int* p = stackalloc int[size];
    _dummy = *(p + rnd.Next(0, size));
 }
 private static void DoHA()
 {
    Random rnd = new Random(1);
    for(int i = 0; i != 100000; ++i)
        HeapAllocation(rnd);
 }
 static void HeapAllocation(Random rnd)
 {
    int size = rnd.Next(1024, 131072);
    int[] a = new int[size];
    _dummy = a[rnd.Next(0, size)];
 }

Important differences between this code and that in the question:

  1. We have several threads running. With stack allocation, they are allocating in their own stack. With heap allocation, they are allocating from a heap shared with other threads.

  2. Larger sizes allocated.

  3. Different sizes allocated each time (though I seeded the random generator to make the tests more deterministic). This makes heap fragmentation more likely to happen, making heap allocation less efficient than with identical allocations each time.

As well as this, it's also worth noting that stackalloc would often be used as an alternative to using fixed to pin an array on the heap. Pinning arrays is bad for heap performance (not just for that code, but also for other threads using the same heap), so the performance impact would be even greater then, if the claimed memory would be in use for any reasonable length of time.

While my code demonstrates a case where stackalloc gives a performance benefit, that in the question is probably closer to most cases where someone might eagerly "optimise" by using it. Hopefully the two pieces of code together show that whole stackalloc can give a boost, it can also hurt performance a lot too.

Generally, you shouldn't even consider stackalloc unless you are going to need to use pinned memory for interacting with unmanaged code anyway, and it should be considered an alternative to fixed rather than an alternative to general heap allocation. Use in this case still requires caution, forethought before you start, and profiling after you finish.

Use in other cases could give a benefit, but it should be far down the list of performance improvements you would try.

Edit:

To answer part 1 of the question. Stackalloc is conceptually much as you describe. It obtains a chunk of the stack memory, and then returns a pointer to that chunk. It doesn't check the memory will fit as such, but rather if it attempts to obtain memory into the end of the stack - which is protected by .NET on thread creation - then this will cause the OS to return an exceptioin to the runtime, which it then turns into a .NET managed exception. Much the same happens if you just allocate a single byte in a method with infinite recursion - unless the call got optimised to avoid that stack allocation (sometimes possible), then a single byte will eventually add up to enough to trigger the stack overflow exception.

Why is LINQ .Where(predicate).First() faster than .First(predicate)?

18 votes

I am doing some performance tests and noticed that a LINQ expression like

result = list.First(f => f.Id == i).Property

is slower than

result = list.Where(f => f.Id == i).First().Property

This seems counter intuitive. I would have thought that the first expression would be faster because it can stop iterating over the list as soon as the predicate is satisfied, whereas I would have thought that the .Where() expression might iterate over the whole list before calling .First() on the resulting subset. Even if the latter does short circuit it should not be faster than using First directly, but it is.

Below are two really simple unit tests that illustrate this. When compiled with optimisation on TestWhereAndFirst is about 30% faster than TestFirstOnly on .Net and Silverlight 4. I have tried making the predicate return more results but the performance difference is the same.

Can any one explain why .First(fn) is slower than .Where(fn).First()? I see a similar counter intuitive result with .Count(fn) compared to .Where(fn).Count().

private const int Range = 50000;

private class Simple
{
   public int Id { get; set; }
   public int Value { get; set; }
}

[TestMethod()]
public void TestFirstOnly()
{
   List<Simple> list = new List<Simple>(Range);
   for (int i = Range - 1; i >= 0; --i)
   {
      list.Add(new Simple { Id = i, Value = 10 });
   }

   int result = 0;
   for (int i = 0; i < Range; ++i)
   {
      result += list.First(f => f.Id == i).Value;
   }

   Assert.IsTrue(result > 0);
}

[TestMethod()]
public void TestWhereAndFirst()
{
   List<Simple> list = new List<Simple>(Range);
   for (int i = Range - 1; i >= 0; --i)
   {
      list.Add(new Simple { Id = i, Value = 10 });
   }

   int result = 0;
   for (int i = 0; i < Range; ++i)
   {
      result += list.Where(f => f.Id == i).First().Value;
   }

   Assert.IsTrue(result > 0);
}

I got the same results: where+first was quicker than first.

As Jon noted, Linq uses lazy evaluation so the performance should be (and is) broadly similar for both methods.

Looking in Reflector, First uses a simple foreach loop to iterate through the collection but Where has a variety of iterators specialised for different collection types (arrays, lists, etc.). Presumably this is what gives Where the small advantage.

Why is concurrent modification of arrays so slow?

16 votes

I was writing a program to illustrate the effects of cache contention in multithreaded programs. My first cut was to create an array of long and show how modifying adjacent items causes contention. Here's the program.

const long maxCount = 500000000;
const int numThreads = 4;
const int Multiplier = 1;
static void DoIt()
{
    long[] c = new long[Multiplier * numThreads];
    var threads = new Thread[numThreads];

    // Create the threads
    for (int i = 0; i < numThreads; ++i)
    {
        threads[i] = new Thread((s) =>
            {
                int x = (int)s;
                while (c[x] > 0)
                {
                    --c[x];
                }
            });
    }

    // start threads
    var sw = Stopwatch.StartNew();
    for (int i = 0; i < numThreads; ++i)
    {
        int z = Multiplier * i;
        c[z] = maxCount;
        threads[i].Start(z);
    }
    // Wait for 500 ms and then access the counters.
    // This just proves that the threads are actually updating the counters.
    Thread.Sleep(500);
    for (int i = 0; i < numThreads; ++i)
    {
        Console.WriteLine(c[Multiplier * i]);
    }

    // Wait for threads to stop
    for (int i = 0; i < numThreads; ++i)
    {
        threads[i].Join();
    }
    sw.Stop();
    Console.WriteLine();
    Console.WriteLine("Elapsed time = {0:N0} ms", sw.ElapsedMilliseconds);
}

I'm running Visual Studio 2010, program compiled in Release mode, .NET 4.0 target, "Any CPU", and executed in the 64-bit runtime without the debugger attached (Ctrl+F5).

That program runs in about 1,700 ms on my system, with a single thread. With two threads, it takes over 25 seconds. Figuring that the difference was cache contention, I set Multipler = 8 and ran again. The result is 12 seconds, so contention was at least part of the problem.

Increasing Multiplier beyond 8 doesn't improve performance.

For comparison, a similar program that doesn't use an array takes only about 2,200 ms with two threads when the variables are adjacent. When I separate the variables, the two thread version runs in the same amount of time as the single-threaded version.

If the problem was array indexing overhead, you'd expect it to show up in the single-threaded version. It looks to me like there's some kind of mutual exclusion going on when modifying the array, but I don't know what it is.

Looking at the generated IL isn't very enlightening. Nor was viewing the disassembly. The disassembly does show a couple of calls to (I think) the runtime library, but I wasn't able to step into them.

I'm not proficient with windbg or other low-level debugging tools these days. It's been a really long time since I needed them. So I'm stumped.

My only hypothesis right now is that the runtime code is setting a "dirty" flag on every write. It seems like something like that would be required in order to support throwing an exception if the array is modified while it's being enumerated. But I readily admit that I have no direct evidence to back up that hypothesis.

Can anybody tell me what is causing this big slowdown?

You've got false sharing. I wrote an article about it here

Are there risks to optimizing code in C#?

14 votes

In the build settings panel of VS2010 Pro, there is a CheckBox with the label "optimize code"... of course, I want to check it... but being unusually cautious, I asked my brother about it and he said that it is unchecked for debugging and that in C++ it can potentially do things that would break or bug the code... but he doesn't know about C#.

So my question is, can I check this box for my release build without worrying about it breaking my code? Second, if it can break code, when and why? Links to explanations welcome.

The CheckBox I am talking about.

You would normally use this option in a release build. It's safe and mainstream to do so. There's no reason to be afraid of releasing code with optimizations enabled. Enabling optimization can interfere with debugging which is a good reason to disable it for debug builds.

Boxing and Unboxing in String.Format(...) ... is the following rationalized?

14 votes

I was doing some reading regarding boxing/unboxing, and it turns out that if you do an ordinary String.Format() where you have a value type in your list of object[] arguments, it will cause a boxing operation. For instance, if you're trying to print out the value of an integer and do string.Format("My value is {0}",myVal), it will stick your myVal int in a box and run the ToString function on it.

Browsing around, I found this article.

It appears you can avoid the boxing penalty simply by doing the .ToString on the value type before handing it on to the string.Format function: string.Format("My value is {0}",myVal.ToString())

  1. Is this really true? I'm inclined to believe the author's evidence.
  2. If this is true, why doesn't the compiler simply do this for you? Maybe it's changed since 2006? Does anybody know? (I don't have the time/experience to do the whole IL analysis)

The compiler doesn't do this for you because string.Format takes a params Object[]. The boxing happens because of the conversion to Object.

I don't think the compiler tends to special case methods, so it won't remove boxing in cases like this.

Yes in many cases it is true that the compiler won't do boxing if you call ToString() first. If it uses the implementation from Object I think it would still have to box.

Ultimately the string.Format parsing of the format string itself is going to be much slower than any boxing operation, so the overhead is negligible.

Does code-signing without strong-naming leave your app open to abuse?

13 votes

Trying to get my head around authenticode code-signing and strong-naming.

Am I right in thinking that if I code-sign an exe that references a few dlls (not strong named) that a malicious user could replace my DLLs and distribute the app in a way that appears as if it's signed by me, but is running their code?

Assuming that's true, it seems like you wouldn't really want to sign a .NET app without strong-naming the whole thing, otherwise you're giving people the ability to execute code under the guise of an app you wrote?

The reason I'm unsure, is that none of the articles I found online (including the MSDN doc about using SN+Authenticode) seem to mention this, and it seems like a fairly important point to understand (if I've understood correctly)?

Am I right in thinking that if I code-sign an exe that references a few dlls (not strong named) that a malicious user could replace my DLLs and distribute the app in a way that appears as if it's signed by me, but is running their code?

Yes if the remainder of the DLLs are only signed and not strong named they can be replaced without .NET raising an exception. You could, inside the exe, verify the DLLs are signed by the same key as the exe. None of these approaches prevent someone from replacing your DLLs or the EXE.

Assuming that's true, it seems like you wouldn't really want to sign a .NET app without strong-naming the whole thing, otherwise you're giving people the ability to execute code under the guise of an app you wrote?

Generally I suppose that is the 'best practice', but again you have not prevented anything. Once a user has the rights to change files on the local system there is not much you can do to stop them from malicious activity.

There are several obfuscation technologies that build complete .NET projects into a single exe, this might make the 'most secure' approach but still can tampered with.

The real question is what are you trying to prevent them from doing? I mean to say, why would someone be interested in replacing your dll? What would they hope to achieve, what is their goal? If you're trying to prevent someone from reading sensitive information from the process you in for a long hard road of disappointment. Assume a malicious party has complete access to your source code and every piece of information used by your process, because they do. Assume they can replace all or part of your code at will, because they can.

Updated

So binding redirect will only work with assemblies strong-named with the same key, and therefore does protect you from DLLs being changed?

Correct, with the noted exception that code injection can still be done in numerous ways.

... and back to the original question, does code-signing without strong-naming kinda undermine the point of code-signing?

Not really. Code signing (not strong naming) has two distinct purposes:

  1. Authentication. Verifying who the author of the software is.
  2. Integrity. Verifying that the software hasn’t been tampered with since it was signed.

Often this is only authenticated and validated during installation. This is why we sign our setup.exe, to ensure that the customer has received the unmodified installer from us. They are prompted with the "Do you trust XXXX Company" and are thereby granting authorization to the authenticated/signed installer. Once installed however there is little built-in use of code signing by the OS (except for drivers and some other obscure cases).

Strong Naming on the other had has a completely different purpose for it's existence. It's entirely focused on 'integrity' of the application. There is no certificate, no signing authority (CA) to verify it against, no user-displayed information for them to confirm, and nothing the OS can verify about the executable it's going to run.

The .NET framework uses strong names for many things, all of them I loosely categorize as application integrity:

  1. The contents of the dll/exe has a signed hash so that it cannot be tampered with.
  2. Each reference must be strong named and verified when loading the dependency.
  3. Assemblies can be registered in the GAC and publisher policies can be used.
  4. Native images can be ngen'd to produce a compiled image of the assembly's IL.

I'm sure there are other things I'm missing here, but these are primary uses I'm aware of.

Best practices for Signing and Strong-naming

  • Use a signed installer
  • Use a code-signed executable
  • Use a strong-named executable
  • Strong name all dependencies and references to them
  • Code signing dependencies is not generally required*
  • Consider GAC registering assemblies at install time

*Note: Code signing can be useful in some cases for a DLL, for example COM objects marked 'safe' and embedded into a browser should be signed and strong-named as if it were an executable. Code signing can also be useful in externally verifying dependencies without loading the assembly or reflecting it's attributes.

Why doesn't $ in .NET multiline regular expressions match CRLF?

13 votes

I have noticed the following:

var b1 = Regex.IsMatch("Line1\nLine2", "Line1$", RegexOptions.Multiline);   // true
var b2 = Regex.IsMatch("Line1\r\nLine2", "Line1$", RegexOptions.Multiline); // false

I'm confused. The documentation of RegexOptions says:

Multiline: Multiline mode. Changes the meaning of ^ and $ so they match at the beginning and end, respectively, of any line, and not just the beginning and end of the entire string.

Since C# and VB.NET are mainly used in the Windows world, I would guess that most files processed by .NET applications use CRLF linebreaks (\r\n) rather than LF linebreaks (\n). Still, it seems that the .NET regular expression parser does not recognize a CRLF linebreak as an end of line.

I know that I could workaround this, for example, by matching Line1\r?$, but it still strikes me as strange. Is this really the intended behaviour of the .NET regexp parser or did I miss some hidden UseWindowsLinebreaks option?

From MSDN:

By default, $ matches only the end of the input string. If you specify the RegexOptions.Multiline option, it matches either the newline character (\n) or the end of the input string. It does not, however, match the carriage return/line feed character combination. To successfully match them, use the subexpression \r?$ instead of just $.

http://msdn.microsoft.com/en-us/library/yd1hzczs.aspx#Multiline

So I can't say why (compatibility with regular expressions from other languages?), but at the very least it's intended.

Why are collection initializers on re-assignments not allowed?

12 votes

I always thought it worked fine both ways. Then did this test and realized it's not allowed on re-assignments:

int[] a = {0, 2, 4, 6, 8};

works fine but not:

int [ ] a;
a = { 0, 2, 4, 6, 8 };

Any technical reason for this? I thought I would ask about it here, because this behavior was what I expected intuitively.

First off, let's get the terms correct. That's not a collection initializer. That's an array initializer. A collection initializer always follows a constructor for a collection type. An array initializer is only legal in a local or field declaration initializer, or in an array creation expression.

You are completely correct to note that this is an odd rule. Let me characterize its weirdness precisely:

Suppose you have a method M that takes an array of ints. All these are legal:

int[] x = new[] { 10, 20, 30 };
int[] y = new int[] { 10, 20, 30 };
int[] z = new int[3] { 10, 20, 30 };
M(new[] { 10, 20, 30 });
M(new int[] { 10, 20, 30 });
M(new int[3] { 10, 20, 30 });

But

int[] q = {10, 20, 30}; // legal!
M( { 10, 20, 30 } ); // illegal!

It seems like either the "lone" array initializer ought to be legal everywhere that the "decorated" one is, or nowhere. It's weird that there is this pseudo-expression that is valid only in an initializer, not anywhere else that an expression is legal.

Before I both criticize and defend this choice, I want to say that first and foremost, this discrepancy is a historical accident. There's no compellingly good reason for it. If we could get rid of it without breaking code, we would. But we can't. Were we designing C# from scratch again today I think odds are good that the "lone" array initializer without "new" would not be a valid syntax.

So, let me first give some reasons why array initializers should NOT be allowed as expressions and should be allowed in local variable initializers. Then I'll give some reasons for the opposite.

Reasons why array initializers should not be allowed as expressions:

Array initializers violate the nice property that { always means introduction of a new block of code. The error-recovery parser in the IDE that parses as you are typing likes to use braces as a convenient way to tell when a statement is incomplete; if you see:

if (x == M(
{ 
   Q(

Then it is pretty easy for the code editor to guess that you are missing )) before the {. the editor will assume that Q( is the beginning of a statement and it is missing its end.

But if array initializers are legal expressions then it could be that what is missing is )})){} following the Q.

Second, array initializers as expressions violate the nice principle that all heap allocations have "new" in them somewhere.

Reasons why array initializers should be allowed in field and local initializers:

Remember that array initializers were added to the language in v1.0, before implicitly typed locals, anonymous types, or type inference on arrays. Back in the day we did not have the pleasant "new[] { 10, 20, 30}" syntax, so without array initializers you'd have to say:

int[] x = new int[] { 10, 20, 30 };

which seems very redundant! I can see why they wanted to get that "new int[]" out of there.

When you say

int[] x = { 10, 20, 30 };

it is not syntactically ambiguous; the parser knows that this is an array initializer and not the beginning of a code block (unlike the case I mentioned above.) Nor is it type-ambiguous; it is clear that the initializer is an array of ints from the context.

So that argument justifies why in C# 1.0 array initializers were allowed in local and field initializers but not in expression contexts.

But that's not the world we're in today. Were we designing this from scratch today we probably would not have array initializers that do not have "new". Nowadays of course we realize that the better solution is:

var x = new[] { 10, 20, 30 };

and that expression is valid in any context. You can explicitly type it on either the "declaration" side or the "initializer" side of the = if you see fit, or you can let the compiler infer the types of either side or both.

So, summing up, yes, you are right that it is inconsistent that array initializers can be only in local and field declarations but not in expression contexts. There was a good reason for that ten years ago, but in the modern world with type inference, there's no longer much of a good reason for it. It's just a historical accident at this point.

Is .NET Stopwatch standby/sleep/hibernate aware?

12 votes

Does System.Diagnostics.Stopwatch count time during computer stand by?

Yes it does.

Looking at the code in Reflector shows that if it is not Stopwatch.IsHighResolution, then it will use the tick count (in my environment, the value was false so it will use DateTime.UtcNow.Ticks):

public void Start()
{
    if (!this.isRunning)
    {
        this.startTimeStamp = GetTimestamp();
        this.isRunning = true;
    }
}



public static long GetTimestamp()
{
    if (IsHighResolution)
    {
        long num = 0L;
        SafeNativeMethods.QueryPerformanceCounter(out num);
        return num;
    }
    return DateTime.UtcNow.Ticks;
}

Why do most exceptions omit instance-specific information?

11 votes

I've noticed that most exception messages don't include instance-specific details like the value that caused the exception. They generally only tell you the "category" of the error.

For example, when attempting to serialize an object with a 3rd. party library, I got a MissingMethodException with message:

"No parameterless constructor defined for this object."

In many cases this is enough, but often (typically during development) a message like

"No parameterless constructor defined for this object of type 'Foo'."

can save a lot of time by directing you straight to the cause of the error.

InvalidArgumentException is another example: it usually tells you the name of the argument but not its value. This seems to be the case for most framework-raised exceptions, but also for 3rd party libraries.

Is this done on purpose?

Is there a security implication in exposing an internal state like the "faulty" value of a variable?

Two reasons I can think of:

Firstly, maybe the parameter that threw the exception was a value that was a processed form of the one that was passed to the public interface. The value may not make sense without the expense of catching to rethrow a different exception that is going to be the same in most regards anyway.

Secondly, and more importantly, is that there can indeed be a security risk, that can be very hard to second-guess (if I'm writing a general-purpose container, I don't know what contexts it will be used in). We don't want "Credit-Card: 5555444455554444" appearing in an error message if we can help it.

Ultimately, just what debug information is most useful will vary according to the error anyway. If the type, method and (when possible) file and line number isn't enough, it's time to write some debug code that traps just what you do want to know, rather than complaining that it isn't already trapped when next time you might want yet different information (field state of instances can be just as likely to be useful as parameters).

Easiest way in C# to find out if an app is running from a network drive?

11 votes

I want to programmatically find out if my application is running from a network drive. What is the simplest way of doing that? It should support both UNC paths (\\127.0.0.1\d$) and mapped network drives (Z:).

This is my current method of doing this, but it feels like there should be a better way.

private bool IsRunningFromNetworkDrive()
    {
        var dir = AppDomain.CurrentDomain.BaseDirectory;
        var driveLetter = dir.First();
        if (!Char.IsLetter(driveLetter))
            return true;
        if (new DriveInfo(driveLetter.ToString()).DriveType == DriveType.Network)
            return true;
        return false;
    }

How was the hash collision issue in ASP.NET fixed (MS11-100)?

11 votes

As reported by Slashdot, MS issued an update to ASP.NET to fix the hash collision attack today. (Listed as "Collisions in HashTable May Cause DoS Vulnerability - CVE-2011-3414" on the linked Technet page.)

The problem is that the POST data are converted into a hash table that uses a known hashing algorithm. And if an attacker uses this by crafting a request that contains lots of collisions, he can easily cause a Denial of Service.

Does anyone know how exactly does that update fix the issue?

The update is not a complete fix, but rather a workaround. It limits the number of POST parameters accepted.

How to follow a .lnk file programmatically

6 votes

We have a network drive full of shortcuts (.lnk files) that point to folders and I need to traverse them programmatically in a C# Winforms app.

What practical options do I have?

Add IWshRuntimeLibrary as a reference to your project. Add Reference, COM tab, Windows Scripting Host Object Model.

Here is how I get the properties of a shortcut:

IWshRuntimeLibrary.IWshShell wsh = new IWshRuntimeLibrary.WshShellClass();
IWshRuntimeLibrary.IWshShortcut sc = (IWshRuntimeLibrary.IWshShortcut)wsh.CreateShortcut(filename);

The shortcut object "sc" has a TargetPath property.

Is it possible to run a Windows app within a console window?

5 votes

I have a Windows app that I've written in C# 4 and WPF; I've now been asked if I can add a commandline parameter (e.g. /console) that would force it to run as a console app so it can be run by a task scheduler.

Is this possible with modern apps? Or do I need to create a separate console app?

UPDATE: can I just emphasise that this is a WPF application. There is no convenient static void Main(string[] args) entry point to hook into. But the PM would still like the app to have the ability to run from the commandline...

FINAL UPDATE: the trick, as pointed out by @RodH257, is that the WPF app codegens the expected static void Main. You can add your own class with a method of the same name and in the project build properties, set it as the startup object for the executable. You'll also need the [STAThread] attribute on the method so that WPF will run properly too.

You can either turn it into a console application and manually show the first WPF dialog. just as you would if you were creating a DLL and starting a WPF window, as per below

 static class Program
{
    static void Main(params string[] args)
    {
        if (args.Length > 0)
        {
           //do stuff here
            return;
        }
          Window1 window = new Window1();
          window.ShowDialog();
    }
}

OR, have a look at this link for editing the entrypoint on your current WPF project and act based on arguments How to write custom Main method for a WPF application?

EDIT: Updated the post to make it more clear for the exact situation described.

What is the correct SQL type to store a .Net Timespan with values > 24:00:00?

5 votes

I am trying to store a .Net TimeSpan in SQL server 2008 R2.

EF Code First seems to be suggesting it should be stored as a Time(7) in SQL.

However TimeSpan in .Net can handle longer periods than 24 hours.

What is the best way to handle storing .Net TimeSpan in SQL server?

I'd store it in the database as a BIGINT and I'd store the number of ticks (eg. TimeSpan.Ticks property).

That way, if I wanted to get a TimeSpan object when I retrieve it, I could just do TimeSpan.FromTicks(value) which would be easy.