Best .net questions in May 2012

Why is this valid C# code?

54 votes

This is valid C# code

var bob = "abc" + null + null + null + "123";  // abc123

This is not valid C# code

var wtf = null.ToString(); // compiler error

Why is the first statement valid?

The reason for first one working:

From MSDN:

In string concatenation operations,the C# compiler treats a null string the same as an empty string, but it does not convert the value of the original null string.

More information on the + binary operator:

The binary + operator performs string concatenation when one or both operands are of type string.

If an operand of string concatenation is null, an empty string is substituted. Otherwise, any non-string argument is converted to its string representation by invoking the virtual ToString method inherited from type object.

If ToString returns null, an empty string is substituted.

The reason of the error in second is:

null (C# Reference) - The null keyword is a literal that represents a null reference, one that does not refer to any object. null is the default value of reference-type variables.

String.Format - how it works and how to implement custom formatstrings

45 votes

With String.Format() it is possible to format for example DateTime objects in many different ways. Every time I am looking for a desired format I need to search around on Internet. Almost always I find an example I can use. For example:

String.Format("{0:MM/dd/yyyy}", DateTime.Now);          // "09/05/2012"

But I don't have any clue how it works and which classes support these 'magic' additional strings.

So my questions are:

  1. How does String.Format map the additional information MM/dd/yyyy to a string result?
  2. Do all Microsoft objects support this feature?
    Is this documented somewhere?
  3. Is it possible to do something like this:
    String.Format("{0:MyCustomFormat}", new MyOwnClass())

String.Format matches each of the tokens inside the string ({0} etc) against the corresponding object: http://msdn.microsoft.com/en-us/library/system.string.format.aspx

A format string is optionally provided:

{ index[,alignment][ : formatString] }

If formatString is provided, the corresponding object must implement IFormattable and specifically the ToString method that accepts formatString and returns the corresponding formatted string: http://msdn.microsoft.com/en-us/library/system.iformattable.tostring.aspx

An IFormatProvider may also used which can be used to capture basic formatting standards/defaults etc. Examples here and here.

So the answers to your questions in order:

  1. It uses the IFormattable interface's ToString() method on the DateTime object and passes that the MM/dd/yyyy format string. It is that implementation which returns the correct string.

  2. Any object that implement IFormattable supports this feature. You can even write your own!

  3. Yes, see above.

Why does JIT order affect performance?

22 votes

Why does the order in which C# methods in .NET 4.0 are just-in-time compiled affect how quickly they execute? For example, consider two equivalent methods:

public static void SingleLineTest()
{
    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();
    int count = 0;
    for (uint i = 0; i < 1000000000; ++i) {
        count += i % 16 == 0 ? 1 : 0;
    }
    stopwatch.Stop();
    Console.WriteLine("Single-line test --> Count: {0}, Time: {1}", count, stopwatch.ElapsedMilliseconds);
}

public static void MultiLineTest()
{
    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();
    int count = 0;
    for (uint i = 0; i < 1000000000; ++i) {
        var isMultipleOf16 = i % 16 == 0;
        count += isMultipleOf16 ? 1 : 0;
    }
    stopwatch.Stop();
    Console.WriteLine("Multi-line test  --> Count: {0}, Time: {1}", count, stopwatch.ElapsedMilliseconds);
}

The only difference is the introduction of a local variable, which affects the assembly code generated and the loop performance. Why that is the case is a question in its own right.

Possibly even stranger is that on x86 (but not x64), the order that the methods are invoked has around a 20% impact on performance. Invoke the methods like this...

static void Main()
{
    SingleLineTest();
    MultiLineTest();
}

...and SingleLineTest is faster. (Compile using the x86 Release configuration, ensuring that "Optimize code" setting is enabled, and run the test from outside VS2010.) But reverse the order...

static void Main()
{
    MultiLineTest();
    SingleLineTest();
}

...and both methods take the same time (almost, but not quite, as long as MultiLineTest before). (When running this test, it's useful to add some additional calls to SingleLineTest and MultiLineTest to get additional samples. How many and what order doesn't matter, except for which method is called first.)

Finally, to demonstrate that JIT order is important, leave MultiLineTest first, but force SingleLineTest to be JITed first...

static void Main()
{
    RuntimeHelpers.PrepareMethod(typeof(Program).GetMethod("SingleLineTest").MethodHandle);
    MultiLineTest();
    SingleLineTest();
}

Now, SingleLineTest is faster again.

If you turn off "Suppress JIT optimization on module load" in VS2010, you can put a breakpoint in SingleLineTest and see that the assembly code in the loop is the same regardless of JIT order; however, the assembly code at the beginning of the method varies. But how this matters when the bulk of the time is spent in the loop is perplexing.

A sample project demonstrating this behavior is on github.

It's not clear how this behavior affects real-world applications. One concern is that it can make performance tuning volatile, depending on the order methods happen to be first called. Problems of this sort would be difficult to detect with a profiler. Once you found the hotspots and optimized their algorithms, it would be hard to know without a lot of guess and check whether additional speedup is possible by JITing methods early.

Update: See also the Microsoft Connect entry for this issue.

Please note that I do not trust the "Suppress JIT optimization on module load" option, I spawn the process without debugging and attach my debugger after the JIT has run.

In the version where single-line runs faster, this is Main:

        SingleLineTest();
00000000  push        ebp 
00000001  mov         ebp,esp 
00000003  call        dword ptr ds:[0019380Ch] 
            MultiLineTest();
00000009  call        dword ptr ds:[00193818h] 
            SingleLineTest();
0000000f  call        dword ptr ds:[0019380Ch] 
            MultiLineTest();
00000015  call        dword ptr ds:[00193818h] 
            SingleLineTest();
0000001b  call        dword ptr ds:[0019380Ch] 
            MultiLineTest();
00000021  call        dword ptr ds:[00193818h] 
00000027  pop         ebp 
        }
00000028  ret 

Note that MultiLineTest has been placed on an 8 byte boundary, and SingleLineTest on a 4 byte boundary.

Here's Main for the version where both run at the same speed:

            MultiLineTest();
00000000  push        ebp 
00000001  mov         ebp,esp 
00000003  call        dword ptr ds:[00153818h] 

            SingleLineTest();
00000009  call        dword ptr ds:[0015380Ch] 
            MultiLineTest();
0000000f  call        dword ptr ds:[00153818h] 
            SingleLineTest();
00000015  call        dword ptr ds:[0015380Ch] 
            MultiLineTest();
0000001b  call        dword ptr ds:[00153818h] 
            SingleLineTest();
00000021  call        dword ptr ds:[0015380Ch] 
            MultiLineTest();
00000027  call        dword ptr ds:[00153818h] 
0000002d  pop         ebp 
        }
0000002e  ret 

Amazingly, the addresses chosen by the JIT are identical in the last 4 digits, even though it allegedly processed them in the opposite order. Not sure I believe that any more.

More digging is necessary. I think it was mentioned that the code before the loop wasn't exactly the same in both versions? Going to investigate.

Here's the "slow" version of SingleLineTest (and I checked, the last digits of the function address haven't changed).

            Stopwatch stopwatch = new Stopwatch();
00000000  push        ebp 
00000001  mov         ebp,esp 
00000003  push        edi 
00000004  push        esi 
00000005  push        ebx 
00000006  mov         ecx,7A5A2C68h 
0000000b  call        FFF91EA0 
00000010  mov         esi,eax 
00000012  mov         dword ptr [esi+4],0 
00000019  mov         dword ptr [esi+8],0 
00000020  mov         byte ptr [esi+14h],0 
00000024  mov         dword ptr [esi+0Ch],0 
0000002b  mov         dword ptr [esi+10h],0 
            stopwatch.Start();
00000032  cmp         byte ptr [esi+14h],0 
00000036  jne         00000047 
00000038  call        7A22B314 
0000003d  mov         dword ptr [esi+0Ch],eax 
00000040  mov         dword ptr [esi+10h],edx 
00000043  mov         byte ptr [esi+14h],1 
            int count = 0;
00000047  xor         edi,edi 
            for (uint i = 0; i < 1000000000; ++i) {
00000049  xor         edx,edx 
                count += i % 16 == 0 ? 1 : 0;
0000004b  mov         eax,edx 
0000004d  and         eax,0Fh 
00000050  test        eax,eax 
00000052  je          00000058 
00000054  xor         eax,eax 
00000056  jmp         0000005D 
00000058  mov         eax,1 
0000005d  add         edi,eax 
            for (uint i = 0; i < 1000000000; ++i) {
0000005f  inc         edx 
00000060  cmp         edx,3B9ACA00h 
00000066  jb          0000004B 
            }
            stopwatch.Stop();
00000068  mov         ecx,esi 
0000006a  call        7A23F2C0 
            Console.WriteLine("Single-line test --> Count: {0}, Time: {1}", count, stopwatch.ElapsedMilliseconds);
0000006f  mov         ecx,797C29B4h 
00000074  call        FFF91EA0 
00000079  mov         ecx,eax 
0000007b  mov         dword ptr [ecx+4],edi 
0000007e  mov         ebx,ecx 
00000080  mov         ecx,797BA240h 
00000085  call        FFF91EA0 
0000008a  mov         edi,eax 
0000008c  mov         ecx,esi 
0000008e  call        7A23ABE8 
00000093  push        edx 
00000094  push        eax 
00000095  push        0 
00000097  push        2710h 
0000009c  call        783247EC 
000000a1  mov         dword ptr [edi+4],eax 
000000a4  mov         dword ptr [edi+8],edx 
000000a7  mov         esi,edi 
000000a9  call        793C6F40 
000000ae  push        ebx 
000000af  push        esi 
000000b0  mov         ecx,eax 
000000b2  mov         edx,dword ptr ds:[03392034h] 
000000b8  mov         eax,dword ptr [ecx] 
000000ba  mov         eax,dword ptr [eax+3Ch] 
000000bd  call        dword ptr [eax+1Ch] 
000000c0  pop         ebx 
        }
000000c1  pop         esi 
000000c2  pop         edi 
000000c3  pop         ebp 
000000c4  ret 

And the "fast" version:

            Stopwatch stopwatch = new Stopwatch();
00000000  push        ebp 
00000001  mov         ebp,esp 
00000003  push        edi 
00000004  push        esi 
00000005  push        ebx 
00000006  mov         ecx,7A5A2C68h 
0000000b  call        FFE11F70 
00000010  mov         esi,eax 
00000012  mov         ecx,esi 
00000014  call        7A1068BC 
            stopwatch.Start();
00000019  cmp         byte ptr [esi+14h],0 
0000001d  jne         0000002E 
0000001f  call        7A12B3E4 
00000024  mov         dword ptr [esi+0Ch],eax 
00000027  mov         dword ptr [esi+10h],edx 
0000002a  mov         byte ptr [esi+14h],1 
            int count = 0;
0000002e  xor         edi,edi 
            for (uint i = 0; i < 1000000000; ++i) {
00000030  xor         edx,edx 
                count += i % 16 == 0 ? 1 : 0;
00000032  mov         eax,edx 
00000034  and         eax,0Fh 
00000037  test        eax,eax 
00000039  je          0000003F 
0000003b  xor         eax,eax 
0000003d  jmp         00000044 
0000003f  mov         eax,1 
00000044  add         edi,eax 
            for (uint i = 0; i < 1000000000; ++i) {
00000046  inc         edx 
00000047  cmp         edx,3B9ACA00h 
0000004d  jb          00000032 
            }
            stopwatch.Stop();
0000004f  mov         ecx,esi 
00000051  call        7A13F390 
            Console.WriteLine("Single-line test --> Count: {0}, Time: {1}", count, stopwatch.ElapsedMilliseconds);
00000056  mov         ecx,797C29B4h 
0000005b  call        FFE11F70 
00000060  mov         ecx,eax 
00000062  mov         dword ptr [ecx+4],edi 
00000065  mov         ebx,ecx 
00000067  mov         ecx,797BA240h 
0000006c  call        FFE11F70 
00000071  mov         edi,eax 
00000073  mov         ecx,esi 
00000075  call        7A13ACB8 
0000007a  push        edx 
0000007b  push        eax 
0000007c  push        0 
0000007e  push        2710h 
00000083  call        782248BC 
00000088  mov         dword ptr [edi+4],eax 
0000008b  mov         dword ptr [edi+8],edx 
0000008e  mov         esi,edi 
00000090  call        792C7010 
00000095  push        ebx 
00000096  push        esi 
00000097  mov         ecx,eax 
00000099  mov         edx,dword ptr ds:[03562030h] 
0000009f  mov         eax,dword ptr [ecx] 
000000a1  mov         eax,dword ptr [eax+3Ch] 
000000a4  call        dword ptr [eax+1Ch] 
000000a7  pop         ebx 
        }
000000a8  pop         esi 
000000a9  pop         edi 
000000aa  pop         ebp 
000000ab  ret 

Just the loops, fast on the left, slow on the right:

00000030  xor         edx,edx                 00000049  xor         edx,edx 
00000032  mov         eax,edx                 0000004b  mov         eax,edx 
00000034  and         eax,0Fh                 0000004d  and         eax,0Fh 
00000037  test        eax,eax                 00000050  test        eax,eax 
00000039  je          0000003F                00000052  je          00000058 
0000003b  xor         eax,eax                 00000054  xor         eax,eax 
0000003d  jmp         00000044                00000056  jmp         0000005D 
0000003f  mov         eax,1                   00000058  mov         eax,1 
00000044  add         edi,eax                 0000005d  add         edi,eax 
00000046  inc         edx                     0000005f  inc         edx 
00000047  cmp         edx,3B9ACA00h           00000060  cmp         edx,3B9ACA00h 
0000004d  jb          00000032                00000066  jb          0000004B 

The instructions are identical (being relative jumps, the machine code is identical even though the disassembly shows different addresses), but the alignment is different. There are three jumps. the je loading a constant 1 is aligned in the slow version and not in the fast version, but it hardly matters, since that jump is only taken 1/16 of the time. The other two jumps ( jmp after loading a constant zero, and jb repeating the entire loop) are taken millions more times, and are aligned in the "fast" version.

I think this is the smoking gun.

this == null inside .NET instance method - why is that possible?

21 votes

I've always thought that it's impossible for this to be null inside instance method body. Following simple program demonstrates that it is possible. Is this some documented behaviour?

class Foo
{
    public void Bar()
    {
        Debug.Assert(this == null);
    }
}

public static void Test()
{            
    var action = (Action)Delegate.CreateDelegate(typeof (Action), null, typeof(Foo).GetMethod("Bar"));
    action();
}

UPDATE

I agree with the answers saying that it's how this method is documented. However, I don't really understand this behaviour. Especially because it's not how C# is designed.

We had gotten a report from somebody (likely one of the .NET groups using C# (thought it wasn't yet named C# at that time)) who had written code that called a method on a null pointer, but they didn’t get an exception because the method didn’t access any fields (ie “this” was null, but nothing in the method used it). That method then called another method which did use the this point and threw an exception, and a bit of head-scratching ensued. After they figured it out, they sent us a note about it. We thought that being able to call a method on a null instance was a bit weird. Peter Golde did some testing to see what the perf impact was of always using callvirt, and it was small enough that we decided to make the change.

http://blogs.msdn.com/b/ericgu/archive/2008/07/02/why-does-c-always-use-callvirt.aspx

Because you're passing null into the firstArgument of Delegate.CreateDelegate

So you're calling an instance method on a null object.

http://msdn.microsoft.com/en-us/library/74x8f551.aspx

If firstArgument is a null reference and method is an instance method, the result depends on the signatures of the delegate type type and of method:

If the signature of type explicitly includes the hidden first parameter of method, the delegate is said to represent an open instance method. When the delegate is invoked, the first argument in the argument list is passed to the hidden instance parameter of method.

If the signatures of method and type match (that is, all parameter types are compatible), then the delegate is said to be closed over a null reference. Invoking the delegate is like calling an instance method on a null instance, which is not a particularly useful thing to do.

Is there a race condition in this common pattern used to prevent NullReferenceException?

20 votes

I asked this question and got this interesting (and a little disconcerting) answer.

Daniel states in his answer (unless I'm reading it incorrectly) that the ECMA-335 CLI specification could allow a compiler to generate code that throws a NullReferenceException from the following DoCallback method.

class MyClass {
    private Action _Callback;
    public Action Callback { 
        get { return _Callback; }
        set { _Callback = value; }
    }
    public void DoCallback() {
        Action local;
        local = Callback;
        if (local == null)
            local = new Action(() => { });
        local();
    }
}

He says that, in order to guarantee a NullReferenceException is not thrown, the volatile keyword should be used on _Callback or a lock should be used around the line local = Callback;.

Can anyone corroborate that? And, if it's true, is there a difference in behavior between Mono and .NET compilers regarding this issue?

Edit
Here is a link to the standard.

Update
I think this is the pertinent part of the spec (12.6.4):

Conforming implementations of the CLI are free to execute programs using any technology that guarantees, within a single thread of execution, that side-effects and exceptions generated by a thread are visible in the order specified by the CIL. For this purpose only volatile operations (including volatile reads) constitute visible side-effects. (Note that while only volatile operations constitute visible side-effects, volatile operations also affect the visibility of non-volatile references.) Volatile operations are specified in §12.6.7. There are no ordering guarantees relative to exceptions injected into a thread by another thread (such exceptions are sometimes called "asynchronous exceptions" (e.g., System.Threading.ThreadAbortException).

[Rationale: An optimizing compiler is free to reorder side-effects and synchronous exceptions to the extent that this reordering does not change any observable program behavior. end rationale]

[Note: An implementation of the CLI is permitted to use an optimizing compiler, for example, to convert CIL to native machine code provided the compiler maintains (within each single thread of execution) the same order of side-effects and synchronous exceptions.

So... I'm curious as to whether or not this statement allows a compiler to optimize the Callback property (which accesses a simple field) and the local variable to produce the following, which has the same behavior in a single thread of execution:

if (_Callback != null) _Callback();
else new Action(() => { })();

The 12.6.7 section on the volatile keyword seems to offer a solution for programmers wishing to avoid the optimization:

A volatile read has "acquire semantics" meaning that the read is guaranteed to occur prior to any references to memory that occur after the read instruction in the CIL instruction sequence. A volatile write has "release semantics" meaning that the write is guaranteed to happen after any memory references prior to the write instruction in the CIL instruction sequence. A conforming implementation of the CLI shall guarantee this semantics of volatile operations. This ensures that all threads will observe volatile writes performed by any other thread in the order they were performed. But a conforming implementation is not required to provide a single total ordering of volatile writes as seen from all threads of execution. An optimizing compiler that converts CIL to native code shall not remove any volatile operation, nor shall it coalesce multiple volatile operations into a single operation.

In CLR via C# (pp. 264–265), Jeffrey Richter discusses this specific problem, and acknowledges that it is possible for the local variable to be swapped out:

[T]his code could be optimized by the compiler to remove the local […] variable entirely. If this happens, this version of the code is identical to the [version that references the event/callback directly twice], so a NullReferenceException is still possible.

Richter suggests the use of Interlocked.CompareExchange<T> to definitively resolve this issue:

public void DoCallback() 
{
    Action local = Interlocked.CompareExchange(ref _Callback, null, null);
    if (local != null)
        local();
}

However, Richter acknowledges that Microsoft’s just-in-time (JIT) compiler does not optimize away the local variable; and, although this could, in theory, change, it almost certainly never will because it would cause too many applications to break as a result.

This question has already been asked and answered at length in “Allowed C# Compiler optimization on local variables and refetching value from memory”. Make sure to read the answer by xanatox and the “Understand the Impact of Low-Lock Techniques in Multithreaded Apps” article it cites. Since you asked specifically about Mono, you should pay attention to referenced “[Mono-dev] Memory Model?” mailing list message:

Right now we provide loose semantics close to ecma backed by the architecture you're running.

Volatile Violates its main job?

18 votes

According to MSDN:

The volatile keyword indicates that a field might be modified by multiple threads that are executing at the same time. Fields that are declared volatile are not subject to compiler optimizations that assume access by a single thread. This ensures that the most up-to-date value is present in the field at all times.

Please notice the last sentence:

This ensures that the most up-to-date value is present in the field at all times.

However, there's a problem with this keyword.

I've read that it can change order of instructions:

First instruction       Second instruction         Can they be swapped?
Read                         Read                         No
Read                         Write                        No
Write                       Write                         No 
Write                       Read                          Yes! <----

This means John sets a value to a volatile field, and later Paul wants to read the field, Paul is getting the old value!

What is going here ? Isn't that it's main job ?

I know there are other solutions, but my question is about the volatile keyword.

Should I (as a programmer) need to prevent using this keyword - because of such weird behavior ?

The MSDN documentation is wrong. That is most certainly not what volatile does. The C# specification tells you exactly what volatile does and getting a "fresh read" or a "committed write" is not one of them. The specification is correct. volatile only guarantees acquire-fences on reads and release-fences on writes. These are defined as below.

  • acquire-fence: A memory barrier in which other reads and writes are not allowed to move before the fence.
  • release-fence: A memory barrier in which other reads and writes are not allowed to move after the fence.

I will try to explain the table using my arrow notation. A ↓ arrow will mark a volatile read and a ↑ arrow will mark a volatile write. No instruction can move through the arrowhead. Think of the arrowhead as pushing everything away.

In the following analysis I will use to variables; x and y. I will also assume that they are marked as volatile.

Case #1

Notice how the placement of the arrow after the read of x prevents the read of y from moving up. Also notice that the volatility of y is irrelevant in this case.

var localx = x;
↓
var localy = y;
↓

Case #2

Notice how the placement of the arrow after the read of x prevents the write to y from moving up. Also notice that the volatility of either of x or y, but not both, could have been omitted in this case.

var localx = x;
↓
↑
y = 1;

Case #3

Notice how the placement of the arrow before the write to y prevents the write to x from moving down. Notice that the volatility of x is irrelevant in this case.

↑
x = 1;
↑
y = 2;

Case #4

Notice that there is no barrier between the write to x and the read of y. Because of this the either the write to x can float down or the read of y can float up. Either movement is valid. This is why the instructions in the write-read case can be swapped.

↑
x = 1;
var localy = y;
↓

Notable Mentions

It is also important to note that:

  • x86 hardware has volatile semantics on writes.
  • Microsoft's implementation of the CLI (and suspect Mono's as well) has volatile semantics on writes.
  • The ECMA specification does not have volatile semantics on writes.

Redundant to inherit from Object in C#?

17 votes

As stated above, is it redundant to inherit from Object in c#? Do both sets of code below result in equivalent objects being defined?

class TestClassUno : Object
{
    // Stuff
}

vs.

class TestClassDos
{
    // Stuff
}

I snooped around on MSDN but wasn't able to find anything perfectly conclusive.

If left unspecified every class definition will implicitly inherit from System.Object hence the two definitions are equivalent.

The only time these two would be different is if someone actually defined another Object type in the same namespace. In this case the local definition of Object would take precedence and change the inheritance object

namespace Example {
  class Object { } 
  class C : Object { } 
}

Very much a corner case but wouldn't point it out if I hadn't seen it before

Note that the same is not true if you used object instead of Object. The C# keyword object is a type alias for System.Object and hence it wouldn't match Example.Object.

namespace Example2 { 
  class Object { } 
  class C : Object { } // Uses Example.Object
  class D : object { } // Uses System.Object
}

Of course if you have a truly evil developer you could still cause confusion with object

namespace System { 
  class Object { 
    private Object() { } 
  }
}

namespace Example3 {
  // This will properly fail to compile since it can't bind to the private
  // Object constructor.  This demonstrates that we are using our definition
  // of Object instead of mscorlib's 
  class C : object { } // Uses our System.Object
}

Infinite loop in release mode

17 votes

When I run the following code in debug mode, it'll successfully finish and exit. However, if I run the following code in release mode, it'll get stuck in an infinite loop and never finish.

static void Main(string[] args)
{
    bool stop = false;

    new Thread(() =>
    {
        Thread.Sleep(1000);
        stop = true;
        Console.WriteLine("Set \"stop\" to true.");

    }).Start();

    Console.WriteLine("Entering loop.");

    while (!stop)
    {
    }

    Console.WriteLine("Done.");
}

Which optimization is causing it to get stuck in an infinite loop?

My guess would be processor caching of the stop variable on the main thread. In debug mode the memory model is stricter because the debugger needs to be able to provide a sensible view of the variable's state across all threads.

Try making a field and marking it as volatile:

volatile bool stop = false;

static void Main(string[] args)
{

    new Thread(() =>
    {
        Thread.Sleep(1000);
        stop = true;
        Console.WriteLine("Set \"stop\" to true.");

    }).Start();

    Console.WriteLine("Entering loop.");

    while (!stop)
    {
    }

    Console.WriteLine("Done.");
}

Why isn't string.Normalize consistent depending on the context?

16 votes

I have the following code:

string input = "ç";
string normalized = input.Normalize(NormalizationForm.FormD);
char[] chars = normalized.ToCharArray();

I build this code with Visual studio 2010, .net4, on a 64 bits windows 7.

I run it in a unit tests project (platform: Any CPU) in two contexts and check the content of chars:

  • Visual Studio unit tests : chars contains { 231 }.
  • ReSharper : chars contains { 231 }.
  • NCrunch : chars contains { 99, 807 }.

In the msdn documentation, I could not find any information presenting different behaviors.

So, why do I get different behaviors? For me the NCrunch behavior is the expected one, but I would expect the same for others.

Edit: I switched back to .Net 3.5 and still have the same issue.

In String.Normalize(NormalizationForm) documentation it says that

binary representation is in the normalization form specified by the normalizationForm parameter.

which means you'd be using FormD normalization on both cases, so CurrentCulture and such should not really matter.

The only thing that could change, then, what I can think of is the "ç" character. That character is interpreted as per character encoding that is either assumed or configured for Visual Studio source code files. In short, I think NCrunch is assuming different source file encoding than the others.

Based on quick searching on NCrunch forum, there was a mention of some UTF-8 -> UTF-16 conversion, so I would check that.

Can delegates cause a memory leak? GC.TotalMemory(true) seems to indicate so

13 votes

Code

using System;
internal static class Test
{
    private static void Main()
    {
        try
        {
            Console.WriteLine("{0,10}: Start point", GC.GetTotalMemory(true));
            Action simpleDelegate = SimpleDelegate;
            Console.WriteLine("{0,10}: Simple delegate created", GC.GetTotalMemory(true));
            Action simpleCombinedDelegate = simpleDelegate + simpleDelegate + simpleDelegate;
            Console.WriteLine("{0,10}: Simple combined delegate created", GC.GetTotalMemory(true));
            byte[] bigManagedResource = new byte[100000000];
            Console.WriteLine("{0,10}: Big managed resource created", GC.GetTotalMemory(true));
            Action bigManagedResourceDelegate = bigManagedResource.BigManagedResourceDelegate;
            Console.WriteLine("{0,10}: Big managed resource delegate created", GC.GetTotalMemory(true));
            Action bigCombinedDelegate = simpleCombinedDelegate + bigManagedResourceDelegate;
            Console.WriteLine("{0,10}: Big combined delegate created", GC.GetTotalMemory(true));
            GC.KeepAlive(bigManagedResource);
            bigManagedResource = null;
            GC.KeepAlive(bigManagedResourceDelegate);
            bigManagedResourceDelegate = null;
            GC.KeepAlive(bigCombinedDelegate);
            bigCombinedDelegate = null;
            Console.WriteLine("{0,10}: Big managed resource, big managed resource delegate and big combined delegate removed, but memory not freed", GC.GetTotalMemory(true));
            GC.KeepAlive(simpleCombinedDelegate);
            simpleCombinedDelegate = null;
            Console.WriteLine("{0,10}: Simple combined delegate removed, memory freed, at last", GC.GetTotalMemory(true));
            GC.KeepAlive(simpleDelegate);
            simpleDelegate = null;
            Console.WriteLine("{0,10}: Simple delegate removed", GC.GetTotalMemory(true));
        }
        catch (Exception e)
        {
            Console.WriteLine(e);
        }
        Console.ReadKey(true);
    }
    private static void SimpleDelegate() { }
    private static void BigManagedResourceDelegate(this byte[] array) { }
}

Output

GC.TotalMemory(true)
    105776: Start point
    191264: Simple delegate created
    191328: Simple combined delegate created
 100191344: Big managed resource created
 100191780: Big managed resource delegate created
 100191812: Big combined delegate created
 100191780: Big managed resource, big managed resource delegate and big combined delegate removed, but memory not freed
    191668: Simple combined delegate removed, memory freed, at last
    191636: Simple delegate removed

Interesting case. Here is the solution:

enter image description here

Combining delegates is observationally pure: It looks like delegates are immutable to the outside. But internally, existing delegates are being modified. They share, under certain conditions, the same _invocationList for performance reasons (optimizing for the scenario that a few delegates are hooked up to the same event). Unfortunately, the _invocationList for the simpleCombinedDelegate references the bigMgdResDelegate which causes the memory to be kept alive.

IEnumerable Extension

12 votes

I want to make an IEnumerable<TSource> extension that can convert itself to a IEnumerable<SelectListItem>. So far I have been trying to do it this way:

    public static 
      IEnumerable<SelectListItem> ToSelectItemList<TSource, TKey>(this 
      IEnumerable<TSource> enumerable, Func<TSource, TKey> text, 
                                       Func<TSource, TKey> value)
    {
        List<SelectListItem> selectList = new List<SelectListItem>();

        foreach (TSource model in enumerable)
            selectList.Add(new SelectListItem() { Text = ?, Value = ?});

        return selectList;
    }

Is this the right way to go about doing it? If so how do I draw the values from the appropriate values from the Func<TSource, TKey> ?

You just need to use the two functions you supply as parameters to extract the text and the value. Assuming both text and value are strings you don't need the TKey type parameter. And there is no need to create a list in the extension method. An iterator block using yield return is preferable and how similar extension methods in LINQ are built.

public static IEnumerable<SelectListItem> ToSelectItemList<TSource>(
  this IEnumerable<TSource> enumerable,
  Func<TSource, string> text,
  Func<TSource, string> value)
{ 
  foreach (TSource model in enumerable) 
    yield return new SelectListItem { Text = text(model), Value = value(model) };
}

You can use it like this (you need to supply the two lambdas):

var selectedItems = items.ToSelecListItem(x => ..., x => ...);

However, you could just as well use Enumerable.Select:

var selectedItems = items.Select(x => new SelectListItem { Text = ..., Value = ... });

Is there a GUID alternative for distributed key generation?

10 votes

My situation is :

  1. I have a number of client applications, which is using local DB (MS SQL, MS Access - sorry, this is Enterprise system, I have to support legacy...)
  2. I don't know anything of trend among clients - now it's ~10 but it may be ~100 in a year.
  3. Data from those tables comes to my central server and is put into one common table
  4. Sometimes existing (client) data is changed - I have to perform update/delete operations
  5. I don't want use GUID's (.NET type System.Guid) - It's hard to simply implement and support on MS Access. Besides, it's not good for performance
  6. I need a fast search on that common table, so it would be nice to use int or long int as a PK

So, I want:

  1. Something unique to avoid collisions (it will be used as a PK)
  2. It should hopefully be int or long int
  3. Must be assignable client-side before being inserted

My current solution is to take the CRC from a concatenation of:

  • ProcessodID
  • Bios date
  • User name (strings, hardware\user related data)
  • DateTime.Now (UNC)

Currently it works for me, but maybe there is a better approach to achieve my goals? Any comments, suggestions, examples, or experience of your own?

UPDATE : synchronization between client and server is periodic action, so it can occurs 2-3 times per day (it's config variable)

If data from multiple tables comes to one central table and you need to address changes to these records then my suggestion is to use two columns as PK of you central table. One column could be the Identity field from clients (not unique) and one column could be a client code (not unique) assigned by you to your client apps. The aggregate from ID and client code will be your PK

This solution has the advantage to not require any changes on the client side apps (perhaps some identity code to send to your central server where you could use for some security measure) Of course, if the customer base grows (hopefully) you need to keep a centralized table of code assigned to each client. The search on the central table should not be a problem because you are using two numbers (or short string for the identity code).

How is everyone storing connectionstrings?

9 votes

I was wondering if people could post their solution to the ongoing problem of local databases and different connectionstrings among many developers in one project within source control?

More specifically, I'm talking about the problem where a project that is in source control and has many developers with a local database each. Each developer has their own connection string (named instance, default instance, machine name, username, password, etc). Every check in overrides the previous version and pulling the latest version results in using someone else's connection string.

So, people, which solution are you using for this problem? Extra points for explanations on why their solution works, pros, and cons.

EDIT Keep in mind that this answer shouldn't be targeted only to an enterprise environment where you have full control of the setup. The right solution should work for everyone: enterprise, startup, and open source devs.

Thanks!

To me, your question seems to imply one of two outcomes:

  1. A connection string is specified in the Web.config file that is generic enough to work for all local versions of the database. You've indicated that this isn't an ideal setup in environments where you don't have complete control.
  2. Each developer is required to supply his or her own connection string that is never checked into source control.

A few others have already covered the first scenario. Use localhost and follow a convention for the database name. For option 2, I'd recommend specifying a config source that doesn't get checked into source control:

<configuration>
  <connectionStrings configSource="connectionStrings.config"/>
</configuration>

EDIT:

connectionStrings.config

<connectionStrings>
  <add name="Name" 
    providerName="System.Data.ProviderName" 
    connectionString="Valid Connection String;" />
</connectionStrings>

From: http://msdn.microsoft.com/en-us/library/ms254494(v=vs.80).aspx

connectionStrings.config would be a file in the root of the project that you specifically excluded from source control. Each developer would be required to provide this file when working locally. Your production connection string could be substituted via a Web.config transformation on build / deployment.

Why InitializeComponent is public

9 votes

Public interface of my WPF user control contains autogenerated InitializeComponent method (which is contained in a partial class). It was a surprise for me as I expected such an internal stuff to be private.

Is there any way to remove InitializeComponent from user control public interface?

InitializeComponent is a method defined on the interface System.Windows.Markup.IComponentConnector and is used for loading the compiled page of a component.

See MSDN excerpt below from this link which has more info:

IComponentConnector is used internally by Baml2006Reader.

Implementations of InitializeComponent are widely observable as part of the infrastructure provided by frameworks or technologies that use XAML combined with application and programming models. For example, whenever you look at the generated classes for XAML root elements in WPF pages and applications, you will see InitializeComponent defined in the output. That method also exists in the compiled assembly and plays a role in the WPF application model of loading the XAML UI content at XAML parse time (and I suppose hence InitializeComponent has to be in an interface and be public so that other outside WPF related assemblies can make use of it).

To explain this further, go to the definition of InitializeComponent() method in your (say): Window1.g.cs class of say: WPFProject project, and change its access from public to private

(keep the .g.cs file open in your project otherwise the build process overrides this file, and you won't be able to see the error)

Now, when you compile your WPF project, it throws a compile error as below:

Error 22 'WPFProject.Window1' does not implement interface member 'System.Windows.Markup.IComponentConnector.InitializeComponent()'. 'WPFProject.Window1.InitializeComponent()' cannot implement an interface member because it is not public.

Additionally, InitializeComponent() is marked with the [System.Diagnostics.DebuggerNonUserCodeAttribute()] attribute so you can't step into this method while debugging.

There is another SO QA discussion, which would help you to explain more in detail

Why does WPF not use IDisposable, and what are the ramifications?

8 votes

So, I've never done a sizable WPF project before now, but this one has me wondering; What's up with WPF classes not implementing IDisposable? By comparison, all of the UI elements in Windows Forms implement IDisposable, to assure they get rid of the underlying handles and such.

I think the same Windows objects are under the covers there, and those resources have to be released; so, how is WPF doing that?

Is there anything I need to be doing with my WPF Window objects beyond Close()ing them?

WinForms controls have handles because they are wrappers around Win32 controls. WPF controls are not (well, windows are, but the controls they host are not). The reason is that, after all, a WPF window is just a DirectX rendering context, and all controls are just a bunch of triangles. So, they don't need to be actually registered to the OS, and therefore, they don't have handles (except windows and anything that inherits from HwndHost, of course, which are Win32 objects).

That's why there's such a sizeable interop layer between WPF and WinForms: WPF controls simply aren't Windows objects.

.net localization for non-strings

8 votes

I am localizing a WPF application using .resx files. I created copies of main Resources files like Resources.en-US.resx or Resources.cs-CZ.resx. Works well for strings. However, I can't figure out how to localize other files like images or documents in resource files.

When I add a new image to Resources file (either Resources.en-US.resx or Resources.cs-CZ.resx), a copy of the file is always copied to /Resources directory. So there cannot be multiple versions of one file for multiple languages, because in one directory there can be only one file with same name.

Ideal solution would be if images from localized resources would be copied in subdirectories like /Resources/en-Us. In current conditions, I am unable to localize images and documents using .resx files. Any ideas how I can achieve this? Thank you.

The following MSDN post Resources and Localization in ASP.NET 2.0 - Displaying Localized Images states:

While ASP.NET 2.0 doesn't directly support localizing image files, it doesn't require too much custom code to achieve the desired effect.

And provides the following work around:

You can start by adding the localized versions of an image file to localized versions of a global resource file. For example, the English version of LitwareSlogan.png has been added to the global resource file named Litware.resx while the French version of LitwareSlogan.fr.png has been added to Litware.fr.resx. The resources in both resource files have been given the same name of LitwareSlogan.

Complete sample code is provided at the site.

Is Mapper.Map in AutoMapper thread-safe?

7 votes

I'm looking up AutoMapper code now (evaluating it for one of projects I'm working on), and, frankly speaking, I'm quite surprised:

  • The library API is based on a single static access point (Mapper type), so generally any of its methods must be thread safe
  • But I didn't find ANY evidence of this in code.

All I was able to find is this issue, but even the statement made there seems incorrect: if Map doesn't use thread-safe data structures internally, it can't be considered as thread-safe as well, if I'm going to call CreateMap in non-concurrent context, but concurrently with Map.

I.e. the only possible usage pattern of AutoMapper in e.g. ASP.NET MVC application is:

lock (mapperLock) {
    ... Mapper.AnyMethod(...) ...
}

Obviously, if I'm correct, that's a huge lack.

So I have two questions:

  • Am I correct?
  • If yes, what's the best alternative to AutoMapper that doesn't have this issue?

The linked issue more or less answers your questions:

Mapper.CreateMap is not threadsafe, nor will it ever be. However, Mapper.Map is thread-safe. The Mapper static class is just a thin wrapper on top of the MappingEngine and Configuration objects.

So only use Mapper.CreateMap if you do your configuration in one central place in a threadsafe manner.

Your comment was:

I'm asking this because I'd like to configure automatter in-place, i.e. right before usage. I planned to configure it in non-concurrent context, i.e. ~ lock (mapperConfigLock) { Mapper.CreateMap()....; }, and I fear this is not enough now.

If you are doing in-place configuration just don't use the static Mapper class. As the comment on the github issue suggest use the mapping engine directly:

var config = 
    new ConfigurationStore(new TypeMapFactory(), MapperRegistry.AllMappers());
config.CreateMap<Source, Destination>();
var engine = new MappingEngine(config);

var source = new Source();
var dest = engine.Map(source);

It's a little bit of more code but you can create your own helpers around it. But everything is local in a given method so no shared state no need to worry about thread safety.

Is there a need to secure connection string in web.config?

6 votes

So I am using connection strings in my web.config using SQL authentication.

Of course people say this could be a vulnerability as you are storing password in plaintext.

However, from what I know, IIS never serves web.config, and web.config should only have read access to administrators and IIS anyway. So if the hacker has gained access to the webserver, then it won't matter what encryption I use because the private key will be on the webserver.

Wouldn't encrypting connection string be classified as security through obfuscation?

Is it worth encrypting web.config connection string and storing the private key on the webserver?

Further, of course if I don't use SSL, I am transmitting connection string over HTTP in plaintext. If I use SSL then this problem should be mitigated as well.

I wouldn't say that storing a plaintext password in Web.config is a security vulnerability, in and of itself. But encrypting the password is a useful defense-in-depth measure, not just security through obscurity:

  1. What if IIS is misconfigured to serve Web.config?
  2. What if a security vulnerability is discovered in ASP.NET (like the padding oracle vulnerability) that allows anyone to download Web.config?
  3. There are varying degrees of access to the Web server, from full administrative privileges to server-side code injection. If an attacker can only manage to do the latter, he might be able to read Web.config but might not be able to access the machine keys, especially if your application is running under partial trust.

In the end, it's up to you to decide if the risk of storing plaintext passwords in Web.config is acceptable. Of course, if Windows authentication is an option, then you may want to consider using that instead of SQL authentication.

UPDATE: When talking about security, it's a good idea to identify the assets and the threats. In this case, the asset is sensitive data in the database (if the data is unimportant, then why bother protecting it with a password?), and the threat is the possibility of an attacker somehow gaining access to Web.config and thus the database as well. A possible mitigation is to encrypt the database password in Web.config.

How much of a risk is it? Do we really have to plan for such an astronomically rare occurrence?

This mitigation has already proved its worth once: when the ASP.NET padding oracle vulnerability was discovered. Anyone who stored a plaintext password in Web.config was at risk; anyone who encrypted the password wasn't. How certain are you that another similar vulnerability in ASP.NET won't be discovered in the next few years?

Should we also encrypt source code and decrypt on run-time? Seems excessive to me.

So what if an attacker does get access to your source code? What's the asset you're protecting, and what's the threat you're concerned about? I think that in many cases, source code is much less valuable than data. (I'm thinking here about off-the-shelf commercial and open-source software which anyone can obtain.) And if your source code is valuable, maybe obfuscation is something to think about.

I feel if they already have even limited access to your box, then your host has failed or you've installed vulnerable services already.

What about security vulnerabilities in ASP.NET or your code? They do pop up from time to time.

My concern is standard practices. Is it a standard?

Microsoft has recommended encrypting connection strings.

What you should do is evaluate the risk that storing a plaintext password poses:

  • How likely is it that an attacker will be able to discover and exploit a security vulnerability that exposes Web.config? Based on past history, I'd say the likelihood is low (but not "astronomically" low).
  • How valuable or sensitive is your data? If all you're storing is pictures of your cat, then maybe it doesn't matter much whether an attacker gets your database password. But if you're storing personally identifiable information, then from a legal standpoint, I'd say you should take all possible measures to secure your application, including encrypting your connection strings.

Is there a way to make the _ appear for hotkeys without the Alt key?

5 votes

In WPF when you make a label like this:

<Label Content="_My Label"/>

Then when you run the app and press the Alt key it will show the "M" underlined.

We have our own custom hotkey Attached Property that allows us to use Ctrl as well as Alt.

Problem is that only Alt will show the underscores.

Is there a way to show the underscore when the Ctrl key is pressed?

NOTE: I do NOT want to send a programmatic Alt KeyPress in the background when Ctrl is pressed. That will just confuse my shortcut system.

Ok! I have got a solution to show the _ for hot-keys without Alt pressed but Ctrl pressed.

Here is how to Do it :


Small Code to Press a KeyBoard Key dynamically :

//<summary>
//Function to Perform a Keyboard KeyPress.
//</summary>
void PressKey(Key KeyboardKey)
{
    KeyEventArgs args = new KeyEventArgs(Keyboard.PrimaryDevice,
    Keyboard.PrimaryDevice.ActiveSource, 0, Key.LeftAlt);
    args.RoutedEvent = Keyboard.KeyDownEvent;
    InputManager.Current.ProcessInput(args);
}

Code to Append and Remove HotKeyChar :

//<summary>
//Function to Append a HotKeyChar to a Content of a Control.
//</summary>
void AppendHotKeyChar(ContentControl Ctrl, int KeyIndex)
{
    if (Ctrl.Content.ToString().Substring(KeyIndex, 1) != "_")
    {
        Ctrl.Content = "_" + Ctrl.Content;
    }
}
//<summary>
//Function to Remove a HotKeyChar to a Content of a Control.
//</summary>
void RemoveHotKeyChar(ContentControl Ctrl, int KeyIndex)
{
    if (Ctrl.Content.ToString().Substring(KeyIndex, 1) == "_")
    {
        Ctrl.Content = Ctrl.Content.ToString().Remove(KeyIndex, 1);
    }
}

XAML Code for Button Bt1 :

<Button x:Name="Bt1" Content="Button" HorizontalAlignment="Left" Margin="169,97,0,0" VerticalAlignment="Top" Width="75"/>

Code for Window.Loaded event of the MainWindow (e.g. MainWindow1_Loaded) :

PressKey(Key.LeftAlt);

Code for Window.KeyDown event of the MainWindow (e.g. MainWindow1_KeyDown) :

if (e.Key == Key.LeftCtrl)
{
    AppendHotKey(Bt1, 0);
}

Code for Window.KeyUp event of the MainWindow (e.g. MainWindow1_KeyUp) :

if (e.Key == Key.LeftCtrl)
{
    RemoveHotKey(Bt1, 0);
}

Now, When you start your app the Alt will be pressed once dynamically.

And now every-time you press Ctrl, your Control.Content will be Appended with a _ and so the HotKey will appear underlined! But one remark is that you should create Control.Content without HotKeyChar '_' but keep an Index of where your _ will be appended.

But keep in mind that if Alt is pressed again in your app, The code will not work anymore. So, you have to press the Alt again to make the code work!

Best way to appending and removing a HotKeyChar :

  • Create an instance of List<KeyValuePair<int, Control>> to store the Index of the HotKeyChar and the Control.
  • And now in the KeyDown event just loop through the KeyValuePair<...> in the List<...>..appending the _.
  • In the KeyUp event again just loop through the KeyValuePair<...> in the List<...>..removing the _.

Hope it Helped!

How do I force compilation of ASP.NET MVC views?

5 votes

I have a Windows Azure web role that contains a web site using ASP.NET MVC. When an HTTP request arrives and a page is first loaded the view (.aspx or .cshtml) is compiled and that takes some time and so the first time a page is served it takes notable longer than later serving the same page.

I've enabled <MvcBuildViews> (described in this answer) to enforce compile-time validation of views, but that doesn't seem to have any effect on their compilation when the site is deployed and running.

Azure web roles have so-called startup tasks and also a special OnStart() method where I can place whatever warmup code, so once I know what to do adding that into the role is not a problem.

Is there a way to force compilation of all views?

Turns out there's ASP.NET Precompilation that can be performed using ClientBuildManager.PrecompileApplication and mimics the on-demand compilation behavior, but just compiles every page. Tried it - the first load looks notably faster.