Best .net questions in January 2011

AutoMapper vs ValueInjecter

67 votes

Everytime I'm looking for AutoMapper stuff on StackOverflow, I'm reading something about ValueInjecter.

Can somebody tell me the pros and cons between them (performance, features, API usage, extensibility, testing) ?

as the creator of ValueInjecter, I can tell you that I did it because I wanted something simple and very flexible

I really don't like writing much or writing lots of monkey code like:

Prop1.Ignore, Prop2.Ignore etc.
CreateMap<Foo,Bar>(); CreateMap<Tomato, Potato>(); etc.

ValueInjecter is something like mozilla with it's plugins, you create ValueInjections and use them

there are built-in injections for flattening, unflattening, and some that are intended to be inherited

and it works more in an aspect type of way, you don't have to specify all properties 1-to-1, instead you do something like:

take all the int properties from source which name ends with "Id", transform the value and set each to a property in the source object with same name without the Id suffix and it's type is inherited from Entity, stuff like that

so one obvious difference, ValueInjecter is used even in windows forms with flattening and unflattening, that's how flexible it is

(mapping from object to form controls and back)

Automapper, not usable in windows forms, no unflatenning, but it has good stuff like collections mapping, so in case you need it with ValueInjecter you just do something like:

foos.Select(o => new Bar().InjectFrom(o));

you can also use ValueInjecter to map from anonymous and dynamic objects

differences:

  • automapper create configuration for each mapping possibility CreateMap()

  • valueinjecter inject from any object to any object (there are also cases when you inject from object to valuetype)

  • automapper has flattening built it, and only for simple types or from same type, and it doesn't has unflattening

  • valueinjecter only if you need it you do target.InjectFrom<FlatLoopValueInjection>(source); also <UnflatLoopValueInjection> and if you want from Foo.Bar.Name of type String to FooBarName of type Class1 you inherit FlatLoopValueInjection and specify this

  • automapper maps properties with same name by default and for the rest you have to specify one by one, and do stuff like Prop1.Ignore(), Prop2.Ignore() etc.

  • valueinjecter has a default injection .InjectFrom() that does the properties with the same name and type; for everything else you create your custom valueinjections with individual mapping logic/rules, more like aspects, e.g. from all props of Type Foo to all props of type Bar

Why does (does it really?) List<T> implement all these interfaces, not just IList<T>?

Asked on Thu, 27 Jan 2011 by idm c# .net
31 votes

List declaration from MSDN:

public class List<T> : IList<T>, ICollection<T>, 
 IEnumerable<T>, IList, ICollection, IEnumerable

Reflector gives similar picture. Does List really implement all of these (if yes why)? I have checked:

    interface I1 {}
    interface I2 : I1 {}
    interface I3 : I2 {}

    class A : I3 {}
    class B : I3, I2, I1 {}

    static void Main(string[] args)
    {
        var a = new A();
        var a1 = (I1)a;
        var a2 = (I2)a;
        var a3 = (I3)a;

        var b = new B();
        var b1 = (I1) b;
        var b2 = (I2)b;
        var b3 = (I3)b;
    }

it compiles.

[UPDATED]:

Guys, as i understand, all the replies stay that it:

class Program
{

    interface I1 {}
    interface I2 : I1 {}
    interface I3 : I2 {}

    class A : I3 {}
    class B : I3, I2, I1 {}

    static void I1M(I1 i1) {}
    static void I2M(I2 i2) {}
    static void I3M(I3 i3) {}

    static void Main(string[] args)
    {
        var a = new A();
        I1M(a);
        I2M(a);
        I3M(a);

        var b = new B();
        I1M(b);
        I2M(b);
        I3M(b);

        Console.ReadLine();
    }
}

would give error, but it compiles and runs without any errors. Why?

UPDATE: This question was the basis of my blog entry for Monday April 4th 2011. Thanks for the great question.

Let me break it down into many smaller questions.

Does List<T> really implement all those interfaces?

Yes.

Why?

Because when an interface (say, IList<T>) inherits from an interface (say IEnumerable<T>) then implementers of the more derived interface are required to also implement the less derived interface. That's what interface inheritance means; if you fulfill the contract of the more derived type then you are required to also fulfill the contract of the less derived type.

So a class is required to implement all the methods of all the interfaces in the transitive closure of its base interfaces?

Exactly.

Is a class that implements a more-derived interface also required to state in its base type list that it is implementing all of those less-derived interfaces?

No.

Is the class required to NOT state it?

No.

So it's optional whether the less-derived implemented interfaces are stated in the base type list?

Yes.

Always?

Almost always:

interface I1 {}
interface I2 : I1 {}
interface I3 : I2 {} 

It is optional whether I3 states that it inherits from I1.

class B : I3 {}

Implementers of I3 are required to implement I2 and I1, but they are not required to state explicitly that they are doing so. It's optional.

class D : B {}

Derived classes are not required to re-state that they implement an interface from their base class, but are permitted to do so. (This case is special; see below for more details.)

class C<T> where T : I3
{
    public virtual void M<U>() where U : I3 {}
}

Type arguments corresponding to T and U are required to implement I2 and I1, but it is optional for the constraints on T or U to state that.

It is always optional to re-state any base interface in a partial class:

partial class E : I3 {}
partial class E {}

The second half of E is permitted to state that it implements I3 or I2 or I1, but not required to do so.

OK, I get it; it's optional. Why would anyone unnecessarily state a base interface?

Perhaps because they believe that doing so makes the code easier to understand and more self-documenting.

Or, perhaps the developer wrote the code as

interface I1 {}
interface I2 {}
interface I3 : I1, I2 {}

and the realized, oh, wait a minute, I2 should inherit from I1. Why should making that edit then require the developer to go back and change the declaration of I3 to not contain explicit mention of I1? I see no reason to force developers to remove redundant information.

Aside from being easier to read and understand, is there any technical difference between stating an interface explicitly in the base type list and leaving it unstated but implied?

Usually no, but there can be a subtle difference in one case. Suppose you have a derived class D whose base class B has implemented some interfaces. D automatically implements those interfaces via B. If you re-state the interfaces in D's base class list then the C# compiler will do an interface re-implementation. The details are a bit subtle; if you are interested in how this works then I recommend a careful reading of section 13.4.6 of the C# 4 specification.

Does the List<T> source code actually state all those interfaces?

No. The actual source code says

public class List<T> : IList<T>, System.Collections.IList

Why does MSDN have the full interface list but the real source code does not?

Because MSDN is documentation; it's supposed to give you as much information as you might want. It is much more clear for the documentation to be complete all in one place than to make you search through ten different pages to find out what the full interface set is.

Why does Reflector show the whole list?

Reflector only has metadata to work from. Since putting in the full list is optional, Reflector has no idea whether the original source code contains the full list or not. It is better to err on the side of more information. Again, Reflector is attempting to help you by showing you more information rather than hiding information you might need.

BONUS QUESTION: Why does IEnumerable<T> inherit from IEnumerable but IList<T> does not inherit from IList?

A sequence of integers can be treated as a sequence of objects, by boxing every integer as it comes out of the sequence. But a read-write list of integers cannot be treated as a read-write list of objects, because you can put a string into a read-write list of objects. An IList<T> is not required to fulfill the whole contract of IList, so it does not inherit from it.

How is it that an enum derives from System.Enum and is an integer at the same time?

30 votes

Edit: Comments at bottom. Also, this.


Here's what's kind of confusing me. My understanding is that if I have an enum like this...

enum Animal
{
    Dog,
    Cat
}

...what I've essentially done is defined a value type called Animal with two defined values, Dog and Cat. This type derives from the reference type System.Enum (something which value types can't normally do—at least not in C#—but which is permitted in this case), and has a facility for casting back and forth to/from int values.

If the way I just described the enum type above were true, then I would expect the following code to throw an InvalidCastException:

public class Program
{
    public static void Main(string[] args)
    {
        // Box it.
        object animal = Animal.Dog;

        // Unbox it. How are these both successful?
        int i = (int)animal;
        Enum e = (Enum)animal;

        // Prints "0".
        Console.WriteLine(i);

        // Prints "Dog".
        Console.WriteLine(e);
    }
}

Normally, you cannot unbox a value type from System.Object as anything other than its exact type. So how is the above possible? It is as if the Animal type is an int (not just convertible to int) and is an Enum (not just convertible to Enum) at the same time. Is it multiple inheritance? Does System.Enum somehow inherit from System.Int32 (something I would not have expected to be possible)?

Edit: It can't be either of the above. The following code demonstrates this (I think) conclusively:

object animal = Animal.Dog;

Console.WriteLine(animal is Enum);
Console.WriteLine(animal is int);

The above outputs:

True
False

Both the MSDN documentation on enumerations and the C# specification make use of the term "underlying type"; but I don't know what this means, nor have I ever heard it used in reference to anything other than enums. What does "underlying type" actually mean?


So, is this yet another case that gets special treatment from the CLR?

My money's on that being the case... but an answer/explanation would be nice.


Update: Damien_The_Unbeliever provided the reference to truly answer this question. The explanation can be found in Partition II of the CLI specification, in the section on enums:

For binding purposes (e.g., for locating a method definition from the method reference used to call it) enums shall be distinct from their underlying type. For all other purposes, including verification and execution of code, an unboxed enum freely interconverts with its underlying type. Enums can be boxed to a corresponding boxed instance type, but this type is not the same as the boxed type of the underlying type, so boxing does not lose the original type of the enum.

Edit (again?!): Wait, actually, I don't know that I read that right the first time. Maybe it doesn't 100% explain the specialized unboxing behavior itself (though I'm leaving Damien's answer as accepted, as it shed a great deal of light on this issue). I will continue looking into this...


Another Edit: Man, then yodaj007's answer threw me for another loop. Somehow an enum is not exactly the same as an int; yet an int can be assigned to an enum variable with no cast? Buh?

I think this is all ultimately illuminated by Hans's answer, which is why I've accepted it. (Sorry, Damien!)

Yes, special treatment. The JIT compiler is keenly aware of the way boxed value types work. Which is in general what makes value types acting a bit schizoid. Boxing involves creating a System.Object value that behaves exactly the same way as a value of a reference type. At that point, value type values no longer behave like values do at runtime. Which makes it possible, for example, to have a virtual method like ToString(). The boxed object has a method table pointer, just like reference types do.

The JIT compiler knows the method tables pointers for value types like int and bool up front. Boxing and unboxing for them is very efficient, it takes but a handful of machine code instructions. This needed to be efficient back in .NET 1.0 to make it competitive. A very important part of that is the restriction that a value type value can only be unboxed to the same type. This avoids the jitter from having to generate a massive switch statement that invokes the correct conversion code. All it has to do is to check the method table pointer in the object and verify that it is the expected type. And copy the value out of the object directly. Notable perhaps is that this restriction doesn't exist in VB.NET, it's CType() operator does in fact generate code to a helper function that contains this big switch statement.

The problem with Enum types is that this cannot work. Enums can have a different GetUnderlyingType() type. In other words, the unboxed value has different sizes so simply copying the value out of the boxed object cannot work. Keenly aware, the jitter doesn't inline the unboxing code anymore, it generates a call to a helper function in the CLR.

That helper is named JIT_Unbox(), you can find its source code in the SSCLI20 source, clr/src/vm/jithelpers.cpp. You'll see it dealing with enum types specially. It is permissive, it allows unboxing from one enum type to another. But only if the underlying type is the same, you get an InvalidCastException if that's not the case.

Which is also the reason that Enum is declared as a class. It's logical behavior is of a reference type, derived enum types can be cast from one to another. With the above noted restriction on the underlying type compatibility. The values of an enum type have however very much the behavior of a value type value. They have copy semantics and boxing behavior.

Is there any connection string parser in C#?

30 votes

I have a connection string and I want to be able to peek out for example "Data Source". Is there a parser, or do I have to search the string?

Yes, there's the System.Data.Common.DbConnectionStringBuilder class.

The DbConnectionStringBuilder class provides the base class from which the strongly typed connection string builders (SqlConnectionStringBuilder, OleDbConnectionStringBuilder, and so on) derive. The connection string builders let developers programmatically create syntactically correct connection strings, and parse and rebuild existing connection strings.

The subclasses of interest are:

System.Data.EntityClient.EntityConnectionStringBuilder
System.Data.Odbc.OdbcConnectionStringBuilder
System.Data.OleDb.OleDbConnectionStringBuilder
System.Data.OracleClient.OracleConnectionStringBuilder
System.Data.SqlClient.SqlConnectionStringBuilder

For example, to "peek out the Data Source" from a SQL-server connection string, you can do:

var builder = new SqlConnectionStringBuilder(connectionString);
var dataSource = builder.DataSource;

How to combine 2 lists using LINQ?

25 votes

Env.: .NET4 C#

Hi All,

I want to combine these 2 lists : { "A", "B", "C", "D" } and { "1", "2", "3" }

into this one:

{ "A1", "A2", "A3", "B1", "B2", "B3", "C1", "C2", "C3", "D1", "D2", "D3" }

Obviously, i could use nested loops. But I wonder if LINQ can help. As far as I understand, Zip() is not my friend in this case, right?

TIA,

Essentially, you want to generate a cartesian product and then concatenate the elements of each 2-tuple. This is easiest to do in query-syntax:

var cartesianConcat = from a in seq1
                      from b in seq2
                      select a + b;

Bad implementation of Enumerable.Single?

24 votes

I came across this implementation in Enumerable.cs by reflector.

public static TSource Single<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
    //check parameters
    TSource local = default(TSource);
    long num = 0L;
    foreach (TSource local2 in source)
    {
        if (predicate(local2))
        {
            local = local2;
            num += 1L;
            //I think they should do something here like:
            //if (num >= 2L) throw Error.MoreThanOneMatch();
            //no necessary to continue
        }
    }
    //return different results by num's value
}

I think they should break the loop if there are more than 2 items meets the condition, why they always loop through the whole collection? In case of that reflector disassembles the dll incorrectly, I write a simple test:

class DataItem
{
   private int _num;
   public DataItem(int num)
   {
      _num = num;
   }

   public int Num
   {
      get{ Console.WriteLine("getting "+_num); return _num;}
   }
} 
var source = Enumerable.Range(1,10).Select( x => new DataItem(x));
var result = source.Single(x => x.Num < 5);

For this test case, I think it will print "getting 0, getting 1" and then throw an exception. But the truth is, it keeps "getting 0... getting 10" and throws an exception. Is there any algorithmic reason they implement this method like this?

EDIT Some of you thought it's because of side effects of the predicate expression, after a deep thought and some test cases, I have a conclusion that side effects doesn't matter in this case. Please provide an example if you disagree with this conclusion.

Yes, I do find it slightly strange especially because the overload that doesn't take a predicate (i.e. works on just the sequence) does seem to have the quick-throw 'optimization'.


In the BCL's defence however, I would say that the InvalidOperation exception that Single throws is a boneheaded exception that shouldn't normally be used for control-flow. It's not necessary for such cases to be optimized by the library.

Code that uses Single where zero or multiple matches is a perfectly valid possibility, such as:

try
{
     var item = source.Single(predicate);
     DoSomething(item);
}

catch(InvalidOperationException)
{
     DoSomethingElseUnexceptional();    
}

should be refactored to code that doesn't use the exception for control-flow, such as (only a sample; this can be implemented more efficiently):

var firstTwo = source.Where(predicate).Take(2).ToArray();

if(firstTwo.Length == 1) 
{
    // Note that this won't fail. If it does, this code has a bug.
    DoSomething(firstTwo.Single()); 
}
else
{
    DoSomethingElseUnexceptional();
}

In other words, we should leave the use of Single to cases when we expect the sequence to contain only one match. It should behave identically to Firstbut with the additional run-time assertion that the sequence doesn't contain multiple matches. Like any other assertion, failure, i.e. cases when Single throws, should be used to represent bugs in the program (either in the method running the query or in the arguments passed to it by the caller).

This leaves us with two cases:

  1. The assertion holds: There is a single match. In this case, we want Single to consume the entire sequence anyway to assert our claim. There's no benefit to the 'optimization'. In fact, one could argue that the sample implementation of the 'optimization' provided by the OP will actually be slower because of the check on every iteration of the loop.
  2. The assertion fails: There are zero or multiple matches. In this case, we do throw later than we could, but this isn't such a big deal since the exception is boneheaded: it is indicative of a bug that must be fixed.

To sum up, if the 'poor implementation' is biting you performance-wise in production, either:

  1. You are using Single incorrectly.
  2. You have a bug in your program. Once the bug is fixed, this particular performance problem will go away.

EDIT: Clarified my point.

EDIT: Here's a valid use of Single, where failure indicates bugs in the calling code (bad argument):

public static User GetUserById(this IEnumerable<User> users, string id)
{
     if(users == null)
        throw new ArgumentNullException("users");

     // Perfectly fine if documented that a failure in the query
     // is treated as an exceptional circumstance. Caller's job 
     // to guarantee pre-condition.        
     return users.Single(user => user.Id == id);    
}

Why does TimeSpan.FromSeconds(double) round to milliseconds?

22 votes

TimeSpan.FromSeconds takes a double, and can represent values down to 100 nanoseconds, however this method inexplicably rounds the time to whole milliseconds.

Given that I've just spent half an hour to pinpoint this (documented!) behaviour, knowing why this might be the case would make it easier to put up with the wasted time.

Can anyone suggest why this seemingly counter-productive behaviour is implemented?

TimeSpan.FromSeconds(0.12345678).TotalSeconds
    // 0.123
TimeSpan.FromTicks((long)(TimeSpan.TicksPerSecond * 0.12345678)).TotalSeconds
    // 0.1234567

As you've found out yourself, it's a documented feature. It's described in the documentation of TimeSpan:

Parameters

value Type: System.Double

A number of seconds, accurate to the nearest millisecond.

The reason for this is probably because a double is not that accurate at all. It is always a good idea to do some rounding when comparing doubles, because it might just be a very tiny bit larger or smaller than you'd expect. That behaviour could actually provide you with some unexpected nanoseconds when you try to put in whole milliseconds. I think that is the reason they chose to round the value to whole milliseconds and discard the smaller digits.

How can I test for negative zero?

22 votes

Initially I thought Math.Sign would be the proper way to go but after running a test it seems that it treats -0.0 and +0.0 the same.

Here's a grotty hack way of doing it:

private static readonly long NegativeZeroBits =
    BitConverter.DoubleToInt64Bits(-0.0);

public static bool IsNegativeZero(double x)
{
    return BitConverter.DoubleToInt64Bits(x) == NegativeZeroBits;
}

Basically that's testing for the exact bit pattern of -0.0, but without having to hardcode it.

Is the null coalesce operator thread safe ??

19 votes

So this is the meat of the question: Can Foo.Bar ever return null? To clarify, can '_bar' be set to null after it's evaluated as non-null and before it's value is returned?

    public class Foo
    {
        Object _bar;
        public Object Bar
        {
            get { return _bar ?? new Object(); }
            set { _bar = value; }
        }
    }

I know using the following get method is not safe, and can return a null value:

            get { return _bar != null ? _bar : new Object(); }

UPDATE:

Another way to look at the same problem, this example might be more clear:

        public static T GetValue<T>(ref T value) where T : class, new()
        {
            return value ?? new T();
        }

And again asking can GetValue(...) ever return null? Depending on your definition this may or may not be thread-safe... I guess the right problem statement is asking if it is an atomic operation on value... David Yaw has defined the question best by saying is the above function the equivalent to the following:

        public static T GetValue<T>(ref T value) where T : class, new()
        {
            T result = value;
            if (result != null)
                return result;
            else
                return new T();
        }

No, this is not thread safe.

The IL for the above compiles to:

.method public hidebysig specialname instance object get_Bar() cil managed
{
    .maxstack 2
    .locals init (
        [0] object CS$1$0000)
    L_0000: nop 
    L_0001: ldarg.0 
    L_0002: ldfld object ConsoleApplication1.Program/MainClass::_bar
    L_0007: dup 
    L_0008: brtrue.s L_0010
    L_000a: pop 
    L_000b: newobj instance void [mscorlib]System.Object::.ctor()
    L_0010: stloc.0 
    L_0011: br.s L_0013
    L_0013: ldloc.0 
    L_0014: ret 
}

This effectively does a load of the _bar field, then checks its existence, and jumps ot the end. There is no synchronization in place, and since this is multiple IL instructions, it's possible for a secondary thread to cause a race condition - causing the returned object to differ from the one set.

It's much better to handle lazy instantiation via Lazy<T>. That provides a thread-safe, lazy instantiation pattern. Granted, the above code is not doing lazy instantiation (rather returning a new object every time until some time when _bar is set), but I suspect that's a bug, and not the intended behavior.

In addition, Lazy<T> makes setting difficult.

To duplicate the above behavior in a thread-safe manner would require explicit synchronization.


As to your update:

The getter for the Bar property could never return null.

Looking at the IL above, it _bar (via ldfld), then does a check to see if that object is not null using brtrue.s. If the object is not null, it jumps, copies the value of _bar from the execution stack to a local via stloc.0, and returns - returning _bar with a real value.

If _bar was unset, then it will pop it off the execution stack, and create a new object, which then gets stored and returned.

Either case prevents a null value from being returned. However, again, I wouldn't consider this thread-safe in general, since it's possible that a call to set happening at the same time as a call to get can cause different objects to be returned, and it's a race condition as which object instance gets returned (the set value, or a new object).

What is a NullReferenceException in .NET?

19 votes

(I'm creating this separate question and answer because every question we get on NullReferenceException is really the same)


I have some code and when it executes, it throws a NullReferenceException, saying, "Object reference not set to an instance of an object.".

What does this mean, and what can I do about it?


Note again, that this is a question meant to focus answers to the canonical "what is a NullReferenceException and why did I get one" question. I do know what a NullReferenceException is, as my answer below demonstrates.

NullReferenceException always means the same thing. You are trying to use a reference to an object, but you haven't initialized it (or it used to be initialized, but is now uninitialized).


Examples

Simple

string s1 = null;
int len = s1.Length; // s1 is null. There is no string to get a length from.

Indirect

public class Person {
    public int Age { get; set; }
}
public class Book {
    public Person Author { get; set; }
}
Book b1 = new Book();
int authorAge = b1.Author.Age; // You never initialized the Author property.
                               //  there is no Person to get an Age from.

Array

int[] numbers = null;
int n = numbers[0]; // numbers is null.  There is no array to index.

Array Elements

Person[] people = new Person[5];
people[0].Age = 20 // people[0] is null.  The array was allocated but not initialized.
                   // There is no Person to set the Age for.

Collection/List/Dictionary

Dictionary<string, int> agesForNames = null;
int age = agesForNames["Bob"]; // agesForNames is null.
                               // There is no Dictionary to perform the lookup.

Range Variable (Indirect/Deferred)

public class Person {
    public string Name { get; set; }
}
var people = new List<Person>();
people.Add(null);
var names = from p in people select p.Name;
string firstName = names.First(); // Exception is thrown here, but actually occurs
                                  // on the line above.  "p" is null because the
                                  // first element we added to the list is null.

Bad Naming Conventions:

public class Form1 {
    private Customer customer;

    private void Form1_Load(object sender, EventArgs e) {
        Customer customer = new Customer();
        customer.Name = "John";
    }

    private void Button_Click(object sender, EventArgs e) {
        MessageBox.Show(customer.Name);
    }
}

If you named fields differently from locals, you might have realized that you never initialized the field. Suggestion:

private Customer _customer;

Ways to Avoid

Explicitly check for null, and ignore null values.

void PrintName(Person p) {
    if (p != null) {
        Console.WriteLine(p.Name);
    }
}

Explicitly check for null, and provide a default value.

string GetCategory(Book b) {
    if (b == null)
        return "Unknown";
    return b.Category;
}

Explicitly check for null, and throw a more meaningful exception.

string GetCategory(string bookTitle) {
    var book = library.FindBook(bookTitle);  // This may return null
    if (book == null)
        throw new BookNotFoundException(bookTitle);  // Your custom exception
    return book.Category;
}

Use Debug.Assert if a value should never be null, to catch the problem earlier.

string GetTitle(int knownBookID) {
    var book = library.GetBook(knownBookID);  // Should never return null
    // Exception will occur on the next line instead of 3 lines down
    Debug.Assert(book != null, "Library did not return a book for known book ID.");
    // Some other code ...
    return book.Title; // Will never throw NullReferenceException in Debug mode
}

Why isn't List<T> sealed?

19 votes

This question came to mind after reading the answer to this question; which basically made the point that List<T> has no virtual methods, since it was designed to be "fast, not extensible".

If that's the design goal, why didn't the original design including sealing the class? (I know that's not possible now, seeing how that would break a lot child classes within client code)

There's no compelling reason to seal it. It does no harm to derive from it. I used to be of the opposite mindset - only leave things unsealed that you intend for people to derive from. But in hindsight, it makes no sense. .NET takes the position that methods are non-virtual by default but classes are unsealed by default. List<T> just follows that same practice.

Where you would want to seal a class is when it does override virtual methods but further subclassing is not easy or obvious. It can be slightly useful to derive from a collection such as Dictionary<TKey,TValue> to stick in known type parameters and avoid typing them out if used in an application. For example maybe you would have a QueryString class that derives from Dictionary<String,String>.

And since there's no virtual methods, there's really nothing to protect the class against by sealing it.

C# Nullable Equality Operations, Why does null <= null resolve as false?

17 votes

Possible Duplicate:
Why does >= return false when == returns true for null values?

Why is it that in .NET

null >= null

resolves as false, but

null == null 

resolves as true?

In other words, why isn't null >= null equivalent to null > null || null == null?

Does anyone have the official answer?

This behaviour is defined in the C# specification (ECMA-334) in section 14.2.7 (I have highlighted the relevant part):

For the relational operators

< > <= >=

a lifted form of an operator exists if the operand types are both non-nullable value types and if the result type is bool. The lifted form is constructed by adding a single ? modifier to each operand type. The lifted operator produces the value false if one or both operands are null. Otherwise, the lifted operator unwraps the operands and applies the underlying operator to produce the bool result.

In particular, this means that the usual laws of relations don't hold; x >= y does not imply !(x < y).

Gory details

Some people have asked why the compiler decides that this is a lifted operator for int? in the first place. Let's have a look. :)

We start with 14.2.4, 'Binary operator overload resolution'. This details the steps to follow.

  1. First, the user-defined operators are examined for suitability. This is done by examining the operators defined by the types on each side of >=... which raises the question of what the type of null is! The null literal actually doesn't have any type until given one, it's simply the "null literal". By following the directions under 14.2.5 we discover there are no operators suitable here, since the null literal doesn't define any operators.

  2. This step instructs us to examine the set of predefined operators for suitability. (Enums are also excluded by this section, since neither side is an enum type.) The relevant predefined operators are listed in sections 14.9.1 to 14.9.3, and they are all operators upon primitive numeric types, along with the lifted versions of these operators (note that strings operators are not included here).

  3. Finally, we must perform overload resolution using these operators and the rules in 14.4.2.

Actually performing this resolution would be extremely tedious, but luckily there is a shortcut. Under 14.2.6 there is an informative example given of the results of overload resolution, which states:

...consider the predefined implementations of the binary * operator:

int operator *(int x, int y);
uint operator *(uint x, uint y);
long operator *(long x, long y);
ulong operator *(ulong x, ulong y);
void operator *(long x, ulong y);
void operator *(ulong x, long y);
float operator *(float x, float y);
double operator *(double x, double y);
decimal operator *(decimal x, decimal y);

When overload resolution rules (§14.4.2) are applied to this set of operators, the effect is to select the first of the operators for which implicit conversions exist from the operand types.

Since both sides are null we can immediately throw out all unlifted operators. This leaves us with the lifted numeric operators on all primitive numeric types.

Then, using the previous information, we select the first of the operators for which an implicit conversion exists. Since the null literal is implicitly convertible to a nullable type, and a nullable type exists for int, we select the first operator from the list, which is int? >= int?.

List<T> readonly with a private set

17 votes

How can I expose a List<T> so that it is readonly but can be set privately?

This doesnt work:

public List<string> myList {readonly get; private set; }

Even if you do:

public List<string> myList {get; private set; }

You can still do this:

myList.Add("TEST"); //This should not be allowed

I guess you could have:

public List<string> myList {get{ return otherList;}}
private List<string> otherList {get;set;}

I think you are mixing concepts.

public List<string> myList {get; private set;}

is already "read-only". That is, outside this class, nothing can set myList to a different instance of List<string>

However, if you want a readonly list as in "I don't want people to be able to modify the list contents", then you need to expose a ReadOnlyCollection<string>. You can do this via:

private List<string> actualList = new List<string>();
public ReadOnlyCollection<string> myList
{
  get{ return actualList.AsReadOnly();}
}

Note that in the first code snippet, others can manipulate the List, but can not change what list you have. In the second snippet, others will get a read-only list that they can not modify.

When is a method eligible to be inlined by the CLR?

16 votes

I've observed a lot of "stack-introspective" code in applications, which often implicitly rely on their containing methods not being inlined for their correctness. Such methods commonly involve calls to:

  • MethodBase.GetCurrentMethod
  • Assembly.GetCallingAssembly
  • Assembly.GetExecutingAssembly

Now, I find the information surrounding these methods to be very confusing. I've heard that the run-time will not inline a method that calls GetCurrentMethod, but I can't find any documentation to that effect. I've seen posts on StackOverflow on several occasions, such as this one, indicating the CLR does not inline cross-assembly calls, but the GetCallingAssembly documentation strongly indicates otherwise.

There's also the much-maligned [MethodImpl(MethodImpOptions.NoInlining)], but I am unsure if the CLR considers this to be a "request" or a "command."

Note that I am asking about inlining eligibility from the standpoint of contract, not about when current implementations of the JITter decline to consider methods because of implementation difficulties, or about when the JITter finally ends up choosing to inline an eligible method after assessing the trade-offs. I have read this and this, but they seem to be more focused on the last two points (there are passing mentions of MethodImpOptions.NoInlining and "exotic IL instructions", but these seem to be presented as heuristics rather than as obligations).

When is the CLR allowed to inline?

It is a jitter implementation detail, the x86 and x64 jitters have subtly different rules. This is casually documented in blog posts of team members that worked on the jitter but the teams certainly reserve the right to alter the rules. Looks like you already found them.

Inlining methods from other assemblies is most certainly supported, a lot of the .NET classes would work quite miserably if that wasn't the case. You can see it at work when you look at the machine code generated for Console.WriteLine(), it often gets inlined when you pass a simple string. To see this for yourself, you need to switch to the Release build and change a debugger option. Tools + Options, Debugging, General, untick "Suppress JIT optimization on module load".

There is otherwise no good reason to consider MethodImpOptions.NoInlining maligned, it's pretty much why it exists in the first place. It is in fact used intentionally in the .NET framework on lots of small public methods that call an internal helper method. It makes exception stack traces easier to diagnose.

In C#, Is Expression API better than Reflection

16 votes

Nowadays, I'm exploring C# Expression APIs. So I could use some help understanding how it works, including the difference between Expression and Reflection. I also want to understand if Expressions are merely syntactic sugar, or are they indeed better than Reflection performance-wise?

Good examples as well as links to good articles would be appreciated. :-)

Regarding calling one method :

  • Direct call can't be beaten speed-wise.
  • Using Expression API is globally similar to using Reflection.Emit or Delegate.CreateDelegate speed-wise (Small differences could be measured; as always optimizing for speed without measurements and goals is useless).

    They all generate IL and the framework will compile it to native code at some point. But you still pay the cost of one indirection level for calling the delegate and one method call inside your delegate.

    The expression API is more limited but order of magnitude simpler to use as it doesn't require you to learn IL.

  • The Dynamic Language Runtime either used directly or via the dynamic keyword of C# 4 add a little overhead but stay near emitting code as it cache most checks related to parameter types, access and the rest.

    When used via the dynamic keyword it's also get the neatest syntax as it look like a normal method call. But if you use dynamic you are limited to method calls while the library is able to do a lot more (See IronPython)

  • System.Reflection.MethodInfo.Invoke is slow : in addition to what other methods do it need to check access rights, check arguments count, type, ... against the MethodInfo each time you call the method.

Jon Skeet also get some good points in this answer : Delegate.CreateDelegate vs DynamicMethod vs Expression


Some samples, the same thing done different ways.

You could already see from the line count and complexity witch solutions are easy to maintain and witch should be avoided from a long term maintenance standpoint.

Most of the samples are pointless but they demonstrate the basic code generation classes / syntaxes of C#, for more info there is always the MSDN

PS: Dump is a LINQPad method.

public class Foo
{
    public string Bar(int value) { return value.ToString(); }
}

void Main()
{
    object foo = new Foo();

    // We have an instance of something and want to call a method with this signature on it :
    // public string Bar(int value);

    Console.WriteLine("Cast and Direct method call");
    {
        var result = ((Foo)foo).Bar(42);
        result.Dump();
    }
    Console.WriteLine("Create a lambda closing on the local scope.");
    {
        // Useless but i'll do it at the end by manual il generation

        Func<int, string> func = i => ((Foo)foo).Bar(i);
        var result = func(42);
        result.Dump();
    }
    Console.WriteLine("Using MethodInfo.Invoke");
    {
        var method = foo.GetType().GetMethod("Bar");
        var result = (string)method.Invoke(foo, new object[] { 42 });
        result.Dump();
    }
    Console.WriteLine("Using the dynamic keyword");
    {
        var dynamicFoo = (dynamic)foo;
        var result = (string)dynamicFoo.Bar(42);
        result.Dump();
    }
    Console.WriteLine("Using CreateDelegate");
    {
        var method = foo.GetType().GetMethod("Bar");
        var func = (Func<int, string>)Delegate.CreateDelegate(typeof(Func<int, string>), foo, method);
        var result = func(42);
        result.Dump();
    }
    Console.WriteLine("Create an expression and compile it to call the delegate on one instance.");
    {
        var method = foo.GetType().GetMethod("Bar");
        var thisParam = Expression.Constant(foo);
        var valueParam = Expression.Parameter(typeof(int), "value");
        var call = Expression.Call(thisParam, method, valueParam);
        var lambda = Expression.Lambda<Func<int, string>>(call, valueParam);
        var func = lambda.Compile();
        var result = func(42);
        result.Dump();
    }
    Console.WriteLine("Create an expression and compile it to a delegate that could be called on any instance.");
    {
        // Note that in this case "Foo" must be known at compile time, obviously in this case you want
        // to do more than call a method, otherwise just call it !
        var type = foo.GetType();
        var method = type.GetMethod("Bar");
        var thisParam = Expression.Parameter(type, "this");
        var valueParam = Expression.Parameter(typeof(int), "value");
        var call = Expression.Call(thisParam, method, valueParam);
        var lambda = Expression.Lambda<Func<Foo, int, string>>(call, thisParam, valueParam);
        var func = lambda.Compile();
        var result = func((Foo)foo, 42);
        result.Dump();
    }
    Console.WriteLine("Create a DynamicMethod and compile it to a delegate that could be called on any instance.");
    {
        // Same thing as the previous expression sample. Foo need to be known at compile time and need
        // to be provided to the delegate.

        var type = foo.GetType();
        var method = type.GetMethod("Bar");

        var dynamicMethod = new DynamicMethod("Bar_", typeof(string), new [] { typeof(Foo), typeof(int) }, true);
        var il = dynamicMethod.GetILGenerator();
        il.DeclareLocal(typeof(string));
        il.Emit(OpCodes.Ldarg_0);
        il.Emit(OpCodes.Ldarg_1);
        il.Emit(OpCodes.Call, method);
        il.Emit(OpCodes.Ret);
        var func = (Func<Foo, int, string>)dynamicMethod.CreateDelegate(typeof(Func<Foo, int, string>));
        var result = func((Foo)foo, 42);
        result.Dump();
    }
    Console.WriteLine("Simulate closure without closures and in a lot more lines...");
    {
        var type = foo.GetType();
        var method = type.GetMethod("Bar");

        // The Foo class must be public for this to work, the "skipVisibility" argument of
        // DynamicMethod.CreateDelegate can't be emulated without breaking the .Net security model.

        var assembly = AppDomain.CurrentDomain.DefineDynamicAssembly(
            new AssemblyName("MyAssembly"), AssemblyBuilderAccess.Run);
        var module = assembly.DefineDynamicModule("MyModule");
        var tb = module.DefineType("MyType", TypeAttributes.Class | TypeAttributes.Public);

        var fooField = tb.DefineField("FooInstance", type, FieldAttributes.Public);
        var barMethod = tb.DefineMethod("Bar_", MethodAttributes.Public, typeof(string), new [] { typeof(int) });
        var il = barMethod.GetILGenerator();
        il.DeclareLocal(typeof(string));
        il.Emit(OpCodes.Ldarg_0); // this
        il.Emit(OpCodes.Ldfld, fooField);
        il.Emit(OpCodes.Ldarg_1); // arg
        il.Emit(OpCodes.Call, method);
        il.Emit(OpCodes.Ret);

        var closureType = tb.CreateType();

        var instance = closureType.GetConstructors().Single().Invoke(new object[0]);

        closureType.GetField(fooField.Name).SetValue(instance, foo);

        var methodOnClosureType = closureType.GetMethod("Bar_");

        var func = (Func<int, string>)Delegate.CreateDelegate(typeof(Func<int, string>), instance,
            closureType.GetMethod("Bar_"));
        var result = func(42);
        result.Dump();
    }
}

Why does capturing a mutable struct variable inside a closure within a using statement change its local behavior?

15 votes

Update: Well, now I've gone and done it: I filed a bug report with Microsoft about this, as I seriously doubt that it is correct behavior. That said, I'm still not 100% sure what to believe regarding this question; so I can see that what is "correct" is open to some level of interpretation.

My feeling is that either Microsoft will accept that this is a bug, or else respond that the modification of a mutable value type variable within a using statement constitutes undefined behavior.

Also, for what it's worth, I have at least a guess as to what is happening here. I suspect that the compiler is generating a class for the closure, "lifting" the local variable to an instance field of that class; and since it is within a using block, it's making the field readonly. As LukeH pointed out in a comment to the other question, this would prevent method calls such as MoveNext from modifying the field itself (they would instead affect a copy).


Note: I have shortened this question for readability, though it is still not exactly short. For the original (longer) question in its entirety, see the edit history.

I have read through what I believe are the relevant sections of the ECMA-334 and cannot seem to find a conclusive answer to this question. I will state the question first, then provide a link to some additional comments for those who are interested.

Question

If I have a mutable value type that implements IDisposable, I can (1) call a method that modifies the state of the local variable's value within a using statement and the code behaves as I expect. Once I capture the variable in question inside a closure within the using statement, however, (2) modifications to the value are no longer visible in the local scope.

This behavior is only apparent in the case where the variable is captured inside the closure and within a using statement; it is not apparent when only one (using) or the other condition (closure) is present.

Why does capturing a variable of a mutable value type inside a closure within a using statement change its local behavior?

Below are code examples illustrating items 1 and 2. Both examples will utilize the following demonstration Mutable value type:

struct Mutable : IDisposable
{
    int _value;
    public int Increment()
    {
        return _value++;
    }

    public void Dispose() { }
}

1. Mutating a value type variable within a using block

using (var x = new Mutable())
{
    Console.WriteLine(x.Increment());
    Console.WriteLine(x.Increment());
}

The output code outputs:

0
1

2. Capturing a value type variable inside a closure within a using block

using (var x = new Mutable())
{
    // x is captured inside a closure.
    Func<int> closure = () => x.Increment();

    // Now the Increment method does not appear to affect the value
    // of local variable x.
    Console.WriteLine(x.Increment());
    Console.WriteLine(x.Increment());
}

The above code outputs:

0
0

Further Comments

It has been noted that the Mono compiler provides the behavior I expect (changes to the value of the local variable are still visible in the using + closure case). Whether this behavior is correct or not is unclear to me.

For some more of my thoughts on this issue, see here.

It's a known bug; we discovered it a couple years ago. The fix would be potentially breaking, and the problem is pretty obscure; these are points against fixing it. Therefore it has never been prioritized high enough to actually fix it.

This has been in my queue of potential blog topics for a couple years now; perhaps I ought to write it up.

And incidentally, your conjecture as to the mechanism that explains the bug is completely accurate; nice psychic debugging there.

So, yes, known bug, but thanks for the report regardless!

Is the heap actually a heap?

14 votes

Possible Duplicates:
Why are two different concepts both called “heap”?
What's the relationship between “a” heap and “the” heap?

In .NET (and Java as far as I know), the area where objects are dynamically allocated is referred to as the managed heap. However, most documentation that describes how the managed heap works depicts it as a linear data structure, such as a linked list or stack.

So, is the managed heap actually a heap, or is it implemented with some other data structure? If it actually does not use a heap data structure, is seems like a significant failure of terminology to overload the meaning of this word.

If it is in fact a heap data structure, what is the value that satisfies the heap property: the size of the allocated memory region?

No, the heap is not a heap-ordered binomial tree at all. It's not clear (to me) whose fault the terminology clash is, but both uses of heap date back decades now (mid-1970, it appears). Some of the history is discussed in this article.

What's the best way to ensure a base class's static constructor is called?

14 votes

The documentation on static constructors in C# says:

A static constructor is used to initialize any static data, or to perform a particular action that needs performed once only. It is called automatically before the first instance is created or any static members are referenced.

That last part (about when it is automatically called) threw me for a loop; until reading that part I thought that by simply accessing a class in any way, I could be sure that its base class's static constructor had been called. Testing and examining the documentation have revealed that this is not the case; it seems that the static constructor for a base class is not guaranteed to run until a member of that base class specifically is accessed.

Now, I guess in most cases when you're dealing with a derived class, you would construct an instance and this would constitute an instance of the base class being created, thus the static constructor would be called. But if I'm only dealing with static members of the derived class, what then?

To make this a bit more concrete, I thought that the code below would work:

abstract class TypeBase
{
    static TypeBase()
    {
        Type<int>.Name = "int";
        Type<long>.Name = "long";
        Type<double>.Name = "double";
    }
}

class Type<T> : TypeBase
{
    public static string Name { get; internal set; }
}

class Program
{
    Console.WriteLine(Type<int>.Name);
}

I assumed that accessing the Type<T> class would automatically invoke the static constructor for TypeBase; but this appears not to be the case. Type<int>.Name is null, and the code above outputs the empty string.

Aside from creating some dummy member (like a static Initialize() method that does nothing), is there a better way to ensure that a base type's static constructor will be called before any of its derived types is used?

If not, then... dummy member it is!

The rules here are very complex, and between CLR 2.0 and CLR 4.0 they actually changed in subtle and interesting ways, that IMO make most "clever" approaches brittle between CLR versions. An Initialize() method also might not do the job in CLR 4.0 if it doesn't touch the fields.

I would look for an alternative design, or perhaps use regular lazy initialization in your type (i.e. check a bit or a reference (against null) to see if it has been done).

Why would typeof(Foo) ever return null?

14 votes

Occasionally, I see that typeof(Foo) returns null. Why would this happen?

This is in C#, .NET 3.5.

I thought it might have something to do with the assembly containing the type not yet being loaded, but a test app shows that the assembly is loaded at the start of the method where typeof is used.

Any ideas?


Update 1

  • I can't provide a reproducible sample as this happens on a huge application
  • When I say 'occasionally' I mean in the same method in my application but during various instances. Also, when it fails once when running, it'll fail every time for that instance of the application.

Update 2

The application in question uses a huuuuuge amount of memory and runs on 32bit XP. I'm thinking maybe it's a TypeLoadException or OutOfMemoryException that's somehow being swallowed (but I can't see how, as I've tried this with first-chance exceptions turned on in the debugger).


Update 3

Ran into the same issue just now. Here's the stack trace: enter image description here The code up to this point is literally just:

Type tradeType = typeof(MyTradeType)
TradeFactory.CreateTrade(tradeType)

(before, it was ..CreateTrade(typeof(MyTradeType)) so I couldn't actually tell if the typeof returned null)

So, it looks like typeof() isn't returning null but it's getting set to null by the time it ends up in the CreateTrade method.

The exception (NullReferenceException) has a HResult property of 0x80004003 (Invalid pointer). A call to System.Runtime.InteropServices.Marshal.GetLastWin32Error( ) (in the Immediate Window) returns 127 (The specified procedure could not be found). I've looked in the Modules window and the module that contains this type and method has been loaded and there doesn't look to be any loader errors.


I haven't been able to reproduce this bug but it looks like typeof(Foo) would never return null.

I'd still be interested to hear otherwise though.

Is Mono's VB.Net support ready for a production site?

9 votes

Previously, I've only used Microsoft-centric solutions, but for an upcoming ASP.Net project I'm considering using Mono and hosting it on a Linux Amazon EC2 instance. Based on the responses to my previous question, this sounds doable. However, I'm most comfortable with VB.Net and I'm wondering how well Mono supports it.

Does anyone have first-hand experience writing ASP.Net applications for Mono using VB.Net? If so, I'd like to know how it went, what kind of compatibility issues you ran into, and if you consider Mono's VB.Net support ready for use on a production site?

I know Mono's C#.Net support is very good, so that's my fall-back plan, but I'd really prefer to use VB.Net.

The VB compiler hasn't been abandoned, it's just a lack of time that is preventing the required work to update to newer VB versions.

Currently vbnc has support for VB 8 (aka Visual Studio 2005), with a few minor features from newer VB versions.

The easiest and safest would be to precompile your site on Windows, in which case you won't have to deal with any potential compiler issues (and you can use the most recent Visual Studio version). If you take this route you shouldn't run into any bugs you wouldn't hit using C# [1]

[1]: You'd be referencing one assembly more: Microsoft.VisualBasic.dll, which could be a source of bugs - but if you adhere to what is considered good programming practice for VB (turn on Option Strict) the chances that you'll hit any significant new bugs is pretty low.