Best .net questions in June 2011

Why doesn't C# support the return of references?

60 votes

I have read that .NET supports return of references, but C# doesn't. Is there a special reason?

UPDATE: This question was the subject of my blog on June 23rd 2011. There is a lot of good additional conversation on this topic there in the comments. Thanks for the great question!


You are correct; .NET does support methods that return managed references to variables. .NET also supports local variables that contain managed references to other variables. (Note however that .NET does not support fields or arrays that contain managed references to other variables because that overly complicates the garbage collection story. Also the "managed reference to variable" types are not convertible to object, and therefore may not be used as type arguments to generic types or methods.)

Commenter "RPM1984" for some reason asked for a citation for this fact. RPM1984 I encourage you to read the CLI specification Partition I Section 8.2.1.1, "Managed pointers and related types" for information about this feature of .NET.

It is entirely possible to create a version of C# which supports both these features. You could then do things like

static ref int Max(ref int x, ref int y) 
{ 
  if (x > y) 
    return ref x; 
  else 
    return ref y; 
} 

and then call it with

int a = 123;
int b = 456; 
ref int c = ref Max(ref a, ref b); 
c += 100;
Console.WriteLine(b); // 556!

I know empirically that it is possible to build a version of C# that supports these features because I have done so. Advanced programmers, particularly people porting unmanaged C++ code, often ask us for more C++-like ability to do things with references without having to get out the big hammer of actually using pointers and pinning memory all over the place. By using managed references you get these benefits without paying the cost of screwing up your garbage collection performance.

We have considered this feature, and actually implemented enough of it to show to other internal teams to get their feedback. However at this time based on our research we believe that the feature does not have broad enough appeal or compelling usage cases to make it into a real supported language feature. We have other higher priorities and a limited amount of time and effort available, so we're not going to do this feature any time soon.

Also, doing it properly would require some changes to the CLR. Right now the CLR treats ref-returning methods as legal but unverifiable because we do not have a detector that detects this situation:

ref int M1(ref int x)
{
    return ref x;
}

ref int M2()
{
    int y = 123;
    return ref M1(ref y); // Trouble!
}

int M3()
{
    ref int z = ref M2();
    return z;
}

M3 returns the contents of M2's local variable, but the lifetime of that variable has ended! It is possible to write a detector that determines uses of ref-returns that clearly do not violate stack safety. What we would do is write such a detector, and if the detector could not prove stack safety, then we would not allow the usage of ref returns in that part of the program. It is not a huge amount of dev work to do so, but it is a lot of burden on the testing teams to make sure that we've really got all the cases. It's just another thing that increases the cost of the feature to the point where right now the benefits do not outweigh the costs.

If you can describe for me why it is you want this feature, I would really appreciate that. The more information we have from real customers about why they want it, the more likely it will make it into the product someday. It's a cute little feature and I'd like to be able to get it to customers somehow if there is sufficient interest.

(See also related questions Is it Possible to Return a Reference to a Variable in C#? and Can I use a reference inside a C# function like C++?)

Evil code confusion, how does it even compile?

47 votes

I stumbled upon this code:

static void Main()
{
    typeof(string).GetField("Empty").SetValue(null, "evil");//from DailyWTF

    Console.WriteLine(String.Empty);//check

    //how does it behave?
    if ("evil" == String.Empty) Console.WriteLine("equal"); 

    //output: 
    //evil 
    //equal

 }

and I wonder how is it even possible to compile this piece of code. My reasoning is:

According to MSDN String.Empty is read-only therefore changing it should be impossible and compiling should end with "A static readonly field cannot be assigned to" or similar error.

I thought Base Class Library assemblies are somehow protected and signed and whatnot to prevent exactly this kind of attack. Next time someone may change System.Security.Cryptography or another critical class.

I thought Base Class Library assemblies are compiled by NGEN after .NET installation therefore changing fields of String class should require advanced hacking and be much harder.

And yet this code compiles and works. Can somebody please explain what is wrong with my reasoning?

A static readonly field cannot be assigned to

You're not assigning to it. You're calling public functions in the System.Reflection namespace. No reason for the compiler to complain about that.

Besides, typeof(string).GetField("Empty") could use variables entered in by the user instead, there's no sure way for the compiler to tell in all cases whether the argument to GetField will end up being "Empty".

I think you're wanting Reflection to see that the field is marked initonly and throw an error at runtime. I can see why you would expect that, yet for white-box testing, even writing to initonly fields has some application.

The reason NGEN has no effect is that you're not modifying any code here, only data. Data is stored in memory with .NET just as with any other language. Native programs may use readonly memory sections for things like string constants, but the pointer to the string is generally still writable and that is what is happening here.

Note that your code must be running with full-trust to use reflection in this questionable way. Also, the change only affect one program, this isn't any sort of a security vulnerability as you seem to think (if you're running malicious code inside your process with full trust, that design decision is the security problem, not reflection).

Why is a Dictionary "not ordered"?

28 votes

I have read this in answer to many questions on here. But what exactly does it mean?

var test = new Dictionary<int, string>();
test.Add(0, "zero");
test.Add(1, "one");
test.Add(2, "two");
test.Add(3, "three");

Assert(test.ElementAt(2).Value == "two");

The above code seems to work as expected. So in what manner is a dictionary considered unordered? Under what circumstances could the above code fail?

Well, for one thing it's not clear whether you expect this to be insertion-order or key-order. For example, what would you expect the result to be if you wrote:

var test = new Dictionary<int, string>();
test.Add(3, "three");
test.Add(2, "two");
test.Add(1, "one");
test.Add(0, "zero");

Console.WriteLine(test.ElementAt(0).Value);

Would you expect "three" or "zero"?

As it happens, I think the current implementation preserves insertion ordering so long as you never delete anything - but you must not rely on this. It's an implementation detail, and that could change in the future.

Deletions also affect this. For example, what would you expect the result of this program to be?

using System;
using System.Collections.Generic;

class Test
{ 
    static void Main() 
    {
        var test = new Dictionary<int, string>();
        test.Add(3, "three");
        test.Add(2, "two");
        test.Add(1, "one");
        test.Add(0, "zero");

        test.Remove(2);
        test.Add(5, "five");

        foreach (var pair in test)
        {
            Console.WriteLine(pair.Key);
        }
    }     
}

It's actually (on my box) 3, 5, 1, 0. The new entry for 5 has used the vacated entry previously used by 2. That's not going to be guaranteed either though.

Rehashing (when the dictionary's underlying storage) could affect things... all kinds of things do.

Just don't treat it as an ordered collection. It's not designed for that. Even if it happens to work now, you're relying on undocumented behaviour which goes against the purpose of the class.

What is this Type in .NET (Reflection)

25 votes

What is this Type in .NET? I am using reflection to get a list of all the classes and this one turns up.

What is it? where does it come from? How is the name DisplayClass1 chosen? I search the sources and didnt see anything. What does the <> mean? what does the c__ mean? is there reference?

enter image description here

It's almost certainly a class generated by the compiler due to a lambda expression or anonymous method. For example, consider this code:

using System;

class Test
{
    static void Main()
    {
        int x = 10;
        Func<int, int> foo = y => y + x;
        Console.WriteLine(foo(x));
    }
}

That gets compiled into:

using System;

class Test
{
    static void Main()
    {
        ExtraClass extra = new ExtraClass();
        extra.x = 10;

        Func<int, int> foo = extra.DelegateMethod;
        Console.WriteLine(foo(x));
    }

    private class ExtraClass
    {
        public int x;

        public int DelegateMethod(int y)
        {
            return y + x;
        }
    }
}

... except using <>c_displayClass1 as the name instead of ExtraClass. This is an unspeakable name in that it isn't valid C# - which means the C# compiler knows for sure that it won't appear in your own code and clash with its choice.

The exact manner of compiling anonymous functions is implementation-specific, of course - as is the choice of name for the extra class.

The compiler also generates extra classes for iterator blocks and (in C# 5) async methods and delegates.

Why doesn't string.Substring share memory with the source string?

19 votes

As we all know, strings in .NET are immutable. (Well, not 100% totally immutable, but immutable by design and used as such by any reasonable person, anyway.)

This makes it basically OK that, for example, the following code just stores a reference to the same string in two variables:

string x = "shark";
string y = x.Substring(0);

// Proof:
fixed (char* c = y)
{
    c[4] = 'p';
}

Console.WriteLine(x);
Console.WriteLine(y);

The above outputs:

sharp
sharp

Clearly x and y refer to the same string object. So here's my question: why wouldn't Substring always share state with the source string? A string is essentially a char* pointer with a length, right? So it seems to me the following should at least in theory be allowed to allocate a single block of memory to hold 5 characters, with two variables simply pointing to different locations within that (immutable) block:

string x = "shark";
string y = x.Substring(1);

// Does c[0] point to the same location as x[1]?
fixed (char* c = y)
{
    c[0] = 'p';
}

// Apparently not...
Console.WriteLine(x);
Console.WriteLine(y);

The above outputs:

shark
park

For two reasons:

  • The string meta data (e.g. length) is stored in the same memory block as the characters, to allow one string to use part of the character data of another string would mean that you would have to allocate two memory blocks for most strings instead of one. As most strings are not substrings of other strings, that extra memory allocation would be more memory consuming than what you could gain by reusing part of strings.

  • There is an extra NUL character stored after the last character of the string, to make the string also usable by system functions that expect a null terminated string. You can't put that extra NUL character after a substring that is part of another string.

Very High Memory Usage in .NET 4.0

18 votes

I have a C# Windows Service that I recently moved from .NET 3.5 to .NET 4.0. No other code changes were made.

When running on 3.5, memory utilzation for a given work load was roughly 1.5 GB of memory and throughput was 20 X per second. (The X doesn't matter in the context of this question.)

The exact same service running on 4.0 uses between 3GB and 5GB+ of memory, and gets less than 4 X per second. In fact, the service will typically end up stalling out as memory usage continue to climb until my system is siting at 99% utilization and page file swapping goes nuts.

I'm not sure if this has to do with garbage collection, or what, but I'm having trouble figuring it out. My window service uses the "Server" GC via the config file switch seen below:

  <runtime>
    <gcServer enabled="true"/>
  </runtime>

Changing this option to false didn't seem to make a difference. Futhermore, from the reading I've done on the new GC in 4.0, the big changes only effect the workstation GC mode, not server GC mode. So perhaps GC has nothing to do with the issue.

Ideas?

Well this was an interesting one.

The root cause turns out to be a change in the behavior of SQL Server Reporting Services' LocalReport class (v2010) when running this on top of .NET 4.0.

Basically, Microsoft altered the behavior of RDLC processing so that each time a report was processed it was done so in a seperate application domain. This was actually done specifically to address a memory leak caused by the inability to unload assemblies from app domains. When the LocalReport class processed an RDLC file, it actually creates an assembly on the fly and loads it into the app domain.

In my case, due to the large volume of report I was processing, this was resulting in very large numbers of System.Runtime.Remoting.ServerIdentity objects being created. This was my tip off to the cause, as I was confused as to why processing an RLDC required remoting.

Of course, to call a method on a class in another app domain, remoting is exactly what you use. In .NET 3.5, this wasn't necessary as, by default, the RDLC-assembly was loaded into the same app domain. In .NET 4.0, however, a new app domain is created by default.

The fix was fairly easy. First I needed to go enable legacy security policy using the following config:

  <runtime>
    <NetFx40_LegacySecurityPolicy enabled="true"/>
  </runtime>

Next, I needed to force the RDLCs to be processed in the same app domain as my service by calling the following:

myLocalReport.ExecuteReportInCurrentAppDomain(AppDomain.CurrentDomain.Evidence);

This resolved the issue.

.NET Memory issues loading ~40 images, memory not reclaimed, potentially due to LOH fragmentation

15 votes

Well, this is my first foray into memory profiling a .NET app (CPU tuning I have done) and I am hitting a bit of a wall here.

I have a view in my app which loads 40 images (max) per page, each running about ~3MB. The max number of pages is 10. Seeing as I don't want to keep 400 images or 1.2GB in memory at once, I set each image to null when the page is changed.

Now, at first I thought that I must just have stale references to these images. I downloaded ANTS profiler (great tool BTW) and ran a few tests. The object lifetime graph tells me that I don't have any references to these images other than the single reference in the parent class (which is by design, also confirmed by meticulously combing through my code):

enter image description here

The parent class SlideViewModelBase sticks around forever in a cache, but the MacroImage property is set to null when the page is changed. I don't see any indication that these objects should be kept around longer than expected.

I next took a look at the large object heap and memory usage in general. After looking at three pages of images I have 691.9MB of unmanaged memory allocated and 442.3MB on the LOH. System.Byte[], which comes from my System.Drawing.Bitmap to BitmapImage conversion is taking pretty much all of the LOH space. Here is my conversion code:

public static BitmapSource ToBmpSrc( this Bitmap b )
{
    var bi = new BitmapImage();
    var ms = new MemoryStream();
    bi.CacheOption = BitmapCacheOption.OnLoad;
    b.Save( ms,  ImageFormat.Bmp );
    ms.Position = 0;
    bi.BeginInit();
    ms.Seek( 0, SeekOrigin.Begin );
    bi.StreamSource = ms;
    bi.EndInit();
    return bi;
}

I am having a hard time finding where all of that unmanaged memory is going. I suspected the System.Drawing.Bitmap objects at first, but ANTS doesn't show them sticking around, and I also ran a test where I made absolutely sure that all of them were disposed and it didn't make a difference. So I haven't yet figured out where all of that unmanaged memory is coming from.

My two current theories are:

  1. LOH fragmentation. If I navigate away from the paged view and click a couple of buttons about half of the ~1.5GB is reclaimed. Still too much, but interesting nonetheless.
  2. Some weird WPF binding thing. We do use databinding to display these images and I am no expert in regards to the ins and outs of how these WPF controls work.

If anyone has any theories or profiling tips I would be extremely grateful as (of course) we are on a tight deadline and I am scrambling a bit to get this final part done and working. I think I've been spoiled by tracking down memory leaks in C++ ... who woulda' thought?

If you need more info or would like me to try something else please ask. Sorry about the wall-o-text here, I tried to keep it as concise as possible.

This blog post appears to descibe what you are seeing, and the proposed solution was to create an implementation of Stream that wraps another stream.

The Dispose method of this wrapper class needs to release the wrapped stream, so that it can be garbage collected. Once the BitmapImage is initialised with this wrapper stream, the wrapper stream can be disposed, releasing the underlying stream, and allowing the large byte array itself to be freed.

The BitmapImage keeps a reference to the source stream so it keeps the MemoryStream object alive. Unfortunately, even though MemoryStream.Dispose has been invoked, it doesn't release the byte array that the memory stream wraps. So, in this case, bitmap is referencing stream, which is referencing buffer, which may be taking up a lot of space on the large object heap. There isn't a true memory leak; when there are no more references to bitmap, all these objects will (eventually) be garbage collected. But since bitmap has already made its own private copy of the image (for rendering), it seems rather wasteful to have the now-unnecessary original copy of the bitmap still in memory.

Also, what version of .NET are you using? Prior to .NET 3.5 SP1, there was a known issue where a BitmapImage could cause a memory leak. The workaround was to call Freeze on the BitmapImage.

Synchronization primitives in the .NET Framework: which one is the good one?

15 votes

I have a problem concerning the System.Threading Microsoft .NET namespace. In this namespace, many classes are defined in order to help me managing with threads. Well, I have a problem, but I do not know what to use, MSDN is vague and I still haven't got a clue of what classes do what. in particular, my problem concerns synchronization.

The problem

I have a certain number of threads (consider N threads). At a certain point a thread must stop and wait for at least one of the other thread to do something. Once one of the N - 1 threads has done a certain task, this thread notifies and the stopped thread will be able to proceed.

So it is just a synchronization issue: a thread must wait to be signalled, that's all.

Many classes

In System.Threading there are many classes provided in order to handle synchronization issues. There are WaitHandle(s), there are AutoResetEvent(s), there are ManualResetEvent(s) and so on...

Which one whould I use?

The question

My question is: can anybody summarize me which class I should use in order to solve my problem? Could you please tell the most important differences between these classes, or other classes?

The point is that I havn't really understood what class is responsible of in the synchronization matter: what is the difference, for example, between a WaitHandle and an AutoResetEvent or ManualResetEvent?

What about lock?

In order to handle many threading issue, .net provides lock functionalities and the Monitor class. Is this couple good for my needs?

Thankyou

Albahari's book is amazing, you should really read through it some time. Its grown alot lately!

What you want

You want an EventWaitHandle (EWH), they are nice because there is nothing to pass around, they are used for signaling threads (either in the same or in a different process) and as the name implies, they can be waited on.

How you use it

You would open one on the thread that is doing the waiting, you open it with a given name that the other thread is going to know about. Then you wait on that wait handle.

The signaling thread will open an existing wait handle of the same name (name is a string) and call set on it.

Differences

AutoResetEvents and ManualResetEvents both inherit from EWH and they are really just EWH's, they just act differently. Which one you want just depends on if you want the EWH to act as a gate or a turnstyle. You only care about this if you are using the wait handle more than once or you are waiting on that wait handle by more than one thread. I've used wait handles a decent amount (I suppose) and I don't think I've ever used a Manual.

Important to know

  • Whatever you do, dont pass wait handles around, they are meant to be opened seperately by their own threads.

  • If the threads are in different processes, then you will HAVE to prefix the name of the EWH with @"Global\", otherwise the names of the wait handles will be encapsulated within the same process. Alternatively, if you are using them all within the same process, dont use the global namespace.

  • Keep in mind that EWH's can be permissioned, and if you run into issues with that I reccomend that you use EventWaitHandleRights.FullControl, but you can browse the full EventWaitHandleRights enumeration here.

  • I like to name my EWH's with a Guid.NewGuid().ToString("N") (Guid.NewGuid & Guid.ToString). I typically do this when the signaling thread is created, since you can easily pass information to it at that time. So in that case, the initial thread creates the string and passes it to the signaling thread when its created. That way both threads know of the name, without having to do any fancy cross-thread passing of variables.

  • EWH implements IDisposable so wrap it in a using block

Race conditions

EWH's are nice because if for whatever reason the signaling thread opens and signals the wait handle before the waiting thread even creates it, everything will still work and the waiting thread will be signaled the instant it hits the wait.

Because of this, though, the thread that is waiting on it will need to have some error trapping because you will need to call OpenExisting. If you call one of the ctor's and the thread is already opened, you'll get a UnauthorizedAccessException or a WaitHandleCannotBeOpenedException thrown as described here, under Exceptions.

Why do .net languages vary in performance?

15 votes

I have heard that C++ .NET is fastest , C# is next, followed by VB .NET and Languages like Iron-Python and Boo come last in terms of performance. If all .NET languages compile to intermediate byte-code which is the same, why the difference in performance?

It is understandable for Boo and Python as all the types have to be evaluated at runtime. But why the difference between languages like C++ and C#?

Boo and Python perform worse because they are interpreted, not compiled. Instead of being converted to CIL (common intermediate language) before being run, they are converted at run time, which obviously will incur performance overhead. NOTE: Boo may not be interpreted, I'm not sure.

Also, since IronPython is dynamically-typed (apparently Boo is not), fewer optimizations can be made when compared to statically typed languages (which C++ and C# are).

You also have to consider the amount of effort put into making optimizations to each implementation. C# and C++.NET have huge teams at Microsoft working on making their compilers produce the fastest bytecode possible. IronPython and Boo are volunteer projects that don't have nearly as much manpower or resources, so they won't gain optimizations as quickly as something MS-funded.

Essentially, language features can have performance/memory costs at both compile-time and runtime. That is why .NET languages vary in performance; because they vary in features.

Any limit to number of properties on a .NET Class?

14 votes

Received a spec to add over 800 properties to an object. Is their any 'limits' to the number of Properties an object can have in C# (or .NET)?

Is their any performance impacts to be concerned with in regards to objects of this class with this many properties?

Thanks!

The metadata can have up to 24-bit references/definitions per assembly. Being a property, you need 2 methods per property. Hence the limit will be 23-bit, or 1 << 23 - 1 for the entire assembly.

Update:

If they are only read-only properties, the limit would be 1 << 24 - 1.

Answer to second question:

No, there will be no performance overhead. Simple properties are likely to be inlined by the JIT.

Some thoughts:

You will never reach the above limit. Imaging having 16 million properties. That will require 16 million strings stored for the names too. Say the average name is 8 chars, then you are looking at a string table size of ~256MB (property name + method name), and then you havent even started coding yet. Just a thought.

Why not System.Void?

13 votes

Possible Duplicate:
What is System.Void?

I have no practical reason for knowing this answer, but I'm curious anyway...

In C#, trying to use System.Void will produce a compilation error:

error CS0673: System.Void cannot be used from C# -- use typeof(void) to get the void type object

As I understood it, void is simply an alias of System.Void. So, I don't understand why 'System.Void' can't be used directly as you might with 'string' for 'System.String' for example. I would love to read an explanation for this!

Incidentally, System.Void can be successfully used with the Mono compiler, instead of Microsoft's, and there it appears equivalent to using the void keyword. This must therefore be a compiler-enforced restriction rather than a CLR restriction, right?

I believe the sole purpose for this struct is to use it in reflection, whereas the other types (like System.String, System.Int32 etc.) are proper types holding data. Void carries no data and you cannot instantiate this struct from your code.

My guess about the compiler error is that it's there to enforce consistency in code. It would look weird to have methods like this:

System.Void MyMethod() { ... }

At first glance, it appears to be returning something while in reality it doesn't. In my opinion, this is a good decision by the C# team (if my speculation about it is correct)

Why does Microsoft recommend against Empty Interfaces ?

13 votes

While doing some research i stumbled on this page:

.Net 2.0 Avoid Empty Interfaces

.Net 4.0 Avoid Empty Interfaces:

Interfaces define members that provide a behavior or usage contract. The functionality that is described by the interface can be adopted by any type, regardless of where the type appears in the inheritance hierarchy. A type implements an interface by providing implementations for the members of the interface. An empty interface does not define any members. Therefore, it does not define a contract that can be implemented.

If your design includes empty interfaces that types are expected to implement, you are probably using an interface as a marker or a way to identify a group of types. If this identification will occur at run time, the correct way to accomplish this is to use a custom attribute. Use the presence or absence of the attribute, or the properties of the attribute, to identify the target types. If the identification must occur at compile time, then it is acceptable to use an empty interface.

Why would they advise this when there are plenty of valid examples of using blank interfaces?

Great question considering that microsoft doesn't follow their own advice.

perfect example: IRequiresSessionState.

Says right in the documentation that it serves as a marker.

My guess is that they are trying to push the use of attributes which do more effectively represent "Markers" plus they blend with Reflection more easily as well.

Also presents a new way of doing things which languages and companies love to do.

Edit: empty interfaces also tend to represent meta data ... which is precisely what attributes where designed to accommodate (which goes back to the previous statement I made about attributes more effectively representing markers).

What method is most efficient at moving objects across the wire in .NET?

13 votes

I've been using WebServices at moving data across the wire and that has served me pretty well. It excels at sending small pieces of data. As soon as you have to move deep object trees with lots of properties, the resulting XML soup takes 100k of data and turns it into a 1MB.

So I've tried IIS Compression, but it left me underwhelmed. It compressed data well, but the trade off was in compression/decompression. Then I've serialized the objects via BinaryFormatter and sent that across. This was better, however, speed of encode/decode still remains.

Anyway, I am hearing that I am stuck in the 00s and now there are better ways to send data across the wire such as ProtocolBuffers, MessagePack, etc...

Can someone tell me whether these new protocols will be better suited for sending large pieces of data and whether I am missing some other efficient ways to do this?

By efficient, I mean amount of bandwidth, speed of encode/decode, speed of implementation, etc...

It depends on what's making up the bulk of your data. If you've just got lots of objects with a few fields, and it's really the cruft which is "expanding" them, then other formats like Protocol Buffers can make a huge difference. I haven't used MessagePack or Thrift, but I would expect they could have broadly similar size gains.

In terms of speed of encoding and decoding, I believe that both Marc Gravell's implementation of Protocol Buffers and my own will outperform any of the built-in serialization schemes.

Adding two .NET SqlDecimals increases precision?

12 votes

in .NET, when I add two SqlDecimals, like so:

SqlDecimal s1 = new SqlDecimal(1);
SqlDecimal s2 = new SqlDecimal(1);
SqlDecimal s3 = s1 + s2;

then s3 has precision 2, whereas both s1 and s2 have precision 1.

This seems odd, especially as the documentation states that the return value of the addition operator is "A new SqlDecimal structure whose Value property contains the sum." I.e. according to the documentation, addition should not change the precision.

Am I missing something here? Is this intended behaviour?

Cheers,

Tilman

This article (http://msdn.microsoft.com/en-us/library/ms190476.aspx) explains the behavior for the SQL types, and I assume the .NET Sql data types reflect that in their behavior.

Stopping Garbage Collection for an unmanaged Delegate

12 votes

I've recently been trying out using R.NET to get R talking to .NET and C#. It's been going very well so far, but I've hit a snag that I don't seem to be able to solve.

I've had no issues with simple, basic commands. I made a simple calculator, and something to import data into a data grid. But now I keep getting the following error:

A callback was made on a garbage collected delegate of type 'R.NET!RDotNet.Internals.blah3::Invoke'. This may cause application crashes, corruption and data loss. When passing delegates to unmanaged code, they must be kept alive by the managed application until it is guaranteed that they will never be called.

It began when I tried to repeatedly import a text file, just to test something. I've looked up this error in various ways - after hours of trawling through pages, it seems that there are a number of causes of this type of error. As time has gone on, I've been stripping back my code to more and more simple tasks to try to eliminate possibilities. I've got this now:

 public Form1()
        {
            InitializeComponent();

            REngine.SetDllDirectory(@"C:\Program Files\R\R-2.13.0\bin\i386");
            REngine.CreateInstance("RDotNet");

            using (REngine currentEngine = REngine.GetInstanceFromID("RDotNet"))
            {
                for (int i = 0; i < 1000; ++i)
                {
                    currentEngine.EagerEvaluate("test <- " + i.ToString());

                    NumericVector returned = currentEngine.GetSymbol("test").AsNumeric();

                    textBox1.Text += returned[0];

                }

            }

        }

All it does is increment a count in textBox1.Text. I had been doing it as a test with a button press incrementing the value, but this was making my finger ache after a while! It could typically manage loads of presses before throwing the error above.

At first this code seemed to be fine - so I had assumed the other stuff I had been doing was somehow the cause of the problem, as well as the cause of the error quoted above. So that's why I put in the for loop. The loop can run with no problems for several hundred runs, but then the error kicks in.

Now, I did read that this kind of error can be called by the garbage collector getting rid of the instance I've been working with. I've tried various suggestions that I read, as best I understand them. These have included using GC.KeepAlive() (no luck), and also creating a private field in a separate class that can't be gotten rid of. Sadly this didn't work either.

Is there anything else that I can try? I'm very, very new to C# so I'd appreciate any pointers on how to get this working. I assume very much that my lack of success with the methods suggested are either something to do with (1) my own mistakes in implementing the standard fixes (this seems most likely) or (2) a quirk associated with R.NET that I haven't understood.

Any help would be greatly appreciated!

Looks like a bug in R.NET. The exception you're seeing happens when a .NET layer passes a callback to unmanaged code but then lets the delegate get garbage collected. I see no delegate usage in your repro code, hence the conclusion that it must be in R.NET.

What is exactly happening when I spawn a new thread from .NET?

11 votes

I want to understand what precisely is happening behind the scene when I spawn a new thread in .NET, something like here:

Thread t = new Thread(DoWork); //I am not interested in DoWork per se
t.Start();

1. What thread-related objects are created in CLR and Windows kernel?
2. Why are those objects needed?
3. How much managed/unmanaged memory (heap and stack) is allocated on x86, x64 Windows?

UPDATE
I am looking for such objects as managed thread object, which is I assume is t, but perhaps some other additional managed objects; kernel thread object, user thread environment block and alike.

Many thanks!

Win32 and Kernel memory allocated

I'm not exactly sure how the .NET part works, but if the runtime does decide to create a real thread with the OS, it would eventually call the Win32 API CreateThread in kernel32.dll, probably from mscorlib.ni.dll

By default, new threads get 1MB of virtual address for the stack, which is committed as needed. This can be controlled with the maxStackSize parameter. The main thread's stack size comes from a parameter in the executable file itself.

In the process's address space, a TEB (thread environment block) will be allocated (see also). Incidentally, the FS register on x86 points to this for things like thread local storage and structured exception handling (SEH). There are probably other things allocated by Win32 that are not documented.

In creating the Win32 thread, the Win32 server process (csrss.exe) is contacted. You can see that csrss has handles open to all Win32 processes and threads in Process Explorer for some kind of bookkeeping.

DLLs loaded in the process will be notified of the new thread and may allocate their own memory for tracking the thread.

The kernel will create an ETHREAD [layout] (derived from KTHREAD) object from kernel non-paged pool to track the thread's state. There will also be a kernel stack allocated (12k default for x86) which can be paged out (unless the thread is in a kernel mode wait state).

Why so many things need to allocate memory for a thread

Threads are the smallest preemptively scheduled unit that the OS provides and there is a lot of context connected to them. Many different components need to provide separate context for each thread because system services need to be able to deal with multiple threads doing different things all at the same time.

Some services require you to declare new threads to them explicitly but most are expected to work with new threads automatically. Sometimes this means allocating space right when the thread is started. As the thread engages other services, the amount of memory used to track the thread can increase as those services set up their own context for the thread.

How much memory is allocated

It's hard to say how much memory is allocated for a thread since it is spread across several address spaces and heaps. It will vary between Windows versions, installed components and what is loaded into the process currently.

The largest cost is generally accepted to be the 1MB of address space used by default for new threads, but even this limit can allow many hundreds to be used in a single process without running out of space.

If the design is using many more OS threads than the number of CPUs in the system, it should be reviewed. Work queues with a thread pool and lightweight threads with user mode scheduling with fibers or another library's implementation should be able to handle mulithreading without requiring an excessive number of OS threads, rendering the memory cost of the threads to be unimportant.

How to calculate the total time a user spending on an application?

11 votes

I want to create an application that able to calculate the total time the user (i.e. myself) spent on a particular application, for example Firefox. And this application should display warning message if the user spent a lot of time on Firefox (for example 1 hour or more)

Reason: I'm a VB.NET developer. During my working hours, my main tool is Visual Studio and I suppose to do coding. But I need Firefox occasionally to access internet (particularly SO and other sites) to find solutions for my programming problems. The problem is I addicted to SO and SO sucks my time for hours until I have forgotten that I suppose to continue coding and not browsing the SO site.

My question: How to calculate the total time a user spending on an open application like Firefox?

Update:

I need to play a song as warning message to myself if I stay too long on Firefox. My intent is to create a winform or windows service to achieve this

This is not a programming problem. This is a discipline problem. My advises:

  1. First of all, don't rely on application to tell you what to do.
  2. Secondly, the application can warn you but ultimately you can disable it, turn it off.
  3. Thirdly, my suggestion to your real problem i.e. no discipline and poor work ethic is to put up a small banner in front of your monitor with this text "Focus on your work" or "Code it now" or "SO is evil"

How to create a new System.String type with other name?

10 votes

I try to describe my problem step by step because I do not know how to say it in correct programming terms.

When I use a System.String type, I do the following:

  1. Declare the type: Dim Str1 as String
  2. Assign its value: Str1 = "This is a string"

I want to create a new type that just like the System.String type but in different name. For example, I want to create a UrlString type for string like this:

  1. Declare the type: Dim Str2 as UrlString
  2. Assign its value: Str2 = "http://www.example.com"

My question is: How do I create the UrlString type?

The reason: I want to create the UrlString type to help me to identify the value of the content. For example, UrlString type means the string is in url format, PhoneString means the string is in phone format, CreditCardString type means the string is in credit card format and so on.

UPDATE:

Thanks Marc Gravell and GSerg. Here is the solution:

Class UrlString
    Private ReadOnly value As String

    Public Sub New(ByVal value As String)
        Me.value = value
    End Sub

    Public Shared Widening Operator CType(ByVal value As String) As UrlString
        Return New UrlString(value)
    End Operator

    Public Shared Widening Operator CType(ByVal u As UrlString) As String
        Return u.value
    End Operator

    Public Overrides Function GetHashCode() As Integer
        Return If(value Is Nothing, 0, value.GetHashCode())
    End Function

    Public Overrides Function Equals(ByVal obj As Object) As Boolean
        Return String.Equals(value, DirectCast(obj, String))
    End Function

    Public Overrides Function ToString() As String
        Return value
    End Function
End Class

You need to add an implicit conversion operator from string to UrlString for that to work. In C#:

class UrlString
{
    private readonly string value;
    public UrlString(string value) { this.value = value; }
    public static implicit operator UrlString(string value)
    {
        return new UrlString(value);
    }
    public override int GetHashCode()
    {
        return value == null ? 0 : value.GetHashCode();
    }
    public override bool Equals(object obj)
    {
        return string.Equals(value, (string)obj);
    }
    public override string ToString()
    {
        return value;
    }
}

Then:

UrlString foo = "abc";

How to turn off monitor using vb.net code

8 votes

How to turn off monitor using vb.net code? Ok, actually I found the c# solution. But I need the vb.net solution. I have tried online C# to vb.net converter, but the converter complaining that there are errors in it.

Please help me to translate following C# code to vb.net:

using System.Runtime.InteropServices; //to DllImport

public int WM_SYSCOMMAND = 0x0112;
public int SC_MONITORPOWER = 0xF170; //Using the system pre-defined MSDN constants that can be used by the SendMessage() function .


[DllImport("user32.dll")]
private static extern int SendMessage(int hWnd, int hMsg, int wParam, int lParam);
//To call a DLL function from C#, you must provide this declaration .


private void button1_Click(object sender, System.EventArgs e)
{

SendMessage( this.Handle.ToInt32() , WM_SYSCOMMAND , SC_MONITORPOWER ,2 );//DLL function
}

UPDATE:

I use this online converter

UPDATE: Here is the solution in vb.net

Imports System.Runtime.InteropServices

Public Class Form1
    Public WM_SYSCOMMAND As Integer = &H112
    Public SC_MONITORPOWER As Integer = &HF170

    <DllImport("user32.dll")> _
    Private Shared Function SendMessage(ByVal hWnd As Integer, ByVal hMsg As Integer, ByVal wParam As Integer, ByVal lParam As Integer) As Integer
    End Function

    Private Sub button1_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button1.Click
        SendMessage(Me.Handle.ToInt32(), WM_SYSCOMMAND, SC_MONITORPOWER, 2)
    End Sub


End Class

Try this

Public WM_SYSCOMMAND As Integer = &H112
Public SC_MONITORPOWER As Integer = &Hf170

<DllImport("user32.dll")> _
Private Shared Function SendMessage(hWnd As Integer, hMsg As Integer, wParam As Integer, lParam As Integer) As Integer
End Function

Private Sub button1_Click(sender As Object, e As System.EventArgs)
    SendMessage(Me.Handle.ToInt32(), WM_SYSCOMMAND, SC_MONITORPOWER, 2)
End Sub

Custom MVVM implementation Vs. PRISM

8 votes

This question is inspired from this closed question -

What does Prism actually offer the developer? And is it worth it?

I have already implemented my own custom MVVM implementations in enterprise applications. I am intrested in knowing -

why should I learn PRISM(specifically PRISM, not other MVVM frameworks)? Benefits of PRISM over custom MVVM implementation and is it worth the investment in learning PRISM?

I hope this question is not subjective and everyone pls. don't get into arguments :)

As with many frameworks that do a common task for you, you get:

  1. Tested by many more eyeballs than just yourself. This (hopefully) includes unit tests, which you may or may not be doing while building your own framework.
  2. More readable for other developers: nobody else has experience with your custom MVVM framework. But if another developer joins your project, or joins your team, or joins your company, they can jump straight into Prism code.
  3. Better documentation. Along the same lines, anyone new joining likely has to learn the ropes by manually gathering the collective knowledge from your brain, and any other users on the team, and by looking at the source code. Third party frameworks have their own documentation, and tons more blog posts on the internet.
  4. Better community. You can ask questions on StackOverflow about "how do I do X with Prism?" You can't ask that with your custom framework.
  5. Likely more capable: by needing to serve more users than just you/your team, more features will have been added. If you need to do something MVVM-related that you've never done before, chances are support for it isn't built in to your own MVVM framework. But it's likely in Prism.
  6. Better structure. Let's say you wanted to do something MVVM-related but it wasn't in Prism. Very likely, there's a good reason for that! If something's not in a (reasonably-mature) framework made for working in a given domain, that's a sign that what you're trying to do is an unnatural or awkward way of approaching the problem. Working with your own framework, it's too easy to say "oh I'll add that feature," then 6 months later realize you made a huge mistake because this new feature makes your code very hard to follow or ends up being a vector for lots of bugs or whatnot.
  7. A CV line-item: I would have mixed feelings toward hiring someone who had "implemented and used custom MVVM framework." While it could mean they're smart, it could also indicate the dreaded "not built here syndrome." On the other hand, putting "Microsoft Prism MVVM Framework" among a huge list of technologies could be nice, but isn't a wow-er. The best of both worlds would be a longer bullet point, along the lines of "Deep understanding of the MVVM pattern, achieved by first implementing a toy MVVM framework for learning purposes before switching to MVVM Prism." Yes, the difference between these three isn't going to make or break your CV, and not-built-here syndrome is something that would hopefully come up in an interview, but it's just worth keeping in mind, especially if you're applying for a place that gets enough resumes they can afford to throw out anything that unnerves them slightly.