Best memory-management questions in May 2011

return large data by reference or as return in function?

10 votes

Hi there,

On the job today I had a argue with a collage about passing large data between scopes. The myth was that reference uses less memory/CPU usage when passing between 2 scopes. We build a proof of concept who was right... so:

function by_return($dummy=null) {
    $dummy = str_repeat("1",100 * 1024 * 1024);
    return $dummy;
}

function by_reference(&$dummy) {
    $dummy = null;
    $dummy = str_repeat("1",100 * 1024 * 1024);
}
echo memory_get_usage()."/".memory_get_peak_usage()."\n";
//1 always returns: 105493696/105496656
$nagid = by_return();
echo memory_get_usage()."/".memory_get_peak_usage()."\n";
unset($nagid);
//2 always returns:  105493696/210354184 even if we comment 1st part
by_reference($dummy);
echo memory_get_usage()."/".memory_get_peak_usage()."\n";
unset($dummy);

But it seems that by reference it consumes more memory according to function "memory_get_peak_usage()"

So, do you agree with this? That "return by function" consumes less memory in this case? Any enlightening is welcomed :)

This is due to the way php handles variables, and is a bit counter-intuitive to anyone who has worked in C or C++.

Passing by reference to be smarter than PHP isn't advised. PHP doesn't actually make copies of data unless it needs to (i.e. you change a variable's value when there's more than 1 reference to it), an optimization strategy very similar to copy-on-write for shared memory pages.

So, let's say you have a variable that you pass by value several times in a given script. If you then take this variable and pass it by reference, you're actually duplicating the variable rather than just getting a pointer to the object.

This is because internally, PHP zvals (the data structure PHP uses to store variables) can only be reference variables or non-reference variables. So it doesn't matter what the zval's ref_count field is, because it's not a reference variable (the is_ref field of the zval structure). So internally, PHP is forced to create a new zval and set its is_ref field to true, thus doubling the memory.

Tell your co-worker to stop trying to outsmart PHP. Passing by reference unless done 100% perfectly throughout the code will cause a lot of overhead and double the memory usage.

For a more detailed discussion, please see this link: http://porteightyeight.com/2008/03/18/the-truth-about-php-variables/

9 votes

I'm going through all of my documentation regarding memory management and I'm a bit confused about something.

When you use @property, it creates getters/setters for the object:

.h: @property (retain, nonatomic) NSString *myString

.m: @synthesize myString

I understand that, but where I get confused is the use of self. I see different syntax in different blogs and books. I've seen:

myString = [NSString alloc] initWithString:@"Hi there"];

or

self.myString = [NSString alloc] initWithString:@"Hi there"];

Then in dealloc I see:

self.myString = nil;

or

[myString release];

or

self.myString = nil;
[myString release];

On this site, someone stated that using self adds another increment to the retain count? Is that true, I haven't seen that anywhere.

Do the automatic getters/setters that are provided autorelease?

Which is the correct way of doing all of this?

Thanks!

If you are not using the dot syntax you are not using any setter or getter.

The next thing is, it depends on how the property has been declared.

Let's assume something like this:

@property (nonatomic, retain) Article *article;
...
@synthesize article;

Assigning something to article with

self.article = [[Article alloc] init];

will overretain the instance given back by alloc/init and cause a leak. This is because the setter of article will retain it and will release any previous instance for you.

So you could rewrite it as:

self.article = [[[Article alloc] init] autorelease];

Doing this

article = [[Article alloc] init]; 

is also ok, but could involve a leak as article may hold a reference to an instance already. So freeing the value beforehand would be needed:

[article release];
article = [[Article alloc] init]; 

Freeing memory could be done with

[article release];

or with

self.article = nil;

The first one does access the field directly, no setters/getters involved. The second one sets nil to the field by using a setter. Which will release the current instance, if there is one before setting it to nil.

This construct

self.myString = nil; 
[myString release];

is just too much, it actually sends release to nil, which is harmless but also needless.

You just have to mentally map hat using the dot syntax is using accessor methods:

self.article = newArticle
// is
[self setArticle:newArticle];

and

myArticle = self.article;
// is
myArticle = [self article];

Some suggestions on reading, all official documents by Apple:

The Objective-C Programming Language

Memory Management Programming Guide

Memory limitations in a 64-bit .Net application?

9 votes

On my laptop, running 64 bit Windows 7 and with 2 Gb of free memory (as reported by Task Manager), I'm able to do:

var x = new Dictionary<Guid, decimal>( 30 * 1024 *1024 );

Without having a computer with more RAM at my hands, I'm wondering if this will scale so that on a computer with 4 Gb free memory, I'll be able to allocate 60M items instead of "just" 30M and so on?

Or are there other limitations (of .Net and/or Windows) that I'll bump into before I'm able to consume all availalke RAM?

Update: OK, so I'm not allowed to allocate a single object larger than 2 Gb. That's important to know! But then I'm of course curious to know if I'll be able to fully utilize all memory by allocating 2 Gb chunks like this:

  var x = new List<Dictionary<Guid, decimal>>();
  for ( var i = 0 ; i < 10 ; i++ )
    x.Add( new Dictionary<Guid, decimal>( 30 * 1024 *1024 ) );

Would this work if the computer have >20Gb free memory?

There's a 2 GiB limitation on all objects in .NET, you are never allowed to create a single object that exceeds 2 GiB. If you need a bigger object you need to make sure that the objects is built from parts smaller than 2 GiB, so you cannot have an array of continuous bits larger than 2 GiB or a single string longer larger than 512 MiB, I'm not entirely sure about the string but I've done some testing on the issue and was getting OutOfMemoryExceptions when I tried to allocate strings bigger than 512 MiB.

These limits though are subject to heap fragmentation and even if the GC does try to compact the heap, large objects (which is somewhat of an arbitrary cross over around 80K) end up on the large object heap which is a heap that isn't compacted. Strictly speaking, and somewhat of a side note, if you can maintain short lived allocations below this threshold it would be better for your overall GC memory management and performance.

Maximum size of native heap on Android?

7 votes

If I have understood correctly, an android process has two heaps - one managed by the VM and one native.

The size of the VM heap cannot exceed 16mb (at least, this value can be higher on some phones).

But what about the maximum size of the native heap?

The 16 mb limit doesn't seem to be a hard limit in that an app can allocate more than 16mb through the NDK, but the OS will start killing other processes and possibly the foreground process as well when a high amount of memory is used.

When does the OS start behaving this way? When the native heap + VM heap size exceeds 16mb?

Debug.getNativeHeapSize() gives the size of the native heap, but is there a function to check the combined native + VM heap size?

Curious to hear from someone who knows how this works!

There is no "line of death" in Android memory management. When the system needs to kill processes to reclaim memory, it considers a number of different factors, including the process' importance (determined by factors like whether or not it's in the foreground, or providing services to a foreground app) and how much memory it's using.

If your process is idle, and sitting on more memory than anything else, it's likely to be killed first.

The exact algorithm has evolved a bit over time, and the system doesn't make any guarantees about specific behavior.