Best arrays questions in December 2010

How to convert an int[] array to a List?

14 votes

I expected this code to display true:

int[] array = {1, 2};
System.out.println(Arrays.asList(array).contains(1));

The Arrays.asList(array) will result in a singleton list of an int[].

It works as you expect if you change int[] to Integer[]. Don't know if that helps you though.

What's the magic of arrays in C#

14 votes
int[] a = new int[5];
string[] b = new string[1];

The types of both a and b inherit from the abstract System.Array, but there is no real classes in the built-in library(it seems that there are some runtime types, you can't find the type defination class of an int[]). Can you tell me what happens while compiling? And why did they(the c# team) make this design(I mean why it's not something like Array<T>,instead they are using an abstract class with compiler magics)?

Trying to reason this out within the .NET type system doesn't get you very far. There is core support built into the JIT compiler and the CLR to deal with creating arrays. A statement like this:

        var arr = new int[5];

Generates this IL:

  IL_0001:  ldc.i4.5
  IL_0002:  newarr     [mscorlib]System.Int32

Which the JIT compiler then translate into this machine code:

00000035  mov         edx,5                 ; arg2 = array size
0000003a  mov         ecx,6F535F06h         ; arg1 = typeof(int)
0000003f  call        FFD52128              ; call JIT_NewArr1(type, size)

Core ingredients here are the dedicated IL opcode, newarr, instead of the usual newobj opcode that creates an instance of a class. And the simple translation to a CLR helper function that actually gets the object created. You can have a look-see at this helper function with the SSCLI20 source code, clr\src\vm\jithelpers.cpp. Too large to post here, but it is heavily optimized to make this kind of code run as fast possible, having direct access to the type internals available to CLR code.

There are two of these helpers available, JIT_NewArr1() creates one-dimensional (vector) arrays and JIT_NewMDArr() creates multi-dimensional arrays. Compare to the two overloads available for Type.MakeArrayType().

What interfaces do all arrays implement in c#?

12 votes

As a new .NET 3.0 programmer, I started to learn LINQ and I found something pretty basic that I haven't noticed before:

The book claims every array implements IEnumerable<T> (obviously, otherwise we couldn't use LINQ to objects on arrays...). When I saw this, I thought to myself that I never really thought about that, and I asked myself what else all arrays implement - so I examined System.Array using the object browser (since it's the base class for every array in the CLR) and, to my surprise, it doesn't implement IEnumerable<T>.

So my question is: where is the definition? I mean, how can I tell exactly which interfaces every array implements?

From the documentation (emphasis mine):

[...] the Array class implements the System.Collections.Generic.IList<T>, System.Collections.Generic.ICollection<T>, and System.Collections.Generic.IEnumerable<T> generic interfaces. The implementations are provided to arrays at run time, and therefore are not visible to the documentation build tools.

EDIT: as Jb Evain points out in his comment, only vectors (one-dimensional arrays) implement the generic interfaces. As to why multi-dimensional arrays don't implement the generic interfaces, I'm not quite sure since they do implement the non-generic counterparts (see the class declaration below).

The System.Array class (i.e. every array) also implements these non-generic interfaces:

public abstract class Array : ICloneable, IList, ICollection, IEnumerable, IStructuralComparable, IStructuralEquatable

Algorithms to efficiently "scale" or "resize" of an array of numbers (audio resampling)

10 votes

Doing audio processing (though it could just as well be image processing) I have a one-dimensional array of numbers. (They happen to be 16-bit signed integers representing audio samples, this question could apply to floats or integers of different sizes equally.)

In order to match audio with different frequencies (e.g. blend a 44.1kHz sample with a 22kHz sample), I need to either stretch or squash the array of values to meet a specific length.

Halving the array is simple: drop every other sample.

[231, 8143, 16341, 2000, -9352, ...] => [231, 16341, -9352, ...]

Doubling the array width is slightly less simple: double each entry in place (or optionally perform some interpolation between neighboring 'real' samples).

[231, 8143, 16341, 2000, -9352, ...] => [231, 4187, 8143, 12242, 16341, ...]

What I want is an efficient, simple algorithm that handles any scaling factor, and (ideally) optionally supports performing interpolation of one kind or another in the process.

My use case happens to be using Ruby arrays, but I'll happily take answers in most any language or pseudo-code.

This is something I threw together in a few minutes just as I was leaving work, then recreated after a glass of wine after dinner:

sample = [231, 8143, 16341, 2000, -9352]
new_sample = []
sample.zip([] * sample.size).each_cons(2) do |a,b|
  a[1] = (a[0] + b[0]).to_f / 2 # <-- simple average could be replaced with something smarter
  new_sample << a
end
new_sample.flatten!
new_sample[-1] = new_sample[-2]
new_sample # => [231, 4187.0, 8143, 12242.0, 16341, 9170.5, 2000, 2000]

I think it's a start but obviously not finished since the -9352 didn't propagate into the final array. I didn't bother converting floats to ints; I figure you know how to do that. :-)

I'd like to find a better way to iterate over each_cons. I'd rather use a map than each* but this works OK.

Here's what the loop iterates over:

asdf = sample.zip([] * sample.size).each_cons(2).to_a 
asdf # => [[[231, nil], [8143, nil]], [[8143, nil], [16341, nil]], [[16341, nil], [2000, nil]], [[2000, nil], [-9352, nil]]]

each_cons is nice because it steps through the array returning slices of it, which seemed like a useful way to build up the averages.

[0,1,2,3].each_cons(2).to_a # => [[0, 1], [1, 2], [2, 3]]

EDIT:

I like this better:

sample = [231, 8143, 16341, 2000, -9352]

samples = sample.zip([] * sample.size).each_cons(2).to_a 
new_sample = samples.map { |a,b|
  a[1] = (a[0] + b[0]).to_f / 2
  a
}.flatten
new_sample << sample[-1]
new_sample # => [231, 4187.0, 8143, 12242.0, 16341, 9170.5, 2000, -3676.0, -9352]

what is the point of heterogenous arrays?

9 votes

I know that more-dynamic-than-Java languages, like Python and Ruby, often allow you to place objects of mixed types in arrays, like so:

["hello", 120, ["world"]]

What I don't understand is why you would ever use a feature like this. If I want to store heterogenous data in Java, I'll usually create an object for it.

For example, say a User has int ID and String name. While I see that in Python/Ruby/PHP you could do something like this:

[["John Smith", 000], ["Smith John", 001], ...]

this seems a bit less safe/OO than creating a class User with attributes ID and name and then having your array:

[<User: name="John Smith", id=000>, <User: name="Smith John", id=001>, ...]

where those <User ...> things represent User objects.

Is there reason to use the former over the latter in languages that support it? Or is there some bigger reason to use heterogenous arrays?

N.B. I am not talking about arrays that include different objects that all implement the same interface or inherit from the same parent, e.g.:

class Square extends Shape
class Triangle extends Shape
[new Square(), new Triangle()]

because that is, to the programmer at least, still a homogenous array as you'll be doing the same thing with each shape (e.g., calling the draw() method), only the methods commonly defined between the two.

As katrielalex wrote: There is no reason not to support heterogeneous lists. In fact, disallowing it would require static typing, and we're back to that old debate. But let's refrain from doing so and instead answer the "why would you use that" part...

To be honest, it is not used that much -- if we make use of the exception in your last paragraph and choose a more liberal definition of "implement the same interface" than e.g. Java or C#. Nearly all of my iterable-crunching code expects all items to implement some interface. Of course it does, otheriwise it could do very little to it!

Don't get me wrong, there are absolutely valid use cases - there's rarely a good reason to write a whole class for containing some data (and even if you add some callables, functional programming sometimes comes to the rescue). A dict would be a more common choice though, and namedtuple is very neat as well. But they are less common than you seem to think, and they are used with thought and discipline, not for cowboy coding.

(Also, you "User as nested list" example is not a good one - since the inner lists are fixed-sized, you better use tuples and that makes it valid even in Haskell (type would be [(String, Integer)]))