Best arrays questions in May 2012

Why does in_array() wrongly return true with these (large numeric) strings?

37 votes

I am not getting what is wrong with this code. It's returning "Found", which it should not.

$lead = "418176000000069007";
$diff = array("418176000000069003","418176000000057001");

if (in_array($lead,$diff))
    echo "Found";
else
    echo "Not found";

I think it is because of the limitations of the number storage. The values exceed PHP_INT_MAX.

Try without using the quotes, and try to echo the values of the variables. It will result in something like

$lead ---> 418176000000070000  
$diff ---> Array ( [0] => 418176000000070000 [1] => 418176000000060000 )

so in this case the in_array result is true!

<?php
     $lead = "418176000000069007";
     $diff = array("418176000000069003","418176000000057001");

     if(in_array($lead,$diff,true)) //use type too
       echo "Found";
     else
       echo "Not found";
?>

Try this. It will work.

Java "new String[-1]" passes compilation. How come?

22 votes

While fiddling around in Java, I initialized a new String array with a negative length. i.e. -

String[] arr = new String[-1];

To my surprise, the compiler didn't complain about it. Googling didn't bring up any relevant answers. Can anyone shed some light on this matter?

Many thanks!

The reason is that the JLS allows this, and a compiler that flagged it as a compilation error would be rejecting valid Java code.

It is specified in JLS 15.10.1. Here's the relevant snippet:

"... If the value of any DimExpr expression is less than zero, then a NegativeArraySizeException is thrown."

Now if the Java compiler flagged the code as an error, then that specified behaviour could not occur ... in that specific code.

Furthermore, there's no text that I can find that "authorizes" the compiler to reject this in the "obvious mistake" cases involving compile-time constant expressions like -1. (And who is to say it really was a mistake?)


The next question, of course, is 'why does the JLS allow this?'

You've need to ask the Java designers. However I can think of some (mostly) plausible reasons:

  • This was originally overlooked, and there's no strong case for fixing it. (Noting that fixing it breaks source code compatibility.)

  • It was considered to be too unusual / edge case to be worth dealing with.

  • It would potentially cause problems for people writing source code generators. (Imagine, having to write code to evaluate compile-time constant expressions in order that you don't generate non-compilable code. With the current JLS spec, you can simply generate the code with the "bad" size, and deal with the exception (or not) if the code ever gets executed.)

  • Maybe someone had a plan to add "unarrays" to Java :-)

What are recursive arrays good for?

10 votes

Ruby supports recursive arrays (that is, self-containing arrays):

ruby-1.9.2-p180 :156 > a = []
 => [] 
ruby-1.9.2-p180 :157 > a << a
 => [[...]] 
ruby-1.9.2-p180 :158 > a.first == a
 => true 

This is intrinsically cool, but what work can you do with it?

A directed graph with undifferentiated edges could have each vertex represented simply as an array of the the vertices reachable from that vertex. If the graph had cycles, you would have a 'recursive array', especially if an edge could lead back to the same vertex.

For example, this graph:
directed cyclic graph
...could be represented in code as:

nodes = { a:[], b:[], c:[], d:[] }
nodes[:a] << nodes[:a]
nodes[:a] << nodes[:b]
nodes[:b] << nodes[:a]
nodes[:b] << nodes[:c]
p nodes
#=> {:a=>[[[...], []], [...]], :b=>[[[...], [...]], []], :c=>[], :d=>[]}

Usually the representation of each vertex would be more 'robust' (e.g. a class instance with properties for the name and array of outgoing edges), but it's not impossible to imagine a case where you wanted a very lightweight representation of your data (for very large graphs) and so needed to use a minimal representation like this.

Why return an enumerable?

8 votes

I''m curious about why ruby returns an Enumerable instead of an Array for something that seems like Array is an obvious choice. For example:

'foo'.class
# => String

Most people think of a String as an array of chars.

'foo'.chars.class
# => Enumerator

So why does String#chars return an Enumerable instead of an Array? I'm assuming somebody put a lot of thought into this and decided that Enumerable is more appropriate but I don't understand why.

This completely in accordance with the spirit of 1.9: to return enumerators whenever possible. String#bytes, String#lines, String#codepoints, but also methods like Array#permutation all return an enumerator.

In ruby 1.8 String#to_a resulted in an array of lines, but the method is gone in 1.9.