Best java questions in October 2011

What is x after "x = x++"?

69 votes

Possible Duplicate:
Is there a difference between x++ and ++x in java?
Why does this go into an infinite loop?

What happens (behind the curtains) when this is executed?

int x = 7;
x = x++;

I compiled and executed this. x is still 7 even after the entire statement. In my book, it says that x is incremented!

x = x++;

is equivalent to

int tmp = x;
x++;
x = tmp;

Efficiency of premature return in a function

61 votes

This is a situation I encounter frequently as an inexperienced programmer and am wondering about particularly for an ambitious, speed-intensive project of mine I'm trying to optimize. For the major C-like languages (C, objC, C++, Java, C#, etc) and their usual compilers, will these two functions run just as efficiently? Is there any difference in the compiled code?

void foo1(bool flag)
{
    if (flag)
    {
        //Do stuff
        return;
    }

    //Do different stuff
}

void foo2(bool flag)
{
    if (flag)
    {
        //Do stuff
    }
    else
    {
        //Do different stuff
    }
}

Basically, is there ever a direct efficiency bonus/penalty when breaking or returning early? How is the stackframe involved? Are there optimized special cases? Are there any factors (like inlining or the size of "Do stuff") that could affect this significantly?

I'm always a proponent of improved legibility over minor optimizations (I see foo1 a lot with parameter validation), but this comes up so frequently that I'd like to set aside all worry once and for all.

And I'm aware of the pitfalls of premature optimization... ugh, those are some painful memories.

EDIT: I accepted an answer, but EJP's answer explains pretty succinctly why the use of a return is practically negligible (in assembly, the return creates a 'branch' to the end of the function, which is extremely fast. The branch alters the PC register and may also affect the cache and pipeline, which is pretty minuscule.) For this case in particular, it literally makes no difference because both the if/else and the return create the same branch to the end of the function.

There is no difference at all:

=====> cat test_return.cpp
extern void something();
extern void something2();

void test(bool b)
{
    if(b)
    {
        something();
    }
    else
        something2();
}
=====> cat test_return2.cpp
extern void something();
extern void something2();

void test(bool b)
{
    if(b)
    {
        something();
        return;
    }
    something2();
}
=====> rm -f test_return.s test_return2.s
=====> g++ -S test_return.cpp 
=====> g++ -S test_return2.cpp 
=====> diff test_return.s test_return2.s
=====> rm -f test_return.s test_return2.s
=====> clang++ -S test_return.cpp 
=====> clang++ -S test_return2.cpp 
=====> diff test_return.s test_return2.s
=====> 

Meaning no difference in generated code whatsoever even without optimization in two compilers

Any idea why I need to cast an integer literal to (int) here?

53 votes

In the following example

int i = -128;
Integer i2 = (Integer) i; // compiles

Integer i3 = (Integer) -128; /*** Doesn't compile ***/

Integer i4 = (Integer) (int) -128; // compiles
Integer i4 = -128; // compiles
Integer i5 = (int) -128; // compiles
Integer i6 = (Integer) (-128); // compiles
Integer i7 = (Integer) 0-128; // compiles

I can't cast -128 with (Integer) but I can cast (int) -128.

I always thought -128 was of int type and casting it with (int) should be redundant.

The error on the line with i3 is

cannot find symbol variable Integer

I tried this with Java 6 update 29 and Java 7 update 1.

EDIT: You get the same behaviour with +128 instead of -128. It does appear to be confusion between unary and binary operators.

The compiler tries to substract 128 from (Integer) instead of casting -128 to Integer. Add () to fix it

Integer i3 = (Integer) -128; // doesn't compile
Integer i3 = (Integer) (-128); // compiles

According to BoltClock in the comments the cast to int works as intended, becaus it is a reserved word and therefore can't be interpreted as an identifier, which makes sense to me.

And Bringer128 found the JLS Reference 15.16.

 CastExpression:
    ( PrimitiveType Dimsopt ) UnaryExpression
    ( ReferenceType ) UnaryExpressionNotPlusMinus

As you can see, casting to a primitive type requires any UnaryExpression, whereas casting to a reference type requires a UnaryExpressionNotPlusMinus. These are defined just before the CastExpression at JLS 15.15.

How to approach number guessing game(with a twist) algorithm?

41 votes

I am learning programming (python and algo’s) and was trying to work on a project that I find interesting. I have created a few basic python scripts but I’m not sure how to approach a solution to a game I am trying to build.

Here’s how the game will work:

users will be given items with a value. For example

Apple = 1
Pears = 2
Oranges  = 3

They will then get a chance to choose any combo of them they like (i.e. 100 apples, 20 pears, and 1 oranges). The only output the computer gets is the total value(in this example, its currently $143). The computer will try to guess what they have. Which obviously it won’t be able to get correctly the first turn.

         Value  quantity(day1)  value(day1)
Apple    1      100             100
Pears    2      20              40
Orange   3      1               3
Total           121             143

The next turn the user can modify their numbers but no more than 5% of the total quantity (or some other percent we may chose. I’ll use 5% for example.). The prices of fruit can change(at random) so the total value may change based on that also(for simplicity I am not changing fruit prices in this example). Using the above example, on day 2 of the game, the user returns a value of $152 and $164 on day 3. Here's an example.

quantity(day2)  %change(day2)   value(day2) quantity(day3)  %change(day3)   value(day3)
104                             104         106                             106
21                              42          23                              46
2                               6           4                               12
127             4.96%           152         133             4.72%           164

*(I hope the tables show up right, I had to manually space them so hopefully its not just doing it on my screen, if it doesn't work let me know and I'll try to upload a screenshot).

I am trying to see if I can figure out what the quantities are over time(assuming the user will have the patience to keep entering numbers). I know right now my only restriction is the total value cannot be more than 5% so I cannot be within 5% accuracy right now so the user will be entering it forever.

What I have done so far

Here’s my solution so far(not much). Basically I take all the values and figure out all the possible combos of them(I am done this part). Then I take all the possible combos and put them in a database as a dictionary(so for example for $143, there could be a dictionary entry {apple:143, Pears:0, Oranges :0}..all the way to {apple:0, Pears:1, Oranges :47}. I do this each time I get a new number so I have a list of all possibilities.

Here’s where I’m stuck. Is using the rules above, how can I figure out the best possible solution? I think I’ll need a fitness function that automatically compares the two days data and removes any possibilities that have more than 5% variance of the previous days data.

Questions:

So my question with user changing the total and me having a list of all the probabilities, how should I approach this? What do I need to learn? Is there any algorithms out there or theories that I can use that are applicable? Or, to help me understand my mistake, can you suggest what rules I can add to make this goal feasible(if its not in its current state. I was thinking adding more fruits and saying they must pick at least 3,etc..)? Also, I only have a vague understanding of genetic algorithms but I thought I could use them here, if is there something I can use?

I'm very very eager to learn so any advice or tips would be greatly appreciated(just please don't tell me this game is impossible).

Thanks in advance.

UPDATE: Getting feedback that this is hard to solve. So I thought I'd add another condition to the game that won't interfere with what the player is doing(game stays the same for them) but everyday the value of the fruits change price(randomly). Would that make it easier to solve? Because within a 5% movement and certain fruit value changes, only a few combo's are probable over time. Day 1, anything is possible and getting a close enough range is almost impossible, but as the prices of fruits change and the user can only choose a 5% change, then shouldn't(over time) the range be narrow and narrow. In the above example, if prices are volatile enough I think I could brute force a solution that gave me a range to guess in, but I'm trying to figure out if there's a more elegant solution or other solutions to keep narrowing this range over time.

UPDATE2: After reading and asking around, I believe this is a hidden markov/Viterbi problem that tracks the changes in fruit prices as well as total sum(weighting the last data point the heaviest). I'm not sure how to apply the relationship though. I think this is the case and could be wrong but at the least I'm starting to suspect this is a some type of machine learning problem.

Update3: I am created a test case(with smaller numbers) and a generator to help automate the user generated data and I am trying to create a graph from it to see what's more likely. Here's the code, along with the total values and comments on what the users actually fruit quantities are.

#!/usr/bin/env python
import itertools

#Fruit price data
fruitPriceDay1 = {'Apple':1,'Pears':2,'Oranges':3}
fruitPriceDay2 = {'Apple':2,'Pears':3,'Oranges':4}
fruitPriceDay3 = {'Apple':2,'Pears':4,'Oranges':5}

#generate possibilities for testing(Warning..will not scale with large numbers)
def possibilityGenerator(target_sum, apple, pears, oranges):
    allDayPossible = {}
    counter = 1
    apple_range = range(0, target_sum + 1, apple)
    pears_range = range(0, target_sum + 1, pears)
    oranges_range = range(0, target_sum + 1, oranges)
    for i, j, k in itertools.product(apple_range, pears_range, oranges_range):
        if i + j + k == target_sum:
            currentPossible = {}
            #print counter
            #print 'Apple', ':', i/apple, ',', 'Pears', ':', j/pears, ',', 'Oranges', ':', k/oranges
            currentPossible['apple'] = i/apple
            currentPossible['pears'] = j/pears
            currentPossible['oranges'] = k/oranges
            #print currentPossible
            allDayPossible[counter] = currentPossible
            counter = counter +1
    return allDayPossible

#total sum being returned by user for value of fruits            
totalSumDay1=26 # computer does not know this but users quantities are apple: 20, pears 3, oranges 0 at the current prices of the day
totalSumDay2=51 # computer does not know this but users quantities are apple: 21, pears 3, oranges 0 at the current prices of the day
totalSumDay3=61 # computer does not know this but users quantities are apple: 20, pears 4, oranges 1 at the current prices of the day
graph = {}
graph['day1'] = possibilityGenerator(totalSumDay1, fruitPriceDay1['Apple'], fruitPriceDay1['Pears'], fruitPriceDay1['Oranges'] )
graph['day2'] = possibilityGenerator(totalSumDay2, fruitPriceDay2['Apple'], fruitPriceDay2['Pears'], fruitPriceDay2['Oranges'] )
graph['day3'] = possibilityGenerator(totalSumDay3, fruitPriceDay3['Apple'], fruitPriceDay3['Pears'], fruitPriceDay3['Oranges'] )
#sample of dict = 1 : {'oranges': 0, 'apple': 0, 'pears': 0}..70 : {'oranges': 8, 'apple': 26, 'pears': 13}
print graph

We'll combine graph-theory and probability:

On the 1st day, build a set of all feasible solutions. Lets denote the solutions set as A1={a1(1), a1(2),...,a1(n)}.

On the second day you can again build the solutions set A2.

Now, for each element in A2, you'll need to check if it can be reached from each element of A1 (given x% tolerance). If so - connect A2(n) to A1(m). If it can't be reached from any node in A1(m) - you can delete this node.

Basically we are building a connected directed acyclic graph.

All paths in the graph are equally likely. You can find an exact solution only when there is a single edge from Am to Am+1 (from a node in Am to a node in Am+1).

Sure, some nodes appear in more paths than other nodes. The probability for each node can be directly deduced based on the number of paths that contains this node.

By assigning a weight to each node, which equals to the number of paths that leads to this node, there is no need to keep all history, but only the previous day.

Also, have a look at non-negative-values linear diphantine equations - A question I asked a while ago. The accepted answer is a great way to enumarte all combos in each step.

Can 0.99999999999 be rounded to 1.0 when multiplying?

30 votes

When multiplying a floating number that is very close to 1 with an int > 0, can it ever be interpreted as 1.

That is, if Math.random() returns its highest possible result (which is 1 step below 1.0), will

(int)(Math.random() * 8)

be 8 or 7?

For a practical example, can this often-used (AFAIK) construct give an index out of bounds error:

Object element = elementArray[(int)(Math.random() * elementArray.length)];

I'm specifically interested in answers for Java and ActionScript 3, but I suppose they all use the same rules for floating point arithmetic, and answers for any platform would be useful.

Update: Though I already accepted an answer, I'd still appreciate confirmation that this can't go wrong in ActionScript 3 either, since a colleague reporting that he saw it go wrong once is what partly prompted me to ask this question.

If you multiply the greatest value below 1.0 with someInt (> 0), the result will never be someInt.

This can be exhaustively tested for integers like this:

Double greatestLessThanOne = Double.longBitsToDouble(4607182418800017407L);

// Assert that greatestLessThanOne is indeed the largest double less than 1.
//assert 1.0 == greatestLessThanOne + Math.ulp(greatestLessThanOne);

for (int i = 1; i >= 0; i++)
    if ((int) (greatestLessThanOne * i) == i)
        System.out.println("Exception found: " + i);

The snippet produces no output.

(Math.ulp returns the distance between the given double and the double value next larger in magnitude. The assertion thus ensures that greatestLessThanOne is indeed the greatest value less than 1.0.)

In other words, your line

Object element = elementArray[(int)(Math.random() * elementArray.length)];

will never give rise to an ArrayIndexOutOfBoundsException.


Furthermore, according to Mark Dickinsons comment over here, this holds also when multiplying with a double.

With IEEE 754 floating-point arithmetic in round-to-nearest mode, you can show that x * y < y for any x < 1.0 and any non-tiny positive y. (It can fail if y is either subnormal or the smallest positive normal number.)

How should I map string keys to values in Java in a memory-efficient way?

29 votes

I'm looking for a way to store a string->int mapping. A HashMap is, of course, a most obvious solution, but as I'm memory constrained and need to store 2 million pairs, 7 characters long keys, I need something that's memory efficient, the retrieval speed is a secondary parameter.

Currently I'm going along the line of:

List<Tuple<String, int>> list = new ArrayList<Tuple<String, int>>();
list.add(...); // load from file
Collections.sort(list);

and then for retrieval:

Collections.binarySearch(list, key); // log(n), acceptable

Should I perhaps go for a custom tree (each node a single character, each leaf with result), or is there an existing collection that fits this nicely? The strings are practically sequential (UK postcodes, they don't differ much), so I'm expecting nice memory savings here.

Edit: I just saw you mentioned the String were UK postcodes so I'm fairly confident you couldn't get very wrong by using a Trove TLongIntHashMap (btw Trove is a small library and it's very easy to use).

Edit 2: Lots of people seem to find this answer interesting so I'm adding some information to it.

The goal here is to use a map containing keys/values in a memory-efficient way so we'll start by looking for memory-efficient collections.

The following SO question is related (but far from identical to this one).

What is the most efficient Java Collections library?

Jon Skeet mentions that Trove is "just a library of collections from primitive types" [sic] and, that, indeed, it doesn't add much functionality. We can also see a few benchmarks (by the.duckman) about memory and speed of Trove compared to the default Collections. Here's a snippet:

                      100000 put operations      100000 contains operations 
java collections             1938 ms                        203 ms
trove                         234 ms                        125 ms
pcj                           516 ms                         94 ms

And there's also an example showing how much memory can be saved by using Trove instead of a regular Java HashMap:

java collections        oscillates between 6644536 and 7168840 bytes
trove                                      1853296 bytes
pcj                                        1866112 bytes

So even though benchmarks always need to be taken with a grain of salt, it's pretty obvious that Trove will save not only memory but will always be much faster.

So our goal now becomes to use Trove (seen that by putting millions and millions of entries in a regular HashMap, your app begins to feel unresponsive).

You mentioned 2 million pairs, 7 characters long keys and a String/int mapping.

2 million is really not that much but you'll still feel the "Object" overhead and the constant (un)boxing of primitives to Integer in a regular HashMap{String,Integer} which is why Trove makes a lot of sense here.

However, I'd point out that if you have control over the "7 characters", you could go even further: if you're using say only ASCII or ISO-8859-1 characters, your 7 characters would fit in a long (*). In that case you can dodge altogether objects creation and represent your 7 characters on a long. You'd then use a Trove TLongIntHashMap and bypass the "Java Object" overhead altogether.

You stated specifically that your keys were 7 characters long and then commented they were UK postcodes: I'd map each postcode to a long and save a tremendous amount of memory by fitting millions of keys/values pair into memory using Trove.

The advantage of Trove is basically that it is not doing constant boxing/unboxing of Objects/primitives: Trove works, in many cases, directly with primitives and primitives only.

(*) say you only have at most 256 codepoints/characters used, then it fits on 7*8 == 56 bits, which is small enough to fit in a long.

Sample method for encoding the String keys into long's (assuming ASCII characters, one byte per character for simplification - 7 bits would be enough):

long encode(final String key) {
    final int length = key.length();
    if (length > 8) {
        throw new IndexOutOfBoundsException(
                "key is longer than 8 characters");
    }
    long result = 0;
    for (int i = 0; i < length; i++) {
        result += ((long) ((byte) key.charAt(i))) << i * 8;
    }
    return result;
}

Use "this" everywhere?

23 votes

Coding style question:

Within a class' methods, is it advisable to use this.myField rather than just myField? Always? Or should this never be done, except when myField is shadowed by a method (or constructor) parameter also called myField? (usually in setters)

Is there a convention that guides this?

I tend to use this. a lot out of laziness, because that allows me to pick the field through autocompletion. But maybe there is a downside to this (no pun intended).

We prefer not to use this if it is not syntactically required and neither use m_... to mark member fields. We can do this because modern Java IDEs (like Eclipse) graphically mark the difference between local and class variables anyway, so today better readability is no longer a reason to increase typing overhead.

Transitive nature of equals method

19 votes

The contract for equals(object) method specifies 4 properties to follow: Reflexive, Symmetric, Transitive and Consistent. While I understand the danger of not following Reflexive, Symmetric and Consistent , and can definitely agree its good to follow transitive, I was wondering what harm it would bring if its violating the Transitive property?

Specifically, which of the Java library (or various third party libraries) need the dependency upon equals to be transitive to work correctly? In my understanding, the Collections framework will work if the other 3 properties are well implemented.

Assume three objects a,b,c with

a == a, b == b, c == c (reflexive)
a == b, b == a
b == c, c == b
a != c, c != a

(Pseudocode, x == y stands for x.equals(y)).

Now, let's add the objects to a set:

Set s = new HashSet(); // Set implementation doesn't matter
s.add(b); // s = [b]
s.add(a); // s doesn't change, because a == b
s.add(c); // s doesn't change, because c == b

In contrast, if we were to add them in a different order:

Set s = new HashSet();
s.add(a); // s = [a]
s.add(b); // s doesn't change, because b == a
s.add(c); // s = [a,c], because c != a

That is clearly counter-intuitive and doesn't match the behavior one would expect of a set. For example, it would mean that the union of two sets (i.e. the state of s after s.addAll(someOtherSet)) could depend on the implementation (order of elements) of someOtherSet.

Why does automatic boxing work in eclipse but not in javac?

16 votes

This code:

Integer ints[] = new Integer[]{'1', '2', '3'};

compiles just fine in eclipse, but javac (both version 1.6.0_27 and 1.7.0) gives the following error:

BoxTest.java:4: incompatible types
found   : char
required: java.lang.Integer
               Integer ints[] = new Integer[]{'1', '2', '3'};

BoxTest.java:4: incompatible types

Why?

I assume it's some kind of compiler flag, but digging trough eclipse to figure it out is not exactly straight forward.

What javac is not doing, is not autoboxing, but is an automatic cast. In javac it compiles with:

Integer ints[] = new Integer[] { (int) '1', (int) '2', (int) '3' };

The same happens with just one Integer, again in javac, I've to do the explicit cast to compile:

Integer a = (int) '1';

But here is what I found. Using Eclipse JDT batch compiler from the command line it works, even without the casts:

$ java -jar org.eclipse.jdt.core_3.7.1.v_B76_R37x.jar -classpath rt.jar \
  -1.6 Appo.java 

I've looked at the options of javac and I don't think there is any way to change this behaviour.

I've to infer that the difference is caused by the fact that Eclipse is not using javac internally, but the JDT compiler.

Pattern.matches() gives StackOverflowError

15 votes

I'm using java's Pattern.matches to match a block of data to a regex. The block of data can be a single line or multiple lines. The problem is that once my data becomes more than 15 lines (typically more than 17-18 lines), i start getting stackoverflowerror. For data less than 15 lines the regex works fine.

The Regex is of this format:
domainname -> space -> , -> space -> number -> space -> , -> space -> number -> newline

String regex = "^(([a-zA-Z0-9][a-zA-Z0-9\\-]*\\.)+([a-zA-Z]{2,})\\s*,\\s*\\d+\\s*,\\s*\\d+(\\r?\\n)?)+$";

The data block i use to test against this regex is this

abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456
abc.com, 123, 456

This is the code:

String regex = "^(([a-zA-Z0-9][a-zA-Z0-9\\-]*\\.)+([a-zA-Z]{2,})\\s*,\\s*\\d+\\s*,\\s*\\d+(\\r?\\n)?)+$";
boolean valid = Pattern.matches(regex, data); //fails here

I can't tell you the reason for this error; the regex itself is fine and not subject to catastrophic backtracking or any other obvious error.

Perhaps you can reduce the number of backtracking positions the regex engine saves by using possessive quantifiers (++ instead of +, *+ instead of *, {2,}+ instead of {2,} etc.). Also, you don't need the capturing groups (thanks Thomas), so I've changed them into non-capturing ones:

"(?:(?:[a-zA-Z0-9][a-zA-Z0-9-]*+\\.)++([a-zA-Z]{2,}+)\\s*+,\\s*+\\d++\\s*+,\\s*+\\d++(\r?+\n)?+)++"

This won't change the behaviour of the regex (except for the removal of the unnecessary anchors since you're using Pattern.matches()), but perhaps it helps avoid StackOverflows. I don't have a Java SDK installed, so I can't test it myself, though.

Dynamic Programming algorithm to find Heavy integers

15 votes

Recently I was asked the following question:

A non-negative integer is called heavy if the average value of its digits in decimal representation exceeds 7. For example the number 8698 is heavy, because the average value of its digits equal to (8+6+9+8)/4 = 7.75

Given two non-negative integers A and B find the number of heavy integers in the interval [A..B] (A and B included)

A and B are integers within the range [0..200,000,000].

...which is the same problem explained in this post:

Interview Question: Optimal Solution to the problem of finding Heavy integers

Now, during the interview I used the naive solution, and later that day improved it following the idea of "skipping the numbers that are surely not-heavy".

The problem is, in my interview the requested Time Complexity was O(log(A)+log(B))...which is not met by either of these solution, if I am not wrong.

So, out of curiosity, I wanted to ask you: can you think of any way to meet that complexity? Also, following this other post about the problem: http://math.stackexchange.com/questions/47329/calculating-heavy-numbers-in-a-given-range, it looks like this problem can be modeled as Dynamic Programming.....only I can't see how....any suggestions?

I have a solution for a sub-problem with additional restrictions:

1) I only use digits 1 to 9 (0 is complicated because you don't include it in the average when it starts a number).

2) I compute the numbers of combinations for "n" digits, so A and B are fixed (for n=3, A=111 and B=999).

I haven't checked how difficult it might be to generalize my solution with 0's and arbitrary A and B. Also, as I explain in the end, it doesn't look like my solution reduces the complexity down to O(log(B)).

I define a function c(n, s) which is the number (count) of n-digits "integers" with a sum above or equal to s. For example, c(1, 9) = 1 since only the single digit integer 9 satisfies i >= 9. More generally c(1, i) = 10 - i.

For c(2, s), we want all combinations {ij: 1 <= i <= 9, 1 <= j <= 9, i + j >= s}. We can use dynamic programming since we know the values of c(1, s). We will consider the 9 cases i = 1, .., 9 and figure out how many possible values of j there are from c(1, s). Let's try for example s = 18: there is only possible combination with i = 9 and the number of possible j given by c(1, 18 - 9) = 1. So c(2, 18) = 1, which is obvious since 99 is the only solution. For a general s, the equation that gives the number of combinations with a digit sum above or equal to s is:

c(2, s) = \sum_{i=1}^9 c(1, s - i) 

And you can convince your self more generally that for any number of digits n:

c(n, s) =  \sum_{i=1}^9 c(n - 1, s - i) 

Back to your original problem: The number of n-digits heavy "integers" is given by c(5, 5 * 7). Of course in order to get that value you'll first have to compute c(1, s), ... c(4, s) for all s's (but only a small number of s's are non-zero: for c(n, s) only s = n, ..., 10^n - 1 are non-zero).

For computational complexity, at each digit d, you compute c(d, s) for s = d, ..., 10^d - 1. Each term just requires a sum of 9 terms and you have 10^d - 1 - d such terms. So since d goes from 1 to n, you get $O(n * 10^n)$. But the number of digits n is just log B (where log is in base 10), so O(log(B) * B), which is not what you wanted.

Improved solution:
There is actually no need to compute all values of s at each digit. If we go back to the basic equation c(n, s) = \sum_{i=1}^9 c(n - 1, s - i) we can see that to get c(n, s) we do no need all values of c(n - 1, t), but only 9 such values (t = s - 1, ... , s - 9). Things get a bit complicated to measure the computational complexity since if you go to c(n - 2, k), you then need roughly 2 * 9 values and roughly d * 9 values for c(n - d, m). When you sum all those operations you get a computational complexity of O(n^2) = O(log(B)^2), which is still not quite what you wanted.

There's a "pruning" of terms going on too such that you'll actually get less than d * 9 values for c(n - d, k) (for example c(1, k) only has 9 possible values of k, much less than d * 9), but I doubt it's enough to bring the complexity down to O(log(B)).

Update:
I found a solution nearly identical to mine here. It's even better since there is apparently an exact formula for the function c(n, s), which they call f(n, s). It's not exactly the same function however since I simplified the problem. I haven't checked that it is exact either.

Java breaks strong typing! Who can explain it?

15 votes

Possible Duplicate:
Varying behavior for possible loss of precision

I found an inconsistence in Java strong typing check at compile time. Please look at the following code:

int sum = 0;
sum = 1; //is is OK
sum = 0.56786; //compile error because of precision loss, and strong typing
sum = sum + 2; //it is OK
sum += 2; //it is OK
sum = sum + 0.56787; //compile error again because of automatic conversion into double, and possible precision loss
sum += 0.56787; //this line is does the same thing as the previous line, but it does not give us a compile error, and javac does not complain about precision loss etc.

Can anyone explain it to me? Is it a known bug, or desired behavior? C++ gives a warning, C# gives a compile error.

Does Java breaks strong typing? You can replace += with -= or *= - everything is acceptable by a compiler.

This behaviour is defined by the language (and is therefore OK). From the JLS:

15.26.2 Compound Assignment Operators

A compound assignment expression of the form E1 op= E2 is equivalent to E1 = (T)((E1) op (E2)), where T is the type of E1, except that E1 is evaluated only once. For example, the following code is correct:

short x = 3;
x += 4.6;

and results in x having the value 7 because it is equivalent to:

short x = 3;
x = (short)(x + 4.6);

OpenJDK's rehashing mechanism

15 votes

Found this code on http://www.docjar.com/html/api/java/util/HashMap.java.html after searching for a HashMap implementation.

  264       static int hash(int h) {
  265           // This function ensures that hashCodes that differ only by
  266           // constant multiples at each bit position have a bounded
  267           // number of collisions (approximately 8 at default load factor).
  268           h ^= (h >>> 20) ^ (h >>> 12);
  269           return h ^ (h >>> 7) ^ (h >>> 4);
  270       }

Can someone shed some light on this? The comment tells us why this code is here but I would like to understand how this improves a bad hash value and how it guarantees that the positions have bounded number of collisions. What do these magic numbers mean?

In order for it to make any sense it has to be combined with an understanding of how HashMap allocates things in to buckets. This is the trivial function by which a bucket index is chosen:

static int indexFor(int h, int length) {
    return h & (length-1);
}

So you can see, that with a default table size of 16, only the 4 least significant bits of the hash actually matter for allocating buckets! (16 - 1 = 15, which masks the hash by 1111b)

This could clearly be bad news if your hashCode function returned:

10101100110101010101111010111111

01111100010111011001111010111111

11000000010100000001111010111111
//etc etc etc

Such a hash function would not likely be "bad" in any way that is visible to its author. But if you combine it with the way the map allocates buckets, boom, MapFail(tm).

If you keep in mind that h is a 32 bit number, those are not magic numbers at all. It is systematically xoring the most significant bits of the number rightward into the least significant bits. The purpose is so that "differences" in the number that occur anywhere "across" it when viewed in binary become visible down in the least significant bits.

Collisions become bounded because the number of different numbers that have the same relevant LSBs is now significantly bounded because any differences that occur anywhere in the binary representation are compressed into the bits that matter for bucket-ing.

Merging interfaces, without merging

14 votes

I was thinking, does C++ or Java have a way to do something like this

Interface IF1{
    ....
};

Interface IF2{
    ....
};


function f(Object o : Implements IF1, IF2){
    ...
}

meaning a typesystem that allows you to require implementation of interfaces.

You can do this in Java:

public <I extends IF1 & IF2> void methodName(I i){

....

}

This way you force I to implement your two interfaces, otherwise it won't even compile.

Filling the data in an Android bitmap as quickly as possible from C

13 votes

I've managed to get an android.graphics.Bitmap created and I'm successfully filling it via the SetPixels command.

The problem is that I start off with RGBA data. I then create a jintArray. I then call SetIntArray (effectively memcpying the data into the buffer). Then, finally, I call setPixels to actually set the pixels (which presumably causes another copy).

one big issue with doing this is that whether I used R8G8B8A8 or R5G6B5 or A8 I still have to convert my pixel data to R8G8B8A8 data.

Ideally I'd like a way to fill the buffer using only one copy and allow me to do it without doing the pixel format conversion.

Is there any way to directly get at the buffer data contained in a Bitmap? I see there is a function GetDirectBufferAddress in the JNI, but the documentation I can find on it suggests its limited to a java.nio.buffer. Can I directly get at the pixel data using this function? Perhaps by getting the internal buffer used by the Bitmap class?

Is my only way of using this to create a Global Ref'd Java.nio.buffer then each time I want to update, copy my pixel data into it and then use copyPixelsFromBuffer? This still involves 2 copies but can, at least, eliminate the pixel format change. Is this going to be any more efficient than the method I'm already using?

Is there an even better way of doing it?

Btw, I AM aware of the fact I can use the functions in < android/bitmap.h > but I would really like to not lose support for Android 2.1 and Android 2.2 ...

Cheers in advance!

Here comes a dirty yet working solution, working from Android 1.5 up to 4.0. Code is in C++.

                              //decls of some partial classes from Skia library
class SkRefCnt{
public:
   virtual ~SkRefCnt(){}
private:
   mutable int fRefCnt;
};

//----------------------------

class SkPixelRef: public SkRefCnt{
public:
   virtual class Factory getFactory() const;
   virtual void flatten(class SkFlattenableWriteBuffer&) const;
protected:
   virtual void* onLockPixels(class SkColorTable**) = 0;
   virtual void onUnlockPixels() = 0;
public:
   void *GetPixels(){
      SkColorTable *ct;
      return onLockPixels(&ct);
   }
};

jobject java_bitmap;  //your Bitmap object
jclass java_bitmap_class = env.GetObjectClass(java_bitmap);
class SkBitmap;
SkBitmap *sk_bitmap = (SkBitmap*)env.CallIntMethod(java_bitmap, env.GetMethodID(java_bitmap_class, "ni", "()I"));
SkPixelRef *sk_pix_ref;
sk_pix_ref = (SkPixelRef*)((int*)sk_bitmap)[1];
// get pointer to Bitmap's pixel memory, and lenght of single line in bytes
int buffer_pitch = env.CallIntMethod(java_bitmap, env.GetMethodID(java_bitmap_class, "getRowBytes", "()I"));
void *buffer = sk_pix_ref->GetPixels();

How do I protect OAuth keys from a user decompileing my project?

11 votes

I am writing my first application to use OAuth. This is for a desktop application, not a website or a mobile device where it would be more difficult to access the binary, so I am concerned on how to protect my application key and secret. I feel it would be trivial to look at the complied file and find the string that stores the key.

Am I over reacting or is this a genuine problem (with a known solution) for desktop apps?

This project is being coded in Java but I am also a C# developer so any solutions for .NET would be appreciated too.

EDIT: I know there is no perfect solution, I am just looking for mitigating solutions.

EDIT2: I know pretty much only solution is use some form of obfuscation. Are there any free providers for .NET and Java that will do string obfuscation?

There is no good or even half good way to protect keys embedded in a binary that untrusted users can access.

There are reasons to at least put a minimum amount of effort to protect yourself.

The minimum amount of effort won't be effective. Even the maximum amount of effort won't be effective against a skilled reverse engineer / hacker with just a few hours of spare time.

If you don't want your OAuth keys to be hacked, don't put them in code that you distribute to untrusted users. Period.

Am I over reacting or is this a genuine problem (with a known solution) for desktop apps?

It is a genuine problem with no known (effective) solution. Not in Java, not in C#, not in Perl, not in C, not in anything. Think of it as if it was a Law of Physics.


Your alternatives are:

  • Force your users to use a trusted platform that will only execute crypto signed code. (Hint: this is most likely not practical for your application because current generation PC's don't work this way. And even TPS can be hacked given the right equipment.)

  • Turn your application into a service and run it on a machine / machines that you control access to. (Hint: it sounds like OAuth 2.0 might remove this requirement.)

  • Use some authentication mechanism that doesn't require permanent secret keys to be distributed.

  • Get your users to sign a legally binding contract to not reverse engineer your code, and sue them if they violate the contract. Figuring out which of your users has hacked your keys is left to your imagination ... (Hint: this won't stop hacking, but may allow you to recover damages, if the hacker has assets.)


By the way, argument by analogy is a clever rhetorical trick, but it is not logically sound. The observation that physical locks on front doors stop people stealing your stuff (to some degree) says nothing whatsoever about the technical feasibility of safely embedding private information in executables.

And ignoring the fact that argument by analogy is unsound, this particular analogy breaks down for the following reason. Physical locks are not impenetrable. The lock on your front door "works" because someone has to stand in front of your house visible from the road fiddling with your lock for a minute or so ... or banging it with a big hammer. Someone doing that is taking the risk that he / she will be observed, and the police will be called. Bank vaults "work" because the time required to penetrate them is a number of hours, and there are other alarms, security guards, etc. And so on. By contrast, a hacker can spend minutes, hours, even days trying to break your technical protection measures with effectively zero risk of being observed / detected doing it.

Java pattern matching going to infinite loop

11 votes

A friend gave me this piece of code and said there is a bug. And yes, this code runs for ever.

The answer I got is:

It runs for >10^15 years before printing anything.

public class Match {
     public static void main(String[] args) {
         Pattern p = Pattern.compile("(aa|aab?)+");
         int count = 0;
         for(String s = ""; s.length() < 200; s += "a")
             if (p.matcher(s).matches())
                 count++;
         System.out.println(count);
     }
}

I didn't really understand why am I seeing this behavior, I am new to java, do you have any suggestions?

The pattern you are using is known as an evil regex according to OWASP (they know what they're talking about most of the time):

https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS

It basically matches aa OR aa or aab (since the b is optional by addition of ?)

A Regex like this is vulnerable to a ReDoS or Regex Denial of Service Attack.

So yes, sort out what you want to match. I suggest in the above example you should simply match aa, no need for groups, repitition or alternation:

Pattern p = Pattern.compile("aa");

Also as someone pointed out, who now deleted his post, you should not use += to append to strings. You should use a StringBuffer instead:

public class Match {
  public static void main(String[] args) {
    Pattern p = Pattern.compile("aa");
    StringBuffer buffy = new StringBuffer(200);
    int count = 0;
    for(int i = 0; i < 200; i++){
      buffy.append("a")
      if (p.matcher(buffy.toString()).matches()){
        count++;
      }
    }
    System.out.println(count);
  }
}

Why do I need to flush the connection pool each time I redeploy?

6 votes

SOLVED

enter image description here

Made my own dual familiar with the one on Oracle.

CREATE TABLE dual 
(
    x VARCHAR(1)
);

INSERT INTO dual(x) VALUES('y');

I have successfully made a connection to a remote MySQL server and I use it for my Glassfish educational purpose application (not homework). However each time I make a change to the code or XHTML files I need to open the administrator panel of Glassfish and flush the connection pool otherwise I get this when I just refresh the page. Have anybody experienced this? I can post code or other information if it is needed.

HTTP Status 500 -

type Exception report

message

descriptionThe server encountered an internal error () that prevented it from fulfilling this request.

exception

javax.servlet.ServletException: WELD-000049 Unable to invoke [method] @PostConstruct public com.myapp.QuestionController.initialize() on com.myapp.QuestionController@4635bd2a

root cause

org.jboss.weld.exceptions.WeldException: WELD-000049 Unable to invoke [method] @PostConstruct public com.myapp.interfaces.QuestionController.initialize() on com.myapp.interfaces.QuestionController@4635bd2a

root cause

java.lang.reflect.InvocationTargetException

root cause

javax.ejb.EJBException

root cause

javax.persistence.PersistenceException: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.3.0.v20110604-r9504): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: java.sql.SQLException: Error in allocating a connection. Cause: java.lang.RuntimeException: Got exception during XAResource.start: Error Code: 0

root cause

Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.3.0.v20110604-r9504): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: java.sql.SQLException: Error in allocating a connection. Cause: java.lang.RuntimeException: Got exception during XAResource.start: Error Code: 0

root cause

java.sql.SQLException: Error in allocating a connection. Cause: java.lang.RuntimeException: Got exception during XAResource.start:

root cause

javax.resource.spi.ResourceAllocationException: Error in allocating a connection. Cause: java.lang.RuntimeException: Got exception during XAResource.start:

root cause

com.sun.appserv.connectors.internal.api.PoolingException: java.lang.RuntimeException: Got exception during XAResource.start:

root cause

com.sun.appserv.connectors.internal.api.PoolingException: java.lang.RuntimeException: Got exception during XAResource.start:

root cause

java.lang.RuntimeException: Got exception during XAResource.start:

root cause

javax.transaction.xa.XAException: com.sun.appserv.connectors.internal.api.PoolingException: javax.resource.spi.LocalTransactionException: Communications link failure

The last packet successfully received from the server was 435�409 milliseconds ago. The last packet sent successfully to the server was 7 milliseconds ago.

Image of config

Persistence XML

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.0" xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd">
    <persistence-unit name="SertifikatPU" transaction-type="JTA">
        <jta-data-source>jdbc/sertifikatdb</jta-data-source>
    </persistence-unit>
</persistence>

In the "Additional properties" in Glassfish connection pool settings I have just configured: servername, URL, user and password.

Searching on your root cause, PoolingException: javax.resource.spi.LocalTransactionException: Communications link failure

I found this bug on jira for Glassfish (check the comments tab at the bottom), which explains that you may need to refresh your invalid connections. Read the comment at the bottom by Jagadish, which says to check your connection validation type. If it is set to "autocommit" (default), jdbc-drivers may cache the connection validation data and no actual database interaction will happen during connection validation.

Another article which also explains this

To fix this, set your connection-validation-method="table" and validation-table-name="any_table_you_know_exists" and provide any existing table-name, which forces connections to talk to the database instead of the cache. If the connection is invalid, it should be dropped and recreated. You may also need to specify is-connection-validation-required="true".

EDIT: Two excellent articles I found to help with additional configuration:

Jagadish's Oracle Blog Article on this topic with even more info

Glassfish JDBC Connection Validation explained in detail

From Jagadish's Blog:

AS_INSTALL_ROOT/bin/asadmin set domain.resources.jdbc-connection-pool.DerbyPool.is-connection-validation-required=true
domain.resources.jdbc-connection-pool.DerbyPool.is-connection-validation-required = true

AS_INSTALL_ROOT/bin/asadmin set domain.resources.jdbc-connection-pool.DerbyPool.connection-validation-method=table
domain.resources.jdbc-connection-pool.DerbyPool.connection-validation-method = table

bin/asadmin set domain.resources.jdbc-connection-pool.DerbyPool.validation-table-name=sys.systables
domain.resources.jdbc-connection-pool.DerbyPool.validation-table-name = sys.systables

Note that sys.systables is a guaranteed MSSQL table and the other article references dual, which is a guaranteed ORACLE table. You will need to use a guaranteed MySQL table, and I do not know one offhand. Your best bet is to just create a table solely for validation purposes, possibly with one column with one row of data.

Returning overlapping regular expressions

5 votes

Is there a regular expression that will capture all instances of an expression, regardless of whether or not they overlap?

E.g. in /abc/def/ghi if I want to capture all strings beginning with /. The regex (/.*) only returns the entire string, but I'd want it to match on /def/ghi and /ghi as well.

Sure, match an empty string and place a look-ahead after it that captures /.* in a capturing group:

Matcher m = Pattern.compile("(?=(/.*))").matcher("/abc/def/ghi");
while(m.find()) {
  System.out.println(m.group(1));
}

would print:

/abc/def/ghi
/def/ghi
/ghi

How to split a string with whitespace chars at the beginning?

5 votes

Quick example:

public class Test {
    public static void main(String[] args) {
        String str = "   a b";
        String[] arr = str.split("\\s+");
        for (String s : arr)
            System.out.println(s);
    }
}

I want the array arr to contain 2 elements: "a" and "b", but in the result there are 3 elements: "" (empty string), "a" and "b". What should I do to get it right?

Kind of a cheat, but replace:

String str = "   a b";

with

String str = "   a b".trim();