Best thread-safety questions in June 2012

Java Synchronization Not Working as Expected

34 votes

I have a "simple" 4 class example that reliably shows unexpected behavior from java synchronization on multiple machines. As you can read below, given the contract of the java sychronized keyword, Broke Synchronization should never be printed from the class TestBuffer.

Here are the 4 classes that will reproduce the issue (at least for me). I'm not interested in how to fix this broken example, but rather why it breaks in the first place.

Sync Issue - Controller.java

Sync Issue - SyncTest.java

Sync Issue - TestBuffer.java

Sync Issue - Tuple3f.java

And here is the output I get when I run it:

java -cp . SyncTest
Before Adding
Creating a TestBuffer
Before Remove
Broke Synchronization
1365192
Broke Synchronization
1365193
Broke Synchronization
1365194
Broke Synchronization
1365195
Broke Synchronization
1365196
Done

UPDATE: @Gray has the simplest example that breaks thus far. His example can be found here: Strange JRC Race Condition

Based on the feedback I've gotten from others, it looks like the issue may occur on Java 64-bit 1.6.0_20-1.6.0_31 (unsure about newer 1.6.0's) on Windows and OSX. Nobody has been able to reproduce the issue on Java 7. It may also require a multi-core machine to reproduce the issue.

ORIGINAL QUESTION:

I have a class which provides the following methods:

  • remove - Removes the given item from the list
  • getBuffer - Iterates over all the items in the list

I've reduced the problem down to the 2 functions below, both of which are in the same object and they're both synchronized. Unless I am mistaken, "Broke Synchronization" should never be printed because insideGetBuffer should always be set back to false before remove can be entered. However, in my application it is printing "Broke Synchronization" when I have 1 thread calling remove repeatedly while the other calls getBuffer repeatedly. The symptom is that I get a ConcurrentModificationException.

See Also:

Very strange race condition which looks like a JRE issue

I think you are indeed looking at a JVM bug in the OSR. Using the simplified program from @Gray (slight modifications to print an error message) and some options to mess with/print JIT compilation, you can see what is going on with the JIT. And, you can use some options to control that to a degree that can suppress the issue, which lends a lot of evidence to this being a JVM bug.

Running as:

java -XX:+PrintCompilation -XX:CompileThreshold=10000 phil.StrangeRaceConditionTest

you can get an error condition (like others about 80% of the runs) and the compilation print somewhat like:

 68   1       java.lang.String::hashCode (64 bytes)
 97   2       sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (553 bytes)
104   3       java.math.BigInteger::mulAdd (81 bytes)
106   4       java.math.BigInteger::multiplyToLen (219 bytes)
111   5       java.math.BigInteger::addOne (77 bytes)
113   6       java.math.BigInteger::squareToLen (172 bytes)
114   7       java.math.BigInteger::primitiveLeftShift (79 bytes)
116   1%      java.math.BigInteger::multiplyToLen @ 138 (219 bytes)
121   8       java.math.BigInteger::montReduce (99 bytes)
126   9       sun.security.provider.SHA::implCompress (491 bytes)
138  10       java.lang.String::charAt (33 bytes)
139  11       java.util.ArrayList::ensureCapacity (58 bytes)
139  12       java.util.ArrayList::add (29 bytes)
139   2%      phil.StrangeRaceConditionTest$Buffer::<init> @ 38 (62 bytes)
158  13       java.util.HashMap::indexFor (6 bytes)
159  14       java.util.HashMap::hash (23 bytes)
159  15       java.util.HashMap::get (79 bytes)
159  16       java.lang.Integer::valueOf (32 bytes)
168  17 s     phil.StrangeRaceConditionTest::getBuffer (66 bytes)
168  18 s     phil.StrangeRaceConditionTest::remove (10 bytes)
171  19 s     phil.StrangeRaceConditionTest$Buffer::remove (34 bytes)
172   3%      phil.StrangeRaceConditionTest::strangeRaceConditionTest @ 36 (76 bytes)
ERRORS //my little change
219  15      made not entrant  java.util.HashMap::get (79 bytes)

There are three OSR replacements (the ones with the % annotation on the compile ID). My guess is that it is the third one, which is the loop calling remove(), that is responsible for the error. This can be excluded from JIT via a .hotspot_compiler file located in the working directory with the following contents:

exclude phil/StrangeRaceConditionTest strangeRaceConditionTest

When you run the program again, you get this output:

CompilerOracle: exclude phil/StrangeRaceConditionTest.strangeRaceConditionTest
 73   1       java.lang.String::hashCode (64 bytes)
104   2       sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (553 bytes)
110   3       java.math.BigInteger::mulAdd (81 bytes)
113   4       java.math.BigInteger::multiplyToLen (219 bytes)
118   5       java.math.BigInteger::addOne (77 bytes)
120   6       java.math.BigInteger::squareToLen (172 bytes)
121   7       java.math.BigInteger::primitiveLeftShift (79 bytes)
123   1%      java.math.BigInteger::multiplyToLen @ 138 (219 bytes)
128   8       java.math.BigInteger::montReduce (99 bytes)
133   9       sun.security.provider.SHA::implCompress (491 bytes)
145  10       java.lang.String::charAt (33 bytes)
145  11       java.util.ArrayList::ensureCapacity (58 bytes)
146  12       java.util.ArrayList::add (29 bytes)
146   2%      phil.StrangeRaceConditionTest$Buffer::<init> @ 38 (62 bytes)
165  13       java.util.HashMap::indexFor (6 bytes)
165  14       java.util.HashMap::hash (23 bytes)
165  15       java.util.HashMap::get (79 bytes)
166  16       java.lang.Integer::valueOf (32 bytes)
174  17 s     phil.StrangeRaceConditionTest::getBuffer (66 bytes)
174  18 s     phil.StrangeRaceConditionTest::remove (10 bytes)
### Excluding compile: phil.StrangeRaceConditionTest::strangeRaceConditionTest
177  19 s     phil.StrangeRaceConditionTest$Buffer::remove (34 bytes)
324  15      made not entrant  java.util.HashMap::get (79 bytes)

and the problem does not appear (at least not in the repeated attempts that I've made).

Also, if you change the JVM options a bit, you can cause the problem to go away. Using either of the following I cannot get the problem to appear.

java -XX:+PrintCompilation -XX:CompileThreshold=100000 phil.StrangeRaceConditionTest
java -XX:+PrintCompilation -XX:FreqInlineSize=1 phil.StrangeRaceConditionTest

Interestingly, the compilation output for both of these still show the OSR for the remove loop. My guess (and it is a big one) is that delaying the JIT via the compilation threshold or changing the FreqInlineSize cause changes to the OSR processing in these cases that bypass a bug that you are otherwise hitting.

See here for info on the JVM options.

See here and here for information on the output of -XX:+PrintCompilation and how to mess with what the JIT does.

Thread-safe circular buffer in Java

18 votes

Consider a few web server instances running in parallel. Each server holds a reference to a single shared "Status keeper", whose role is keeping the last N requests from all servers.

For example (N=3):

Server a: "Request id = ABCD"        Status keeper=["ABCD"]
Server b: "Request id = XYZZ"        Status keeper=["ABCD", "XYZZ"] 
Server c: "Request id = 1234"        Status keeper=["ABCD", "XYZZ", "1234"]
Server b: "Request id = FOO"         Status keeper=["XYZZ", "1234", "FOO"]
Server a: "Request id = BAR"         Status keeper=["1234", "FOO", "BAR"]

At any point in time, the "Status keeper" might be called from a monitoring application that reads these last N requests for an SLA report.

What's the best way to implement this producer-consumer scenario in Java, giving the web servers higher priority than the SLA report?

CircularFifoBuffer seems to be the appropriate data structure to hold the requests, but I'm not sure what's the optimal way to implement efficient concurrency.

Buffer fifo = BufferUtils.synchronizedBuffer(new CircularFifoBuffer());