Best java questions in December 2010

Good use case for Akka

34 votes

I have heard lots of raving about Akka framework (Java/Scala service platform), but so far have not seen many actual examples of use cases it would be good for. So I would be interested in hearing about things developers have used it succesfully.

Only one limitation: please do not include case of writing a chat server. :-) (why? since this has been overused as an example for lots of similar things)

EDIT: Since there are many good answers, my choice of accepted answer is bit arbitrary; I thought highest voted one seems like a reasonable choice. Thank you everyone for good answers!

I have used it so far in two real projects very successfully. both are in the near real-time traffic information field (traffic as in cars on highways), distributed over several nodes, integrating messages between several parties, reliable backend systems. I'm not at liberty to give specifics on clients yet, when I do get the OK maybe it can be added as a reference.

Akka has really pulled through on those projects, even though we started when it was on version 0.7. (we are using scala by the way)

One of the big advantages is the ease at which you can compose a system out of actors and messages with almost no boilerplating, it scales extremely well without all the complexities of hand-rolled threading and you get asynchronous message passing between objects almost for free.

It is very good in modeling any type of asynchronous message handling. I would prefer to write any type of (web) services system in this style than any other style. (Have you ever tried to write an asynchronous web service (server side) with JAX-WS? that's a lot of plumbing). So I would say any system that does not want to hang on one of its components because everything is implicitly called using synchronous methods, and that one component is locking on something. It is very stable and the let-it-crash + supervisor solution to failure really works well. Everything is easy to setup programmatically and not hard to unit test.

Then there are the excellent add-on modules. The Camel module really plugs in well into Akka and enables such easy development of asynchronous services with configurable endpoints.

I'm very happy with the framework and it is becoming a defacto standard for the connected systems that we build.

My java code has an obvious error. Why does it compile and run?

30 votes
public class HelloWorld {
    public static void main (String args[]){
        System.out.println ("Hello ");
        http://www.google.com
        System.out.println ("World!");
    }
}

The above code is just compiling and executing fine. Why the compiler is not reporting any error?

The http: is a Labeled Statement. See here also.

The //www.google.com portion is then interpreted as a // comment.

Code to read and learn from

26 votes

The best way to learn programming is by writing programs. Reading programs written by brilliant programmers is equally important. If someone asks me for source code to read and learn from, without a second thought I will point him/her to SQLite. The following merits make the SQLite source code an ideal learning ground for programmers:

  • Informative, balanced comments.
  • Good coding practices: well factored-out functions, idiomatic use of the implementation language, consistency in style, comprehensive tests etc.
  • A real world project. Probably SQLite is the world's most widely deployed database! Still it is small and amicable to a single brain.

SQLite is implemented in C. Are there open source projects written in other languages/paradigms that have all the above qualities? Personally, I would like to know about a Lisp project. Many will be interested in Java/C++ projects because millions of words have been written about designing 'maintainable object-oriented software'. It will be informative to see this wisdom in practice.

As far as Common Lisp is concerned, Edi Weitz is generally regarded as producing Libraries of highest quality. Especially CL-PPCRE is often mentioned as commendable reading.

How can I make external methods interruptable?

25 votes

The Problem

I'm running multiple invocations of some external method via an ExecutorService. I would like to be able to interrupt these methods, but unfortunately they do not check the interrupt flag by themselves. Is there any way I can force an exception to be raised from these methods?

I am aware that throwing an exception from an arbitrary location is potentially dangerous, in my specific case I am willing to take this chance and prepared to deal with the consequences.

Details

By "external method" I mean some method(s) that come from an external library, and I cannot modify its code (well I can, but that will make it a maintenance nightmare whenever a new version is released).

The external methods are computationally expensive, not IO-bound, so they don't respond to regular interrupts and I can't forcefully close a channel or a socket or something. As I've mentioned before, they also do not check the interrupt flag.

The code is conceptually something like:

// my code
public void myMethod() {
    Object o = externalMethod(x);
}

// External code
public class ExternalLibrary {
    public Object externalMethod(Object) {
        innerMethod1();
        innerMethod1();
        innerMethod1();
    }

    private void innerMethod1() {
        innerMethod2();
        // computationally intensive operations
    }

    private void innerMethod2() {
        // computationally intensive operations
    }
}

What I've Tried

Thread.stop() will theoretically do what I want, but not only is it deprecated but it is also only available for actual threads, while I'm working with executor tasks (which might also share threads with future tasks, for example when working in a thread pool). Nevertheless, if no better solution is found, I will convert my code to use old-fashioned Threads instead and use this method.

Another option I've tried is to mark myMethod() and similar methods with a special "Interruptable" annotation and then use AspectJ (which I am admittedly a newbie at) for catching all method invocations there - something like:

@Before("call(* *.*(..)) && withincode(@Interruptable * *.*(..))")
public void checkInterrupt(JoinPoint thisJoinPoint) {
    if (Thread.interrupted()) throw new ForcefulInterruption();
}

But withincode isn't recursive to methods called by the matching methods, so I would have to edit this annotation into the external code.

Finally, this is similar to a previous question of mine - though a notable difference is that now I'm dealing with an external library.

I have hacked an ugly solution to my problem. It's not pretty, but it works in my case, so I'm posting it here in case it will help anyone else.

What I did was profile the library parts of my application, hoping that I could isolate a small group of methods which are called repeatedly - for example some get methods or equals() or something along these lines; and then I could insert the following code segment there:

if (Thread.interrupted()) {
    // Not really necessary, but could help if the library does check it itself in some other place:
    Thread.currentThread().interrupt();
    // Wrapping the checked InterruptedException because the signature doesn't declare it:
    throw new RuntimeException(new InterruptedException());
}

Either inserting it manually by editing the library's code, or automatically by writing an appropriate aspect. Notice that if the library attempts to catch and swallow a RuntimeException, the thrown exception could be replaced with something else the library doesn't try to catch.

Luckily for me, using VisualVM, I was able to find a single method called a very high number of times during the specific usage I was making of the library. After adding the above code segment, it now properly responds to interrupts.

This is of course not maintainable, plus nothing really guarantees the library will call this method repeatedly in other scenarios; but it worked for me, and since it's relatively easy to profile other applications and insert the checks there, I consider this a generic, if ugly, solution.

Why should I use foreach instead of for (int i=0; i<length; i++) in loops?

23 votes

It seems like the cool way of looping in C# and Java is to use foreach instead of C style for loops.

Is there a reason why I should prefer this way over the C style?

I'm particularly interested in these two cases, but please address as many cases as you need to explain your points.

  • I wish to perform an operation on each item in a list.
  • I am searching for an item in a list, and wish to exit when that item is found.

Two major reasons I can think of are:

1) It abstracts away from the underlying container type. This means, for example, that you don't have to change the code that loops over all the items in the container when you change the container -- you're specifying the goal of "do this for every item in the container", not the means.

2) It eliminates the possibility of off-by-one errors.

In terms of performing an operation on each item in a list, it's intuitive to just say:

for(Item item: lst)
{
  op(item);
}

It perfectly expresses the intent to the reader, as opposed to manually doing stuff with iterators. Ditto for searching for items.

If profiler is not the answer, what other choices do we have?

22 votes

After watching the presentation "Performance Anxiety" of Joshua Bloch, I read the paper he suggested in the presentation "Evaluating the Accuracy of Java Profilers". Quoting the conclusion:

Our results are disturbing because they indicate that profiler incorrectness is pervasive—occurring in most of our seven benchmarks and in two production JVM—-and significant—all four of the state-of-the-art profilers produce incorrect profiles. Incorrect profiles can easily cause a performance analyst to spend time optimizing cold methods that will have minimal effect on performance. We show that a proof-of-concept profiler that does not use yield points for sampling does not suffer from the above problems

The conclusion of the paper is that we cannot really believe the result of profilers. But then, what is the alternative of using profilers. Should we go back and just use our feeling to do optimization?

UPDATE: A point that seems to be missed in the discussion is observer effect. Can we build a profiler that really 'observer effect'-free?

Oh, man, where to begin?

First, I'm amazed that this is news. Second, the problem is not that profilers are bad, it is that some profilers are bad. The authors built one that, they feel, is good, just by avoiding some of the mistakes they found in the ones they evaluated. Mistakes are common because of some persistent myths about performance profiling.

But let's be positive. If one wants to find opportunities for speedup, it is really very simple:

  • Sampling should be uncorrelated with the state of the program.
    That means happening at a truly random time, regardless of whether the program is in I/O (except for user input), or in GC, or in a tight CPU loop, or whatever.

  • Sampling should read the function call stack,
    so as to determine which statements were "active" at the time of the sample. The reason is that every call site (point at which a function is called) has a percentage cost equal to the fraction of time it is on the stack.

  • Reporting should show percent by line (not by function).
    If a "hot" function is identified, one still has to hunt inside it for the "hot" lines of code accounting for the time. That information is in the samples! Why hide it?

An almost universal mistake (that the paper shares) is to be concerned too much with accuracy of measurement, and not enough with accuracy of location. For example, here is an example of performance tuning in which a series of performance problems were identified and fixed, resulting in a compounded speedup of 43 times. It was not essential to know precisely the size of each problem before fixing it, but to know its location. A phenomenon of performance tuning is that fixing one problem, by reducing the time, magnifies the percentages of remaining problems, so they are easier to find. As long as any problem is found and fixed, progress is made toward the goal of finding and fixing all the problems. It is not essential to fix them in decreasing size order, but it is essential to pinpoint them.

On the subject of statistical accuracy of measurement, if a call point is on the stack some percent of time F (like 20%), and N (like 100) random-time samples are taken, then the number of samples that show the call point is a binomial distribution, with mean = NF = 20, standard deviation = sqrt(NF(1-F)) = sqrt(16) = 4. So the percent of samples that show it will be 20% +/- 4%. So is that accurate? Not really, but has the problem been found? Precisely.

In fact, the larger a problem is, in terms of percent, the fewer samples are needed to locate it. For example, if 3 samples are taken, and a call point shows up on 2 of them, it is highly likely to be very costly. (Specifically, it follows a beta distribution. If you generate 4 uniform 0,1 random numbers, and sort them, the distribution of the 3rd one is the distribution of cost for that call point. It's mean is (2+1)/(3+2) = 0.6, so that is the expected savings, given those samples.)

It's high time we programmers blew the cobwebs out of our heads on the subject of profiling.

Disclaimer - the paper failed to reference my article: Dunlavey, “Performance tuning with instruction-level cost derived from call-stack sampling”, ACM SIGPLAN Notices 42, 8 (August, 2007), pp. 4-8.

What are the big improvements between guava and apache equivalent libraries?

16 votes

We currently use apache collections, string utils, etc. I need to decide if we should switch from the apache foundations implementation.

The important criteria is ease of developers use. Performance/memory usage is not yet an important issue for us. Speed of development is the key criteria at this point.

I would appreciate opinions about how the developer's life became significantly easier with guava.

First of, as javamonkey79 explained, while Google Guava and Apache Commons do share similar features, they also both have functionality that is absent from their counterpart. Thus, limiting yourself to only one library might be unwise.

That being said, if I had to choose, I'd opt to use Guava, keeping Apache Commons around for the (rare) cases where Guava does not have the needed functionality. Let me attempt to explain why.

Guava is more "modern"

Apache Commons is a really mature library, but it's also almost 10 years old, and targets Java 1.4. Guava was open sourced in 2007, targets Java 5, and thus Guava greatly benefits from the Java 5 features: generics, varargs, enums, and autoboxing.

According to the Guava developers, generics are one reason they chose to create a new library instead of improving Apache Commons (see the google-collections FAQ, under the title "Why did Google build all this, when it could have tried to improve the Apache Commons Collections instead?").

I agree with them: while often criticized (no reification, limited due to backward compatibility), Java generics are still very useful when used appropriately, like Guava does. I'd rather quit than work with non-generified collections!

(Note that Apache Commons 3.0, currently in beta, targets Java 1.5+)

Guava is very well designed / documented

The code is full of best practices and useful patterns to make the API more readable, discoverable, performant, secure, thread-safe...

Having read Effective Java (awesome book BTW), I see these patterns everywhere in the code:

  • factory methods (such as ImmutableList.copyOf())
  • builder pattern (ImmutableList.builder(), Joiner, CharMatcher, Splitter, Ordering, ...)
  • immutability (immutable collections, CharMatcher, Joiner, Splitter,...)
  • implementation hiding (Predicates.xXx, ...)
  • favoring composition over inheritance(the ForwardXXX collections)
  • null-checks
  • enum-singleton pattern
  • serialization proxies
  • well thought-out naming conventions

I could go on for hours explaining the advantages brought by these design choices (tell me if you want me to). The thing is, these patterns are not only "for the show", they have a real value: the API is a pleasure to use, easier to learn (did I forget to say how well documented it is?), more efficient, and many classes are simpler / thread-safe due to their immutability.

As a bonus point, one learns a lot by looking at the code :)

Guava is consistent

Kevin Bourrillion (Guava's lead developer) does a great job maintaining a high level of quality / consistency across the library. He is of course not alone, and a lot of great developers have contributed to Guava (even Joshua Bloch, who now works at Google!).

The core philosophies and design choices behind Guava are consistent across the library, and the developers adhere to very good (IMO) API design principles, having learned from past mistakes of the JDK APIs (not their mistakes, though).

Guava has a high power-to-weight ratio

The Guava designers resist the temptation to add too many features, limiting the API to the most useful ones. They know it's very hard to remove a feature once added, and follow Joshua Bloch's motto on API design: "When in doubt, leave it out". Also, using the @Beta annotation allows them to test some design choices without committing to a specific API.

The design choices mentioned above allow for a very compact API. Simply look at the MapMaker to see the power packed inside a "simple" builder. Other good (albeit simpler?) examples are CharMatcher, Splitter, and Ordering.

It's also very easy to compose various parts of Guava. For example, say you want to cache the result of a complex function? Feed this function to your MapMaker and BINGO, you got a thread-safe computing map/cache. Need to constrain the map/function inputs to specific Strings? No problem, wrap it inside a ConstrainedMap, using a CharMatcher to reject inappropriate Strings...

Guava is in active development

While the development of Apache Commons seems to have accelerated with the work on Commons Lang 3.0, Guava seems to pick up more steam at the moment, while Google open sources more of their internal classes.

Since Google heavily relies on it internally, I don't think it's going to disappear any time soon. Plus, open sourcing its common libraries allows Google to more easily open source other libraries that depend on it (instead of repackaging them, like Guice currently does).

Conclusion

For all the above reasons, Guava is my go-to library when starting a new project. And I am very grateful to Google and to the awesome Guava developers, who created this fantastic library.


PS: you might also want to read this other SO question

PPS: I don't own any Google stock (yet)

How to convert an int[] array to a List?

14 votes

I expected this code to display true:

int[] array = {1, 2};
System.out.println(Arrays.asList(array).contains(1));

The Arrays.asList(array) will result in a singleton list of an int[].

It works as you expect if you change int[] to Integer[]. Don't know if that helps you though.

I'm maintaining a Java class that's 40K lines long.. problem?

14 votes

This may be a subjective question leading to deletion but I would really like some feedback.

Recently, I moved to another very large enterprise project where I work as a developer. I was aghast to find most classes in the project are anywhere from 8K to 50K lines long with methods that are 1K to 8K lines long. It's mostly business logic dealing with DB tables and data management, full of conditional statements to handle the use cases.

Are classes this large common in large enterprise systems? I realize without looking at the code it's hard to make a determination, but have you ever worked on a system with classes this large?

Here are the ten largest class in the JDK 6 by line count of 7209 .java files. These classes include significant amount of comments which could be longer than the code.

4495 ./javax/sql/rowset/BaseRowSet.java
4649 ./java/awt/Container.java
5025 ./javax/swing/text/JTextComponent.java
5246 ./java/util/regex/Pattern.java
5316 ./javax/swing/JTree.java
5469 ./java/lang/Character.java
5473 ./javax/swing/JComponent.java
9063 ./com/sun/corba/se/impl/logging/ORBUtilSystemException.java
9595 ./javax/swing/JTable.java
9982 ./java/awt/Component.java

I would agree one printed page is long enough for a method. There really should not be a need for classes over 10K lines long IMHO.

Is there a good book for understanding the modern internals of the JVM?

13 votes

I came across this book "Inside Java 2 Virtual Machine", which looks like an excellent guide. However it's very dated, and I was wondering if there was a similar book that is more up to date.

Not a book, but there is a wiki on the Snoracle website for the HotSpot internals

Why does POST not honor charset, but an AJAX request does? tomcat 6.

12 votes

I have a tomcat based application that needs to submit a form capable of handling utf-8 characters. When submitted via ajax, the data is returned correctly from getParameter() in utf-8. When submitting via form post, the data is returned from getParameter() in iso-8859-1.

I used fiddler, and have determined the only difference in the requests, is that charset=utf-8 is appended to the end of the Content-Type header in the ajax call (as expected, since I send the content type explicitly).

ContentType from ajax: "application/x-www-form-urlencoded; charset=utf-8"

ContentType from form: "application/x-www-form-urlencoded"

I have the following settings:

ajax post (outputs chars correctly):

$.ajax( {
  type : "POST",
  url : "blah",
  async : false,
  contentType: "application/x-www-form-urlencoded; charset=utf-8",
  data  : data,
  success : function(data) { 
  }
 });

form post (outputs chars in iso)

 <form id="leadform" enctype="application/x-www-form-urlencoded; charset=utf-8" method="post" accept-charset="utf-8" action="{//app/path}">

xml declaration:

<?xml version="1.0" encoding="utf-8"?>

Doctype:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

meta tag:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>

jvm parameters:

-Dfile.encoding=UTF-8

I have also tried using request.setCharacterEncoding("UTF-8"); but it seems as if tomcat simply ignores it. I am not using the RequestDumper valve.

From what I've read, POST data encoding is mostly dependent on the page encoding where the form is. As far as I can tell, my page is correctly encoded in utf-8.

The sample JSP from this page works correctly. It simply uses setCharacterEncoding("UTF-8"); and echos the data you post. http://wiki.apache.org/tomcat/FAQ/CharacterEncoding

So to summarize, the post request does not send the charset as being utf-8, despite the page being in utf-8, the form parameters specifying utf-8, the xml declaration or anything else. I have spent the better part of three days on this and am running out of ideas. Can anyone help me?

form post (outputs chars in iso)

<form id="leadform" enctype="application/x-www-form-urlencoded; charset=utf-8" method="post" accept-charset="utf-8" action="{//app/path}">

You don't need to specify the charset there. The browser will use the charset which is specified in HTTP response header.

Just

<form id="leadform" method="post" action="{//app/path}">

is enough.


xml declaration:

<?xml version="1.0" encoding="utf-8"?>

Irrelevant. It's only relevant for XML parsers. Webbrowsers doesn't parse text/html as XML. This is only relevant for the server side (if you're using a XML based view technology like Facelets or JSPX, on plain JSP this is superfluous).


Doctype:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Irrelevant. It's only relevant for HTML parsers. Besides, it doesn't specify any charset. Instead, the one in the HTTP response header will be used. If you aren't using a XML based view technology like Facelets or JSPX, this can be as good <!DOCTYPE html>.


meta tag:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>

Irrelevant. It's only relevant when the HTML page is been viewed from local disk or is to be parsed locally. Instead, the one in the HTTP response header will be used.


jvm parameters:

-Dfile.encoding=UTF-8

Irrelevant. It's only relevant to Sun/Oracle(!) JVM to parse the source files.


I have also tried using request.setCharacterEncoding("UTF-8"); but it seems as if tomcat simply ignores it. I am not using the RequestDumper valve.

This will only work when the request body is not been parsed yet (i.e. you haven't called getParameter() and so on beforehand). You need to call this as early as possible. A Filter is a perfect place for this. Otherwise it will be ignored.


From what I've read, POST data encoding is mostly dependent on the page encoding where the form is. As far as I can tell, my page is correctly encoded in utf-8.

It's dependent on the HTTP response header.

All you need to do are the following three things:

  1. Add the following to top of your JSP:

    <%@page pageEncoding="UTF-8" %>
    

    This will set the response encoding to UTF-8 and set the response header to UTF-8.

  2. Create a Filter which does the following in doFilter() method:

    if (request.getCharacterEncoding() == null) {
        request.setCharacterEncoding("UTF-8");
    }
    chain.doFilter(request, response);
    

    This will make that the POST request body will be processed as UTF-8.

  3. Change the <Connector> entry in Tomcat/conf/server.xml as follows:

    <Connector (...) URIEncoding="UTF-8" />
    

    This will make that the GET query strings will be processed as UTF-8.

See also:

How does java do modulus calculations with negative numbers?

12 votes

Am I doing modulus wrong? because I keep getting -13 % 64 = 51 not -13 % 64 = -13

Both definitions of modulus of negative numbers are in use - some languages use one definition and some the other.

If you want to get a negative number for negative inputs then you can use this:

int r = x % n;
if (r > 0 && x < 0)
{
    r -= n;
}

Likewise if you were using a language that returns a negative number on a negative input and you would prefer positive:

int r = x % n;
if (r < 0)
{
    r += n;
}

In Java should I copy a volatile reference locally before I foreach it

12 votes

If I have the following

private volatile Collection<Integer> ints;

private void myMethod()
{
   for ( Integer i : ints )
   {
      ...
   }
}

The ints collection is never changed but the entire collection maybe replaced by another thread (so it's an immutable collection).

Should I be copying the ints variable locally before I iterate it? I'm not sure if it will be accessed multiple times. ie Iterating the collection, another thread replaces the collection, the code continues iterating but with the new collection.

EDIT : This question is relevant for additional info on how foreach works internally.

You don't have to. Implicitly, the code will do an ints.iterator() anyway, and from that point on only use that iterator, on the old collection.

Is it really Impossible to Protect Android apps from Reverse Engineering?

9 votes

As we know, Android apps are written in Java. In Java, no matter what you do, it is impossible to protect compiled code from decompilation or reverse-engineering, as the following thread suggests:

How to lock compiled Java Classes to prevent decompilation

My question is how one would go about protecting an app that contains algorithmic trade secrets from reverse-engineering?

By "how" I mean not only software techniques but also other creative approaches.

First stop for me would be to optimise and obfuscate the code with ProGuard which is known to work with byte code targeted at Android's Dalvik VM (via Dex). Its a really great tool and can increase the difficulty of 'reversing' your code whilst shrinking your code's footprint (in some cases dramatically: a recent applet of mine went from ~600KB down to ~50KB).

Like others are saying you will never get 100% security of your algorithm's details whilst its implementation is being distributed to clients. For that you'd need to keep the code on your servers alone. Attempts to near 100% percent security for client code effectively amount to DRM and can make your client code fragile in the face of network outages and just generally frustrate (legitimate) users.

Android developers blog has some useful articles on the matter of 'tamper resistant' Android apps (and they recommend the use of ProGuard as part of the overall approach).


Edit: With regards to 'creative' approaches: some developers employ debugger detection techniques to prevent run-time analysis and combine this with encryption of portions of binary code (to deter static analysis), but to be honest a determined enough attacker can circumvent these, whilst it can cause legitimate user frustration as illustrated by this Windows KB article. My girlfriends 'Learn to drive' DVD software will not run under Virtualbox for this reason, but she blames Linux of course!

OpenRCE and Wikipedia's article on obfuscated code may be good starting points if you want to look into this further. But be warned, you may lose more through over zealous use of these techniques frustrating your users than you would through loss of trade secrets by reverse engineering. Like Anton S says, maybe the most 'creative' approach lies with tweaking the business model rather than the technology.


Edit #2 : The latest Android SDK update on 6th Dec 2010 (coinciding with Android 2.3 Gingerbread release) :

Integrated ProGuard support: ProGuard is now packaged with the SDK Tools. Developers can now obfuscate their code as an integrated part of a release build.

Tired of non-semantic testing to make up for dynamic typing - suggestions?

9 votes

I used to do a lot of web programming in Rails (PHP before that) before I started studying computer engineering.

Since then, I've done a lot of school work in C, and some personal stuff in Objective-C (Mac stuff). I learnt to love static typing.

But now I'm having to do some professional web development (freelancing) and have picked up Rails once again. I'm finding it really annoying to write non-semantic type-checking tests. I was getting those for free from C and Objective-C compilers. I loved hitting Build and having the system check all my code to see that A can call B, B can call some obscure library C, etc. All I had to do was test the semantics. But with Rails, I'm the compiler. :(

Has anyone treaded this same path? Are my only options for web development ASP.NET MVC with C# and Java + x framework? Looking for some suggestions, or even some sympathy... :P

By the way, I make a specific reference to Rails rather than Ruby because I don't mind Ruby's dynamic nature for simple stuff like scripting or what not. But since Rails depends on so many gems and since one usually adds a number of other gems, the dynamic typing becomes an issue.

Thanks!

edit:

I followed up on pst's suggestion and looked into Scala. In reading the book Programming in Scala, written by the language's creator, Martin Odersky, I came accross this bit of text that in many ways expresses my concerns and a bit more. Very interesting reading.

Taken from page 52 of Martin Odersky's Programming in Scala:

Scala is statically typed

A static type system classifies variables and expressions according to the kinds of values they hold and compute. Scala stands out as a language with a very advanced static type system. Starting from a system of nested class types much like Java’s, it allows you to parameterize types with generics, to combine types using intersections, and to hide details of types using abstract types. These give a strong foundation for building and composing your own types, so that you can design interfaces that are at the same time safe and flexible to use.

If you like dynamic languages such as Perl, Python, Ruby, or Groovy, you might find it a bit strange that Scala’s static type system is listed as one of its strong points. After all, the absence of a static type system has been cited by some as a major advantage of dynamic languages. The most common arguments against static types are that they make programs too verbose, prevent programmers from expressing themselves as they wish, and make impossible certain patterns of dynamic modifications of software systems.

However, often these arguments do not go against the idea of static types in general, but against specific type systems, which are perceived to be too verbose or too inflexible. For instance, Alan Kay, the inventor of the Smalltalk language, once remarked: “I’m not against types, but I don’t know of any type systemsthat aren’t a complete pain, so I still like dynamic typing.”

We hope to convince you in this book that Scala’s type system is far from being a “complete pain.” In fact, it addresses nicely two of the usual concerns about static typing: verbosity is avoided through type inference and flexibility is gained through pattern matching and several new ways to write and compose types. With these impediments out of the way, the classical benefits of static type systems can be better appreciated. Among the most important of these benefits are verifiable properties of program abstractions, safe refactorings, and better documentation.

Verifiable properties

Static type systems can prove the absence of certain run-time errors. For instance, they can prove properties like: booleans are never added to integers; private variables are not accessed from outside their class; functions are applied to the right number of arguments; only strings are ever added to a set of strings.

Other kinds of errors are not detected by today’s static type systems. For instance, they will usually not detect non-terminating functions, array bounds violations, or divisions by zero. They will also not detect that your program does not conform to its specification (assuming there is a spec, that is!). Static type systems have therefore been dismissed by some as not being very useful. The argument goes that since such type systems can only detect simple errors, whereas unit tests provide more extensive coverage, why bother with static types at all?

We believe that these arguments miss the point. Although a static type system certainly cannot replace unit testing, it can reduce the number of unit tests needed by taking care of some properties that would otherwise need to be tested. Likewise, unit testing cannot replace static typing. After all, as Edsger Dijkstra said, testing can only prove the presence of errors, never their absence. So the guarantees that static typing gives may be simple, but they are real guarantees of a form no amount of testing can deliver.

Safe refactorings

A static type system provides a safety net that lets you make changes to a codebase with a high degree of confidence. Consider for instance a refactoring that adds an additional parameter to a method. In a statically typed language you can do the change, re-compile your system and simply fix all lines that cause a type error. Once you have finished with this, you are sure to have found all places that need to be changed. The same holds for many other simple refactorings like changing a method name, or moving methods from one class to another. In all cases a static type check will provide enough assurance that the new system works just like the old.

Documentation

Static types are program documentation that is checked by the compiler for correctness. Unlike a normal comment, a type annotation can never be out of date (at least not if the source file that contains it has recently passed a compiler). Furthermore, compilers and integrated development environments can make use of type annotations to provide better context help. For instance, an integrated development environment can display all the members available for a selection by determining the static type of the expression on which the selection is made and looking up all members of that type.

This is one of my "gripes" about dynamic languages. I want to test for semantics, not type errors ;-) That being said, a good testing framework/setup is really a must in all non-trivial situations and good code-coverage and tested requirements is/are important.

If you do want to go down the static-typing path on the JVM (I have), I would highly recommend looking at Scala. Coming from Ruby, it's far less painful (and actually lots of fun in different ways) than going to Java. You get to "keep" the things you take for granted -- an expression-based syntax, closures, the ability to omit types in many places (not as open as Ruby, but you do get compile-time type checking ;-), everything(*)-is-an-object OO, unified accessor methods, ability to construct DSLs easily, and sugar -- and get the benefits of a statically typed language with local type inference, pattern matching, a relatively rich collection framework, and decent integration with Java (including the numerous web-frameworks, there are some Scala-specific frameworks as well which leverage the Scala language).

C#3.0/4.0 (and .NET3.5+) isn't too shabby either (but avoid C#2.0, which is now hopefully a relic), with the introduction of LINQ/closures, basic type inference and other nice language features I find it "acceptable" for most tasks (take a guess how I would rate Java as a language ;-). However, C# is a CLR-target language (there is/was a .NET Scala port, but I am not sure of the status -- it is not the main target platform though).

Since I have mentioned Scala, I should also mention F# (now an "official" .NET language) which takes the "Funtional with OO" approach being similar to OCaml -- Scala is more of the reverse and I would describe it as "OO with Functional". I have heard arguments for/against F# compared to C# w.r.t the type system, but have no practical experience with F#. You may or may not like the paradigm shift.

Happy coding.

What is "public abstract interface" in Java?

9 votes

When I wondered the implementation of MenuItem.setOnMenuItemClickListener() method, I opened the implementation and this is what I see :

// Compiled from MenuItem.java (version 1.5 : 49.0, no super bit)
public abstract interface android.view.MenuItem {

  // Method descriptor #7 ()I
  public abstract int getItemId();

  // Method descriptor #7 ()I
  public abstract int getGroupId();

  // Method descriptor #7 ()I
  public abstract int getOrder();

 //...goes like that

}

As you can see, android.view.MenuItem has two qualifier that I always know they have similiar meaning in programming and mostly for abstraction or force developer.

So what does it mean now?

Difference between Abstract class and interface :

Unlike interfaces, abstract classes can contain fields that are not static and final, and they can contain implemented methods. Such abstract classes are similar to interfaces, except that they provide a partial implementation, leaving it to subclasses to complete the implementation. If an abstract class contains only abstract method declarations, it should be declared as an interface instead.

Multiple interfaces can be implemented by classes anywhere in the class hierarchy, whether or not they are related to one another in any way. Think of Comparable or Cloneable, for example.

By comparison, abstract classes are most commonly subclassed to share pieces of implementation. A single abstract class is subclassed by similar classes that have a lot in common (the implemented parts of the abstract class), but also have some differences (the abstract methods).

Source : http://download.oracle.com/javase/tutorial/java/IandI/abstract.html

An interface is abstract by definition. The use of the abstract modifier here is redundant. Arguably, it shouldn't even be allowed.

Where do I sort?

9 votes

Hello all, I have a database, which I query, and I'm unsure of where to perform the sorting of the results, so far I've have the following options.

  • At the MySQL query.
  • At list level(Using a LinkedList)
  • Sorting an unsorted list using comparators before showing the results (basically in the jsp)

The List is composed by ObjectDTO so where would it be more efficient. Any ideas?

You should do the sorting in the database if at all possible.

  • The database can use indexes. If there is a suitable index available then the results can be read from disk already in sorted order, resulting in a performance increase - no extra O(n log(n)) sorting step is required.
  • If you only need the first x results you also minimize data transfer (both reduced network transfer, and also reduced disk access if there is a suitable index).

what is the point of heterogenous arrays?

9 votes

I know that more-dynamic-than-Java languages, like Python and Ruby, often allow you to place objects of mixed types in arrays, like so:

["hello", 120, ["world"]]

What I don't understand is why you would ever use a feature like this. If I want to store heterogenous data in Java, I'll usually create an object for it.

For example, say a User has int ID and String name. While I see that in Python/Ruby/PHP you could do something like this:

[["John Smith", 000], ["Smith John", 001], ...]

this seems a bit less safe/OO than creating a class User with attributes ID and name and then having your array:

[<User: name="John Smith", id=000>, <User: name="Smith John", id=001>, ...]

where those <User ...> things represent User objects.

Is there reason to use the former over the latter in languages that support it? Or is there some bigger reason to use heterogenous arrays?

N.B. I am not talking about arrays that include different objects that all implement the same interface or inherit from the same parent, e.g.:

class Square extends Shape
class Triangle extends Shape
[new Square(), new Triangle()]

because that is, to the programmer at least, still a homogenous array as you'll be doing the same thing with each shape (e.g., calling the draw() method), only the methods commonly defined between the two.

As katrielalex wrote: There is no reason not to support heterogeneous lists. In fact, disallowing it would require static typing, and we're back to that old debate. But let's refrain from doing so and instead answer the "why would you use that" part...

To be honest, it is not used that much -- if we make use of the exception in your last paragraph and choose a more liberal definition of "implement the same interface" than e.g. Java or C#. Nearly all of my iterable-crunching code expects all items to implement some interface. Of course it does, otheriwise it could do very little to it!

Don't get me wrong, there are absolutely valid use cases - there's rarely a good reason to write a whole class for containing some data (and even if you add some callables, functional programming sometimes comes to the rescue). A dict would be a more common choice though, and namedtuple is very neat as well. But they are less common than you seem to think, and they are used with thought and discipline, not for cowboy coding.

(Also, you "User as nested list" example is not a good one - since the inner lists are fixed-sized, you better use tuples and that makes it valid even in Haskell (type would be [(String, Integer)]))

Fastest reliable way for Clojure (Java) and Ruby apps to communicate

7 votes

Hi There,

We have cloud-hosted (RackSpace cloud) Ruby and Java apps that will interact as follows:

  1. Ruby app sends a request to Java app. Request consists of map structure containing strings, integers, other maps, and lists (analogous to JSON).
  2. Java app analyzes data and sends reply to Ruby App.

We are interested in evaluating both messaging formats (JSON, Buffer Protocols, Thrift, etc.) as well as message transmission channels/techniques (sockets, message queues, RPC, REST, SOAP, etc.)

Our criteria:

  1. Short round-trip time.
  2. Low round-trip-time standard deviation. (We understand that garbage collection pauses and network usage spikes can affect this value).
  3. High availability.
  4. Scalability (we may want to have multiple instances of Ruby and Java app exchanging point-to-point messages in the future).
  5. Ease of debugging and profiling.
  6. Good documentation and community support.
  7. Bonus points for Clojure support.
  8. Good dynamic language support.

What combination of message format and transmission method would you recommend? Why?

I've gathered here some materials we have already collected for review:

We have decided to go with BSON over RabbitMQ.

We like BSON's support for heterogeneous collections and the lack of the need to specify the format of messages up-front. We don't mind that it has poor space usage characteristics and likely poorer serialization performance than other message formats since the messaging portion of our application is not anticipated to be the bottleneck. It doesn't look like a nice Clojure interface has been written to let you directly manipulate BSON objects, but hopefully that won't be an issue. I will revise this entry if we decide that BSON won't work out for us.

We chose RabbitMQ mainly because we already have experience with it and are using it in a system that demands high throughput and availability.

If messaging does become a bottleneck, we will look first to BERT (we rejected it because it currently does not appear to have Java support), then to MessagePack (rejected because it appears that there isn't a large community of Java developers using it), then to Avro (rejected because it requires you to define your message format up-front), then Protocol Buffers (rejected because of the extra code generation step and lack of heterogeneous collections) and then Thrift (rejected for the reasons mentioned for Protocol Buffers).

We may want to go with a plain RPC scheme rather than using a message queue since our messaging style is essentially synchronous point-to-point.

Thanks for your input everyone!

Update: Here is the project.clj and core.clj that shows how to convert Clojure maps to BSON and back:

;;;; project.clj

(defproject bson-demo "0.0.1"
  :description "BSON Demo"
  :dependencies [[org.clojure/clojure "1.2.0"]
                 [org.clojure/clojure-contrib "1.2.0"]
                 [org.mongodb/mongo-java-driver "2.1"]]
  :dev-dependencies [[swank-clojure "1.3.0-SNAPSHOT"]]
  :main core)

;;;; core.clj
(ns core
  (:gen-class)
  (:import [org.bson BasicBSONObject BSONEncoder BSONDecoder]))

(defonce *encoder* (BSONEncoder.))

(defonce *decoder* (BSONDecoder.))

;; XXX Does not accept keyword arguments. Convert clojure.lang.Keyword in map to java.lang.String first.
(defn map-to-bson [m]
  (->> m (BasicBSONObject.) (.encode *encoder*)))

(defn bson-to-map [^BasicBSONObject b]
  (->> (.readObject *decoder* b) (.toMap) (into {})))

(defn -main []
  (let [m {"foo" "bar"}]
    (prn (bson-to-map (map-to-bson m)))))

Is the entire Xss (stack space) used for each Java thread?

6 votes

I am considering increasing the stack size to work around the StackOverflowError thrown by the regex library which does not appear to be on the plans for a fix.

Edit: Solution

  • Stephen C's answer is probably the best answer to the problem, even if it is not an answer to the question. Although my string size was more than 4k already, I was still likely to eventually have the problem again during the lifetime of the product
  • aioobe's answer is the best answer to the actual question, perhaps not the actual problem.
  • Chris's answer is a very good idea. Edit: JRegex worked great!

I think a better solution would be to rewrite the regex to avoid the problem. Or better still, replace it with some plain Java parsing code. Or maybe just reject strings larger than a certain length.

Bumping the stack size only puts off the problem. Now you can cope with 2000 or 4000 character input strings instead of 1000. But sooner or later you are likely to run into one that causes your expanded stacks to overflow.