Best questions in March 2011

Regular expression to search for Gadaffi

145 votes

I'm trying to search for the word Gadaffi. What's the best regular expression to search for this?

My best attempt so far is:

\b[KG]h?add?af?fi$\b

But I still seem to be missing some journals. Any suggestions?

Update: I found a pretty extensive list here: http://www.express.be/joker/nl/platdujour/gaddafi-khadaffi-el-qadafi-kadhafy/141157.htm

The answer below matches all the 30 variants:

Gadaffi
Gadafi
Gadafy
Gaddafi
Gaddafy
Gaddhafi
Gadhafi
Gathafi
Ghadaffi
Ghadafi
Ghaddafi
Ghaddafy
Gheddafi
Kadaffi
Kadafi
Kaddafi
Kadhafi
Kazzafi
Khadaffy
Khadafy
Khaddafi
Qadafi
Qaddafi
Qadhafi
Qadhdhafi
Qadthafi
Qathafi
Quathafi
Qudhafi
Kad'afi

\b[KGQ]h?add?h?af?fi\b

Arabic transcription is (Wiki says) "Qaḏḏāfī", so maybe adding a Q. And one H ("Gadhafi", as the article (see below) mentions).

Btw, why is there a $ at the end of the regex?


Btw, nice article on the topic:

Gaddafi, Kadafi, or Qaddafi? Why is the Libyan leader’s name spelled so many different ways?.


EDIT

To match all the names in the article you've mentioned later, this should match them all. Let's just hope it won't match a lot of other stuff :D

\b(Kh?|Gh?|Qu?)[aeu](d['dt]?|t|zz|dhd)h?aff?[iy]\b

Is this a JVM bug or "expected behavior"?

48 votes

I noticed some unexpected behavior (unexpected relative to my personal expectations), and I'm wondering if something if there is a bug in the JVM or if perhaps this is a fringe case where I don't understand some of the details of what exactly is supposed to happen. Suppose we had the following code in a main method by itself:

int i;
int count = 0;
for(i=0; i < Integer.MAX_VALUE; i+=2){
  count++;
}
System.out.println(i++);

A naive expectation would be that this would print Integer.MAX_VALUE-1, the largest even representable int. However, I believe integer arithmetic is supposed to "rollover" in Java, so adding 1 to Integer.MAX_VALUE should result in Integer.MIN_VALUE. Since Integer.MIN_VALUE is still less than Integer.MAX_VALUE, the loop would keep iterating through the negative even ints. Eventually it would get back to 0, and this process should repeat as an infinite loop.

When I actually run this code, I get non-deterministic results. The result that gets printed tends to be on the order of half a million, but the exact value varies. So not only is the loop terminating when I believe it should be an infinite loop, but it seems to terminate randomly. What's going on?

My guess is that this is either a bug in the JVM, or there is a lot of funky optimization going on that makes this expected behavior. Which is it?

Known bug. Related to

http://bugs.sun.com/view_bug.do?bug_id=6196102

http://bugs.sun.com/view_bug.do?bug_id=6357214

and others.

I think they're considered low-priority to fix because they don't come up in the real world.

Why are composite primary keys still around?

48 votes

I'm assigned to migrate a database to a mid-class ERP. The new system uses composite primary keys here and there, and from a pragmatic point of view, why?

Compared to autogenerated IDs, I can only see negative aspects;

  • Foreign keys becomes blurry
  • Harder migration or db-redesigns
  • Inflexible as business change. (My car has no reg.plate..)
  • Same integrity better achieved with constraints.

It's falling back to the design concept of candiate keys, which I neither see the point of.

Is it a habit/artifact from the floppy-days (minimizing space/indexes), or am I missing something?

//edit// Just found good SO-post: Composite primary keys versus unique object ID field //

Personally I prefer the use of surrogate keys. However, in joining tables that consist only of the ids from two other tables (to create a many-to-many relationships) composite keys are the way to go and thus taking them out would make things more difficult.

There is a school of thought that surrogate keys are always bad and that if you don't have uniqueness to record through the use of natural keys you have a bad design. I strongly disagree with this (if you aren't storing SSN or some other unique value I defy you to come up with a natural key for a person table for instance.) But many people feel that it is necessary for proper normalization.

Sometimes having a composite key reduces the need to join to another table. Sometimes it doesn't. So there are times when a composite key can boost performance as well as times when it can harm performance. If the key is relatively stable, you may be fine with faster performance on select queries. However, if it is something that is subject to change like a company name, you could be in a world of hurt when company A changes it's name and you have to update a million associated records.

There is no one size fits all in database design. There are time when composite keys are helpful and times when they are horrible. There are times when surrogate keys are helpful and times when they are not.

Xcode 4 Tips and Tricks for Xcode 3 users

37 votes

As most of you have probably seen, Xcode 4 has been released officially today. Now I know that plenty of devs out there have been using the preview versions, and it'd be great if people could post any great tips, tricks, or keyboard shortcuts they've learned using those version now they're no longer under NDA. This could be especially useful for those upgrading from Xcode 3 (like me, downloading right now).

Note: Apple have released a 'transition guide' that has plenty of stuff in about getting from version 3 to version 4, but I bet there are loads of great tricks people out there have learned that aren't in there.

I liked reading this Blog: Pilky.me - Xcode 4: the super mega awesome review.

It presents a good comparison, I especially liked his conclusion near the end.

Do redundant casts get optimized?

36 votes

I am updating some old code, and have found several instances where the same object is being cast repeatedly each time one of its properties or methods needs to be called. Example:

if (recDate != null && recDate > ((System.Windows.Forms.DateTimePicker)ctrl).MinDate)
{
    ((System.Windows.Forms.DateTimePicker)ctrl).CustomFormat = "MM/dd/yyyy";
    ((System.Windows.Forms.DateTimePicker)ctrl).Value = recDate;
}
else
{
    (System.Windows.Forms.DateTimePicker)ctrl).CustomFormat = " ";
}
((System.Windows.Forms.DateTimePicker)ctrl).Format = DateTimePickerFormat.Custom;

My inclination is to fix this monstrosity, but given my limited time I don't want to bother with anything that's not affecting functionality or performance.

So what I'm wondering is, are these redundant casts getting optimized away by the compiler? I tried figuring it out myself by using ildasm on a simplified example, but not being familiar with IL I only ended up more confused.

UPDATE

So far, the consensus seems to be that a)no, the casts are not optimized, but b)while there may possibly be some small performance hit as a result, it is not likely significant, and c)I should consider fixing them anyway. I have come down on the side of resolving to fix these someday, if I have time. Meanwhile, I won't worry about them.

Thanks everyone!

It is not optimized away from IL in either debug or release builds.

simple C# test:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace RedundantCastTest
{
    class Program
    {
        static object get()
        { return "asdf"; }

        static void Main(string[] args)
        {
            object obj = get();
            if ((string)obj == "asdf")
                Console.WriteLine("Equal: {0}, len: {1}", obj, ((string)obj).Length);
        }
    }
}

Corresponding IL (note the multiple castclass instructions):

.method private hidebysig static void Main(string[] args) cil managed
{
    .entrypoint
    .maxstack 3
    .locals init (
        [0] object obj,
        [1] bool CS$4$0000)
    L_0000: nop 
    L_0001: call object RedundantCastTest.Program::get()
    L_0006: stloc.0 
    L_0007: ldloc.0 
    L_0008: castclass string
    L_000d: ldstr "asdf"
    L_0012: call bool [mscorlib]System.String::op_Equality(string, string)
    L_0017: ldc.i4.0 
    L_0018: ceq 
    L_001a: stloc.1 
    L_001b: ldloc.1 
    L_001c: brtrue.s L_003a
    L_001e: ldstr "Equal: {0}, len: {1}"
    L_0023: ldloc.0 
    L_0024: ldloc.0 
    L_0025: castclass string
    L_002a: callvirt instance int32 [mscorlib]System.String::get_Length()
    L_002f: box int32
    L_0034: call void [mscorlib]System.Console::WriteLine(string, object, object)
    L_0039: nop 
    L_003a: ret 
}

Neither is it optimized from the IL in the release build:

.method private hidebysig static void Main(string[] args) cil managed
{
    .entrypoint
    .maxstack 3
    .locals init (
        [0] object obj)
    L_0000: call object RedundantCastTest.Program::get()
    L_0005: stloc.0 
    L_0006: ldloc.0 
    L_0007: castclass string
    L_000c: ldstr "asdf"
    L_0011: call bool [mscorlib]System.String::op_Equality(string, string)
    L_0016: brfalse.s L_0033
    L_0018: ldstr "Equal: {0}, len: {1}"
    L_001d: ldloc.0 
    L_001e: ldloc.0 
    L_001f: castclass string
    L_0024: callvirt instance int32 [mscorlib]System.String::get_Length()
    L_0029: box int32
    L_002e: call void [mscorlib]System.Console::WriteLine(string, object, object)
    L_0033: ret 
}

Neither case means that the casts don't get optimized when native code is generated - you'd need to look at the actual machine assembly there. i.e. by running ngen and disassembling. I'd be greatly surprised if it wasn't optimized away.

Regardless, I'll cite The Pragmatic Programmer and the broken window theorem: When you see a broken window, fix it.

What does T&& mean in C++0x?

34 votes

I've been looking into some of the new features of C++0x and one I've noticed is the double ampersand in declaring variables, like T&& var.

For a start, what is this beast called? (I wish Google would allow us to search for punctuation like this.)

What exactly does it mean?

At first glance, it appears to be a double reference (like the C-style double pointers T** var), but I'm having a hard time thinking of a use case for that.

It declares an rvalue reference (standards proposal doc).

Here's an introduction to rvalue references: http://www.artima.com/cppsource/rvalue.html.
Here's a fantastic in-depth look at rvalue references by one of Microsoft's standard library developers: http://blogs.msdn.com/b/vcblog/archive/2009/02/03/rvalue-references-c-0x-features-in-vc10-part-2.aspx.

The biggest difference between a C++03 reference (now called an lvalue reference in C++0x) is that it can bind to an rvalue like a temporary without having to be const. Thus, this syntax is now legal:

T&& r = T();

rvalue references primarily provide for the following:

Move semantics. A move constructor and move assignment operator can now be defined that takes an rvalue reference instead of the usual const-lvalue reference. A move functions like a copy, except it is not obliged to keep the source unchanged; in fact, it usually modifies the source such that it no longer owns the moved resources. This is great for eliminating extraneous copies, especially in standard library implementations.

For example, a copy constructor might look like this:

foo(foo const& other)
{
    this->length = other.length;
    this->ptr = new int[other.length];
    copy(other.ptr, other.ptr + other.length, this->ptr);
}

If this constructor was passed a temporary, the copy would be unnecessary because we know the temporary will just be destroyed; why not make use of the resources the temporary already allocated? In C++03, there's no way to prevent the copy as we cannot determine we were passed a temporary. In C++0x, we can overload a move constructor:

foo(foo&& other)
{
   this->length = other.length;
   this->ptr = other.ptr;
   other.length = 0;
   other.ptr = nullptr;
}

Notice the big difference here: the move constructor actually modifies its argument. This would effectively "move" the temporary into the object being constructed, thereby eliminating the unnecessary copy.

The move constructor would be used for temporaries and for non-const lvalue references that are explicitly converted to rvalue references using the std::move function (it just performs the conversion). The following code both invoke the move constructor for f1 and f2:

foo f1((foo())); // Move a temporary into f1; temporary becomes "empty"
foo f2 = std::move(f1); // Move f1 into f2; f1 is now "empty"

Perfect forwarding. rvalue references allow to properly forward arguments for templated functions. Take for example this factory function:

template <typename T, typename A1>
std::shared_ptr<T> factory(A1& a1)
{
    return std::shared_ptr<T>(new T(a1));
}

If we called factory<foo>(5), the template argument will be deduced to be int&, which will not bind to a literal 5, even if foo's constructor takes an int. Well, we could instead use A1 const&, but what if foo takes the constructor argument by non-const reference? To make a truly generic factory function, we would have to overload factory on A1& and on A1 const&. That might be fine if factory takes 1 parameter, but each additional parameter would multiply the necessary overload set by 2. That's very quickly unmaintainable.

rvalue references fix this problem by allowing the standard library to define a forward function, something like this:

template <typename T>
struct identity
{
    typedef T type;
};

template <typename T>
T&& forward(typename identity<T>::type& a)
{
    return (T&&)a;
}

Which would enable us to define the factory function like this:

template <typename T, typename A1>
std::shared_ptr<T> factory(A1&& a1)
{
    return std::shared_ptr<T>(new T(std::forward<A1>(a1)));
}

Now the argument's rvalue/lvalue-ness is preserved when passed to T's constructor. That means that if factory is called with an rvalue, T's constructor is called with an rvalue. If factory is called with an lvalue, T's constructor is called with an lvalue. The improved factory function works because of one special rule:

When the function parameter type is of the form T&& where T is a template parameter, and the function argument is an lvalue of type A, the type A& is used for template argument deduction

Thus, we can use factory like so:

auto p1 = factory<foo>(foo()); // calls foo(foo&&)
auto p2 = factory<foo>(*p1);   // calls foo(foo const&)

Important rvalue reference properties:

  • For overload resolution, lvalues prefer binding to lvalue references and rvalues prefer binding to rvalue references. Hence why temporaries prefer invoking a move constructor / move assignment operator over a copy constructor / assignment operator.
  • rvalue references will implicitly bind to rvalues and to temporaries that are the result of an implicit conversion. i.e. float f = 0f; int&& i = f; is well formed because float is implicitly convertible to int; the reference would be to a temporary that is the result of the conversion.
  • Named rvalue references are lvalues. Unnamed rvalue references are rvalues. This is important to understand why the std::move call is necessary in: foo&& r = foo(); foo f = std::move(r);

Why is DateTime.Now a property and not a method?

33 votes

After reading this blog entry : http://wekeroad.com/post/4069048840/when-should-a-method-be-a-property,

I'm wondering why Microsoft choose in C# :

DateTime aDt = DateTime.Now;

instead of

DateTime aDt = DateTime.Now();
  • Best practices say : Use a method when calling the member twice in succession produces different results
  • And DateTime.Now is perfect example of non-determistic method/property.

Do you know if there any reason for that design ?
Or if it's just a small mistake ?

I believe in CLR via C#, Jeffrey Richter mentions that DateTime.Now is a mistake.

The System.DateTime class has a readonly Now property that returns the current date and time. Each time you query this property, it will return a different value. This is a mistake, and Microsoft wishes that they could fix the class by making Now a method instead of a property.

CLR via C# 3rd Edition - Page 243

Is this code behavior defined?

33 votes

What does the following code print to the console?

map<int,int> m;
m[0] = m.size();
printf("%d", m[0]);

Possible answers:

  1. The behavior of the code is not defined since it is not defined which statement m[0] or m.size() is being executed first by the compiler. So it could print 1 as well as 0.
  2. It prints 0 because the right hand side of the assignment operator is executed first.
  3. It prints 1 because the operator[] has the highest priority of the complete statement m[0] = m.size(). Because of this the following sequence of events occurs:

    • m[0] creates a new element in the map
    • m.size() gets called which is now 1
    • m[0] gets assigned the previously returned (by m.size()) 1
  4. The real answer?, which is unknown to me^^

Thx in advance for your answers...
Woltan

I believe it's unspecified whether 0 or 1 is stored in m[0], but it's not undefined behavior.

The LHS and the RHS can occur in either order, but they're both function calls, so they both have a sequence point at the start and end. There's no danger of the two of them, collectively, accessing the same object without an intervening sequence point.

The assignment is actual int assignment, not a function call with associated sequence points, since operator[] returns T&. That's briefly worrying, but it's not modifying an object that is accessed anywhere else in this statement, so that's safe too. It's accessed within operator[], of course, where it is initialized, but that occurs before the sequence point on return from operator[], so that's OK. If it wasn't, m[0] = 0; would be undefined too!

However, the order of evaluation of the operands of operator= is not specified by the standard, so the actual result of the call to size() might be 0 or 1 depending which order occurs.

The following would be undefined behavior, though. It doesn't make function calls and so there's nothing to prevent size being accessed (on the RHS) and modified (on the LHS) without an intervening sequence point:

int values[1];
int size = 0;

(++size, values[0] = 0) = size;
/*     fake m[0]     */  /* fake m.size() */

When will C++0x be finished?

32 votes

Ok, this is the first question I've asked and I didn't know you couldn't answer your own question.

Answer:

March 25, 2011. :-) I'm not kidding, it's official. Well, at least as far as the committee is concerned.

As Howard already said in the question, the final draft was completed on March 25, 2011.

There will now be some months of editorial changes, voting and ISO red tape before it officially becomes a standard, but on the 25th, the standards committee themselves officially signed off on it.

Sources:

https://www.ibm.com/developerworks/mydeveloperworks/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/the_c_0x_standard_has_been_approved_to_ship23?lang=en

http://herbsutter.com/2011/03/25/we-have-fdis-trip-report-march-2011-c-standards-meeting/

http://twitter.com/#!/sdt_intel/status/51328822066417665

and of course, Howard Hinnant, who asked the question, is on the committee as well, so he's not making it up.

(Only posting this as a "real" answer because Howard apparently was unable to answer his own question)

Final arguments in interface methods - what's the point?

31 votes

In Java, it is perfectly legal to define final arguments in interface methods and do not obey that in the implementing class, e.g.:

public interface Foo {
    public void foo(int bar, final int baz);
}

public class FooImpl implements Foo {

    @Override
    public void foo(final int bar, int baz) {
        ...
    }
}

In the above example, bar and baz has the opposite final definitions in the class VS the interface.

In the same fashion, no final restrictions are enforced when one class method extends another, either abstract or not.

While final has some practical value inside the class method body, is there any point specifying final for interface method parameters?

It doesn't seem like it. According to the Java Language Specification 4.12.4:

Declaring a variable final can serve as useful documentation that its value will not change and can help avoid programming errors.

However, a final modifier on a method parameter is not mentioned in the rules for matching signatures of overridden methods, and it has no effect on the caller, only within the body of an implementation.

.net string class alternative

31 votes

Since I am planning an application that will hold MANY of its data in memory, I would like to have some kind of 'compact' string class, at least one which will contain string in format not larger than zero terminated ASCII version of the string.

Do you know of any such string class implementation - it should have some utility functions like the original string class.

EDIT:

I need to sort the strings and be able to scan through them, just to mention few of the operations that I will use.

Ideally, it would be source compatible with System.String, so basic search&replace action would optimize application memory footprint.

NUMBERS:

I could have 100k record of each record having up to 10 string having 30-60 characters. So:

100000x10x60=60000000=57mega characters. Why not have 60 megs of ram used for that instead of 120 megs of ram? Operations will be faster, everything will be tighter.

Trees will be used for searching, but won't be helpful in regex scans that I plan to have.

I've actually had a similar problem, but with somewhat different problem parameters. My application deals with 2 types of strings - relatively short ones measuring 60-100 chars and longer ones with 100-1000 bytes (averages around 300).

My use case also has to support unicode text, but a relatively small percentage of the strings actually have non-english chars.

In my use case i was exposing each String property as a native String, but the underlying data structure was a byte[] holding unicode bytes.

My use case also requires searching and sorting through these strings, getting substrings and other common string operations. My dataset measures in the millions.

The basic implementation looks something like this:

byte[] _myProperty;

public String MyProperty
{

   get 
   { 
        if (_myProperty== null)
            return null;

        return Encoding.UTF8.GetString(value);
   }

   set
   { 
        _myProperty = Encoding.UTF8.GetBytes(value);

   }

}

The performance hit for these conversions, even when you search and sort was relatively small (was about 10-15%).

This was fine for a while, but i wanted to reduce the overhead further. The next step was to create a merged array for all the strings in a given object (an object would hold either 1 short and 1 long string, or 4 short and 1 long string). so there would be one byte[] for each object, and only require 1 byte for each of the strings (save their lengths which are always < 256). even if your strings can be longer then 256, and int is still cheaper then the 12-16 byte overhead for the byte[].

This reduced much of the byte[] overhead, and added a little complexity but no additional impact to performance (the encoding pass is relatively expensive compared with the array copy involved).

this implementation looks something like this:

byte _property1;
byte _property2;
byte _proeprty3;

private byte[] _data; 

byte[] data;
//i actually used an Enum to indicate which property, but i am sure you get the idea
private int GetStartIndex(int propertyIndex)
{  

   int result = 0;
   switch(propertyIndex)
   {
       //the fallthrough is on purpose 
       case 2:
          result+=property2;
       case 1:
          result+=property1;

   }

   return result;
}

private int GetLength(int propertyIndex)
{
   switch (propertyIndex)
   {
     case 0:
        return _property1;
     case 1: 
        return _property2;
     case 2:
        return _property3;
   }
    return -1;
}

private String GetString(int propertyIndex)
{
   int startIndex = GetStartIndex(propertyIndex);
   int length = GetLength(propertyIndex);
   byte[] result = new byte[length];
   Array.Copy(data,startIndex,result,0,length);

   return Encoding.UTF8.GetString(result);

}

so the getter looks like this:

public String Property1
{
   get{ return GetString(0);}
}

The setter is in the same spirit - copy the original data into two arrays (between 0 start to startIndex, and between startIndex+length to length) , and create a new array with the 3 arrays (dataAtStart+NewData+EndData) and set the length of the array to the appropriate local variable.

I was still not happy with the memory saved and the very hard labor of the manual implementation for each property, so i built an in-memory compress paging system that uses the amazingly fast QuickLZ to compress a full page. This gave me a lot of control over the time-memory tradeoff (which is essentially the size of the page).

The compression rate for my use-case (compared with the more efficient byte[] store) approaches 50% (!). I used a page size of approx 10 strings per page and grouped similar properties together (which tend to have similar data). This added an additional overhead of 10-20% (on top of the encoding/decoding pass which is still required). The paging mechanism caches recently accessed pages up to a configurable size. Even without compression this implementation allows you to set a fixed factor on the overhead for each page. The major downside of my current implementation of the page cache is that with compression it is not thread-safe (without it there is no such problem).

If you're interested in the compressed paging mechanism let me know (I've been looking for an excuse to open source it).

How to write a code with expiration date?

29 votes

Hi - just had this idea for something that I'd love to be able to use:

Let's say I have to fix a bug and I decide to write an ugly code line that fixes the immediate problem - but only because I promise myself that I will soon find the time to perform a proper refactoring.

I want to be able to somehow mark that code line as "Expired in" and add a date - so that if the code is compiled some time after that date there will be a compilation error/warning with a proper message.

Any suggestions? It must be possible to perform - maybe using some complicated #IF or some options in visual studio? I'm using VS 2005 - mainly for C#.

Thanks!

[EDIT]: Wow - never expected this question to raise so much interest :) Thank you all for your answers and for turning this into an interesting debate. I know it's hard to justify using anything like this - and I probably won't use it - but sometimes, when you have to ship a version YESTERDAY and you find yourself compromising on a patchy fix instead - you want to force yourself to fix it in the near future.

I chose MartinStettner's suggestion as the answer because it met my needs - no error on runtime - only during compilation, no need to define new types just for this goal - and it's not limited to a scope of an entire method. Cheers!

You could write comment lines in the form

// Expires on 2011/07/01

and add a prebuild step which does a solution-wide replace of these lines by something like

#error Code expired on 2011/07/01

for all lines that contain a date before the current day. For this prebuild step you would need to write a short program (probably using regular expressions and some date comparision logic)

This step could also be performed by a VS macro, which allows for easier access to all files fo the solution but has the disadvantage that it must be installed and run on all VS installations where your project is compiled.

What's the advantage of having public static inner classes of an interface/class?

Asked on Tue, 22 Mar 2011 by DR java
29 votes

I've noticed the following code pattern while browsing through some sources of my project's 3rd party libraries:

public interface MyInterface {
    public static class MyClass1 implements MyInterface { ... }
    public static class MyClass2 implements MyInterface { ... }
    public static class MyClass3 implements MyInterface { ... }
}

Or this one:

public class MyBaseClass {
    public static class MyClass1 extends MyBaseClass { ... }
    public static class MyClass2 extends MyBaseClass { ... }
    public static class MyClass3 extends MyBaseClass { ... }
}

Real life examples:

  • SwingX: org.jdesktop.swingx.decorator.HighlightPredicate (Source)
  • Substance: org.pushingpixels.substance.api.renderers.SubstanceDefaultTableCellRenderer (Source)

What's the advantage of having a code structure like this?

My first thought was "aggregation", but the same thing could be achieved using plain old packages. So when/why is it better to use public inner classes instead of a package?

I think this is reasoned by aggregation, maybe they're also not worth it to create a top level class. I do this sometimes if something is to small to create a package (to separate them from others) but the corresponding classes should only used within the context of the top level class. In my opinion this is a design decision.

The decorator pattern may be a nice example, they can be applied on the top-level class but are maybe so simple they're not worth it to be also top-level. You can easily show the ownership by using them as inner classes.

That's not that visible at first glance with packages. You directly see the dependent class/interface.

Further it's possible to access the private fields of a class, this could be useful and is more fine-grained than the package private scope.

Running a Haskell program on the Android OS

27 votes

Forenote: This is an extension of the thread started on /r/haskell

Lets start with the facts:

  • Android is one awesome Operating System
  • Haskell is the best programming language on the planet

Therefore, clearly, combining them would make Android development that much better. So essentially I would just like to know how I can write Haskell programs for the Android OS. My question is:

How can I get a Haskell program to execute/run on the Android OS?

P.S. Ignore the jokes above because this is an honest question and I really would like to see this happen.

How you do it is by first getting a Haskell compiler which can target C with the android NDK which comes with a GCC port for ARM architectures. JHC can trivially do this with a very small inf style file which describes platform (word size, c-compiler, etc) I've done this with the Wii homebrew dev kit and it was quite easy. However jhc still has some stability issues with complex code such as using a monad transformer stack with IO but jhc has been improving a lot over the last 6 months. There is only one person working jhc I just wished more people could help him.

The other option is to build an "unregistered" port of GHC targeting ndk gcc, this is a lot more involved process because GHC is not a true cross-compiler at the moment and you need to understand the build system what parts you need to change. Another option is NHC which can cross-compile to C, like GHC you need to build nhc targeting a C compiler, NHC does not have many Haskell extensions like GHC.

Once you have Haskell compiler targeting NDK GCC, you will need write bindings to either android NDK JNI glue code framework (android 2.3) or you must write JNI glue code between Java-C-Haskell, the former option is the easier solution and if I remember correctly might actually be backwards compatible with Android 2.*

Once you have this you must build Haskell code as shared library or static library which gets linked into the NDK java glue code (which itself is a shared library). As far as I'm aware you can not officially run native executables on android. You could probably do it with a rooted phone, thus I assume this means you can not distribute native executables on the app store even when the NDK gcc port can generate native executables just fine. This also probably kills the option for using LLVM unless you can get the NDK JNI working with LLVM

The biggest hurdle isn't so much getting a Haskell compiler for android (which is still a big hurdle) the biggest problem is that some one needs to write binding APIs for NDK libraries which is a huge task and the situation is worse if you need to write android UI code because there are no NDK APIs for this part of android SDK. If you want to do android UI code in Haskell somebody will have to write Haskell bindings to Java through JNI/C. Unless there is a more automated process to writing binding libraries (and I know there are they are just not automated enough for me) then chances of some one doing it are quite low.

What good are public variables then?

27 votes

Hi everyone, I'm a total newbie with tons of ?'s in my mind and a lot to experience with C++ yet! There's been something which I find really confusing and it's the use of public variables, I've seen tons of code like this:

class Foo {

private:
    int m_somePrivateVar;

public:
    void setThatPrivateVar (int const & new_val) {
        m_somePrivateVar = new_val;
    }

    int getThatPrivateVar (void) const {
        return m_somePrivateVar;
    }

};

Why would anyone hide that variable and implement accessors and mutators when there's nothing done in them more than assigning the new value just as it got received (no range checking etc.) or returning the value without just as it is? Well I've heard some reasons and some of them are convincing in some cases, but imagine implementing a huge class in such a manner for with a lot of variables which do not need any checking and stuff! Let me ask you this way, When do you use public variables? Do you use that at all?

Thanks in advance.

By hiding the variable and adding methods now, the class designer allows for inserting arbitrary code into those methods in the future without breaking tons of code that use the attributes directly.

Also note that providing a lot of accessor/mutator methods is generally a sign that your class design needs another look for possible improvement. Class methods should implement actual logic, not just provide access to each member.

I use public variables only in struct form. For example, I might have a database table that represents a string->value mapping, where value is a composite data structure. I'd just write a structure and use for example std::map<std::string, MyStruct> to represent the database table. I don't need to actually do work on the data, merely be able to look it up and make use of it when required.

As noted in a couple comments, even structs can often benefit from judicial use of methods, for example a couple of common constructors to keep the members sanely initialized, a clear function to reuse the structure, etc.

How can you flip website upside down in IE ? (for the April 1st)

27 votes

We are making April 1st prank in our office, and wanted to flip our corporate website upside down for several hours tomorrow :)

My patch works everywhere but not in IE... Can anyone help ?

<script type="text/javascript">
   document.body.style.MozTransform = 'rotate(180deg)';
   document.body.style['-webkit-transform'] = 'rotate(180deg)';
</script>

A slightly simpler version for IE (no matrix stuff):

body {
  filter: progid:DXImageTransform.Microsoft.BasicImage(rotation=2);
}

Redefinition allowed in C but not in C++?

26 votes

Why does this code work in C but not in C++?

int i = 5;
int i; // but if I write int i = 5; again I get error in C also

int main(){

  // using i
}

Tentative definition is allowed in C but not in C++.

A tentative definition is any external data declaration that has no storage class specifier and no initializer.

C99 6.9.2/2

A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.

So int i is a tentative definition. The C compiler will combine all of the tentative definitions into a single definition of i.

In C++ your code is ill-formed due to the One Definition Rule (Section 3.2/1 ISO C++)

No translation unit shall contain more than one definition of any variable, function, class type, enumeration type or template.


// but if I write int i = 5; again I get error in C also

Because in that case it no longer remains a tentative definition because of the initializer (5).


Just for the sake of information

J.5.11 Multiple external definitions

There may be more than one external definition for the identifier of an object, with or without the explicit use of the keyword extern; if the definitions disagree, or more than one is initialized, the behavior is undefined (6.9.2).

Also check out this excellent post on external variables.

Why isn't LinkedList.Clear() O(1)

25 votes

I was assuming LinkedList.Clear() was O(1) on a project I'm working on, as I used a LinkedList to drain a BlockingQueue in my consumer that needs high throughput, clearing and reusing the LinkedList afterwards.

Turns out that assumption was wrong, as the (OpenJDK) code does this:

    Entry<E> e = header.next;
    while (e != header) {
        Entry<E> next = e.next;
        e.next = e.previous = null;
        e.element = null;
        e = next;
    }

This was a bit surprising, are there any good reason LinkedList.Clear couldn't simply "forget" its header.next and header.previous member ?

The source code in the version I'm looking at (build 1.7.0-ea-b84) in Eclipse have this comment above them:

// Clearing all of the links between nodes is "unnecessary", but:
// - helps a generational GC if the discarded nodes inhabit
//   more than one generation
// - is sure to free memory even if there is a reachable Iterator

That makes it reasonably clear why they're doing it, although I agree it's slightly alarming that it turns an O(1) operation into O(n).

Are Exceptions still undesirable in Realtime environment?

25 votes

A couple of years ago I was tought, that in real-time applications such as Embedded Systems or (Non-Linux-)Kernel-development C++-Exceptions are undesirable. (Maybe that lesson was from before gcc-2.95). But I also know, that Exception Handling has become better.

So, are C++-Exceptions in the context of real-time applications in practice

  • totally unwanted?
  • even to be switched off via via compiler-switch?
  • or very carefully usable?
  • or handled so well now, that one can use them almost freely, with a couple of things in mind?
  • Does C++0x change anything w.r.t. this?

Update: Does exception handling really require RTTI to be enabled (as one answerer suggested)? Are there dynamic casts involved, or similar?

Exceptions are now well-handled, and the strategies used to implement them make them in fact faster than testing return code, because their cost (in terms of speed) is virtually null, as long as you do not throw any.

However they do cost: in code-size. Exceptions usually work hand in hand with RTTI, and unfortunately RTTI is unlike any other C++ feature, in that you either activate or deactivate it for the whole project, and once activated it will generated supplementary code for any class that happens to have a virtual method, thus defying the "you don't pay for what you don't use mindset".

Also, it does require supplementary code for its handling.

Therefore the cost of exceptions should be measured not in terms of speed, but in terms of code growth.

EDIT:

From @Space_C0wb0y: This blog article gives a small overview, and introduces two widespread methods for implementing exceptions Jumps and Zero-Cost. As the name implies, good compilers now use the Zero-Cost mechanism.

The Wikipedia article on Exception Handling talk about the two mechanisms used. The Zero-Cost mechanism is the Table-Driven one.

EDIT:

From @Vlad Lazarenko whose blog I had referenced above, the presence of exception thrown might prevent a compiler from inlining and optimizing code in registers.

Alternative to Eclipse for C and C++ development?

25 votes

I have been using Eclipse for C and C++ development for some time. Unfortunately Eclipse has it's faults (speed, the crappy integrated console, and some bugs that pop up from time to time).

For C++ development Qt Creator is a very good choice, but I need something for both C and C++.

I don't really need the integration parts of the IDE (I don't need an integrated project manager, compiler or debuger). What I need is code navigation. Eclipse provides a great feature "callgraph for structure elements" that is unparalleled when I need to modify big crummy code bases (which is what I do most of the time).

Code completion and at least some integration documentation (doxygen, generic comments before functions, system documentation) is an absolute necessity.

Oh and the IDE has to be crossplatform.

Is there something other then Eclipse?

Have you tried NetBeans? There is a plugin for C/C++ development.