Best questions in December 2013

Why does CSS work with fake tags?

333 votes

In my class, I was playing around and found out that CSS works with made-up tags.

Example:

<style>
imsocool {
    color:blue;
}
</style>

<body>
    <imsocool>HELLO</imsocool>
</body>

When my professor first saw me using this, he was a bit surprised that made-up tags worked and recommended I simply change all of my made up tags to paragraphs with ID's.

Why doesn't my professor want me to use made-up tags? They work effectively.

Also, why didn't he know that made-up tags exist and work with CSS. Are they uncommon?

Why does CSS work with fake tags?

(Most) browsers are designed to be (to some degree) forward compatible with future additions to HTML. Unrecognised elements are parsed into the DOM, but have no semantics or specialised default rendering associated with them.

When a new element is added to the specification, sometimes CSS, JavaScript and ARIA can be used to provide the same functionality in older browsers (and the elements have to appear in the DOM for those languages to be able to manipulate them to add that functionality).

(Although it should be noted that work is underway to define a means to extend HTML with custom elements, but this work is in the early stages of development at present so it should probably be avoided until it has matured.)

Why doesn't my professor want me to use made-up tags?

  • They are not allowed by the HTML specification
  • They might conflict with future standard elements with the same name
  • There is probably an existing HTML element that is better suited to the task

Also; why didn't he know that made-up tags existed and worked with CSS. Are they uncommon?

Yes. People don't use them because they have the above problems.

Unusual shape of a textarea?

230 votes

Usually textareas are rectangular or square, like this:

Usual textarea

But I want a custom-shaped textarea, like this, for example:

Custom-shaped textarea

How is this possible?

Introduction

First, there's many solutions, proposed in other posts. I think this one is currently (in 2013) the one which can be compatible with the largest number of browsers, because it doesn't need any CSS3 property. However, the method will not works on browsers which doesn't support contentdeditable, be careful.

Solution with a div contenteditable

As proposed by @Getz, you can use a div with contenteditable and then shape it with some div on it. Here's an example, with two block which floats at the upper left and the upper right of the main div:

JSFiddle of the example: http://jsfiddle.net/qgfP6/1/

The result with Firefox 28

As you can see, you have to play a little with borders if you want the same result as you requested in your post. The main div has the blue border on every side. Next, red blocks has to be sticked to hide top borders of the main div, and you need to apply border to them only on particular sides (bottom and left for the right block, bottom and right for the left).

After, you can get the content via javascript, when the "Submit" button is triggered for example. And I think you can handle also the rest of the CSS (font-size, color, etc.) :)

Full code sample

(for history, if JSFiddle is out of service some day):

<div class="parent">
    <div class="block_left"></div>
    <div class="block_right"></div>
    <div class="div2" contenteditable="true">
        "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut..."
    </div>
</div>

and CSS:

.block_left{
    background-color: red;
    height:100px;
    width:100px;
    float:left;
    border-right:2px solid blue;
    border-bottom:2px solid blue;
}

.block_right{
    background-color: red;
    height:100px;
    width:100px;
    float:right;
    border-left:2px solid blue;
    border-bottom:2px solid blue;
}

.div2{
    background-color:white;
    font-size:1.5em;
    border:2px solid blue;
}

.parent{
    height: 200px;
    width:500px;
}

Why aren't my balls shrinking/disappearing?

196 votes

http://jsfiddle.net/goldrunt/jGL84/42/ this is from line 84 in this JS fiddle. There are 3 different effects which can be applied to the balls by uncommenting lines 141-146. The 'bounce' effect works as it should, but the 'asplode' effect does nothing. Should I include the 'shrink' function inside the asplode function?

// balls shrink and disappear if they touch
var shrink = function(p) {
    for (var i = 0; i < 100; i++) {
        p.radius -= 1;
    }
    function asplode(p) {
        setInterval(shrink(p),100);
        balls.splice(p, 1);
    }
}

Your code has a few problems.

First, in your definition:

var shrink = function(p) {
    for (var i = 0; i < 100; i++) {
        p.radius -= 1;
    }

    function asplode(p) {
         setInterval(shrink(p),100);
        balls.splice(p, 1);
    }
}

asplode is local to the scope inside shrink and therefore not accessible to the code in update where you are attempting to call it. JavaScript scope is function-based, so update cannot see asplode because it is not inside shrink. (In your console, you'll see an error like: Uncaught ReferenceError: asplode is not defined.)

You might first try instead moving asplode outside of shrink:

var shrink = function(p) {
    for (var i = 0; i < 100; i++) {
        p.radius -= 1;
    }
}

function asplode(p) {
     setInterval(shrink(p),100);
     balls.splice(p, 1);
}

However, your code has several more problems that are outside the scope of this question:

  • setInterval expects a function. setInterval(shrink(p), 100) causes setInterval to get the return value of immediate-invoked shrink(p). You probably want

    setInterval(function() { shrink(p) }, 100)
    
  • Your code for (var i = 0; i < 100; i++) { p.radius -= 1; } probably does not do what you think it does. This will immediately run the decrement operation 100 times, and then visually show the result. If you want to re-render the ball at each new size, you will need to perform each individual decrement inside a separate timing callback (like a setInterval operation).

  • .splice expects a numeric index, not an object. You can get the numeric index of an object with indexOf:

    balls.splice(balls.indexOf(p), 1);
    
  • By the time your interval runs for the first time, the balls.splice statement has already happened (it happened about 100ms ago, to be exact). I assume that's not what you want. Instead, you should have a decrementing function that gets repeatedly called by setInterval and finally performs balls.splice(p,1) after p.radius == 0.

What does (x ^ 0x1) != 0 mean?

157 votes

I came across the following code snippet

if( 0 != ( x ^ 0x1 ) )
     encode( x, m );

What does x ^ 0x1 mean ? Is this some standard technique ?

The XOR operation (^) inverts bit 0. So the expression effectively means: if bit 0 of x is 0, or any other bit of x is 1, then the expression is true.

Conversely the expression is false if x = 1.

So the test is the same as:

if (x != 1)

and is therefore (arguably) unnecessarily obfuscated.

Rails I18n validation deprecation warning

105 votes

I just updated to rails 4.0.2 and I'm getting this warning:

[deprecated] I18n.enforce_available_locales will default to true in the future. If you really want to skip validation of your locale you can set I18n.enforce_available_locales = false to avoid this message.

Is there any security issue in setting it to false?

Important: Make sure your app is not using I18n 0.6.8, it has a bug that prevents the configuration to be set correctly.


Short answer

In order to silence the warning edit the application.rb file and include the following line inside the Rails::Application body

config.i18n.enforce_available_locales = true

The possible values are:

  • false: if you
    • want to skip the locale validation
    • don't care about locales
  • true: if you
    • want the application to raise an error if an invalid locale is passed (or)
    • want to default to the new Rails behaviors (or)
    • care about locale validation

Note:

  • The old default behavior corresponds to false, not true.
  • If you are setting the config.i18n.default_locale configuration or other i18n settings, make sure to do it after setting the config.i18n.enforce_available_locales setting.
  • If your use third party gems that include I18n features, setting the variable through may not have effect. In this case, set it directly to I18n using I18n.config.enforce_available_locales.

    Caveats

Example

require File.expand_path('../boot', __FILE__)

# ...

module YouApplication
  class Application < Rails::Application

    # ...

    config.i18n.enforce_available_locales = true
    # or if one of your gem compete for pre-loading, use
    I18n.config.enforce_available_locales = true

    # ...

  end
end

Long answer

The deprecation warning is now displayed both in Rails 4 (>= 4.0.2) and Rails 3.2 (>= 3.2.14). The reason is explained in this commit.

Enforce available locales

When I18n.config.enforce_available_locales is true we'll raise an I18n::InvalidLocale exception if the passed locale is unavailable.

The default is set to nil which will display a deprecation error.

If set to false we'll skip enforcing available locales altogether (old behaviour).

This has been implemented in the following methods :

  • I18n.config.default_locale=
  • I18n.config.locale=
  • I18n.translate
  • I18n.localize
  • I18n.transliterate

Before this change, if you passed an unsupported locale, Rails would silently switch to it if the locale is valid (aka if there is a corresponding locale file in the /config/locales folder), otherwise the locale would default to the config.i18n.default_locale configuration (that defaults to :en).

The new version of the I18n gem, forces developers to be a little bit more conscious of the locale management.

In the future the behavior will change and if a locale is invalid, the Rails app will raise an error.

In preparation of such change (that may potentially break several applications that until today were relying on silent defaults), the warning is forcing you to explicitly declare which validation you want to perform in the current transition period.

To restore the previous behavior, simply set the following configuration to false

config.i18n.enforce_available_locales = false

otherwise, set it to true to match the new Rails defaults or if you want to be more rigid on domain validation and avoid switching to the default in case of invalid locale.

config.i18n.enforce_available_locales = true

Caveat

  1. If you are setting the config.i18n.default_locale configuration or using any of the previously mentioned methods (default_locale=, locale=, translate, etc), make sure to do it after setting the config.i18n.enforce_available_locales setting. Otherwise, the deprecation warning will keep on popping. (Thanks Fábio Batista).

  2. If your use third party gems that include I18n features, setting the variable through may not have effect. In fact, the issue is the same described in the previous point, just a little bit more hard to debug.

    This issue is a matter of precedence. When you set the config in your Rails app, the value is not immediately assigned to the I18n gem. Rails stores each config in an internal object, loads the dependencies (Railties and third party gems) and then it passes the configuration to the target classes. If you use a gem (or Rails plugin) that calls any of the I18n methods before the config is assigned to I18n, then you'll get the warning.

    In this case, you need to skip the Rails stack and set the config immediately to the I18n gem by calling

    I18n.config.enforce_available_locales = true
    

    instead of

    config.i18n.enforce_available_locales = true
    

    The issue is easy to prove. Try to generate a new empty Rails app and you will see that setting config.i18n in the application.rb works fine.

    If in your app it does not, there is an easy way to debug the culprit. Locate the i18n gem in your system, open the i18n.rb file and edit the method enforce_available_locales! to include the statement pp caller.

    This will cause the method to print the stacktrace whenever invoked. You will be able to determine who is the gem calling it by inspecting the stacktrace (in my case it was Authlogic).

    ["/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/i18n-0.6.9/lib/i18n.rb:150:in `translate'",
     "/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/authlogic-3.1.0/lib/authlogic/i18n/translator.rb:8:in `translate'",
     "/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/authlogic-3.1.0/lib/authlogic/i18n.rb:79:in `translate'",
     "/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/authlogic-3.1.0/lib/authlogic/acts_as_authentic/email.rb:68:in `validates_format_of_email_field_options'",
     "/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/authlogic-3.1.0/lib/authlogic/acts_as_authentic/email.rb:102:in `block in included'",
     "/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/authlogic-3.1.0/lib/authlogic/acts_as_authentic/email.rb:99:in `class_eval'",
     "/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/authlogic-3.1.0/lib/authlogic/acts_as_authentic/email.rb:99:in `included'",
     "/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/authlogic-3.1.0/lib/authlogic/acts_as_authentic/base.rb:37:in `include'",
     "/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/authlogic-3.1.0/lib/authlogic/acts_as_authentic/base.rb:37:in `block in acts_as_authentic'",
     "/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/authlogic-3.1.0/lib/authlogic/acts_as_authentic/base.rb:37:in `each'",
     "/Users/weppos/.rvm/gems/ruby-2.0.0-p247@application/gems/authlogic-3.1.0/lib/authlogic/acts_as_authentic/base.rb:37:in `acts_as_authentic'",
     "/Users/weppos/Projects/application/app/models/user.rb:8:in `<class:User>'",
     "/Users/weppos/Projects/application/app/models/user.rb:1:in `<top (required)>'",
    

Why does this C++ snippet compile?

105 votes

I found this in one of my libraries this morning:

static tvec4 Min(const tvec4& a, const tvec4& b, tvec4& out)
{
    tvec3::Min(a,b,out);
    out.w = min(a.w,b.w);
}

I'd expect a compiler error because this method doesn't return anything, and the return type is not void.

The only two things thing that comes to mind are

  • In the only place where this method is called, the return value isn't being used or stored. (This method was supposed to be void - the tvec4 return type is a copy-and-paste error)

  • a default constructed tvec4 is being created, which seems a bit unlike, oh, everything else in C++.

I haven't found the part of the C++ spec that addresses this. References (ha) are appreciated.

This is undefined behavior from the C++11 draft standard section 6.6.3 The return statement paragraph 2 which says:

[...] Flowing off the end of a function is equivalent to a return with no value; this results in undefined behavior in a value-returning function. [...]

This means that the compiler is not obligated provide an error nor a warning usually because it can be difficult to diagnose in all cases. We can see this from the definition of undefined behavior in the draft standard in section 1.3.24 which says:

[...]Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).[...]

Although in this case we can get both gcc and clang to generate a wanring using the -Wall flag, which gives me a warning similar to this:

warning: control reaches end of non-void function [-Wreturn-type]

We can turn this particular warning into an error using the -Werror=return-type flag. I also like to use -Wextra -Wconversion -pedantic for my own personal projects.

As ComicSansMS mentions in Visual Studio this code would generate C4716 which is an error by default, the message I see is:

error C4716: 'Min' : must return a value

and in the case where not all code paths would return a value then it would generate C4715, which is a warning.

Why is argc not a constant?

86 votes
int main( const int argc , const char[] const argv)

As Effective C++ Item#3 states "Use const whenever possible", I start thinking "why not make these 'constant' parameters const"?.

Is there any scenario in which the value of argc is modified in a program?

In this case, history is a factor. C defined these inputs as "not constant", and compatibility with (a good portion of) existing C code was an early goal of C++.

Some UNIX APIs, such as getopt, actually do manipulate argv[], so it can't be made const for that reason also.

(Aside: Interestingly, although getopt's prototype suggests it won't modify argv[] but may modify the strings pointed to, the Linux man page indicates that getopt permutes its arguments, and it appears they know they're being naughty. The man page at the Open Group does not mention this permutation.)

Putting const on argc and argv wouldn't buy much, and it would invalidate some old-school programming practices, such as:

// print out all the arguments:
while (--argc)
    std::cout << *++argv << std::endl;

I've written such programs in C, and I know I'm not alone. I copied the example from somewhere.

How does this print "hello world"?

73 votes

I discovered this oddity:

for (long l = 4946144450195624l; l > 0; l >>= 5)
    System.out.print((char) (((l & 31 | 64) % 95) + 32));

Output:

hello world

How does this work?

The number 4946144450195624 fits 64 bits, its binary representation is:

 10001100100100111110111111110111101100011000010101000

The program decodes a character for every 5-bits group, from right to left

 00100|01100|10010|01111|10111|11111|01111|01100|01100|00101|01000
   d  |  l  |  r  |  o  |  w  |     |  o  |  l  |  l  |  e  |  h

5-bit codification

For 5 bits, it is posible to represent 2⁵ = 32 characters. English alphabet contains 26 letters, this leaves room for 32 - 26 = 6 symbols apart from letters. With this codification scheme you can have all 26 (one case) english letters and 6 symbols (being space among them).

Algorithm description

The >>= 5 in the for-loop jumps from group to group, then the 5-bits group gets isolated ANDing the number with the mask 31₁₀ = 11111₂ in the sentence l & 31

Now the code maps the 5-bit value to its corresponding 7-bit ascii character. This is the tricky part, check the binary representations for the lowercase alphabet letters in the following table:

  ascii   |     ascii     |    ascii     |    algorithm
character | decimal value | binary value | 5-bit codification 
--------------------------------------------------------------
  space   |       32      |   0100000    |      11111
    a     |       97      |   1100001    |      00001
    b     |       98      |   1100010    |      00010
    c     |       99      |   1100011    |      00011
    d     |      100      |   1100100    |      00100
    e     |      101      |   1100101    |      00101
    f     |      102      |   1100110    |      00110
    g     |      103      |   1100111    |      00111
    h     |      104      |   1101000    |      01000
    i     |      105      |   1101001    |      01001
    j     |      106      |   1101010    |      01010
    k     |      107      |   1101011    |      01011
    l     |      108      |   1101100    |      01100
    m     |      109      |   1101101    |      01101
    n     |      110      |   1101110    |      01110
    o     |      111      |   1101111    |      01111
    p     |      112      |   1110000    |      10000
    q     |      113      |   1110001    |      10001
    r     |      114      |   1110010    |      10010
    s     |      115      |   1110011    |      10011
    t     |      116      |   1110100    |      10100
    u     |      117      |   1110101    |      10101
    v     |      118      |   1110110    |      10110
    w     |      119      |   1110111    |      10111
    x     |      120      |   1111000    |      11000
    y     |      121      |   1111001    |      11001
    z     |      122      |   1111010    |      11010

Here you can see that the ascii characters we want to map begin with the 7th and 6th bit set (11xxxxx₂) (except for space, which only has the 6th bit on), you could OR the 5-bit codification with 96 (96₁₀ = 1100000₂) and that should be enough to do the mapping, but that wouldn't work for space (darn space!)

Now we know that special care has to be taken to process space at the same time as the other characters. To achieve this, the code turns the 7th bit on (but not the 6th) on the extracted 5-bit group with an OR 64 64₁₀ = 1000000₂ (l & 31 | 64).

So far the 5-bit group is of the form: 10xxxxx₂ (space would be 1011111₂ = 95₁₀). If we can map space to 0 unaffecting other values, then we can turn the 6th bit on and that should be all. Here is what the mod 95 part comes to play, space is 1011111₂ = 95₁₀, using the mod operation (l & 31 | 64) % 95) only space goes back to 0, and after this, the code turns the 6th bit on by adding 32₁₀ = 100000₂ to the previous result, ((l & 31 | 64) % 95) + 32) transforming the 5-bit value into a valid ascii character

isolates 5 bits --+          +---- takes 'space' (and only 'space') back to 0
                  |          |
                  v          v
               (l & 31 | 64) % 95) + 32
                       ^           ^ 
       turns the       |           |
      7th bit on ------+           +--- turns the 6th bit on

The following code does the inverse process, given a lowercase string (max 12 chars), returns the 64 bit long value that could be used with the OP's code:

public class D {
    public static void main(String... args) {
        String v = "hello test";
        int len = Math.min(12, v.length());
        long res = 0L;
        for (int i = 0; i < len; i++) {
            long c = (long) v.charAt(i) & 31;
            res |= ((((31 - c) / 31) * 31) | c) << 5 * i;
        }
        System.out.println(res);
    }
}    

Use an array as a case statement in switch

69 votes

I am trying to do something like this, i.e., use an array in a switch statement. Is it possible in Java? If it isn't, please explain a possible solution.

boolean[] values = new boolean[4];

values[0] = true;
values[1] = false;
values[2] = false;
values[3] = true;

switch (values) {
    case [true, false, true, false]:
        break;
    case [false, false, true, false]:
        break;
    default:
        break;
}

Try this solution:

    boolean[] values = new boolean[4];
    values[0] = true;
    values[1] = false;
    values[2] = false;
    values[3] = true;

    if (ArrayUtils.isEquals(values, new boolean[] {true, false, true, false})) {
    ...
    }
    else if (ArrayUtils.isEquals(values, new boolean[] {false, false, true, false})) {
    ...
    }
    else {
    ...
    }

See docs here.

How to compare Unicode characters that "look alike"?

69 votes

I fall into a surprising issue.

I loaded a text file in my application and I have some logic which compares the value having µ.

And I realized that even if the texts are same the compare value is false.

 Console.WriteLine("μ".Equals("µ")); // returns false
 Console.WriteLine("µ".Equals("µ")); // return true

In later line the character µ is copy pasted.

However, these might not be the only characters that are like this.

Is there any way in C# to compare the characters which look the same but are actually different?

In many cases, you can normalize both of the Unicode characters to a certain normalization form before comparing them, and they should be able to match. Of course, which normalization form you need to use depends on the characters themselves; just because they look alike doesn't necessarily mean they represent the same character. You also need to consider if it's appropriate for your use case — see Jukka K. Korpela's comment.

For this particular situation, if you refer to the links in Tony's answer, you'll see that the table for U+00B5 says:

Decomposition <compat> GREEK SMALL LETTER MU (U+03BC)

This means U+00B5, the second character in your original comparison, can be decomposed to U+03BC, the first character.

So you'll normalize the characters using full compatibility decomposition, with the normalization forms KC or KD. Here's a quick example I wrote up to demonstrate:

using System;
using System.Text;

class Program
{
    static void Main(string[] args)
    {
        char first = 'μ';
        char second = 'µ';

        // Technically you only need to normalize U+00B5 to obtain U+03BC, but
        // if you're unsure which character is which, you can safely normalize both
        string firstNormalized = first.ToString().Normalize(NormalizationForm.FormKD);
        string secondNormalized = second.ToString().Normalize(NormalizationForm.FormKD);

        Console.WriteLine(first.Equals(second));                     // False
        Console.WriteLine(firstNormalized.Equals(secondNormalized)); // True
    }
}

For details on Unicode normalization and the different normalization forms refer to System.Text.NormalizationForm and the Unicode spec.

A lambda's return type can be deduced by the return value, so why can't a function's?

64 votes
#include <iostream>

int main(){

    auto lambda = [] {
        return 7;
    };

    std::cout << lambda() << '\n';

}

This program compiles and prints 7.
The return type of the lambda is deduced to the integer type based on the return value of 7.


Why isn't this possible with ordinary functions?

#include <iostream>

auto function(){
    return 42;
}

int main(){

    std::cout << function() << '\n';
}

error: ‘function’ function uses ‘auto’ type specifier without trailing return type

C++14 will have that feature. You can test it with new versions of GCC or clang by setting the -std=c++1y flag.

Live example

In addition to that, in C++14 you can also use decltype(auto) (which mirrors decltype(auto) as that of variables) for your function to deduce its return value using decltype semantics.

An example would be that for forwarding functions, for which decltype(auto) is particularly useful:

template<typename function_type, typename... arg_types>
decltype(auto) do_nothing_but_forward(function_type func, arg_types&&... args) {
    return func(std::forward<arg_types>(args)...);
}

With the use of decltype(auto), you mimic the actual return type of func when called with the specified arguments. There's no more duplication of code in the trailing return type which is very frustrating and error-prone in C++11.

Concatenation of Strings and characters

49 votes

The following statements,

String string = "string";   

string = string +((char)65) + 5;
System.out.println(string);

Produce the output stringA5.


The following however,

String string = "string";

string += ((char)65) + 5;
System.out.println(string);

Produce string70.

Where is the difference?

You see this behavior as a result of the combination of operator precedence and string conversion.

JLS 15.8.1 states:

If only one operand expression is of type String, then string conversion (§5.1.11) is performed on the other operand to produce a string at run time.

Therefore the right hand operands in your first expression are implicitly converted to string: string = string + ((char)65) + 5;

For the second expression however string += ((char)65) + 5; the += compound assignment operator has to be considered along with +. Since += is weaker than +, the + operator is evaluated first. There we have a char and an int which results in a binary numeric promotion to int. Only then += is evaluated, but at this time the result of the expression involving the + operator has already been evaluated.

Different results between gcc and clang when compiling a rather simple c++11 program

47 votes

I'm trying to understand whether the different behavior exposed by gcc vs. clang in the output of this simple C++11 program is due to a bug in clang (Xcode 5.0.2, OS X 10.8.5). The code is as follows:

#include <iostream>

int main() {


    int matrix[][3]{{1,2,3}, {4,5,6}, {7,8,9}};
    auto dyn_matrix = new int[3][3]{{1,2,3}, {4,5,6}, {7,8,9}};

    std::cout << matrix[0][1] << std::endl;
    std::cout << dyn_matrix[0][1] << std::endl;

    return 0;   
}

As shown, I'm trying to use uniform initialization to initialize an anonymous (resp. named) multidimensional array of size 3x3. When compiling with gcc 4.7 from MacPorts the expected output is obtained:

$g++-mp-4.7 -std=c++11 dyn_matrix.cpp -o dyn_matrix 
$ ./dyn_matrix
2
2
$

Conversely, in case clang is used the output reads:

$ clang++ -std=c++11 -stdlib=libc++ dyn_matrix.cpp -o dyn_matrix_clang
$ ./dyn_matrix_clang 
2
4
$  

In this case the result is (apparently) wrong. clang --version reports:

Apple LLVM version 5.0 (clang-500.2.75) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin12.5.0
Thread model: posix

Who's to blame? me, gcc or clang?

UPDATE Dec 11th, 2013: The bug should have been fixed in r196995. Unfortunately, we still do not know how long it will take before Apple updates the version of clang that ships with Xcode.

UPDATE Dec 9th, 2013: I submitted a bug report on the LLVM bugzilla platform. It has indeed been recognized as a bug, a patch is currently under review, see http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20131209/095099.html.

Thanks.

Update: Thanks to Faisal Vali and Richard Smith, this bug has been corrected in Clang ToT; see the test file introduced by the commit.


According to §8.5.1 [dcl.init.aggr] it appears that Clang is wrong:

11/ Braces can be elided in an initializer-list as follows. If the initializer-list begins with a left brace, then the succeeding comma-separated list of initializer-clauses initializes the members of a subaggregate; it is erroneous for there to be more initializer-clauses than members. If, however, the initializer-list for a subaggregate does not begin with a left brace, then only enough initializer-clauses from the list are taken to initialize the members of the subaggregate; any remaining initializer-clauses are left to initialize the next member of the aggregate of which the current subaggregate is a member. [ Example:

float y[4][3] = {
    { 1, 3, 5 },
    { 2, 4, 6 },
    { 3, 5, 7 },
};

is a completely-braced initialization: 1, 3, and 5 initialize the first row of the array y[0], namely y[0][0], y[0][1], and y[0][2]. Likewise the next two lines initialize y[1] and y[2]. The initializer ends early and therefore y[3]s elements are initialized as if explicitly initialized with an expression of the form float(), that is, are initialized with 0.0. In the following example, braces in the initializer-list are elided; however the initializer-list has the same effect as the completely-braced initializer-list of the above example,

float y[4][3] = {
    1, 3, 5, 2, 4, 6, 3, 5, 7
};

The initializer for y begins with a left brace, but the one for y[0] does not, therefore three elements from the list are used. Likewise the next three are taken successively for y[1] and y[2]. —end example ]

Which I think applies because of §5.3.4 [expr.new]:

15/ A new-expression that creates an object of type T initializes that object as follows:

  • If the new-initializer is omitted, the object is default-initialized (§8.5); if no initialization is performed, the object has indeterminate value.
  • Otherwise, the new-initializer is interpreted according to the initialization rules of §8.5 for direct initialization.

Java benchmarking - why is the second loop faster?

44 votes

I'm curious about this.

I wanted to check which function was faster, so I create a little code and I executed a lot of times.

public static void main(String[] args) {

        long ts;
        String c = "sgfrt34tdfg34";

        ts = System.currentTimeMillis();
        for (int k = 0; k < 10000000; k++) {
            c.getBytes();
        }
        System.out.println("t1->" + (System.currentTimeMillis() - ts));

        ts = System.currentTimeMillis();
        for (int i = 0; i < 10000000; i++) {
            Bytes.toBytes(c);
        }
        System.out.println("t2->" + (System.currentTimeMillis() - ts));

    }

The "second" loop is faster, so, I thought that Bytes class from hadoop was faster than the function from String class. Then, I changed the order of the loops and then c.getBytes() got faster. I executed many times, and my conclusion was, I don't know why, but something happen in my VM after the first code execute so that the results become faster for the second loop.

This is a classic java benchmarking issue. Hotspot/JIT/etc will compile your code as you use it, so it gets faster during the run.

Run around the loop at least 3000 times (10000 on a server or on 64 bit) first - then do your measurements.

Why is a = (a+b) - (b=a) a bad choice for swapping two integers?

41 votes

I stumbled into this code for swapping two integers without using a temporary variable or the use of bitwise operators.

int main(){

    int a=2,b=3;
    printf("a=%d,b=%d",a,b);
    a=(a+b)-(b=a);
    printf("\na=%d,b=%d",a,b);
    return 0;
}

But I think this code has undefined behavior in the swap statement a = (a+b) - (b=a); as it does not contain any sequence points to determine the order of evaluation.

My question is: Is this an acceptable solution to swap two integers?

No. This is not acceptable. This code invokes Undefined behavior. This is because of the operation on b is not defined. In the expression

a=(a+b)-(b=a);  

it is not certain whether b get modified first or its value get used in the expression (a+b) because of the lack of the sequence point.
See what standard syas:

C11: 6.5 Expressions:

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.84)1.

Read C-faq- 3.8 and this answer for more detailed explanation of sequence point and undefined behavior.


1. Emphasis is mine.

sign changes when going from int to float and back

39 votes

Consider the following code, which is an SSCCE of my actual problem:

#include <iostream>

int roundtrip(int x)
{
    return int(float(x));
}

int main()
{
    int a = 2147483583;
    int b = 2147483584;
    std::cout << a << " -> " << roundtrip(a) << '\n';
    std::cout << b << " -> " << roundtrip(b) << '\n';
}

The output on my computer (Xubuntu 12.04.3 LTS) is:

2147483583 -> 2147483520
2147483584 -> -2147483648

Note how the positive number b ends up negative after the roundtrip. Is this behavior well-specified? I would have expected int-to-float round-tripping to at least preserve the sign correctly...

Hm, on ideone, the output is different:

2147483583 -> 2147483520
2147483584 -> 2147483647

Did the g++ team fix a bug in the meantime, or are both outputs perfectly valid?

Your program is invoking undefined behavior because of an overflow in the conversion from floating-point to integer. What you see is only the usual symptom on x86 processors.

The float value nearest to 2147483584 is 231 exactly (the conversion from integer to floating-point usually rounds to the nearest, which can be up, and is up in this case. To be specific, the behavior when converting from integer to floating-point is implementation-defined, most implementations define rounding as being “according to the FPU rounding mode”, and the FPU's default rounding mode is to round to the nearest).

Then, while converting from the float representing 231 to int, an overflow occurs. This overflow is undefined behavior. Some processors raise an exception, others saturate. The IA-32 instruction cvttsd2si typically generated by compilers happens to always return INT_MIN in case of overflow, regardless of whether the float is positive or negative.

You should not rely on this behavior even if you know you are targeting an Intel processor: when targeting x86-64, compilers can emit, for the conversion from floating-point to integer, sequences of instructions that take advantage of the undefined behavior to return results other than what you might otherwise expect for the destination integer type.

Declaring type of pointers?

38 votes

I just read that we need to give the type of pointers while declaring them in C (or C++) i.e ;

int *point ;

As far as i know, pointers store the address of variables, and address occupies same amount of memory whatever may be the type. So, why do we need to declare its type?

Type-safety. If you don't know what p is supposed to point to, then there'd be nothing to prevent category errors like

*p = "Nonsense";
int i = *p;

Static type checking is a very powerful tool for preventing all kinds of errors like that.

C and C++ also support pointer arithmetic, which only works if the size of the target type is known.

address occupies same amount of memory whatever my be the type

That's true for today's popular platforms. But there have been platforms for which that wasn't the case. For example, a pointer to a multi-byte word could be smaller than a pointer to a single byte, since it doesn't need to represent the byte's offset within the word.

How to safely read a line from an std::istream?

38 votes

I want to safely read a line from an std::istream. The stream could be anything, e.g., a connection on a Web server or something processing files submitted by unknown sources. There are many answer starting to do the moral equivalent of this code:

void read(std::istream& in) {
    std::string line;
    if (std::getline(in, line)) {
        // process the line
    }
}

Given the possibly dubious source of in, using the above code would lead to a vulnerability: a malicious agent may mount a denial of service attack against this code using a huge line. Thus, I would like to limit the line length to some rather high value, say 4 millions chars. While a few large lines may be encountered, it isn't viable to allocate a buffer for each file and use std::istream::getline().

How can the maximum size of the line be limited, ideally without distorting the code too badly and without allocating large chunks of memory up front?

You could write your own version of std::getline with a maximum number of characters read parameter, something called getline_n or something.

#include <string>
#include <iostream>

template<typename CharT, typename Traits, typename Alloc>
auto getline_n(std::basic_istream<CharT, Traits>& in, std::basic_string<CharT, Traits, Alloc>& str, std::streamsize n) -> decltype(in) {
    std::ios_base::iostate state = std::ios_base::goodbit;
    bool extracted = false;
    const typename std::basic_istream<CharT, Traits>::sentry s(in, true);
    if(s) {
        try {
            str.erase();
            typename Traits::int_type ch = in.rdbuf()->sgetc();
            for(; ; ch = in.rdbuf()->snextc()) {
                if(Traits::eq_int_type(ch, Traits::eof())) {
                    // eof spotted, quit
                    state |= std::ios_base::eofbit;
                    break;
                }
                else if(str.size() == n) {
                    // maximum number of characters met, quit
                    extracted = true;
                    in.rdbuf()->sbumpc();
                    break;
                }
                else if(str.max_size() <= str.size()) {
                    // string too big
                    state |= std::ios_base::failbit;
                    break;
                }
                else {
                    // character valid
                    str += Traits::to_char_type(ch);
                    extracted = true;
                }
            }
        }
        catch(...) {
            in.setstate(std::ios_base::badbit);
        }
    }

    if(!extracted) {
        state |= std::ios_base::failbit;
    }

    in.setstate(state);
    return in;
}

int main() {
    std::string s;
    getline_n(std::cin, s, 10); // maximum of 10 characters
    std::cout << s << '\n';
}

Might be overkill though.

Break on NaN in JavaScript

36 votes

Is there any modern browser that raises exceptions on NaN propagation (ie multiplying or adding a number to NaN), or that can be configured to do so?

Silent NaN propagation is an awful and insidious source of bugs, and I'd love to be able to detect them early, even at a performance penalty.


Here's an example bug that use strict, jshint et al. wouldn't pick up:

object = new MyObject();
object.position.x = 0;
object.position.y = 10;

// ... lots of code

var newPosition = object.position + 1; // <- this is clearly an error, however it
                                       //    fails silently, rather than loudly

newPosition *= 2;                      // <- this doesn't raise any errors either

To answer the question as asked:

Is there any modern browser that raises exceptions on NaN propagation (ie multiplying or adding a number to NaN), or that can be configured to do so?

No. Javascript is a very forgiving language and doesn't care if you want to multiply Math.PI by 'potato' (hint: it's NaN). It's just one of the bad parts (or good parts, depending on your perspective) about the language that us developers have to deal with.

Addressing the bug that has you asking this question (presumably), using getters and setters on your Objects is one solid way to enforce this and also keep you from making mistakes like this.

Finding the cause of a memory leak in Ruby

35 votes

I've discovered a memory leak in my Rails code - that is to say, I've found what code leaks but not why it leaks. I've reduced it down to a testcase that doesn't require Rails:

require 'csspool'
require 'ruby-mass'

def report
    puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)[1].to_s + 'KB'
    Mass.print
end

report

# note I do not store the return value here
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))

ObjectSpace.garbage_collect
sleep 1

report

ruby-mass supposedly lets me see all the objects in memory. CSSPool is a CSS parser based on racc. /home/jason/big.css is a 1.5MB CSS file.

This outputs:

Memory 9264KB

==================================================
 Objects within [] namespace
==================================================
  String: 7261
  RubyVM::InstructionSequence: 1151
  Array: 562
  Class: 313
  Regexp: 181
  Proc: 111
  Encoding: 99
  Gem::StubSpecification: 66
  Gem::StubSpecification::StubLine: 60
  Gem::Version: 60
  Module: 31
  Hash: 29
  Gem::Requirement: 25
  RubyVM::Env: 11
  Gem::Specification: 8
  Float: 7
  Gem::Dependency: 7
  Range: 4
  Bignum: 3
  IO: 3
  Mutex: 3
  Time: 3
  Object: 2
  ARGF.class: 1
  Binding: 1
  Complex: 1
  Data: 1
  Gem::PathSupport: 1
  IOError: 1
  MatchData: 1
  Monitor: 1
  NoMemoryError: 1
  Process::Status: 1
  Random: 1
  RubyVM: 1
  SystemStackError: 1
  Thread: 1
  ThreadGroup: 1
  fatal: 1
==================================================

Memory 258860KB

==================================================
 Objects within [] namespace
==================================================
  String: 7456
  RubyVM::InstructionSequence: 1151
  Array: 564
  Class: 313
  Regexp: 181
  Proc: 113
  Encoding: 99
  Gem::StubSpecification: 66
  Gem::StubSpecification::StubLine: 60
  Gem::Version: 60
  Module: 31
  Hash: 30
  Gem::Requirement: 25
  RubyVM::Env: 13
  Gem::Specification: 8
  Float: 7
  Gem::Dependency: 7
  Range: 4
  Bignum: 3
  IO: 3
  Mutex: 3
  Time: 3
  Object: 2
  ARGF.class: 1
  Binding: 1
  Complex: 1
  Data: 1
  Gem::PathSupport: 1
  IOError: 1
  MatchData: 1
  Monitor: 1
  NoMemoryError: 1
  Process::Status: 1
  Random: 1
  RubyVM: 1
  SystemStackError: 1
  Thread: 1
  ThreadGroup: 1
  fatal: 1
==================================================

You can see the memory going way up. Some of the counters go up, but no objects specific to CSSPool are present. I used ruby-mass's "index" method to inspect the objects that have references like so:

Mass.index.each do |k,v|
    v.each do |id|
        refs = Mass.references(Mass[id])
        puts refs if !refs.empty?
    end
end

But again, this doesn't give me anything related to CSSPool, just gem info and such.

I've also tried outputting "GC.stat"...

puts GC.stat
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
puts GC.stat

Result:

{:count=>4, :heap_used=>126, :heap_length=>138, :heap_increment=>12, :heap_live_num=>50924, :heap_free_num=>24595, :heap_final_num=>0, :total_allocated_object=>86030, :total_freed_object=>35106}
{:count=>16, :heap_used=>6039, :heap_length=>12933, :heap_increment=>3841, :heap_live_num=>13369, :heap_free_num=>2443302, :heap_final_num=>0, :total_allocated_object=>3771675, :total_freed_object=>3758306}

As I understand it, if an object is not referenced and garbage collection happens, then that object should be cleared from memory. But that doesn't seem to be what's happening here.

I've also read about C-level memory leaks, and since CSSPool uses Racc which uses C code, I think this is a possibility. I've run my code through Valgrind:

valgrind --partial-loads-ok=yes --undef-value-errors=no --leak-check=full --fullpath-after= ruby leak.rb 2> valgrind.txt

Results are here. I'm not sure if this confirms a C-level leak, as I've also read that Ruby does things with memory that Valgrind doesn't understand.

Versions used:

  • Ruby 2.0.0-p247 (this is what my Rails app runs)
  • Ruby 1.9.3-p392-ref (for testing with ruby-mass)
  • ruby-mass 0.1.3
  • CSSPool 4.0.0 from here
  • CentOS 6.4 and Ubuntu 13.10

It looks like you are entering The Lost World here. I don’t think the problem is with c-bindings in racc either.

Ruby memory management is both elegant and cumbersome. It stores objects (named RVALUEs) in so-called heaps of size of approx 16KB. On a low level, RVALUE is a c-struct, containing a union of different standard ruby object representations.

So, heaps store RVALUE objects, which size is not more than 40 bytes. For such objects as String, Array, Hash etc. this means that small objects can fit in the heap, but as soon as they reach a threshold, an extra memory outside of the Ruby heaps will be allocated.

This extra memory is flexible; is will be freed as soon as an object became GC’ed. That’s why your testcase with big_string shows the memory up-down behaviour:

def report
  puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`
          .strip.split.map(&:to_i)[1].to_s + 'KB'
end
report
big_var = " " * 10000000
report
big_var = nil 
report
ObjectSpace.garbage_collect
sleep 1
report
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 11788KB

But the heaps (see GC[:heap_length]) themselves are not released back to OS, once acquired. Look, I’ll make a humdrum change to your testcase:

- big_var = " " * 10000000
+ big_var = 1_000_000.times.map(&:to_s)

And, voilá:

# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 57448KB

The memory is not released back to OS anymore, because each element of the array I introduced suits the RVALUE size and is stored in the ruby heap.

If you’ll examine the output of GC.stat after the GC was run, you’ll find that GC[:heap_used] value is decreased as expected. Ruby now has a lot of empty heaps, ready.

The summing up: I don’t think, the c code leaks. I think the problem is within base64 representation of huge image in your css. I have no clue, what’s happening inside parser, but it looks like the huge string forces the ruby heap count to increase.

Hope it helps.