Best c++ questions in November 2012

How to avoid overflow in expr. A * B - C * D

136 votes

I need to compute an expression which looks like: A*B - C*D, where their types are: signed long long int A, B, C, D; Each number can be really big (not overflowing its type). While A*B could cause overflow, at same time expression A*B - C*D can be really small. How can I compute it correctly?

For example: MAX * MAX - (MAX - 1) * (MAX + 1) == 1, where MAX = LLONG_MAX - n and n - some natural number.

This seems too trivial I guess. But A*B is the one that could overflow.

You could do the following, without losing precision

A*B - C*D = A(D+E) - (A+F)D
          = AD + AE - AD - DF
          = AE - DF
             ^smaller quantities E & F

E = B - D (hence, far smaller than B)
F = C - A (hence, far smaller than C)

This decomposition can be done further.
As @Gian pointed out, care might need to be taken during the subtraction operation if the type is unsigned long long.

For example, with the case you have in the question, it takes just one iteration,

 MAX * MAX - (MAX - 1) * (MAX + 1)
  A     B       C           D

E = B - D = -1
F = C - A = -1

AE - DF = {MAX * -1} - {(MAX + 1) * -1} = -MAX + MAX + 1 = 1

Malloc vs new -- different padding

74 votes

I'm reviewing someone else's C++ code for our project that uses MPI on high-performance computing (10^5 - 10^6 cores). The code is intended to allow for communications between potentially different machines on different architectures. He's written a comment that says something along the lines of:

We'd normally use new and delete, but here I'm using malloc and free. This is necessary because some compilers will pad the data differently when new is used, leading to errors in transferring data between different platforms. This doesn't happen with malloc.

This does not fit in with anything I know from the standard new vs malloc debate. What is the difference between new/delete and malloc/free? hints at the idea that the compiler could calculate the size of an object differently (but then what is sizeof doing differently?).

malloc & placement new vs. new is a fairly popular question but only talks about new using constructors where malloc doesn't, which isn't relevant to me.

And how does malloc understand alignment? says that memory is guaranteed to be properly aligned with either new or malloc which is what I'd previously thought.

My guess is that he's misdiagnosed his own bug some time in the past and deduced that new and malloc give different amounts of padding, which I think probably isn't true. But I can't find the answer with Google or in any previous question.

Help me, StackOverflow, you're my only hope!

IIRC there's one picky point. malloc is guaranteed to return an address aligned for any standard type. ::operator new(n) is only guaranteed to return an address aligned for any standard type no larger than n, and if T isn't a character type then new T[n] is only required to return an address aligned for T.

But this is only relevant when you're playing implementation-specific tricks like using the bottom few bits of a pointer to store flags, or otherwise relying on the address to have more alignment than it strictly needs.

It doesn't affect padding within the object, which necessarily has exactly the same layout regardless of how you allocated the memory it occupies. So it's hard to see how the difference could result in errors transferring data.

Is there any sign what the author of that comment thinks about objects on the stack or in globals, whether in his opinion they're "padded like malloc" or "padded like new"? That might give clues to where the idea came from.

Maybe he's confused, but maybe the code he's talking about is more than a straight difference between malloc(sizeof(Foo) * n) vs new Foo[n]. Maybe it's more like:

malloc((sizeof(int) + sizeof(char)) * n);


struct Foo { int a; char b; }
new Foo[n];

That is, maybe he's saying "I use malloc", but means "I manually pack the data into unaligned locations instead of using a struct". Actually malloc is not needed in order to manually pack the struct, but failing to realize that is a lesser degree of confusion. It is necessary to define the data layout sent over the wire. Different implementations will pad the data differently when the struct is used.

Could a C++ implementation, in theory, parallelise the evaluation of two function arguments?

52 votes

Given the following function call:

f(g(), h())

since the order of evaluation of function arguments is unspecified (still the case in C++11 as far as I'm aware), could an implementation theoretically execute g() and h() in parallel?

Such a parallelisation could only kick in were g and h known to be fairly trivial (in the most obvious case, accessing only data local to their bodies) so as not to introduce concurrency issues but, beyond that restriction I can't see anything to prohibit it.

So, does the standard allow it? Even if only by the as-if rule?

(In this answer, Mankarse asserts otherwise; however, he does not cite the standard, and my read-through of [] hasn't revealed any obvious wording.)

The requirement comes from [intro.execution]/15:

... When calling a function ... Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function [Footnote: In other words, function executions do not interleave with each other.].

So any execution of the body of g() must be indeterminately sequenced with (that is, not overlapping with) the evaluation of h() (because h() is an expression in the calling function).

The critical point here is that g() and h() are both function calls.

(Of course, the as-if rule means that the possibility cannot be entirely ruled out, but it should never be relevant in a correct program.)

How is "=default" different from "{}" for default constructor and destructor?

36 votes

I originally posted this as a question only about destructors, but now I'm adding consideration of the default constructor. Here's the original question:

If I want to give my class a destructor that is virtual, but is otherwise the same as what the compiler would generate, I can use =default:

class Widget {
   virtual ~Widget() = default;

But it seems that I can get the same effect with less typing using an empty definition:

class Widget {
   virtual ~Widget() {}

Is there any way in which these two definitions behave differently?

Based on the replies posted for this question, the situation for the default constructor seems similar. Given that there is almost no difference in meaning between "=default" and "{}" for destructors, is there similarly almost no difference in meaning between these options for default constructors? That is, assuming I want to create a type where the objects of that type will be both created and destroyed, why would I want to say

Widget() = default;

instead of

Widget() {}


I apologize if extending this question after its original posting is violating some SO rules. Posting an almost-identical question for default constructors struck me as the less desirable option.

This is a completely different question when asking about constructors than destructors.

If your destructor is virtual, then the difference is negligible, as Howard pointed out. However, if your destructor was non-virtual, it's a completely different story. The same is true of constructors.

Using = default syntax for special member functions (default constructor, copy/move constructors/assignment, destructors etc) means something very different from simply doing {}. With the latter, the function becomes "user-provided". And that changes everything.

This is a trivial class by C++11's definition:

struct Trivial
  int foo;

If you attempt to default construct one, the compiler will generate a default constructor automatically. Same goes for copy/movement and destructing. Because the user did not provide any of these member functions, the C++11 specification considers this a "trivial" class. It therefore legal to do this, like memcpy their contents around to initialize them and so forth.


struct NotTrivial
  int foo;

  NotTrivial() {}

As the name suggests, this is no longer trivial. It has a default construct that is user-provided. It doesn't matter if it's empty; as far as the rules of C++11 are concerned, this cannot be a trivial type.


struct Trivial2
  int foo;

  Trivial2() = default;

Again as the name suggests, this is a trivial type. Why? Because you told the compiler to automatically generate the default constructor. The constructor is therefore not "user-provided." And therefore, the type counts as trivial, since it doesn't have a user-provided default constructor.

The = default syntax is mainly there for doing things like copy constructors/assignment, when you add member functions that prevent the creation of such functions. But it also triggers special behavior from the compiler, so it's useful in default constructors/destructors too.

Why can I use auto on a private type?

35 votes

I was somehow surprised that the following code compiles and run (vc2012 & gcc4.7.2)

class Foo {
    struct Bar { int i; };
    Bar Baz() { return Bar(); }

int main() {
    Foo f;
    // Foo::Bar b = f.Baz();  // error
    auto b = f.Baz();         // ok
    std::cout << b.i;

Is it correct that this code compiles fine? And why is it correct? Why can I use auto on a private type, while I can't use its name (as expected)?

The rules for auto are, for the most part, the same as for template type deduction. The example posted works for the same reason you can pass objects of private types to template functions:

template <typename T>
void fun(T t) {}

int main() {
    Foo f;
    fun(f.Baz());         // ok

And why can we pass objects of private types to template functions, you ask? Because only the name of the type is inaccessible. The type itself is still usable, which why you can return it to client code at all.

Why is the != operator not allowed with OpenMP?

34 votes

I was trying to compiled the following code:

#pragma omp parallel shared (j)
   #pragma omp for schedule(dynamic)
   for(i = 0; i != j; i++)
    // do something

I get this error: error: invalid controlling predicate.

I check the openMP reference guide and it says that for the parallel for it "only" allows one of the following operators: < <= > >=.

I don't understand why not allow i != j. I could understand if it was the static schedule, since openMP need to pre-compute the number of iterations assigned to each thread. But i can't understand why this limitation in such case for example. Any clues ?

EDIT: even if I make for(i = 0; i != 100; i++), although I could just put "<" or "<=" .


............ I send it an email to the OpenMP developers about this subject, the answer :

For signed int, the wrap around behavior is undefined. If we allow !=, programmers may get unexpected tripcount. The problem is whether the compiler can generate code to compute a trip count for the loop.

For a simple loop, like:

for( i = 0; i < n; ++i )

the compiler can determine that there are 'n' iterations, if n>=0, and zero iterations if n < 0.

For a loop like:

for( i = 0; i != n; ++i ) 

again, a compiler should be able to determine that there are 'n' iterations, if n >= 0; if n < 0, we don't know how many iterations it has.

For a loop like:

for( i = 0; i < n; i += 2 )

the compiler can generate code to compute the trip count (loop iteration count) as floor((n+1)/2) *if n >= 0*, and 0 if n < 0.

For a loop like:

for( i = 0; i != n; i += 2 )

the compiler can't determine whether 'i' will ever hit 'n'. What if 'n' is an odd number?

For a loop like:

for( i = 0; i < n; i += k )

the compiler can generate code to compute the trip count as floor((n+k-1)/k) *if n >= 0*, and 0 if n < 0, because the compiler knows that the loop must count up; in this case, if k < 0, it's not a legal OpenMP program.

For a loop like:

for( i = 0; i != n; i += k )

the compiler doesn't even know if i is counting up or down. It doesn't know if 'i' will ever hit 'n'. It may be an infinite loop.

Why does stringstream >> change value of target on failure?

32 votes

From Stroustrup's TC++PL, 3rd Edition, Section 21.3.3:

If we try to read into a variable v and the operation fails, the value of v should be unchanged (it is unchanged if v is one of the types handled by istream or ostream member functions).

The following example appears to contradict the above quote. Based on the above quote, I was expecting the value of v to remain unchanged -- but it gets zeroed. What's the explanation for this apparent contradictory behaviour?

#include <iostream>
#include <sstream>

int main( )
    std::stringstream  ss;

    ss  << "The quick brown fox.";

    int  v = 123;

    std::cout << "Before: " << v << "\n";

    if( ss >> v )
        std::cout << "Strange -- was successful at reading a word into an int!\n";

    std::cout << "After: " << v << "\n";

    if( ss.rdstate() & std::stringstream::eofbit  ) std::cout << "state: eofbit\n";
    if( ss.rdstate() & std::stringstream::failbit ) std::cout << "state: failbit\n";
    if( ss.rdstate() & std::stringstream::badbit  ) std::cout << "state: badbit\n";

    return 1;

The output I get using x86_64-w64-mingw32-g++.exe (rubenvb-4.7.2-release) 4.7.2 is:

Before: 123
After: 0
state: failbit


From this reference:

If extraction fails (e.g. if a letter was entered where a digit is expected), value is left unmodified and failbit is set (until C++11)

If extraction fails, zero is written to value and failbit is set. If extraction results in the value too large or too small to fit in value, std::numeric_limits::max() or std::numeric_limits::min() is written and failbit flag is set. (since C++11)

It seems that your compiler is compiling in C++11 mode, which changes the behavior.

The input operator uses the locale facet std::num_get whose get function invokes do_get. For C++11 it's specified to use std::strtoll et. al. type of functions. Before C++11 it apparently used std::scanf style parsing (going by the reference, I don't have access to the C++03 specification) to extract the numbers. The change in behavior is due to this change in parsing the input.

Best way to find an intersection between two arrays?

30 votes

I faced this problem many times during various situations. It is generic to all programming languages although I am comfortable with C or Java.

Let us consider two arrays (or collections):

char[] A = {'a', 'b', 'c', 'd'};
char[] B = {'c', 'd', 'e', 'f'};

How do I get the common elements between the two arrays as a new array? In this case, the intersection of array A and B is char[] c = {'c', 'd'}.

I want to avoid the repeated iteration of one array inside the other array which will increase the execution time by (length of A times length of B) which is too much in the case of huge arrays.

Is there any way we could do a single pass in each array to get the common elements?

Since this looks to me like a string algorithm, I'll assume for a moment that its not possible to sort this sequence (hence string) then you can use Longest Common Sequence algorithm (LCS)

Assuming the input size is constant, then the problem has a complexity of O(nxm), (length of the two inputs)

Can I typically/always use std::forward instead of std::move?

22 votes

I've been watching Scott Meyers' talk on Universal References from the C++ and Beyond 2012 conference and everything makes sense so far. However, an audience member asks a question at around 50 minutes in that I was also wondering about. Meyers says that he does not care about the answer because it is non-idiomatic and would silly his mind, but I'm still interested.

The code presented is as follows:

// Typical function bodies with overloading:
void doWork(const Widget& param)   // copy
  // ops and exprs using param
void doWork(Widget&& param)        // move
  // ops and exprs using std::move(param)

// Typical function implementations with universal reference:
template <typename T>
void doWork(T&& param)             // forward => copy and move
  // ops and exprs using std::forward<T>(param)

The point being that when we take an rvalue reference, we know we have an rvalue so we should std::move it to preserve the fact that it's an rvalue. When we take a universal reference (T&& where T is a deduced type), we want to std::forward to preserve the fact that it may have been an lvalue or an rvalue.

So the question is: Since std::forward preserves whether the value passed into the function was either an lvalue or an rvalue and std::move simply casts its argument to an rvalue, could we just use std::forward everywhere? Would std::forward behave like std::move in all cases where we would use std::move or are there some important differences in behaviour that are missed out by Meyers' generalisation?

I'm not suggesting that anybody should do it because, as Meyers correctly says, it's completely non-idiomatic, but is the following also a valid use of std::move:

void doWork(Widget&& param)         // move
  // ops and exprs using std::forward<Widget>(param)

The two are very different and complementary tools.

  • std::move deduces the argument and unconditionally creates an rvalue expression. This makes sense to apply to an actual object or variable.

  • std::forward takes a mandatory template argument (you must specify this!) and magically creates an lvalue reference or an rvalue expression depending on what the type was (by virtue of adding && and the collapsing rules). This only makes sense to apply to a deduced, templated function argument.

Maybe the following examples illustrate this a bit better:

#include <utility>
#include <memory>
#include <vector>
#include "foo.hpp"

std::vector<std::unique_ptr<Foo>> v;

template <typename T, typename ...Args>
std::unique_ptr<T> make_unique(Args &&... args)
    return std::unique_ptr<T>(new T(std::forward<Args>(args)...));  // #1

int main()
        std::unique_ptr<Foo> p(new Foo('a', true, Bar(1,2,3)));
        v.push_back(std::move(p));                                  // #2

        v.push_back(make_unique<Foo>('b', false, Bar(5,6,7)));      // #3

        Bar b(4,5,6);
        char c = 'x';
        v.push_back(make_unique<Foo>(c, b.ready(), b));             // #4

In situation #2, we have an existing, concrete object p, and we want to move from it, unconditionally. Only std::move makes sense. There's nothing to "forward" here. We have a named variable and we want to move from it.

On the other hand, situation #1 accepts a list of any sort of arguments, and each argument needs to be forwarded as the same value category as it was in the original call. For example, in #3 the arguments are temporary expressions, and thus they will be forwarded as rvalues. But we could also have mixed in named objects in the constructor call, as in situation #4, and then we need forwarding as lvalues.

C++ vs Java: endless loop creating objects only crashes C++

22 votes

This was a question in one of my books (with no answer attached to it), that I've been thinking about for a few days now. Is the answer simply that the C++ code will eventually crash because it is creating a garbage memory cell after each iteration?

Consider the following Java and C++ code fragments, parts of two versions of a GUI based application which collects user preferences and use them to assemble a command and its parameters. The method/function getUserCommandSpecification() returns a string representing the command code and its parameters. The returned string is used to build the required command which is then executed.

Assume the following:

(i) After the creation in the while loop of the Command object (referred by cmd in Java case or pointed by cmd in C++ case), the reference / pointer cmd to the generated object is no more referenced or used.

(ii) The application also defines a class Command along with its method/function execute().

a. Which of the two code versions, detailed below, will eventually crash.
b. Explain why a program version crashes while the other one is not crashing.

Java code

while (true) {
   String commandSpecification = getUserCommandSpecification();
   Command cmd = new Command(commandSpecification);

C++ code

while (true) {
   string commandSpecification = getUserCommandSpecification();
   Command* cmd = new Command(commandSpecification);
   cmd -> execute();

Yes, the C++ version leaks due to new Command(...) with no delete. Of course it could have easily been coded differently to avoid that:

while (true) {
   string commandSpecification = getUserCommandSpecification();
   Command cmd(commandSpecification);
... I'm not sure the example is as instructive as they think.

Life time of an unnamed temporary constructed in if condition expression

21 votes

How Standard defines the life time of a temporary object constructed during evaluation of an if condition expression?

I looked for this information and found something similar in an example to point [10] in $1.9, page 10. (I'm referring here to Final Draft of the new specification.) Yet still it wasn't clear (enough) for me and since Visual C++ acted differently than my understanding of that example I decided to ask.

Please provide proper references to specification.

If you name the object it persists for the entire if (so both true block and false block but is destroyed before if ends).

For example:

if ( MyClass x = f() ) { /* ... */ } else { /* ... */ }

x can be used in both if blocks but gets destroyed before nextInstruction gets called.

But what if you don't name it?

if ( f() ) { /* ... */ } else { /* ... */ }

In my understanding of the referenced part of specification the value returned by f() will be destroyed before execution enters one of the blocks (either for true or for false).

However Visual C++ destroys that temporary object as if it was named. (EDIT: as Tino Didriksen pointed out Visual C++ does work well here. And indeed now I confirm that as well. I must have made a mistake when looking at initial tests results!)

This does matter in some edge cases (lets not discuss here how likely they are or whether it is good to write code that way...).

For example lets have:

class ScopedLock {
  ~ScopedLock() { if ( isLocked() ) unlock(); }

  operator bool() const { return isLocked(); }

  /* ... */


Now if we have code like:

if ( ScopedLock lock = resource.lock() ) { /* ... */ }

we can be sure that when execution enters the true block we own the resource and that it will not be unlocked before we leave that block.

But what if someone wrote it like this:

if ( resource.lock() ) { /* ... */ }

Now it is crucial at which point the destructor for the temporary ScopedLock will be called. Because it determines whether this code is correct or not (in the sens of resource usage). (Again lets skip discussing whether writing such code is bad in general. That is not the point of this question...)

As far as I can tell, Visual C++ is wrong in this regard.

Indeed, a temporary is (almost always) destroyed at the end of the full expression in which it is created:

§12.2/3: [...]Temporary objects are destroyed as the last step in evaluating the full-expression that (lexically) contains the point where they were created

Looking at the definition of a selection statement (if and switch), we can see that the condition is an expression:

  if ( condition ) statement
  if ( condition) statement else statement
  switch ( condition ) statement

  type-specifier-seq declarator = assignment-expression

So any temporaries created in the condition should be destroyed before executing the following statement (unless bound to a reference-to-const).

The behavior when introducing a new name in the condition is specified in §6.4/3:

A name introduced by a declaration in a condition [...] is in scope from its point of declaration until the end of the substatements controlled by the condition.

So in your example, x is in scope in the two branches of the if, and destroyed before evaluating nextInstruction().

Understanding the difference between f() and f(void) in C and C++ once and for all

20 votes

Possible Duplicate:
is f(void) deprecated in modern C and C++

Ok, so I have heard different opinions on this subject and just want to make sure I understand it correctly.

For C++

Declarations void f(); and void f(void); mean precisely the same thing, the function f does not take any parameters. Ditto for definitions.

For C

Declaration void f(void); means that f does not take any parameters.

Declaration void f(); means that function f may or may not have parameters, and if it does, we don't know what kind of parameters those are, or how many there is of them. Note that it is NOT the same as ellipsis, we can't use va_list.

Now here is where things get interesting.

Case 1


void f();


void f(int a, int b, float c)

Case 2


void f();


void f()


What happens at compile time in cases 1 and 2 when we call f with the correct arguments, wrong arguments and no arguments at all? What happens at run time?

Additional question:

If I declare f with arguments, but define it without them, will it make a difference? Should I be able to address the arguments from the function body?

More terminology (C, not C++): a prototype for a function declares the types of its arguments. Otherwise the function does not have a prototype.

void f();                      // Declaration, but not a prototype
void f(void);                  // Declaration and prototype
void f(int a, int b, float c); // Declaration and protoype

Declarations that aren't prototypes are holdovers from pre-ANSI C, from the days of K&R C. The only reason to use an old-style declaration is to maintain binary compatibility with old code. For example, in Gtk 2 there is a function declaration without a prototype -- it is there by accident, but it can't be removed without breaking binaries. The C99 standard comments:

6.11.6 Function declarators

The use of function declarators with empty parentheses (not prototype-format parameter type declarators) is an obsolescent feature.

Recommendation: I suggest compiling all C code in GCC/Clang with -Wstrict-prototypes and -Wmissing-prototypes, in addition to the usual -Wall -Wextra.

What happens

void f(); // declaration
void f(int a, int b, float c) { } // ERROR

The declaration disagrees with the function body! This is actually a compile time error, and it's because you can't have a float argument in a function without a prototype. The reason you can't use a float in an unprototyped function is because when you call such a function, all of the arguments get promoted using certain default promotions. Here's a fixed example:

void f();

void g()
    char a;
    int b;
    float c;
    f(a, b, c);

In this program, a is promoted to int1 and c is promoted to double. So the definition for f() has to be:

void f(int a, int b, double c)

See C99 6.7.6 paragraph 15,

If one type has a parameter type list and the other type is specified by a function declarator that is not part of a function definition and that contains an empty identifier list, the parameter list shall not have an ellipsis terminator and the type of each parameter shall be compatible with the type that results from the application of the default argument promotions.

Answer 1

What happens at compile time in cases 1 and 2 when we call f with the correct arguments, wrong arguments and no arguments at all? What happens at run time?

When you call f(), the parameters get promoted using the default promotions. If the promoted types match the actual parameter types for f(), then all is good. If they don't match, it will probably compile but you will definitely get undefined behavior.

"Undefined behavior" is spec-speak for "we make no guarantees about what will happen." Maybe your program will crash, maybe it will work fine, maybe it will invite your in-laws over for dinner.

There are two ways to get diagnostics at compile-time. If you have a sophisticated compiler with cross-module static analysis capabilities, then you will probably get an error message. You can also get messages for un-prototyped function declarations with GCC, using -Wstrict-prototypes -- which I recommend turning on in all your projects (except for files which use Gtk 2).

Answer 2

If I declare f with arguments, but define it without them, will it make a difference? Should I be able to address the arguments from the function body?

It shouldn't compile.


There are actually two cases in which function arguments are allowed to disagree with the function definition.

  1. It is okay to pass char * to a function that expects void *, and vice versa.

  2. It is okay to pass a signed integer type to a function that expects the unsigned version of that type, or vice versa, as long as the value is representable in both types (i.e., it is not negative, and not out of range of the signed type).


1: It is possible that char promotes to unsigned int, but this is very uncommon.

Is numeric_limits<int>::is_modulo logically contradictory?

18 votes

In another question, the topic of std::numeric_limits<int>::is_modulo came up. But the more I think about it, the more it seems like something is wrong with the spec or with GCC or both.

Let me start with some code:

#include <limits>
#include <iostream>

bool test(int x)
    return x+1 > x;

int main(int argc, char *argv[])
    int big = std::numeric_limits<int>::max();

    std::cout << std::numeric_limits<int>::is_modulo << " ";
    std::cout << big+1 << " ";
    std::cout << test(big) << "\n";

When I compile this with g++ -O3 -std=c++11 (x86_64 GCC 4.7.2), it produces the following output:

1 -2147483648 1

That is, is_modulo is true, one plus INT_MAX is negative, and one plus INT_MAX is greater than INT_MAX.

If you are the sort of person with any realistic chance of answering this question, you already know what happened here. The C++ specification says that integer overflow is Undefined Behavior; the compiler is allowed to assume you do not do that; therefore the argument to x+1 cannot be INT_MAX; therefore the compiler may (and will) compile the test function to return true unconditionally. So far, so good.

However, the C++11 spec also says ( paragraphs 60-61):

static constexpr is_modulo;

True if the type is modulo.222 A type is modulo if, for any operation involving +, -, or * on values of that type whose result would fall outside the range [min(),max()], the value returned differs from the true value by an integer multiple of max() - min() + 1.

On most machines, this is false for floating types, true for unsigned integers, and true for signed integers.

Note that section 5 paragraph (4) still reads, "If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined." There is no mention of is_modulo == true creating an exception.

So it looks to me like the standard is logically contradictory, because integer overflow cannot simultaneously be defined and undefined. Or at the very least, GCC is non-conforming, because it has is_modulo as true even though signed arithmetic most certainly does not wrap around.

Is the standard buggy? Is GCC non-conforming? Am I missing something?

If is_modulo is true for a signed type (e.g. int) that is unchanged by the usual arithmetic conversions, then for any arithmetic operation other than division by zero there is a single correct result in the (mathematical) integers that modulo maps to a single value in the range of the type, and so the implementation is constrained to behave as if the actual result is the true result modulo the range of the type. So if an implementation wants to retain overflow arithmetic as being undefined it must set is_modulo to false.

This was discussed ad nauseam on the gcc mailing list and then under PR 22200 with the eventual conclusion that the value of is_modulo should be false for signed types; the change was made to libstdc++ in April of this year.

Note that in C++03 the language was significantly different: numeric_limits members [lib.numeric.limits.members]

56 - [...] A type is modulo if it is possible to add two positive numbers and have a result that wraps around to a third number that is less.

Given that with undefined behaviour anything is possible, it is arguable that the previous behaviour of libstdc++ (having is_modulo as true) was correct and consistent with the behaviour of g++; the earlier discussions on the linked PR should be read with this in mind.

In C/C++, for an array a, I just learned that (void*)&a == (void*)a. How does that work?

17 votes

So, I always knew that the array "objects" that are passed around in C/C++ just contained the address of the first object in the array.

How can the pointer to the array "object" and it's contained value be the same?

Could someone point me towards more information maybe about how all that works in assembly, maybe.

Short answer: A pointer to an array is defined to have the same value as a pointer to the first element of the array. That's how arrays in C and C++ work.

Pedantic answer:

C and C++ have rvalue and lvalue expressions. An lvalue is something to which the & operator may be applied. They also have implicit conversions. An object may be converted to another type before being used. (For example, if you call sqrt( 9 ) then 9 is converted to double because sqrt( int ) is not defined.)

An lvalue of array type implicitly converts to a pointer. The implicit conversion changes array to &array[0]. This may also be written out explicitly as static_cast< int * >( array ), in C++.

Doing that is OK. Casting to void* is another story. void* is a bit ugly. And casting with the parentheses as (void*)array is also ugly. So please, avoid (void*) a in actual code.

Differences between switch statements in C# and C++

15 votes

I'm just starting out teaching myself C#, and in a tutorial on Switch statements, I read:

The behavior where the flow of execution is forbidden from flowing from one case block to the next is one area in which C# differs from C++. In C++ the processing of case statements is allowed to run from one to another.

Why does it stop after one case statement in C#? If you can use the break statement to stop at any point, is there any reason in C# vs. C++ to having it stop after a match is found? And if you wanted more than one case in C#, would you have to use another Switch statement?

C# has goto casevalue, which has all the benefits of fallthrough but is harder to do by accident.

Example on MSDN

Fastest method for screen capturing on Linux

14 votes

This question is similar to this one

Fastest method of screen capturing

but for linux/X11.

To be more specific, i need a method to capture the pixel images of one window (the programmatic equivalent of alt-print screen in windows) running on a X11 diplay.

Notes and requirements:

1) Even if a new window is placed on top of the window that is being captured, the pixel image should still point to the original application window without any occlusion

2) it is not needed that the application window to be seen by the user, i just need to store the pixel buffers/images for video purposes

other alternatives that i've explored are:

1) xvfb - it works but it does does CPU rendering, which is slow and wasteful of a good GPU

2) x11 inside many lxc - theoretically could work but is complex to setup, and i'm not sure it will scale well with many windows being captured

suggestions and ideas are welcome

This is possible using VirtualGL in a server with hardware acceleration. Basically just configure the server appropiately, then either on the same machine or on a machine in the same network, run

export DISPLAY=<your xvfb display>
vglrun <your_app>

This will have the following advantages:

1) your app will render using virtualGL, which will use the hardware

2) VirtualGL will display your 3D context inside the Xfvb display, which will only render the 2D widgets in CPU

3) configure Xvfb to render to a framebuffer

4) profit!

C++ vector array operator high computational cost?

9 votes

I have always known that the rich abstractions of C++ come with a certain computational overhead but I was under the impression that this overhead would be close to negligible once the correct compiler optimisations were applied. I was curious as to what exactly the magnitude of this overhead would be, so I wrote a simple test to determine this. The test is a templated function which takes a container variable, assigns a value to each element in the container and then sums the values across the container in a separate loop. This process is repeated for a preset number of cycles.

What I found, to my considerable unease, was that the vector implementation took nearly 3 times the standard array implementation. After permuting through a vast selection of compiler optimizations without any success, I decided to bite the bullet and eyeball the assembly code directly to try and see what was causing the time penalty. I included some assembly directives which allowed me to pinpoint exactly where the array indexing operation occurred and examined the code in detail. What I found, to my complete confusion, was that the difference between the vector implementation and the array implementation was utterly insignificant. The assembly code can be found here.

This is the command I used to build the binary:

g++ -O3 vectorArrayOp.cpp -o vectorArrayOp

This is the command I used to build the assembly:

g++ -O3 -DTAGASM vectorArrayOp.cpp -S -o vectorArrayOp.s

This is the output I observe from running the binary:

gmurphy@interloper:Reference$ ./vectorArrayOp 
Duration 0.027678
Duration 0.090212

The results are unchanged when you include the computed value in the stdout stream, I removed them for clarity. My system specifications are as follows (I've seen the same results on my AMD too):

Linux 3.2.0-32-generic x86_64 GNU/Linux
Intel(R) Xeon(R) CPU X5550  @ 2.67GH
g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

The code follows, I would appreciate if someone could provide me with some insight into why the timings are so different when the assembly is so similar.

#include <vector>
#include <iostream>
#include <sys/time.h>
#ifdef TAGASM
#define ASMTAG(X) asm(X)
#define ASMTAG(X)
enum { DataSize=1024, NumTests=(1<<16) } ;
struct ReturnValue {ReturnValue(float _d, int _t):d(_d), t(_t){} float d; int t;} ;
template <typename Container, typename Type>
ReturnValue runTest(Container &c, Type value)
    int tagValue(0);
    timeval startTime;
    gettimeofday(&startTime, NULL);
    for(int i=0; i<NumTests; i++)
        for(int j=0; j<DataSize; j++)
            c [j] = value ;
        for(int j=0; j<DataSize; j++)
            tagValue += c [j] ;
    timeval endTime;
    gettimeofday(&endTime, NULL);
    float duration((endTime.tv_sec-startTime.tv_sec)+
    //tagValue is returned in case the optimising compiler might try to remove the loops
    return ReturnValue(duration, tagValue) ;
int main()
    int *arrayData = new int [DataSize];
    std::vector <int> vectorData(DataSize, 0) ;
    ReturnValue ad = runTest(arrayData, 1);
    ReturnValue vd = runTest(vectorData, 1);
    std::cout<<"Duration "<<ad.d<<std::endl;
    std::cout<<"Duration "<<vd.d<<std::endl;
    delete [] arrayData;
    return 0 ;

% g++-4.4 -O3 vectorArrayOp.cpp -o vectorArrayOp
% ./vectorArrayOp
Duration 0.008581
Duration 0.008775
% g++-4.5 -O3 vectorArrayOp.cpp -o vectorArrayOp
% ./vectorArrayOp
Duration 0.008634
Duration 0.008588
% g++-4.6 -O3 vectorArrayOp.cpp -o vectorArrayOp
% ./vectorArrayOp
Duration 0.01731
Duration 0.081696
% g++-4.7 -O3 vectorArrayOp.cpp -o vectorArrayOp
% ./vectorArrayOp
Duration 0.008618
Duration 0.008612
% clang++ -O3 vectorArrayOp.cpp -o vectorArrayOp
% ./vectorArrayOp
Duration 0.066484
Duration 0.066435

Based on these results, this is probably a compiler specific performance regression in g++ 4.6.

Redundant static data

6 votes

This question applies to any type of static data. I'm only using int to keep the example simple.

I am reading in a large XML data file containing ints and storing them in a vector<int>. For the particular data I'm using, it's very common for the same value to be repeated consecutively many times.

<Node value="4" count="4000">

The count attribute means that the value is to be repeated x number of times:

for(int i = 0; i < 4000; i++)

It seems like a waste of memory to store the same value repeatedly when I already know that it is going to appear 4000 times in a row. However, I need to be able to index into the vector at any point.

For larger data objects, I know that I can just store a pointers but that would still involve storing 4000 identical pointers in the example above.

Is there any type of strategy to deal with an issue like this?

Use two vectors. The first vector contains the indices, the second one the actual values.

Fill in the indices vector such that the value for all indices between indices[i-1] and indices [i] is in values[i].

Then use binary search on the indices array to locate the position in the values array. Binary search is very efficient (O(log n)), and you will only use a fraction of the memory compared to the original approach.

If you assume the following data:

4000 ints with value "4"
followed by 200 ints with value "3"
followed by 5000 ints with value "10"

You would create an index vector and value vector and fill it like this:

indices = {4000, 4200, 9200}; // indices[i+1] = indices [i] + new_count or 0
values = {4,3,10};

As suggested in the other answers, you should probably wrap this in an operator[].

Concurrent access in SQLite

5 votes

Can SQLite manage concurrent access? I use SQLite with C/C++? If it doesn't support that. Is there any suggestion to support concurrent access in SQLite?

Yes it does as the documentation states here:

SQLite Version 3.0.0 introduced a new locking and journaling mechanism designed to improve concurrency over SQLite version 2 and to reduce the writer starvation problem. The new mechanism also allows atomic commits of transactions involving multiple database files.


SQLite uses POSIX advisory locks to implement locking on Unix. On Windows it uses the LockFile(), LockFileEx(), and UnlockFile() system calls.

And here:

SQLite uses filesystem locks to make sure that only one process and database connection is trying to modify the database at a time. The filesystem locking mechanism is implemented in the VFS layer and is different for every operating system. SQLite depends on this implementation being correct. If something goes wrong and two or more processes are able to write the same database file at the same time, severe damage can result.