Best c++ questions in March 2011

What does T&& mean in C++0x?

34 votes

I've been looking into some of the new features of C++0x and one I've noticed is the double ampersand in declaring variables, like T&& var.

For a start, what is this beast called? (I wish Google would allow us to search for punctuation like this.)

What exactly does it mean?

At first glance, it appears to be a double reference (like the C-style double pointers T** var), but I'm having a hard time thinking of a use case for that.

It declares an rvalue reference (standards proposal doc).

Here's an introduction to rvalue references: http://www.artima.com/cppsource/rvalue.html.
Here's a fantastic in-depth look at rvalue references by one of Microsoft's standard library developers: http://blogs.msdn.com/b/vcblog/archive/2009/02/03/rvalue-references-c-0x-features-in-vc10-part-2.aspx.

The biggest difference between a C++03 reference (now called an lvalue reference in C++0x) is that it can bind to an rvalue like a temporary without having to be const. Thus, this syntax is now legal:

T&& r = T();

rvalue references primarily provide for the following:

Move semantics. A move constructor and move assignment operator can now be defined that takes an rvalue reference instead of the usual const-lvalue reference. A move functions like a copy, except it is not obliged to keep the source unchanged; in fact, it usually modifies the source such that it no longer owns the moved resources. This is great for eliminating extraneous copies, especially in standard library implementations.

For example, a copy constructor might look like this:

foo(foo const& other)
{
    this->length = other.length;
    this->ptr = new int[other.length];
    copy(other.ptr, other.ptr + other.length, this->ptr);
}

If this constructor was passed a temporary, the copy would be unnecessary because we know the temporary will just be destroyed; why not make use of the resources the temporary already allocated? In C++03, there's no way to prevent the copy as we cannot determine we were passed a temporary. In C++0x, we can overload a move constructor:

foo(foo&& other)
{
   this->length = other.length;
   this->ptr = other.ptr;
   other.length = 0;
   other.ptr = nullptr;
}

Notice the big difference here: the move constructor actually modifies its argument. This would effectively "move" the temporary into the object being constructed, thereby eliminating the unnecessary copy.

The move constructor would be used for temporaries and for non-const lvalue references that are explicitly converted to rvalue references using the std::move function (it just performs the conversion). The following code both invoke the move constructor for f1 and f2:

foo f1((foo())); // Move a temporary into f1; temporary becomes "empty"
foo f2 = std::move(f1); // Move f1 into f2; f1 is now "empty"

Perfect forwarding. rvalue references allow to properly forward arguments for templated functions. Take for example this factory function:

template <typename T, typename A1>
std::shared_ptr<T> factory(A1& a1)
{
    return std::shared_ptr<T>(new T(a1));
}

If we called factory<foo>(5), the template argument will be deduced to be int&, which will not bind to a literal 5, even if foo's constructor takes an int. Well, we could instead use A1 const&, but what if foo takes the constructor argument by non-const reference? To make a truly generic factory function, we would have to overload factory on A1& and on A1 const&. That might be fine if factory takes 1 parameter, but each additional parameter would multiply the necessary overload set by 2. That's very quickly unmaintainable.

rvalue references fix this problem by allowing the standard library to define a forward function, something like this:

template <typename T>
struct identity
{
    typedef T type;
};

template <typename T>
T&& forward(typename identity<T>::type& a)
{
    return (T&&)a;
}

Which would enable us to define the factory function like this:

template <typename T, typename A1>
std::shared_ptr<T> factory(A1&& a1)
{
    return std::shared_ptr<T>(new T(std::forward<A1>(a1)));
}

Now the argument's rvalue/lvalue-ness is preserved when passed to T's constructor. That means that if factory is called with an rvalue, T's constructor is called with an rvalue. If factory is called with an lvalue, T's constructor is called with an lvalue. The improved factory function works because of one special rule:

When the function parameter type is of the form T&& where T is a template parameter, and the function argument is an lvalue of type A, the type A& is used for template argument deduction

Thus, we can use factory like so:

auto p1 = factory<foo>(foo()); // calls foo(foo&&)
auto p2 = factory<foo>(*p1);   // calls foo(foo const&)

Important rvalue reference properties:

  • For overload resolution, lvalues prefer binding to lvalue references and rvalues prefer binding to rvalue references. Hence why temporaries prefer invoking a move constructor / move assignment operator over a copy constructor / assignment operator.
  • rvalue references will implicitly bind to rvalues and to temporaries that are the result of an implicit conversion. i.e. float f = 0f; int&& i = f; is well formed because float is implicitly convertible to int; the reference would be to a temporary that is the result of the conversion.
  • Named rvalue references are lvalues. Unnamed rvalue references are rvalues. This is important to understand why the std::move call is necessary in: foo&& r = foo(); foo f = std::move(r);

Is this code behavior defined?

33 votes

What does the following code print to the console?

map<int,int> m;
m[0] = m.size();
printf("%d", m[0]);

Possible answers:

  1. The behavior of the code is not defined since it is not defined which statement m[0] or m.size() is being executed first by the compiler. So it could print 1 as well as 0.
  2. It prints 0 because the right hand side of the assignment operator is executed first.
  3. It prints 1 because the operator[] has the highest priority of the complete statement m[0] = m.size(). Because of this the following sequence of events occurs:

    • m[0] creates a new element in the map
    • m.size() gets called which is now 1
    • m[0] gets assigned the previously returned (by m.size()) 1
  4. The real answer?, which is unknown to me^^

Thx in advance for your answers...
Woltan

I believe it's unspecified whether 0 or 1 is stored in m[0], but it's not undefined behavior.

The LHS and the RHS can occur in either order, but they're both function calls, so they both have a sequence point at the start and end. There's no danger of the two of them, collectively, accessing the same object without an intervening sequence point.

The assignment is actual int assignment, not a function call with associated sequence points, since operator[] returns T&. That's briefly worrying, but it's not modifying an object that is accessed anywhere else in this statement, so that's safe too. It's accessed within operator[], of course, where it is initialized, but that occurs before the sequence point on return from operator[], so that's OK. If it wasn't, m[0] = 0; would be undefined too!

However, the order of evaluation of the operands of operator= is not specified by the standard, so the actual result of the call to size() might be 0 or 1 depending which order occurs.

The following would be undefined behavior, though. It doesn't make function calls and so there's nothing to prevent size being accessed (on the RHS) and modified (on the LHS) without an intervening sequence point:

int values[1];
int size = 0;

(++size, values[0] = 0) = size;
/*     fake m[0]     */  /* fake m.size() */

When will C++0x be finished?

32 votes

Ok, this is the first question I've asked and I didn't know you couldn't answer your own question.

Answer:

March 25, 2011. :-) I'm not kidding, it's official. Well, at least as far as the committee is concerned.

As Howard already said in the question, the final draft was completed on March 25, 2011.

There will now be some months of editorial changes, voting and ISO red tape before it officially becomes a standard, but on the 25th, the standards committee themselves officially signed off on it.

Sources:

https://www.ibm.com/developerworks/mydeveloperworks/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/the_c_0x_standard_has_been_approved_to_ship23?lang=en

http://herbsutter.com/2011/03/25/we-have-fdis-trip-report-march-2011-c-standards-meeting/

http://twitter.com/#!/sdt_intel/status/51328822066417665

and of course, Howard Hinnant, who asked the question, is on the committee as well, so he's not making it up.

(Only posting this as a "real" answer because Howard apparently was unable to answer his own question)

What good are public variables then?

27 votes

Hi everyone, I'm a total newbie with tons of ?'s in my mind and a lot to experience with C++ yet! There's been something which I find really confusing and it's the use of public variables, I've seen tons of code like this:

class Foo {

private:
    int m_somePrivateVar;

public:
    void setThatPrivateVar (int const & new_val) {
        m_somePrivateVar = new_val;
    }

    int getThatPrivateVar (void) const {
        return m_somePrivateVar;
    }

};

Why would anyone hide that variable and implement accessors and mutators when there's nothing done in them more than assigning the new value just as it got received (no range checking etc.) or returning the value without just as it is? Well I've heard some reasons and some of them are convincing in some cases, but imagine implementing a huge class in such a manner for with a lot of variables which do not need any checking and stuff! Let me ask you this way, When do you use public variables? Do you use that at all?

Thanks in advance.

By hiding the variable and adding methods now, the class designer allows for inserting arbitrary code into those methods in the future without breaking tons of code that use the attributes directly.

Also note that providing a lot of accessor/mutator methods is generally a sign that your class design needs another look for possible improvement. Class methods should implement actual logic, not just provide access to each member.

I use public variables only in struct form. For example, I might have a database table that represents a string->value mapping, where value is a composite data structure. I'd just write a structure and use for example std::map<std::string, MyStruct> to represent the database table. I don't need to actually do work on the data, merely be able to look it up and make use of it when required.

As noted in a couple comments, even structs can often benefit from judicial use of methods, for example a couple of common constructors to keep the members sanely initialized, a clear function to reuse the structure, etc.

Redefinition allowed in C but not in C++?

26 votes

Why does this code work in C but not in C++?

int i = 5;
int i; // but if I write int i = 5; again I get error in C also

int main(){

  // using i
}

Tentative definition is allowed in C but not in C++.

A tentative definition is any external data declaration that has no storage class specifier and no initializer.

C99 6.9.2/2

A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.

So int i is a tentative definition. The C compiler will combine all of the tentative definitions into a single definition of i.

In C++ your code is ill-formed due to the One Definition Rule (Section 3.2/1 ISO C++)

No translation unit shall contain more than one definition of any variable, function, class type, enumeration type or template.


// but if I write int i = 5; again I get error in C also

Because in that case it no longer remains a tentative definition because of the initializer (5).


Just for the sake of information

J.5.11 Multiple external definitions

There may be more than one external definition for the identifier of an object, with or without the explicit use of the keyword extern; if the definitions disagree, or more than one is initialized, the behavior is undefined (6.9.2).

Also check out this excellent post on external variables.

Are Exceptions still undesirable in Realtime environment?

25 votes

A couple of years ago I was tought, that in real-time applications such as Embedded Systems or (Non-Linux-)Kernel-development C++-Exceptions are undesirable. (Maybe that lesson was from before gcc-2.95). But I also know, that Exception Handling has become better.

So, are C++-Exceptions in the context of real-time applications in practice

  • totally unwanted?
  • even to be switched off via via compiler-switch?
  • or very carefully usable?
  • or handled so well now, that one can use them almost freely, with a couple of things in mind?
  • Does C++0x change anything w.r.t. this?

Update: Does exception handling really require RTTI to be enabled (as one answerer suggested)? Are there dynamic casts involved, or similar?

Exceptions are now well-handled, and the strategies used to implement them make them in fact faster than testing return code, because their cost (in terms of speed) is virtually null, as long as you do not throw any.

However they do cost: in code-size. Exceptions usually work hand in hand with RTTI, and unfortunately RTTI is unlike any other C++ feature, in that you either activate or deactivate it for the whole project, and once activated it will generated supplementary code for any class that happens to have a virtual method, thus defying the "you don't pay for what you don't use mindset".

Also, it does require supplementary code for its handling.

Therefore the cost of exceptions should be measured not in terms of speed, but in terms of code growth.

EDIT:

From @Space_C0wb0y: This blog article gives a small overview, and introduces two widespread methods for implementing exceptions Jumps and Zero-Cost. As the name implies, good compilers now use the Zero-Cost mechanism.

The Wikipedia article on Exception Handling talk about the two mechanisms used. The Zero-Cost mechanism is the Table-Driven one.

EDIT:

From @Vlad Lazarenko whose blog I had referenced above, the presence of exception thrown might prevent a compiler from inlining and optimizing code in registers.

Alternative to Eclipse for C and C++ development?

25 votes

I have been using Eclipse for C and C++ development for some time. Unfortunately Eclipse has it's faults (speed, the crappy integrated console, and some bugs that pop up from time to time).

For C++ development Qt Creator is a very good choice, but I need something for both C and C++.

I don't really need the integration parts of the IDE (I don't need an integrated project manager, compiler or debuger). What I need is code navigation. Eclipse provides a great feature "callgraph for structure elements" that is unparalleled when I need to modify big crummy code bases (which is what I do most of the time).

Code completion and at least some integration documentation (doxygen, generic comments before functions, system documentation) is an absolute necessity.

Oh and the IDE has to be crossplatform.

Is there something other then Eclipse?

Have you tried NetBeans? There is a plugin for C/C++ development.

Interesting Programming interview question

24 votes

The following was an interview questions that I am practicing to solve, however my solution was not correct. Did someone has a clue?

Two integer variables L and R are initially equal to 0 and 1 respectively (i.e. L = 0 and R = 1).

The values of these variables can be manipulated using the following operations: operation 'L' is the assignment L = 2*L-R, operation 'R' is the assignment R = 2*R-L.

Given an integer N we want to find what is the shortest sequence of such operations necessary to make either L or R equal to N. For example, given N = 21, the following sequence of operations:

(initially L = 0, R = 1),
L = 2*L-R (makes L = -1, R = 1),
L = 2*L-R (makes L = -3, R = 1),
R = 2*R-L (makes L = -3, R = 5),
L = 2*L-R (makes L = -11, R = 5),
R = 2*R-L (makes L = -11, R = 21)

makes one of the variables (namely R) equal to N (i.e. after the last operation N = R = 21). This sequence consists of 5 operations and there is no shorter sequence that would make either L or R equal 21.

Write a function int interval_unfold_count(int n);

that returns the length of the shortest sequence of operations resulting in either L or R being equal to N. For example, given N = 21 the function should return 5.

The function should return -1 if no such sequence exists. The function should return 0 if no operations are required.

int interval_unfold_count ( int n ) {
    long l = 0, r = 1;
    long sequence = 0, i = 0;
    if (l == n || r == n )
        return 0;
    while (true){
        i++;
        if (l == n || r == n )
            break;
        if (abs(n-abs(r)) > abs(n-abs(l)) ){
            r = 2*r - l;
            sequence++;
        }
        else if (abs(n-abs(l)) >= abs(n-abs(r))) {
            l = 2*l - r;
            sequence++;
        }
        if ( sequence > 32){ //in each sequence the absolute of l or r will always be larger than previous value
        // so at most after 32 space it will use all the 32 bit space of the integer.
        //after more than 32 iterations, it will cause overflow
            sequence = -1;
            break;
        }
    }
    return sequence; 
}

I found out that my solution logic was wrong. Does someone has any idea how to solve this? What is the correct if/else condition to determine whether we should multiply the first or the second value?

Thanks !

If the value is 0 or 1 the solution is 0.

otherwise

For positive values I think the solution is the binary length of value - 1. So:

value = 21

value - 1 = 20

binary = 10100

solution 5

You can note that, reading the binary from greatest to smallest, it's RLRLL (each 1 is R, each 0 is L). You have to subtract 1 because R is = 1 at the start. This will work for positive numbers (I think/hope) (tested 0... 100000).

For negative number do as previsiously said, but instead of value - 1 use -value. So

value = -21

-value = 21

binary = 10101

solution = 5

For negative you can't use the binary representation you calc with -value to find the right R and L combination. You have to do something different (but the solution is still right, they asked for the number of steps, not for the exact sequence :-) )

This code will find the "right" combination of L and R for a range of number, checking if my idea is right. It's C# because I'm better with C# than with C++ and, in the end, implementing the algorithm is trivial when you know what you are looking for. It WON'T return the asked length (but it's very easy, simply return bitValue.Length without checking it). For negative numbers it's based on the fact that they are represented in 2 complement in memory. The Convert.ToString converts a number in binary form. For negative values I truncate the "complemented" 1s at the beginning, until I find a 0 (-21 = 11111111111111111111111111101011, truncating the beginning 1s you get 01011)

for (int myInt = -100000; myInt < 1000000; myInt++)
{
    string binValue = myInt != 0 && myInt != 1 ? Convert.ToString(myInt - 1, 2) : String.Empty;

    var length = myInt != 0 && myInt != 1 ? Math.Ceiling(Math.Log(myInt > 0 ? myInt : -myInt + 1, 2)) : 0;

    if (myInt < 0)
    {
        binValue = binValue.Substring(binValue.IndexOf('0'));
    }

    if (length != binValue.Length)
    {
        throw new Exception("Log formula is wrong");
    }

    if (myInt < 0 && binValue.Length != Convert.ToString(-myInt, 2).Length)
    {
        throw new Exception("Negative value formula is wrong");
    }

    int l = 0, r = 1;

    for (int i = binValue.Length - 1; i >= 0; i--)
    {
        if (binValue[i] == '1')
        {
            r = r * 2 - l;
        }
        else
        {
            l = l * 2 - r;
        }
    }

    if (myInt > 0 && r != myInt || myInt <= 0 && l != myInt)
    {
        throw new Exception("General formula is wrong");
    }
}

So in the end the value to return is:

var length = myInt != 0 && myInt != 1 ? 
    Math.Ceiling(Math.Log(myInt > 0 ? myInt : -myInt + 1, 2)) : 
    0;

The reasoning

  • We know that L <= 0 an R >= 1 always (this for how they are built)
  • at the n "round", R <= 2^n (easy to demonstrate, always use the R formula)
  • if myInt = 0 or 1, the solution is trivial (0)
  • if myInt > 1, clearly you'll need at least log2(myint) rounds (rounded up). So myInt = 2 at least 1 round, myInt = 3 or 4 at least 2 rounds....
  • my solution has that number of rounds, so it's optimal (at least for positive numbers... I'll ignore negative numbers).

Now a little demonstration on the fact that my solution is correct, and that being long exactly length step, it's optimal. I'm doing it for myInt > 1, because it's easier with positive numbers. For negative numbers it's a little more complex but I'm quite sure the steps are nearly the same.

We first want to demonstrate that R(n)-L(n) = 2^n at any step

With Mathematical Induction (P(0) is true and for any P(n-1), P(n-1) => P(n), then P(n) is true)

1. R(0)-L(0) = 2^0 = 1 True!

2. If we have that R(n-1)-L(n-1) = 2^(n-1)

If we select the rule R

3a. L(n) = L(n-1)
3b. R(n) = 2R(n-1) - L(n-1) = R(n-1) + R(n-1) - L(n-1) = R(n-1) + 2^(n-1) 
3c. R(n) - L(n) = R(n-1) + 2^(n-1) - L(n-1) = 2^(n-1) + 2^(n-1) = 2^n True!
  1. is True, 2. => 3abc. so it's True.

the same can be done with rule L

Now we want to show that when we, at step n, select the rule R, R increases by 2^(n-1)

So, for example, R(5) = R(4) + 2^4

R(n) = 2R(n-1) - L(n-1) = R(n-1) + R(n-1) - L(n-1) = R(n-1) + 2^(n-1)

R(n) - R(n-1) = R(n-1) + 2^(n-1) - R(n-1) = 2^(n-1) True!

the same can be done with rule L

So now we know that if at step n we select the rule R, R will increase by 2^(n-1)

We can see that there is a relationship between a searched for number > 1, R and the binary (base 2) representation of the number.

if intNum is the "target" value of R, we need to increase R by

x = intNum - R(0) = intNum - 1

So we have to find the "right" combination of the R rules that will increase R(0) by x.

Any integer base 10 number > 0 can be represented in base 2 in this way:

[0-1] * 2^0 + [0-1] * 2^1 + [0-1] * 2^2 + [0-1] * 2^3 + ... + [0-1] * 2^n

(where [0-1] means that you can have a 0 or a 1]).

and we know that each time we select the R rule we increase R by 2^(n-1). The relationship is clear:

  • For each 1 of the binary representation of x, we select the R rule.
  • For each 0 of the binary representation of x, we select the L rule.

Done. (ok... this isn't exactly a mathematical demonstration, but I'm quite sure this is enough for you that are reading this :-) )

What are some interesting C/C++ libraries to play around with?

22 votes

I'm looking for a few new libraries and for C and C++. In the past most of the time I "accidently" stumbled across a few - and most of them found good use in projects I worked on.

Libraries should run on Mac OS X and Linux/POSIX and possibly on Windows.

  • Lua - A minimal and fast scripting engine for configuration files and basic application scripting.
  • V8 - A fast JavaScript by Google engine similar to WebKit's JavaScriptCore.
  • Cairo - A good graphcis library similar to QuickDraw/Quartz on Mac OS X.
  • ZBar - A barcode scanner library, which allows to scan photos/images/video streams for barcodes and return their value.
  • ZLib - A very compact compression library for data streams. Used zziblib and minizip, too.
  • DynaPDF - A easy-to-use PDF generation library.
  • libusb - A universal USB library which allows for portable access to USB devices (I used this to write a basic driver for a custom POS printer).
  • WebKit - This is a really nice one if you want to render HTML/Web contents and use it in applications to give your users a "richer" user experience.
  • Qt4 - The general purpose framework for all kinds of desktop (and possibly mobile) development. Spending a lot of my time with that - no idea how I could forget that. ;)

This should be marked community wiki. Please update if you have something interesting to add!

Thanks!


Update 1

I'm not looking for "productivity" libraries like Boost or STL. Instead I'm looking for "interesting new stuff" of random genres - be it graphics libraries, scripting libraries, network or even MOD/MIDI playing libraries. Sorry I didn't make that clear before.

STL and Boost are musts.

SQLite provides a completely embedded, full-featured relational database in a few 100k that you can include right into your project. It's also a highly marketable skill because of its high presence (it's included in Mozilla Firefox as well as Android and iOS).

If you're interested in creating user interfaces, look into ncurses -- it's the library that was used to create many terminal user interfaces and can be very useful for creating games and shell utilities. Qt is a good GUI framework for C++.

If you're interested in graphics or creating games, consider SDL or OpenGL (or DirectX if you don't mind only working on Windows).

Of course, there are thousands of interesting libraries. It really depends on what you're interested in.

is there a pragmatic reason to use "if (0 == p) " instead of "if (!p)"?

20 votes

I'm inclined to write if statements by using logical negation operator:

if (!p)
    some_code();

Some people around me tend to use explicit comparison, so that the code looks like:

if (FOO == p)
    some_code();

where FOO is one of false, FALSE, 0, 0.0, NULL, etc.

I prefer the short form because it is:

  • operator!= friendly
  • generic programming friendly
  • laconic (and even more beautiful, as for me)

What are the pragmatic benefits of writing this otherwise (if any)?

Some claim that the pragmatic benefit is that programmers will find it easier to understand if you explicitly compare against NULL, FALSE, 0, etc., whereas the logical operator may be confusing to people who don't understand how implicit conversions and booleans work in C/C++.

(Disclaimer: I don't share this view myself. if (p) ... and if (!p)... are the idiomatic ways to express this in C and C++, and programmers who have trouble understanding them have no business touching C or C++ code. Heath Hunnicutt's comment is dead on.)

What's the difference in c++ between new int and new (int) ?

20 votes

what's the difference between

int * num = new (int);

and

int * num = new int;

?

Is there a difference at all?

EDIT thx all. ... which one is the most correct answer?

There isn't a difference in the context of your example (using an int type). However, there is a difference if you need to create objects of compound types, where you need to use parenthesized version. i.e:

int (**fn_ptr_ok) ()  = new (int (*[10]) ()); // OK
int (**fn_ptr_err) ()  = new int (*[10]) (); // error 

What's the c++ inline class?

20 votes

I accidentally found that the Clang compiler allows :

inline class AAA
{
};

in C++. What's this?


PS. I have reported this to Clang mailing list cfe-dev@cs.uiuc.edu, and now waiting for reply. I'll update this question by I'm informed.

I got an answer from Clang mailing list. It was a bug: http://llvm.org/bugs/show_bug.cgi?id=3941

However it looks already fixed in recent build. Thanks anyway :)

Here's the conversation: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2011-March/014207.html

std::string with no free store memory allocation

20 votes

I have a question very similar to

How do I allocate a std::string on the stack using glibc's string implementation?

but I think it's worth asking again.

I want an std::string with local storage that overflows into the free store. std::basic_string provides an allocator as a template parameter, so it seems like the thing to do is to write an allocator with local storage and use it to parameterize the basic_string, like so:

std::basic_string<
char, 
std::char_traits<char>, 
inline_allocator<char, 10> 
> 
x("test");

I tried to write the inline_allocator class that would work the way you'd expect: it reserves 10 bytes for storage, and if the basic_string needs more than 10 bytes, then it calls ::operator new(). I couldn't get it to work. In the course of executing the above line of code, my GCC 4.5 standard string library calls the copy constructor for inline_allocator 4 times. It's not clear to me that there's a sensible way to write the copy constructor for inline_allocator.

In the other StackOverflow thread, Eric Melski provided this link to a class in Chromium:

http://src.chromium.org/svn/trunk/src/base/stack_container.h

which is interesting, but it's not a drop-in replacement for std::string, because it wraps the std::basic_string in a container so that you have to call an overloaded operator->() to get at the std::basic_string.

I can't find any other solutions to this problem. Could it be that there is no good solution? And if that's true, then are the std::basic_string and std::allocator concepts badly flawed? I mean, it seems like this should be a very basic and simple use case for std::basic_string and std::allocator. I suppose the std::allocator concept is designed primarily for pools, but I think it ought to cover this as well.

It seems like the rvalue-reference move semantics in C++0x might make it possible to write inline_allocator, if the string library is re-written so that basic_string uses the move constructor of its allocator instead of the copy constructor. Does anyone know what the prospect is for that outcome?

My application needs to construct a million tiny ASCII strings per second, so I ended up writing my own fixed-length string class based on Boost.Array, which works fine, but this is still bothering me.

Andrei Alexandrescu, C++ programmer extraordinaire who wrote "Modern C++ Design" once wrote a great article about building different string implementations with customizable storage systems. His article (linked here) describes how you can do what you've described above as a special case of a much more general system that can handle all sorts of clever memory allocation requirements. This doesn't talk so much about std::string and focuses more on a completely customized string class, but you might want to look into it as there are some real gems in the implementation.

C++ Types Impossible to Name

19 votes

While reading Wikipedia's page on decltype, I was curious about the statement,

Its [decltype's] primary intended use is in generic programming, where it is often difficult, or even impossible, to name types that depend on template parameters.

While I can understand the difficulty part of that statement, what is an example where there is a need to name a type that cannot be named under C++03?

EDIT: My point is that since everything in C++ has a declaration of types. Why would there ever be a case where it is impossible to name a type? Furthermore, aren't trait classes designed to yield type informations? Could trait classes be an alternative to decltype?

The wikipedia page you link has a perfect example:

int& foo(int& i);
float foo(float& f);

template <class T> auto transparent_forwarder(T& t) −> decltype(foo(t)) {
  return foo(t);
}

Note that foo(int&) returns int& (a reference type) while foo(float&) returns float (a nonreference type). Without decltype, it's impossible within the template to specify a type which represents "the return type of the function foo which takes an argument t of type T".

In this example, it's not a particular concrete type which is impossible to express -- either int& or float are individually expressible -- but a higher level generic class of types.

EDIT: and to answer your comment to another answer, this example is inexpressible in C++03. You cannot have a function template which will wrap any function T1 foo(T2) and match both argument and return type of the wrapped function.

How compile time recursion works?

19 votes

Hello!

I found a code here Printing 1 to 1000 without loop or conditionals

Can someone please explain how compile time recursion works, couldn't find it in google

// compile time recursion
template<int N> void f1()
{ 
    f1<N-1>(); 
    cout << N << '\n'; 
}

template<> void f1<1>() 
{ 
    cout << 1 << '\n'; 
}


int main()
{
    f1<1000>();
}

Thank you!

It repeatedly instantiates the f1<N> template with decreasing values for N (f1<N>() calls f1<N-1> and so on). The explicit specialization for N==1 ends the recursion: as soon as N becomes 1, the compiler will pick the specialized function rather than the templated one.

f1<1000>() causes the compiler to instantiate f1<N> 999 times (not counting in the final call to f1<1>). This is the reason why it can take a while to compile code that makes heavy use of template meta-programming techniques.

The whole thing relies heavily on the compiler's optimization skills - ideally, it should remove the recursion (which only serves as hack to emulate a for loop using templates) completely.

How does the C++ runtime system know when objects go out of scope

18 votes

Hello, I was wondering how the C++ runtime system detects when an object goes out of scope so that it calls the destructor accordingly to free up the occupied memory.

Thanks.

This is known statically at compile time

{
  string s; /* ctor called here */
} /* dtor called here */

Sometimes, it's more difficult

{
  again:
  {
    string s; /* ctor called here */
    goto again; /* now dtor of s is called here */
    string q; /* ctor not called. not reached. */
  } /* dtor of s and q would be called here. but not reached */
}

Is C# really slower than say C++ ?

17 votes

I've been wondering about this issue for a while now.

Of course there are things in C# that aren't optimized for speed, so using those objects or language tweaks (like LinQ) may cause the code to be slower.

But if you don't use any of those tweaks, but just compare the same pieces of code in C# and C++ (It's easy to translate one to anther). Will it really be that much slower ?

I've seen comparisons done that show that C# might be even faster in some cases, because in theory the JIT compiler should optimize the code in real time and get better results:

Managed Or Unmanaged?

We should remember that the JIT compiler compiles the code at real time, but that's a 1-time overhead, the same code (once reached and compiled) doesn't need to be compiled again at run time.

The GC doesn't add a lot of overhead either, unless you create and destroy thousands of objects (like using String instead of StringBuilder). And doing that in C++ would also be costly.

Another point that I want to bring up is the better communication between DLLs introduced in .Net. The .Net platform communicates much better than Managed COM based DLLs.

I don't see any inherit reason why the language should be slower, and I don't really think that C# is slower than C++ (both from experience and lack of a good explanation)..

So, will a piece of the same code written in C# will be slower than the same code in C++ ?
In if so, then WHY ?

Some other reference (Which talk about that a bit, but with no explanation about WHY):

Why would you want to use C# if its slower than C++?

Warning: The question you've asked is really pretty complex -- probably much more so than you realize. As a result, this is a really long answer.

From a purely theoretical viewpoint, there's probably a simple answer to this: there's (probably) nothing about C# that truly prevents it from being as fast as C++. Despite the theory, however, there are some practical reasons that it is slower at some things under some circumstances.

I'll consider three basic areas of differences: language features, virtual machine execution, and garbage collection. The latter two often go together, but can be independent, so I'll look at them separately.

Language Features

C++ places a great deal of emphasis on templates, and features in the template system that are largely intended to allow as much as possible to be done at compile time, so from the viewpoint of the program, they're "static." Template meta-programming allows completely arbitrary computations to be carried out at compile time (I.e., the template system is Turing complete). As such, essentially anything that doesn't depend on input from the user can be computed at compile time, so at runtime it's simply a constant. Input to this can, however, include things like type information, so a great deal of what you'd do via reflection at runtime in C# is normally done at compile time via template metaprogramming in C++. There is definitely a trade-off between runtime speed and versatility though -- what templates can do, they do statically, but they simply can't do everything reflection can.

The differences in language features mean that almost any attempt at comparing the two languages simply by transliterating some C# into C++ (or vice versa) is likely to produce results somewhere between meaningless and misleading (and the same would be true for most other pairs of languages as well). The simple fact is that for anything larger than a couple lines of code or so, almost nobody is at all likely to use the languages the same way (or close enough to the same way) that such a comparison tells you anything about how those languages work in real life.

Virtual Machine

Like almost any reasonably modern VM, Microsoft's for .NET can and will do JIT (aka "dynamic") compilation. This represents a number of trade-offs though. First, almost every VM (including Microsoft's, I believe) attempts to make intelligent decisions about what to compile and what to interpret. It does this by interpreting the code the first few times it's encountered. It profiles how often it's interpreting particular code, and when it exceeds a certain threshold, figures it's likely to execute enough more that it's worth compiling it to gain execution speed. This has an obvious problem: once in a while, it can guess wrong -- you have a lot of code that executes just often enough to trigger compilation, but then never gets used again, you're losing pretty badly: nearly all your execution is via the slow, interpreted path, then you pay the price of compilation, but then you get on benefit from the compilation. In fairness, I should add that this is pretty unusual in normal code, but if you actually want to, it's usually pretty easy to trigger it.

Second, optimizing code (like most other optimization problems) is largely an NP-complete problem. For anything but a truly trivial/toy program, you're pretty nearly guaranteed you won't truly "optimize" the result (i.e., you won't find the true optimum) -- the optimizer will simply make the code better than it was previously. Quite a few optimizations that are well known, however, take a substantial amount of time (and, often, memory) to execute. With a JIT compiler, the user is waiting while the compiler runs. Especially when coupled with the first problem (possibility of deriving little or no benefit from compilation), most of the more expensive optimization techniques are ruled out. Static compilation has two advantages: first of all, if it's slow (e.g., building a large system) it's typically carried out on a server, and nobody spends time waiting for it. Second, an executable can be generated once, and used many times by many people. The first minimizes the cost of optimization; the second amortizes the much smaller cost over a much larger number of executions.

As mentioned in the original question (and many other web sites) JIT compilation does have the possibility of greater awareness of the target environment, which should (at least theoretically) offset this advantage. There's no question that this factor does offset at least part of the disadvantage of static compilation. For a few rather specific types of code and target environments, it can even outweigh the advantages of static compilation. At least in my testing and experience, however, this is fairly unusual. Target dependent optimizations mostly seem to either make fairly small differences, or can only be applied (automatically, anyway) to fairly specific types of problems.

Using a VM also has a possibility of improving cache usage. Instructions for a VM are often more compact than native machine instructions. More of them can fit into a given amount of cache memory, so you stand a better chance of any given code being in cache when needed. This can help keep interpreted execution of VM code more competitive (in terms of speed) than most people would initially expect -- you can execute a lot of instructions on a modern CPU in the time taken by one cache miss.

It's also worth mentioning that this factor isn't necessarily different between the two at all. There's nothing preventing (for example) a C++ compiler from producing output intended to run on a virtual machine (with or without JIT). In fact, Microsooft's C++/CLI is nearly that -- an (almost) conforming C++ compiler (albeit, with a lot of conforming extensions) that produces output intended to run on a virtual machine. The reverse is probably also true: in theory a C# compiler that produced native code should be possible as well.

Garbage Collection

From what I've seen, I'd say garbage collection is the poorest-understood of these three factors. Just for an obvious example, the question here mentions: "GC doesn't add a lot of overhead either, unless you create and destroy thousands of objects [...]". In reality, if you create and destroy thousands of objects, the overhead from garbage collection will generally be fairly low. .NET uses a generational scavenger, which is a variety of copying collector. The garbage collector works by starting from "places" (e.g., registers and execution stack) that pointers/references are known to be accessible. It then "chases" those pointers to objects that have been allocated on the heap. It examines those objects for further pointers/references, until it has followed all of them to the ends of any chains, and found all the objects that are (at least potentially) accessible. In the next step, it takes all of the objects that are (or at least might be) in use, and compacts the heap by copying all of them into a contiguous chunk at one end of the memory being managed in the heap. The rest of the memory is then free (modulo finalizers having to be run, but at least in well-written code, they're rare enough that I'll ignore them for the moment).

What this means is that if you create and destroy lots of objects, garbage collection adds very little overhead. The time taken by a garbage collection cycle depends almost entirely on the number of objects that have been created but not destroyed. The primary consequence of creating and destroying objects in a hurry is simply that the GC has to run more often, but each cycle will still be fast. If you create objects and don't destroy them, the GC will run more often and each cycle will be substantially slower as it spends more time chasing pointers to potentially-live objects, and it spends more time copying objects that are still in use.

To combat this, generational scavenging works on the assumption that objects that have remained "alive" for quite a while are likely to continue remaining alive for quite a while longer. Based on this, it has a system where objects that survive some number of garbage collection cycles get "tenured", and the garbage collector starts to simply assume they're still in use, so instead of copying them at every cycle, it simply leaves them alone. This is a valid assumption often enough that generational scavenging typically has considerably lower overhead than most other forms of GC.

"Manual" memory management is often just as poorly understood. Just for one example, many attempts at comparison assume that all manual memory management follows one specific model as well (e.g., best-fit allocation). This is often little (if any) closer to reality than many peoples' beliefs about garbage collection (e.g., the widespread assumption that it's normally done using reference counting).

Given the variety of strategies for both garbage collection and manual memory management, it's quite difficult to compare the two in terms of overall speed. Attempting to compare the speed of allocating and/or freeing memory (by itself) is pretty nearly guaranteed to produce results that are meaningless at best, and outright misleading at worst.

Bonus Topic: Benchmarks

Since quite a few blogs, web sites, magazine articles, etc., claim to provide "objective" evidence in one direction or another, I'll put in my two-cents worth on that subject as well.

Most of these benchmarks are a bit like teenagers deciding to race their cars, and whoever wins gets to keep both cars. The web sites differ in one crucial way though: they guy who's publishing the benchmark gets to drive both cars. By some strange chance, his car always wins, and everybody else has to settle for "trust me, I was really driving your car as fast as it would go."

It's easy to write a poor benchmark that produces results that mean next to nothing. Almost anybody with anywhere close to the skill necessary to design a benchmark the produces anything meaningful, also has the skill to produce one that will give the results he's decided he wants. In fact it's probably easier to write code intended to produce a specific result than code that will really produce meaningful results.

As my friend James Kanze put it, "never trust a benchmark you didn't falsify yourself."

Conclusion

There is no simple answer. I'm reasonably certain I could flip a coin and roll a pair of dice to pick a winner and percentage by which it would win, and write a seemingly fair benchmark in which the chosen language won by the chosen percentage.

As others have pointed out, for most code, speed is almost irrelevant. The corollary to that (which is much more often ignored) is that in the little code where speed does matter, it usually matters a lot. At least in my experience, for the code where it really does matter, C++ is almost always the winner. There are definitely factors that favor C#, but in practice they seem to be outweighed by factors that favor C++. You can certainly find benchmarks that will indicate the outcome of your choice, but when you write real code, you can almost always make it faster in C++ than in C#. It might (or might not) take more skill and/or effort to write, but it's virtually always possible.

How can I catch same effect of "cerr << (A) << endl" on c language ?

Asked on Tue, 29 Mar 2011 by fatai c++ c
17 votes

in c++,

#define DEBUG( A) cerr << (A) << endl ;

I can send everything to it, then it can print it.However, in c, I must specify its type with %d, %c or %s etc. but I dont want write all time its type, I want use fprintf as cerr. How can I do that ?

ex :in c

#define DEBUG(A)  X // X is what I will write 
,,,
// in function, when I put
DEBUG ( 5 ) ;  // I just want to print 5 
// or, with same statement, when I say 
DEBUG ('a' ) ; // output : a

You can use GNU C Language Extensions :

#define DEBUG(x)                                                 \
  ({                                                             \
    if (__builtin_types_compatible_p (typeof (x), int))          \
        fprintf(stderr,"%d\n",x);                                \
    else if (__builtin_types_compatible_p (typeof (x), char))    \
        fprintf(stderr,"%c\n",x);                                \
    else if (__builtin_types_compatible_p (typeof (x), char[]))  \
        fprintf(stderr,"%s\n",x);                                \
    else                                                         \
        fprintf(stderr,"unknown type\n");                        \

  })

These are fine:

DEBUG("hello"); //prints hello
DEBUG(11110);   //prints 11110 

But for chars, you should use it with lvalue, otherwise its type will be "int" :

char c='A';
DEBUG(c);    // prints A
DEBUG('A');  // prints 65

Incompatibility between C and C++ code

15 votes

The given C code

#include <stdio.h>
int x = 14; 
size_t check()
{
   struct x {};
   return sizeof(x); // which x
}
int main()
{
    printf("%zu",check()); 
    return 0;
}

gives 4 as output in C on my 32 bit implementation whereas in C++ the code

#include <iostream>
int x = 14;    
size_t check()
{
   struct x {};
   return sizeof(x); // which x    
}
int main()
{
    std::cout<< check(); 
    return 0;
}

outputs 1. Why such difference?

In C++ class declaration struct x {}; introduces the name x into the scope of check and hides x (previously declared as int at file scope). You get 1 as the output because size of empty class cannot be zero in C++.

In C, an inner scope declaration of a struct tag name never hides the name of an object or function in an outer scope.You need to use the tag name struct to refer to the typename x (struct). However you can't have an empty struct in C as it violates the syntactical constraints on struct(however gcc supports it as an extension).

How is the stack initialized?

13 votes

When a process requests for memory and an operating system is giving some new pages to the process, the kernel should initialize the pages (with zeros for instance) in order to avoid showing potentially confident data that another process used. The same when a process is starting and receives some memory, for example the stack segment.

When I execute the following code in Linux, the result is that the majority of allocated memory is indeed 0, but something about 3-4 kB at the bottom of the stack (the last elements of the array, the highest addresses) contains random numbers.

#include <cstdlib>
#include <iostream>
using namespace std;

int main()
{
    int * a = (int*)alloca(sizeof(int)*2000000);
    for(int i = 0; i< 2000000; ++i)
        cout << a[i] << endl;
    return 0;
}
  1. Why isn't it set to zero too?
  2. Could it be because it is being reused by the process?
  3. If yes, could it be the initialization code that had used those 3-4 kB of memory earlier?

I am pretty sure that when the OS starts your process, the stack is only zeros. What you observe is another phenomenon, I think. You seem to have compiled your program as C++. C++ does a lot of code (constructors and stuff like that) before your main starts. So what you see are the left over values of your own execution.

If you'd compile your code as C (change to "stdio.h" etc) you'd probably see a much reduced "pollution" if not even none at all. In particular if you'd link your program statically to a minimalist version of a C library.