Best c questions in January 2012

Memory leak C++

35 votes

I just wrote a code in C++ which does some string manipulation, but when I ran valgrind over, it shows some possible memory leaks. Debugging the code to granular level I wrote a simple C++ program looking like:

#include<iostream>
using namespace std;
int main()
{
        std::string myname("Is there any leaks");
        exit(0);
}

and running valgrind over it I got:

==20943== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 26 from 1)
==20943== malloc/free: in use at exit: 360,645 bytes in 12,854 blocks.
==20943== malloc/free: 65,451 allocs, 52,597 frees, 2,186,968 bytes allocated.
==20943== For counts of detected errors, rerun with: -v
==20943== searching for pointers to 12,854 not-freed blocks.
==20943== checked 424,628 bytes.
==20943== 
==20943== LEAK SUMMARY:
==20943==    definitely lost: 0 bytes in 0 blocks.
==20943==      possibly lost: 917 bytes in 6 blocks.
==20943==    still reachable: 359,728 bytes in 12,848 blocks.
==20943==         suppressed: 0 bytes in 0 blocks.
==20943== Reachable blocks (those to which a pointer was found) are not shown.
==20943== To see them, rerun with: --show-reachable=yes

Then it struck me that we have forcefully exited (which i performed in my original C++ code as well). Now the problem is that I want to exit from the program as my previous old code waits for the exit status of the new code. For e.g binary a.out waits for the exit status of b.out. Is there any way to avoid the memory leaks, or should i really worry about the memory leaks as the program is already exiting at that point.

This also raise another question for me, is such a code harmful?

#include<stdio.h>
int main()
{
        char *p=(char *)malloc(sizeof(char)*1000);
        exit(0);
}

EDIT : I cannot simply use return, because I wait for my code exit status, and the exit status is not just 0 and is important in my application.

If you insist on using exit():

#include<iostream>
int main(){
    {
        std::string myname("Are there any leaks?");
    }
    exit(0);
}

Also, when you return from main returned value becomes application exit code. So if you want to pass exit code, use return exitCode; in main() instead of exit.

Regarding that part:

This also raise another question for me, is such a code harmful?

Yes, because it is a BAD programming habit.
OS will clean up memory you failed to release, so as long as you haven't managed to eat all system memory and page file, you shouldn't damage OS. However, writing sloppy/leaky code might turn into habit, so relying on OS for cleaning up your mess is a bad idea.

How does GCC optimize C code?

34 votes

I wrote this simple C program:

int main(){
    int i; int count = 0;
    for(i = 0; i < 2000000000; i++){
        count = count + 1;
    }
}

I wanted to see how the gcc compiler optimizes this loop (clearly add 1 2000000000 times should be "add 2000000000 one time"). So:

$ gcc test.c
and then time( ) on a.out:

real 0m7.717s  
user 0m7.710s  
sys 0m0.000s  

$ gcc -O2 test.c and then time( ) on a.out:

real 0m0.003s  
user 0m0.000s  
sys 0m0.000s  

Then I disassembled both with gcc -S. First one seems quite clear:

    .file "test.c"  
    .text  
.globl main
    .type   main, @function  
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    movl    $0, -8(%rbp)
    movl    $0, -4(%rbp)
    jmp .L2
.L3:
    addl    $1, -8(%rbp)
    addl    $1, -4(%rbp)
.L2:
    cmpl    $1999999999, -4(%rbp)
    jle .L3
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
    .section    .note.GNU-stack,"",@progbits

L3 adds, L2 compare -4(%rbp) with 1999999999 and loops to L3 if i < 2000000000.

Now the optimized one:

    .file "test.c"  
    .text
    .p2align 4,,15
.globl main
    .type main, @function
main:
.LFB0:
    .cfi_startproc
    rep
    ret
    .cfi_endproc
.LFE0:
    .size main, .-main
    .ident "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
    .section .note.GNU-stack,"",@progbits

I can't understand at all what's going on there! I've got little knowledge of assembly, but I expected something like

addl $2000000000, -8(%rbp)

I even tried with gcc -c -g -Wa,-a,-ad -O2 test.c to see the C code together with the assembly it was converted to, but the result was no more clear that the previous one.

Can someone briefly explain:

  1. The gcc -S -O2 output.
  2. If the loop is optimized as I expected (one sum instead of many sums)?

The compiler is even smarter than that. :)

In fact, it realizes that you aren't using the result of the loop. So it took out the entire loop completely!

This is called Dead Code Elimination.

A better test is to print the result:

#include <stdio.h>
int main(void) {
    int i; int count = 0;
    for(i = 0; i < 2000000000; i++){
        count = count + 1;
    }

    //  Print result to prevent Dead Code Elimination
    printf("%d\n", count);
}

EDIT : I've added the required #include <stdio.h>; the MSVC assembly listing corresponds to a version without the #include, but it should be the same.


I don't have GCC in front of me at the moment, since I'm booted into Windows. But here's the disassembly of the version with the printf() on MSVC:

EDIT : I had the wrong assembly output. Here's the correct one.

; 57   : int main(){

$LN8:
    sub rsp, 40                 ; 00000028H

; 58   : 
; 59   : 
; 60   :     int i; int count = 0;
; 61   :     for(i = 0; i < 2000000000; i++){
; 62   :         count = count + 1;
; 63   :     }
; 64   : 
; 65   :     //  Print result to prevent Dead Code Elimination
; 66   :     printf("%d\n",count);

    lea rcx, OFFSET FLAT:??_C@_03PMGGPEJJ@?$CFd?6?$AA@
    mov edx, 2000000000             ; 77359400H
    call    QWORD PTR __imp_printf

; 67   : 
; 68   : 
; 69   : 
; 70   :
; 71   :     return 0;

    xor eax, eax

; 72   : }

    add rsp, 40                 ; 00000028H
    ret 0

So yes, Visual Studio does this optimization. I'd assume GCC probably does too.

And yes, GCC performs a similar optimization. Here's an assembly listing for the same program with gcc -S -O2 test.c (gcc 4.5.2, Ubuntu 11.10, x86):

        .file   "test.c"
        .section        .rodata.str1.1,"aMS",@progbits,1
.LC0:
        .string "%d\n"
        .text
        .p2align 4,,15
.globl main
        .type   main, @function
main:
        pushl   %ebp
        movl    %esp, %ebp
        andl    $-16, %esp
        subl    $16, %esp
        movl    $2000000000, 8(%esp)
        movl    $.LC0, 4(%esp)
        movl    $1, (%esp)
        call    __printf_chk
        leave
        ret
        .size   main, .-main
        .ident  "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
        .section        .note.GNU-stack,"",@progbits

Why does (x += x += 1) evaluate differently in C and Javascript?

32 votes

If the value of the variable x is initially 0, the expression x += x += 1 will evaluate to 2 in C, and to 1 in Javascript.

The semantics for C seems obvious to me: x += x += 1 is interpreted as x += (x += 1) which is, in turn, equivalent to

x += 1
x += x  // where x is 1 at this point

What is the logic behind Javascript's interpretation? What specification enforces such behaviour? (It should be noted, by the way, that Java agrees with Javascript here).

Update: It turns out the expression x += x += 1 has undefined behaviour according to the C standard (thanks ouah, John Bode, DarkDust, Drew Dormann), which seems to spoil the whole point of the question for some readers. The expression can be made standards-compliant by inserting an identity function into it as follows: x += id(x += 1). The same modification can be made to the Javascript code and the question still remains as stated. Presuming that the majority of the readers can understand the point behind "non-standards-compliant" formulation I'll keep it as it is more concise.

Update 2: It turns out that according to C99 the introduction of the identity function is probably not solving the ambiguity. In this case, dear reader, please regard the original question as pertaining to C++ rather than C99, where "+=" can be most probably now safely be regarded as an overloadable operator with a uniquely defined sequence of operations. That is, x += x += 1 is now equivalent to operator+=(x, operator+=(x, 1)). Sorry for the long road to standards-compliance.

JavaScript and Java have pretty much strict left-to-right evaluation rules for this expression. C does not (even in the version you provided that has the identity function intervening).

The ECMAscript spec I have (3rd Edition, which I'll admit is quite old - the current version can be found here: http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf) says that compound assignment operators are evaluated like so:

11.13.2 Compound Assignment ( op= )

The production AssignmentExpression : LeftHandSideExpression @ = AssignmentExpression, where@ represents one of the operators indicated above, is evaluated as follows:

  1. Evaluate LeftHandSideExpression.
  2. Call GetValue(Result(1)).
  3. Evaluate AssignmentExpression.
  4. Call GetValue(Result(3)).
  5. Apply operator @ to Result(2) and Result(4).
  6. Call PutValue(Result(1), Result(5)).
  7. Return Result(5)

You note that Java has the same behavior as JavaScript - I think it's spec is more readable, so I'll post some snippets here (http://java.sun.com/docs/books/jls/third_edition/html/expressions.html#15.7):

15.7 Evaluation Order

The Java programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right.

It is recommended that code not rely crucially on this specification. Code is usually clearer when each expression contains at most one side effect, as its outermost operation, and when code does not depend on exactly which exception arises as a consequence of the left-to-right evaluation of expressions.

15.7.1 Evaluate Left-Hand Operand First The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated. For example, if the left-hand operand contains an assignment to a variable and the right-hand operand contains a reference to that same variable, then the value produced by the reference will reflect the fact that the assignment occurred first.

...

If the operator is a compound-assignment operator (§15.26.2), then evaluation of the left-hand operand includes both remembering the variable that the left-hand operand denotes and fetching and saving that variable's value for use in the implied combining operation.

On the other hand, in the not-undefined-behavior example where you provide an intermediate identity function:

x += id(x += 1);

while it's not undefined behavior (since the function call provides a sequence point), it's still unspecified behavior whether the leftmost x is evaluated before the function call or after. So, while it's not 'anything goes' undefined behavior, the C compiler is still permitted to evaluate both x variables before calling the id() function, in which case the final value stored to the variable will be 1:

For example, if x == 0 to start, the evaluation could look like:

tmp = x;    // tmp == 0
x = tmp  +  id( x = tmp + 1)
// x == 1 at this point

or it could evaluate it like so:

tmp = id( x = x + 1);   // tmp == 1, x == 1
x = x + tmp;
// x == 2 at this point

Note that unspecified behavior is subtly different than undefined behavior, but it's still not desirable behavior.

Can a compiler automatically detect pure functions without the type information about purity?

27 votes

So I'm arguing with my friend who claims that a compiler like GCC can detect a pure function automatically without any type information. I doubt that.

Languages like D or Haskell have purity in their type systems and a programmer explicitly defines what function is pure or not. A pure function has no side effects and can therefore very easily be parallelized.

So the question is: Is this all necessary or not? Could a compiler detect purity, without any meta or type information, just by assuming that anything that does IO or accesses global variables automatically is not pure?

Sure, you can detect pure functions in some cases. For instance,

int f(int x)
{
    return x*2;
}

can be detected as pure with simple static analysis. The difficulty is doing this in general, and detecting interfaces which use "internal" state but are externally pure is basically impossible.

GCC does have the warning options -Wsuggest-attribute=pure and -Wsuggest-attribute=const, which suggest functions that might be candidates for the pure and const attributes. I'm not sure whether it opts to be conservative (i.e. missing many pure functions, but never suggesting it for a non-pure function) or lets the user decide.

Note that GCC's definition of pure is "depending only on arguments and global variables":

Many functions have no effects except the return value and their return value depends only on the parameters and/or global variables. Such a function can be subject to common subexpression elimination and loop optimization just as an arithmetic operator would be. These functions should be declared with the attribute pure.

GCC manual

Strict purity, i.e. the same results for the same arguments in all circumstances, is represented by the const attribute, but such a function cannot even dereference a pointer passed to it. So the parallelisation opportunities for pure functions are limited, but much fewer functions can be const compared to the pure functions you can write in a language like Haskell.

By the way, automatically parallelising pure functions is not as easy as you might think; the hard part becomes deciding what to parallelise. Parallelise computations that are too cheap, and overhead makes it pointless. Don't parallelise enough, and you don't reap the benefits. I don't know of any practical functional language implementation that does automatic parallelisation for this reason, although libraries like repa parallelise many operations behind the scenes without explicit parallelism in the user code.

Is it a good idea to compile a language to C?

27 votes

All over the web, I am getting the feeling that writing a C backend for a compiler is not such a good idea anymore. GHC's C backend is not being actively developed anymore (this is my unsupported feeling). Compilers are targeting C-- or LLVM.

Normally, I would think that GCC is a good old mature compiler that does performs well at optimizing code, therefore compiling to C will use the maturity of GCC to yield better and faster code. Is this not true?

I realize that the question greatly depends on the nature of the language being compiled and on other factors such that getting more maintainable code. I am looking for a rather more general answer (w.r.t. the compiled language) that focuses solely on performance (disregarding code quality, ..etc.). I would be also really glad if the answer includes an explanation on why GHC is drifting away from C and why LLVM performs better as a backend (see this) or any other examples of compilers doing the same that I am not aware of.

While I'm not a compiler expert, I believe that it boils down to the fact that you lose something in translation to C as opposed to translating to e.g. LLVM's intermediate language.

If you think about the process of compiling to C, you create a compiler that translates to C code, then the C compiler translates to an intermediate representation (the in-memory AST), then translates that to machine code. The creators of the C compiler have probably spent a lot of time optimizing certain human-made patterns in the language, but you're not likely to be able to create a fancy enough compiler from a source language to C to emulate the way humans write code. There is a loss of fidelity going to C - the C compiler doesn't have any knowledge about your original code's structure. To get those optimizations, you're essentially back-fitting your compiler to try to generate C code that the C compiler knows how to optimize when it's building its AST. Messy.

If, however, you translate directly to LLVM's intermediate language, that's like compiling your code to a machine-independent high-level bytecode, which is akin to the C compiler giving you access to specify exactly what its AST should contain. Essentially, you cut out the middleman that parses the C code and go directly to the high-level representation, which preserves more of the characteristics of your code by requiring less translation.

Also related to performance, LLVM can do some really tricky stuff for dynamic languages like generating binary code at runtime. This is the "cool" part of just-in-time compilation: it is writing binary code to be executed at runtime, instead of being stuck with what was created at compile time.

Why is this C code faster than this C++ code ? getting biggest line in file

23 votes

I have two versions of a program that does basically the same thing, getting the biggest length of a line in a file, I have a file with about 8 thousand lines, my code in C is a little bit more primitive (of course!) than the code I have in C++. The C programm takes about 2 seconds to run, while the program in C++ takes 10 seconds to run (same file I am testing with for both cases). But why? I was expecting it to take the same amount of time or a little bit more but not 8 seconds slower!

my code in C:

#include <stdio.h>
#include <stdlib.h> 
#include <string.h>

#if _DEBUG
    #define DEBUG_PATH "../Debug/"
#else
    #define DEBUG_PATH ""
#endif

const char FILE_NAME[] = DEBUG_PATH "data.noun";

int main()
{   
    int sPos = 0;
    int maxCount = 0;
    int cPos = 0;
    int ch;
    FILE *in_file;              

    in_file = fopen(FILE_NAME, "r");
    if (in_file == NULL) 
    {
        printf("Cannot open %s\n", FILE_NAME);
        exit(8);
    }       

    while (1) 
    {
        ch = fgetc(in_file);
        if(ch == 0x0A || ch == EOF) // \n or \r or \r\n or end of file
        {           
            if ((cPos - sPos) > maxCount)
                maxCount = (cPos - sPos);

            if(ch == EOF)
                break;

            sPos = cPos;
        }
        else
            cPos++;
    }

    fclose(in_file);

    printf("Max line length: %i\n",  maxCount); 

    getch();
    return (0);
}

my code in C++:

#include <iostream>
#include <fstream>
#include <stdio.h>
#include <string>

using namespace std;

#ifdef _DEBUG
    #define FILE_PATH "../Debug/data.noun"
#else
    #define FILE_PATH "data.noun"
#endif

int main()
{
    string fileName = FILE_PATH;
    string s = "";
    ifstream file;
    int size = 0;

    file.open(fileName.c_str());
    if(!file)
    {
        printf("could not open file!");
        return 0;
    }

    while(getline(file, s) )
            size = (s.length() > size) ? s.length() : size;
    file.close();

    printf("biggest line in file: %i", size);   

    getchar();
    return 0;
}

The C++ version constantly allocates and deallocates instances of std::string. Memory allocation is a costly operation. In addition to that the constructors/destructors are executed.

The C version however uses constant memory, and just does was necessary: Reading in single characters, setting the line-length counter to the new value if higher, for each newline and that's it.

Odd use of curly braces in C

22 votes

Sorry for the simple question but I'm on vacation reading a book on core audio, and don't have my C or Objective C books with me...

What are the curly braces doing in this variable definition?

MyRecorder recorder = {0};

Assuming that MyRecorder is a struct, this sets every member to their respective representation of zero (0 for integers, NULL for pointers etc.).

Actually this also works on all other datatypes like int, double, pointers, arrays, nested structures, ..., everything you can imagine (thanks to pmg for pointing this out!)

UPDATE: A quote extracted from the website linked above, citing the final draft of C99:

[6.7.8.21] If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, [...] the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

Most accurate way to do a combined multiply-and-divide operation in 64-bit?

18 votes

What is the most accurate way I can do a multiply-and-divide operation for 64-bit integers that works in both 32-bit and 64-bit programs (in Visual C++)? (In case of overflow, I need the result mod 264.)

(I'm looking for something like MulDiv64, except that this one uses inline assembly, which only works in 32-bit programs.)

Obviously, casting to double and back is possible, but I'm wondering if there's a more accurate way that isn't too complicated. (i.e. I'm not looking for arbitrary-precision arithmetic libraries here!)

Since this is tagged Visual C++ I'll give a solution that abuses MSVC-specific intrinsics.

This example is fairly complicated. It's a highly simplified version of the same algorithm that is used by GMP and java.math.BigInteger for large division.

Although I have a simpler algorithm in mind, it's probably about 30x slower.

This solution has the following constraints/behavior:

  • It requires x64. It will not compile on x86.
  • The quotient is not zero.
  • The quotient saturates if it overflows 64-bits.

Note that this is for the unsigned integer case. It's trivial to build a wrapper around this to make it work for signed cases as well. This example should also produce correctly truncated results.

This code is not fully tested. However, it has passed all the tests cases that I've thrown at it.
(Even cases that I've intentionally constructed to try to break the algorithm.)

#include <intrin.h>

uint64_t muldiv2(uint64_t a, uint64_t b, uint64_t c){
    //  Normalize divisor
    unsigned long shift;
    _BitScanReverse64(&shift,c);
    shift = 63 - shift;

    c <<= shift;

    //  Multiply
    a = _umul128(a,b,&b);
    if (((b << shift) >> shift) != b){
        cout << "Overflow" << endl;
        return 0xffffffffffffffff;
    }
    b = __shiftleft128(a,b,shift);
    a <<= shift;


    uint32_t div;
    uint32_t q0,q1;
    uint64_t t0,t1;

    //  1st Reduction
    div = (uint32_t)(c >> 32);
    t0 = b / div;
    if (t0 > 0xffffffff)
        t0 = 0xffffffff;
    q1 = (uint32_t)t0;
    while (1){
        t0 = _umul128(c,(uint64_t)q1 << 32,&t1);
        if (t1 < b || (t1 == b && t0 <= a))
            break;
        q1--;
//        cout << "correction 0" << endl;
    }
    b -= t1;
    if (t0 > a) b--;
    a -= t0;

    if (b > 0xffffffff){
        cout << "Overflow" << endl;
        return 0xffffffffffffffff;
    }

    //  2nd reduction
    t0 = ((b << 32) | (a >> 32)) / div;
    if (t0 > 0xffffffff)
        t0 = 0xffffffff;
    q0 = (uint32_t)t0;

    while (1){
        t0 = _umul128(c,q0,&t1);
        if (t1 < b || (t1 == b && t0 <= a))
            break;
        q0--;
//        cout << "correction 1" << endl;
    }

//    //  (a - t0) gives the modulus.
//    a -= t0;

    return ((uint64_t)q1 << 32) | q0;
}

Note that if you don't need a perfectly truncated result, you can remove the last loop completely. If you do this, the answer will be no more than 2 larger than the correct quotient.

Test Cases:

cout << muldiv2(4984198405165151231,6132198419878046132,9156498145135109843) << endl;
cout << muldiv2(11540173641653250113, 10150593219136339683, 13592284235543989460) << endl;
cout << muldiv2(449033535071450778, 3155170653582908051, 4945421831474875872) << endl;
cout << muldiv2(303601908757, 829267376026, 659820219978) << endl;
cout << muldiv2(449033535071450778, 829267376026, 659820219978) << endl;
cout << muldiv2(1234568, 829267376026, 1) << endl;
cout << muldiv2(6991754535226557229, 7798003721120799096, 4923601287520449332) << endl;
cout << muldiv2(9223372036854775808, 2147483648, 18446744073709551615) << endl;
cout << muldiv2(9223372032559808512, 9223372036854775807, 9223372036854775807) << endl;
cout << muldiv2(9223372032559808512, 9223372036854775807, 12) << endl;
cout << muldiv2(18446744073709551615, 18446744073709551615, 9223372036854775808) << endl;

Output:

3337967539561099935
8618095846487663363
286482625873293138
381569328444
564348969767547451
1023786965885666768
11073546515850664288
1073741824
9223372032559808512
Overflow
18446744073709551615
Overflow
18446744073709551615

Is there a good reason for always enclosing a define in parentheses in C?

18 votes

Clearly, there are times where defines must have parentheses, like so:

#define WIDTH 80+20

int a = WIDTH * 2; //expect a==200 but a==120

So I have always parenthesized, even if it's just a single number:

#define WIDTH (100)

Someone new to C asked me why I do this, so I tried to find an edge case where the absence of parentheses on a single number define causes issues, but I can't think of one.

Does such a case exist?

Yes. The preprocessor concatenation operator (##) will cause issues, for example:

#define _add_penguin(a) penguin ## a
#define add_penguin(a) _add_penguin(a)

#define WIDTH (100)
#define HEIGHT 200    

add_penguin(HEIGHT) // expands to penguin200
add_penguin(WIDTH)  // error, cannot concatenate penguin and (100) 

Same for stringization (#). Clearly this is a corner case and probably doesn't matter considering how WIDTH will presumably be used. Still, it is something to keep in mind about the preprocessor.

(The reason why adding the second penguin fails is a subtle detail of the preprocessing rules in C99 - iirc it fails because concatenating to two non-placeholder preprocessing tokens must always result in a single preprocessing token - but this is irrelevant, even if the concatenation was allowed it would still give a different result than the unbracketed #define!).

All other responses are correct only insofar that it doesn't matter from the point of view of the C++ scanner because, indeed, a number is atomic. However, to my reading of the question there is no sign that only cases with no further preprocessor expansion should be considered, so the other responses are, even though I totally agree with the advice contained therein, wrong.

Returning struct containing array

16 votes

The following simple code segfaults under gcc 4.4.4

#include<stdio.h>

typedef struct Foo Foo;
struct Foo {
    char f[25];
};

Foo foo(){
    Foo f = {"Hello, World!"};
    return f;
}

int main(){
    printf("%s\n", foo().f);
}

Changing the final line to

 Foo f = foo(); printf("%s\n", f.f);

Works fine. Both versions work when compiled with -std=c99. Am I simply invoking undefined behavior, or has something in the standard changed, which permits the code to work under C99? Why does is crash under C89?

I believe the behavior is undefined both in C89/C90 and in C99.

foo().f is an expression of array type, specifically char[25]. C99 6.3.2.1p3 says:

Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.

The problem in this particular case (an array that's an element of a structure returned by a function) is that there is no "array object". Function results are returned by value, so the result of calling foo() is a value of type struct Foo, and foo().f is a value (not an lvalue) of type char[25].

This is, as far as I know, the only case in C (up to C99) where you can have a non-lvalue expression of array type. I'd say that the behavior of attempting to access it is undefined by omission, likely because the authors of the standard (understandably IMHO) didn't think of this case. You're likely to see different behaviors at different optimization settings.

The new 2011 C standard patches this corner case by inventing a new storage class. N1570 (the link is to a late pre-C11 draft) says in 6.2.4p8:

A non-lvalue expression with structure or union type, where the structure or union contains a member with array type (including, recursively, members of all contained structures and unions) refers to an object with automatic storage duration and temporary lifetime. Its lifetime begins when the expression is evaluated and its initial value is the value of the expression. Its lifetime ends when the evaluation of the containing full expression or full declarator ends. Any attempt to modify an object with temporary lifetime results in undefined behavior.

So the program's behavior is well defined in C11. Until you're able to get a C11-conforming compiler, though, your best bet is probably to store the result of the function in a local object (assuming your goal is working code rather than breaking compilers):

[...]
int main(void ) {
    struct Foo temp = foo();
    printf("%s\n", temp.f);
}

Why is the year value in Perl's localtime function (and C's tm struct) relative to 1900?

15 votes

As all Perl programmer's (hopefully) know, the year value from a call to Perl's localtime function is relative to 1900. Wondering why this was, I took a look at the perldoc for localtime, and found this interesting nugget:

All list elements are numeric and come straight out of the C `struct tm'.

Looking then at the C++ reference for the tm struct, I found that the tm_year member variable is declared as an int.

Question: Why, then, is the year value relative to 1900 and not simply the full, four-digit year? Is there some historical reason? It seems to me that, even with early memory limitations in computing, an integer is (obviously) more than sufficient for storing the full year. There must have been a good reason; I'm curious as to what that might be.

I was programming in the early 1970's, before C and Unix were invented. Two digit years were used to save disk space which was very very tight, we were always trying to figure tricks like that to save it. The first machine I worked on had two 20 megabyte drives, each the size of a washing machine.

I worked at a hospital that had a Y2K problem in 1975. The age of a patient is an important thing to know, and the date of birth only had a two digit year. Being a hospital, we obviously had some very old patients, born in the 1800's. The system assumed that anyone whose year of birth was 75 or over was born in the 1800's. This worked well for people born in 1890, but once Jan 1, 1975 hit, all heck broke loose as the newborns were deemed to be 100 years old. (It was also a major maternity hospital.) We ran around fixing that problem by moving the threshold from 75 to 80. It was also my first understanding of what a problem Y2K was going to be, and I realized I would be better off doing something else by the year 2000. I failed.

Those who think Y2K was not a real problem because nothing happened do not understand the amount of work that went into fixing stuff for the few years beforehand.

Can adding 'const' to a pointer help the optimization?

15 votes

I have a pointer int* p, and do some operations in a loop. I do not modify the memory, just read. If I add const to the pointer (both cases, const int* p, and int* const p), can it help a compiler to optimize the code?

I know other merits of const, like safety or self-documentation, I ask about this particular case. Rephrasing the question: can const give the compiler any useful (for optimization) information, ever?

While this is obviously specific to the implementation, it is hard to see how changing a pointer from int* to int const* could ever provide any additional information that the compiler would not otherwise have known.

In both cases, the value pointed to can change during the execution of the loop.

Therefore it probably will not help the compiler optimize the code.

gcc rounding difference between versions

14 votes

I'm looking into why a test case is failing

The problematic test can be reduced to doing (4.0/9.0) ** (1.0/2.6), rounding this to 6 digits and checking against a known value (as a string):

#include<stdio.h>
#include<math.h>
int main(){
    printf("%.06f\n", powf(4.0/9.0, (1.0/2.6)));
}

If I compile and run this in gcc 4.1.2 on Linux, I get:

0.732057

Python agrees, as does Wolfram|Alpha:

$ python2.7 -c 'print "%.06f" % (4.0/9.0)**(1/2.6)'
0.732057

However I get the following result on gcc 4.4.0 on Linux, and 4.2.1 on OS X:

0.732058

A double acts identically (although I didn't test this extensively)

I'm not sure how to narrow this down any further.. Is this a gcc regression? A change in rounding algorithm? Me doing something silly?

Edit: Printing the result to 12 digits, the digit at the 7th place is 4 vs 5, which explains the rounding difference, but not the value difference:

gcc 4.1.2:

0.732057452202

gcc 4.4.0:

0.732057511806

Here's the gcc -S output from both versions: https://gist.github.com/1588729

Recent gcc version are able to use mfpr to do compile time floating point computation. My guess is that your recent gcc does that and use an higher precision for the compile time version. This is allowed by the at least the C99 standard (I've not looked in other one if it was modified)

6.3.1.8/2 in C99

The values of floating operands and of the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby.

Edit: your gcc -S results confirm that. I haven't checked the computations, but the old one has (after substituting memory for its constant content)

movss 1053092943, %xmm1
movss 1055100473, %xmm0
call powf

calling powf with the precomputed values for 4/9.0 and 1/2.6 and then printing the result after promotion to double, while the new one just print the float 0x3f3b681f promoted to double.

Why is this code resolved to true?

Asked on Sat, 07 Jan 2012 by Xorty c
13 votes
int main() {
    int a = 1;
    int b = 0;

    if (a = b || ++a == 2)
        printf("T: a=%i, b=%i", a, b);
    else
        printf("F: a=%i, b=%i", a, b);

    return 0;
}

Let's take a look at this simple code snippet. Result is: T: a=1, b=0

Why? (note a=b uses assignment operand, not comparison)

What I understand here, is that zero is assigned to a, then a is incremented to 1. 1 is not equal to 2. So result should indeed be a=1, b=0. But why is this condition evaluated to true? Neither of (a=b) or (++a == 2) is true ... What did I miss?

Here is other short program that prints F as expected:

int main() {
    int a = 1;
    int b = 0;

    if (a = b) printf("T"); else printf("F");

    return 0;
}

You have confused yourself with misleading spacing.

if (a = b || ++a == 2)

is the same as:

if (a = (b || ((++a) == 2)))

This actually has undefined behavior. Although there is a sequence point between the evaluation of b and the evaluation of ((++a) == 2), there is no sequence point between the implied assignment to a and the other write to a due to the explicit = assignment.

Why int i=400*400/400 gives result 72, is datatype circular?

13 votes

I think first 400*400=160000 is first converted to 28928 by starting from 0 and going 160000 time in circular fashion for int type assuming it like:

enter image description here

And then 28928 is divided by 400 floor of which gives 72, and the result varies with the type of variable. Is my assumption correct or there is any other explanation?

Assuming you're using a horrifically old enough compiler for where int is only 16 bits. Then yes, your analysis is correct.*

400 * 400 = 160000 

//  Integer overflow wrap-around.
160000 % 2^16 = 28928

//  Integer Division
28928 / 400 = 72 (rounded down)

Of course, for larger datatypes, this overflow won't happen so you'll get back 400.

*This wrap-around behavior is guaranteed only for unsigned integer types. For signed integers, it is technically undefined behavior in C and C++.

In many cases, signed integers will still exhibit the same wrap-around behavior. But you just can't count on it. (So your example with a signed 16-bit integer isn't guaranteed to hold.)


Although rare, here are some examples of where signed integer overflow does not wrap around as expected:

Can you detect a debugger attached to your process using Div by Zero

11 votes

Can you detect whether or not a debugger is attached to your native Windows process by using a high precision timer to time how long it takes to divide an integer by zero?

The rationale is that if no debugger is attached, you get a hard fault, which is handled by hardware and is very fast. If a debugger is attached, you instead get a soft fault, which is percolated up to the OS and eventually the debugger. This is relatively slow.

most debuggers used by reverse engineers come with methods to affect (remove) 99% of the marks left by debuggers, most of these debuggers provided exception filtering, meaning the speed difference would be undetectable.

its more productive to prevent the debugger attaching in the first place, but in the long run you'll never come out ahead unless you make the required effort investment unfeasable.

Is kill function synchronous?

10 votes

Is the kill function in Linux synchronous? Say, I programatically call the kill function to terminate a process, will it return only when the intended process is terminated, or it just sends the signal and return. If that is the case, how can I make it wait for the intended process to be killed?

No, since it doesn't kill anything, it only sends a signal to the process.

By default this signal can even be blocked or ignored.

You can't block kill -9 which represents sending SIGKILL

To wait for the process to die:

while kill -0 PID_OF_THE_PROCESS 2>/dev/null; do sleep 1; done

Why MUST detach from tty when writing a linux daemon?

9 votes

When i tried to write a daemon under linux using C, i was told i should add following code after fork code block:

/* Preparations */
...

/* Fork a new process */
pid_t cpid = fork();
if (cpid == -1){perror("fork");exit(1);}
if (cpid > 0){exit(0);}

/* WHY detach from tty ? */
int fd = open("/dev/tty", O_RDWR);
ioctl(fd, TIOCNOTTY, NULL);

/* Why set PGID as current PID ? */
setpgid(getpid(), 0);

My question is: Is there a must to do the above operations?

You must disassociate your daemon process from the terminal to avoid being sent signals related to terminal's operation (like SIGHUP when the terminal session ends as well as potentially SIGTTIN and SIGTTOU).

Note however that the way of disassociating from the terminal using TIOCNOTTY ioctl is largely obsolete. You should use setsid() instead.

The reason for a daemon to leave its original process group is not to receive signals sent to that group. Note that setsid() also places your process in its own process group.

Android: Java, C or C++?

9 votes

I wrote some simple apps in Android using Java.
But later I found this:

It provides headers and libraries that allow you to build activities, handle user input, use hardware sensors, access application resources, and more, when programming in C or C++. (Source)

How is it related to this:

Android applications are written in the Java programming language. (Source)

Are all three languages possible?
Sorry for the dumb question.

The article you link to has good information. It also links to http://developer.android.com/sdk/ndk/overview.html which says:

The NDK will not benefit most applications. As a developer, you need to balance its benefits against its drawbacks; notably, using native code does not result in an automatic performance increase, but always increases application complexity. In general, you should only use native code if it is essential to your application, not just because you prefer to program in C/C++.

Typical good candidates for the NDK are self-contained, CPU-intensive operations that don't allocate much memory, such as signal processing, physics simulation, and so on. Simply re-coding a method to run in C usually does not result in a large performance increase. When examining whether or not you should develop in native code, think about your requirements and see if the Android framework APIs provide the functionality that you need. The NDK can, however, can be an effective way to reuse a large corpus of existing C/C++ code.

Does every Objective C program get converted to C code?

8 votes

Since Objective-C is basically an extension of C, Does the code get converted to pure C code before it is compiled to native code ?

If so, does the conversion happens on RAM or a temporary file containing C code on disk is created by the compiler which is further compiled by C compiler to native code ?

That Objective-C syntax is an extension of C syntax does not mean that it could not have its own compiler. C++ is the same way - its syntax is compatible with C (for the most part, anyway) but it has its own set of tools. Compilers for C, C++, and Objective-C can reuse parts of each other for preprocessing, syntactic analysis and code generation, but there is not need to run them sequentially (e.g. Objective-C ==> C ==> Target code). Compilers no longer go through human-readable assembly language, either (this has been the case for a very long time, too).