Best c questions in September 2010

Is the "struct hack" technically undefined behavior?

39 votes

What I am asking about is the well known "last member of a struct has variable length" trick. It goes something like this:

struct T {
    int len;
    char s[1];
};

struct T *p = malloc(sizeof(struct T) + 100);
p->len = 100;
strcpy(p->s, "hello world");

Because of the way that the struct is laid out in memory, we are able to overlay the struct over a larger than necessary block and treat the last member as if it were larger than the 1 char specified.

So the question is: Is this technique technically undefined behavior?. I would expect that it is, but was curious what the standard says about this.

PS: I am aware of the C99 approach to this, I would like the answers to stick specifically to the version of the trick as listed above.

As the C FAQ says:

It's not clear if it's legal or portable, but it is rather popular.

and:

... an official interpretation has deemed that it is not strictly conforming with the C Standard, although it does seem to work under all known implementations. (Compilers which check array bounds carefully might issue warnings.)

The rationale behind the 'strictly conforming' bit is in the spec, section J.2 Undefined behavior, which includes in the list of undefined behavior:

  • An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6).

Paragraph 8 of Section 6.5.6 Additive operators has another mention that access beyond defined array bounds is undefined:

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

Where does the word "pragma" come from?

31 votes

So I know what pragma is, and what it's used for, but what is the meaning of the word itself? I've used it many times in code, but I never really knew what the word actually means or stands for.

According to a US Government-owned(!) document describing the design of Ada: Rationale for the Design of the Ada® Programming Language :

A pragma (from the Greek word meaning action) is used to direct the actions of the compiler in particular ways, but has no effect on the semantics of a program (in general).

I like the (last caveat) there...

This cross references well with on-line greek dictionaries (e.g. as quoted by Martin York) that say pragma (πραγμα, as commented on the original question by asveikau) means:

  1. that which has been done, a deed, an accomplished fact
  2. what is done or being accomplished
    1. spec. business, a commercial transaction
  3. a matter, question, affair
    1. spec. in a forensic sense, a matter at law, case, suit
  4. that which is or exists, a thing

Seems the key to understanding is the word action rather than information.

Very fast 3D distance check?

21 votes

Is there a way to do a quick and dirty 3D distance check where the results are rough, but it is very very fast? I need to do depth sorting and im using stl sort like this:

    bool sortfunc(CBox* a, CBox* b)
    {
        return a->Get3dDistance(Player.center,a->center) <
            b->Get3dDistance(Player.center,b->center);
    }

float CBox::Get3dDistance( Vec3 c1, Vec3 c2 )
{
    //(Dx*Dx+Dy*Dy+Dz*Dz)^.5 
    float dx = c2.x - c1.x;
    float dy = c2.y - c1.y;
    float dz = c2.z - c1.z;

return sqrt((float)(dx * dx + dy * dy + dz * dz));
}

Is there possibly a way to do it without a square root or without possibly multiplication?

Thanks

You can leave out the square root because for all positive (or really, non-negative) numbers x and y, if sqrt(x) < sqrt(y) then x < y. Since you're summing squares of real numbers, the square of every real number is non-negative, and the sum of any positive numbers is positive, the square root condition holds.

You cannot eliminate the multiplication, however, without changing the algorithm. Here's a counterexample: if x is (3, 1, 1) and y is (4, 0, 0), |x| < |y| because sqrt(1*1+1*1+3*3) < sqrt(4*4+0*0+0*0) and 1*1+1*1+3*3 < 4*4+0*0+0*0, but 1+1+3 > 4+0+0.

Since modern CPUs can compute a dot product faster than they can actually load the operands from memory, it's unlikely that you would have anything to gain by eliminating the multiply anyway (I think the newest CPUs have a special instruction that can compute a dot product every 3 cycles!).

I would not consider changing the algorithm without doing some profiling first. Your choice of algorithm will heavily depend on the size of your dataset (does it fit in cache?), how often you have to run it, and what you do with the results (collision detection? proximity? occlusion?).

Interview Hello World question

18 votes

This classic ioccc entry is a hello world program written in c. Can anyone please provide an explanation of how it works?

Original code (syntax highlighting intentionally missing):

int i;main(){for(;i["]<i;++i){--i;}"];read('-'-'-',i+++"hell\
o, world!\n",'/'/'/'));}read(j,i,p){write(j/p+p,i---j,i/i);}

Slightly cleaner:

int i;
main()
{
  for ( ; i["]<i;++i){--i;}"]; read('-' - '-', i++ + "hello, world!\n", '/' / '/'));
}

read(j, i, p)
{
  write(j / p + p, i-- - j, i / i);
}

for loop condition

i["]<i;++i){--i;}"]

This expression takes advantage of the fact that array indexing is commutative in C. It is equivalent to.

"]<i;++i){--i;}"[i]

So the loop will terminate when the character at position i is \0, i.e., at the end of the string, which is 14 characters long (which happens to be the same length as "hello, world!\n"). So the for loop condition can be rewritten as:

i != 14

character arithmetic

read('-' - '-', i++ + "hello, world!\n", '/' / '/')

char is an integer type, and thus:

  • '-' - '-' is 0
  • '/' / '/' is 1

    read(0, i++ + "hello, world!\n", 1)


After fixing all the compiler warnings (like implicit int to pointer conversion), and simplifying the things mentioned above, the code becomes:

#include <unistd.h>

int i = 0;

void read2(int, char*, int);

int main()
{
   while (i != 14)
   {
      read2(0, i++ + "hello, world!\n", 1);
   }

   return 0;
}

void read2(int j, char* i, int p)
{
   write(j / p + p, i-- - j, 1);
}

(I renamed read to read2 to avoid conflicting with the Unix read function.)

Note that the j and p arguments to read2 are unneeded, as the function is always called with j=0 and p=1.

#include <unistd.h>

int i = 0;

void read2(char*);

int main()
{
   while (i != 14)
   {
      read2(i++ + "hello, world!\n");
   }

   return 0;
}

void read2(char* i)
{
   write(1, i--, 1);
}

The call write(1, i--, 1) writes 1 character from i to file descriptor 1 (stdout). And the postdecrement is superfluous because this i is a local variable never referenced again. So this function is equivalent to putchar(*i).

Inlining the read2 function within the main loop gives

#include <stdio.h>

int i = 0;

int main()
{
   while (i != 14)
   {
      putchar(*(i++ + "hello, world!\n"));
   }

   return 0;
}

for which the meaning is obvious.

what's the meaning of this piece of code? void (*signal(int sig, void (*func)(int)))(int);

Asked on Tue, 14 Sep 2010 by photon c
17 votes

I came across this piece of code and completely get lost for it meaning.

#include <signal.h>
void (*signal(int sig, void (*func)(int)))(int);

So would someone like to explain the code line 2 at some detail? I know that void and int are types, that *func is a pointer for a function, and that brackets are for priority. But I still don't get the (*signal ...), the (int), and the whole thing combined together. The more detailed the better. But if you could not provide a lot of detail, a few words are also welcome.


Thanks to all the explanation, probably I've known the meaning/effect of this declaration. But I make some more trial to help me understand what's going on, as below:

  1 #include <signal.h>
  2 void (*signal)(int sig, void (*func)(int));
  3 void (*signal)(int);  // then void (signal)(int) again.
  4 //void (*signal(int sig, void (*func)(int)))(int); //break this line into two lines above
  5
  6 int main(){}

In the above code, I broke void (*signal(int sig, void (*func)(int)))(int) into two lines. For line 3, I tried both void (*signal)(int) and void (signal)(int), with the same error result that indicates I was trying to redeclare signal:

TestDeclaration.c:2: error: 'signal' redeclared as different kind of symbol /usr/include/signal.h:93: error: previous declaration of 'signal' was here
TestDeclaration.c:3: error: 'signal' redeclared as different kind of symbol /usr/include/signal.h:93: error: previous declaration of 'signal' was here

Now I know both the trial are incorrect ways of declaration, but why are they incorrect? Why is the original way of declaration NOT a Redeclaration?

Well, I didn't notice that Bart van Ingen Schenau had answered this question to some extent.

It's the declaration of a function taking an int and a pointer to a function (taking int returning void) and returning a pointer to a function (taking int and returning void).


Explanation, or guide to interpretation

You can interpret by treating everything in parentheses as a single entity and then working inwards using the "declaration follows usage" rule.

void (*signal(int sig, void (*func)(int)))(int);

The entity in the brackets looks like a function taking int and returning void.

Stripping away the outer part:

*signal(int sig, void (*func)(int))

So, signal takes some parameters and returns something that can be dereferenced (due to the leading *) to form a function taking int and returning void.

This means signal is a function returning a pointer to a function (taking int and returning void).

Looking at the parameters it takes an int (i.e. sig) and void (*func)(int) which is a pointer to a function (taking int and returning void).

Is it bad practice to use C features in C++?

16 votes

For example printf instead of cout, scanf instead of cin, using #define macros, etc?

I wouldn't say bad as it will depend on the personal choice. My policy is when there is a type-safe alternatives is available in C++, use them as it will reduce the errors in the code.

difference between int* i and int *i

14 votes

I'm converting a header file for a DLL written in C to Delphi so I can use the DLL.

My question is what is the difference between

int* i

and

int *i

I convert the first to

i: PInteger;

But i'm not sure what the correct conversion is for the second one in Delphi.

from my understanding the first is a simple typed pointer. The second is a pointer variable. but i'm not sure what the difference is.

int* i and int *i are completely equivalent

How to improve the performance of this Haskell program?

13 votes

I'm working through the problems in Project Euler as a way of learning Haskell, and I find that my programs are a lot slower than a comparable C version, even when compiled. What can I do to speed up my Haskell programs?

For example, my brute-force solution to Problem 14 is:

import Data.Int
import Data.Ord
import Data.List

searchTo = 1000000

nextNumber :: Int64 -> Int64
nextNumber n
    | even n    = n `div` 2
    | otherwise = 3 * n + 1

sequenceLength :: Int64 -> Int
sequenceLength 1 = 1
sequenceLength n = 1 + (sequenceLength next)
    where next = nextNumber n

longestSequence = maximumBy (comparing sequenceLength) [1..searchTo]

main = putStrLn $ show $ longestSequence

Which takes around 220 seconds, while an "equivalent" brute-force C version only takes 1.2 seconds.

#include <stdio.h>

int main(int argc, char **argv)
{
    int longest = 0;
    int terms = 0;
    int i;
    unsigned long j;

    for (i = 1; i <= 1000000; i++)
    {
        j = i;
        int this_terms = 1;

        while (j != 1)
        {
            this_terms++;

            if (this_terms > terms)
            {
                terms = this_terms;
                longest = i;
            }

            if (j % 2 == 0)
                j = j / 2;
            else
                j = 3 * j + 1;
        }
    }

    printf("%d\n", longest);
    return 0;
}

What am I doing wrong? Or am I naive to think that Haskell could even approach C's speed?

(I'm compiling the C version with gcc -O2, and the Haskell version with ghc --make -O).

For testing purpose I have just set searchTo = 100000. The time taken is 7.34s. A few modification leads to some big improvement:

  1. Use an Integer instead of Int64. This improves the time to 1.75s.

  2. Use an accumulator (you don't need sequenceLength to be lazy right?) 1.54s.

    seqLen2 :: Int -> Integer -> Int
    seqLen2 a 1 = a
    seqLen2 a n = seqLen2 (a+1) (nextNumber n)
    
    
    sequenceLength :: Integer -> Int
    sequenceLength = seqLen2 1
    
  3. Rewrite the nextNumber using quotRem, thus avoiding computing the division twice (once in even and once in div). 1.27s.

    nextNumber :: Integer -> Integer
    nextNumber n 
        | r == 0    = q
        | otherwise = 6*q + 4
        where (q,r) = quotRem n 2 
    
  4. Use Schwartzian transform instead of maximumBy. The problem of maximumBy . comparing is that the sequenceLength function is called more than once for each value. 0.32s.

    longestSequence = snd $ maximum [(sequenceLength a, a) | a <- [1..searchTo]]
    

Note:

  • I check the time by compiling with ghc -O and run with +RTS -s)
  • My machine is running on Mac OS X 10.6. The GHC version is 6.12.2. The compiled file is in i386 architecture.)
  • The C problem runs at 0.078s with the corresponding parameter. It is compiled with gcc -O3 -m32.

Interruptible in-place sorting algorithm

13 votes

I need to write a sorting program in C and it would be nice if the file could be sorted in place to save disk space. The data is valuable, so I need to ensure that if the process is interrupted (ctrl-c) the file is not corrupted. I can guarantee the power cord on the machine will not be yanked.

Extra details: file is ~40GB, records are 128-bit, machine is 64-bit, OS is POSIX

Any hints on accomplishing this, or notes in general?

Thanks!

To clarify: I expect the user will want to ctrl-c the process. In this case, I want to exit gracefully and ensure that the data is safe. So this question is about handling interrupts and choosing a sort algorithm that can wrap up quickly if requested.

Install a handler for SIGINT that just sets a "process should exit soon" flag.

In your sort, check the flag after every swap of two records (or after every N swaps). If the flag is set, bail out.

How to find repeating sequence of characters in given array

13 votes

my problem is to find the repeating sequence of characters in the given array. simply, to identify the pattern in which the characters are appearing.

example: for the examples in the above image the output for the

First array should be "JAMESON"

Second array should be "RON"

Third array should be "SHAMIL"

Fourth array should be "CARPENTER"

how to deal with this problem efficiently?

For your examples, my first approach would be to

  1. get the first character of the array (for your last example, that would be C)
  2. get the index of the next appearance of that character in the array (e.g. 9)
  3. if it is found, search for the next appearance of the substring between the two appearances of the character (in this case CARPENTER)
  4. if it is found, you're done (and the result is this substring).

Of course, this works only for a very limited subset of possible arrays, where the same word is repeated over and over again, starting from the beginning, without stray characters in between, and its first character is not repeated within the word. But all your examples fall into this category - and I prefer the simplest solution which could possibly work :-)

If the repeated word contains the first character multiple times (e.g. CACTUS), the algorithm can be extended to look for subsequent occurrences of that character too, not only the first one (so that it finds the whole repeated word, not only a substring of it).

Note that this extended algorithm would give a different result for your second example, namely RONRON instead of RON.

Culling techniques for rendering lots of cubes

13 votes

I am working on a personal learning project to make a Minecraft clone. It is working very well aside from one thing. Similar to Minecraft, my terrain has lots of cubes stacked on the Y so you can dig down. Although I do frustum culling, this still means that I uselessly draw all the layers of cubes below me. The cubes are X, Y and Z ordered (although only in 1 direction, so its not technically Z ordered to the camera). I basically from the player's position only add pointers to cubes around the player. I then do frustum culling against these. I do not do oct tree subdivision. I thought of simply not rendering the layers below the player, except this does not work if the player looks down into a hole. Given this, how could I avoid rendering cubes below me that I cannot see, or also cubes that are hidden by other cubes.

Thanks

void CCubeGame::SetPlayerPosition()
{
PlayerPosition.x = Camera.x / 3;
PlayerPosition.y = ((Camera.y - 2.9) / 3) - 1;
PlayerPosition.z = Camera.z / 3;
}

void CCubeGame::SetCollids()
{

SetPlayerPosition();

int xamount = 70;
int zamount = 70;
int yamount = 17;

int xamountd = xamount * 2;
int zamountd = zamount * 2;
int yamountd = yamount * 2;
PlayerPosition.x -= xamount;

PlayerPosition.y -= yamount;

PlayerPosition.z -= zamount;


collids.clear();
CBox* tmp;

    for(int i = 0; i < xamountd; ++i)
    {
        for(int j = yamountd; j > 0; --j)
        {
            for(int k = zamountd; k > 0; --k)
            {

                tmp = GetCube(PlayerPosition.x + i, PlayerPosition.y + j, PlayerPosition.z + k);



                if(tmp != 0)
                {
                    if(frustum.sphereInFrustum(tmp->center,25) != NULL)
                    {
                        collids.push_back(tmp);
                    }
                }

            }
        }

}

Render front to back. To do so you don't need sorting, use octrees. The leaves won't be individual cubes, rather larger groups of those.

A mesh for each such leaf should be cached in a display list (as Bobmitch suggested) or even better in a vertex buffer (cheaper to update). When you generate this mesh don't generate all the cubes in a brute-force manner. Instead, for each cube face check if it has an opaque neighbor within the same leaf, if so you don't need to generate this face at all. You can also unify neighboring faces with the same material into a single long rectangle. You can also separate the mesh to six sets, one set for each principal direction: +/-XYZ faces. Draw only those sets of faces that may face the camera.

Rendering front to back doesn't help by itself. However you can use occlusion culling offered by modern hardware to benefit from this ordering. Before rendering an octree leaf, check if its bbox passes the occlusion query. If it doesn't pass you don't need to draw it at all.

Alternative approach to occlusion query may be ray-tracing. Ray tracing is good for rendering such environment. You can cast a sparse set of rays to approximate what leaves are visible and draw those leaves only. However this will underestimate the visibility set.

C/C++ Checking for NULL pointer

13 votes

In a recent code review, a contributor is trying to enforce that all NULL checks on pointers be performed in the following manner:

int * some_ptr;
// ...
if( some_ptr == NULL )
{
  // handle null-pointer error
}
else
{
  // proceed
}

instead of

int * some_ptr;
// ...
if( some_ptr )
{
  // proceed
}
else
{
  //handle null-pointer error
}

I agree that his way is a little more clear in the sense that it's explicitly saying "Make sure this pointer is not NULL" but I would counter that by saying that anyone who's working on this code would understand that using a pointer variable in an if statement is implicitly checking for NULL. Also I feel the second method has a smaller chance of introducing a bug of the ilk:

if( some_ptr = NULL )

which is just an absolute pain to find and debug.

Can anyone weigh in on which way you prefer and why?

In my experience, tests of the form if (ptr) or if (!ptr) are preferred. They do not depend on the definition of the symbol NULL. They do not expose the opportunity for the accidental assignment. And they are clear and succinct.

Edit: As SoapBox points out in a comment, they are compatible with C++ classes such as auto_ptr that are objects that act as pointers and which provide a conversion to bool to enable exactly this idiom. For these objects, an explicit comparison to NULL would have to invoke a conversion to pointer which may have other semantic side effects or be more expensive than the simple existence check that the bool conversion implies.

I have a preference for code that says what it means without unneeded text. if (ptr != NULL) has the same meaning as if (ptr) but at the cost of redundant specificity. The next logical thing is to write if ((ptr != NULL) == TRUE) and that way lies madness. The C language is clear that a boolean tested by if, while or the like has a specific meaning of non-zero value is true and zero is false. Redundancy does not make it clearer.

changing const value in C

Asked on Tue, 14 Sep 2010 by Manas c++ c
11 votes

I find that in the following code snippet

const int i = 2;  
const int* ptr1= &i;  
int* ptr2 = (int*)ptr1;  
*ptr2 =3;

i's value changes to 3. What I could like to know is why is this allowed. What are the situations in which this could become helpful?

It's allowed because you have overruled the constness of ptr1 by casting it to a non-const pointer. This is why casts can be very dangerous.

Note that some compilers, such as GCC, will not allow you to cast away const status like this.

Using printf with a non-null terminated string

11 votes

Suppose you have a string which is NOT null terminated and you know its exact size, so how can you print that string with printf in C? I recall such method but I can not find out know...

There is a possibility with printf, it goes like this:

printf("%.*s", stringLength, pointerToString);

No need to copy anything, no need to modify the original string or buffer.

is "unix" restricted keyword in C?

11 votes

This code does not compile for me on gcc version 4.3.2 (Debian 4.3.2-1.1)

main(){
  int unix;
}

I've checked the C keywords list and "unix" is not one of them. Why am I getting the following error?

unix.c:2: error: expected identifier or ‘(’ before numeric constant

Anybody?

unix is not a identifier reserved by the Standard.

If you compile with -std=c89 or -std=c99 the gcc compiler will accept the program as you expected.

From gcc manual ( http://gcc.gnu.org/onlinedocs/cpp/System_002dspecific-Predefined-Macros.html ), the emphasis is mine.

... However, historically system-specific macros have had names with no special prefix; for instance, it is common to find unix defined on Unix systems. For all such macros, GCC provides a parallel macro with two underscores added at the beginning and the end. If unix is defined, __unix__ will be defined too. There will never be more than two underscores; the parallel of _mips is __mips__.

How to detect encodings on signed integers in C?

11 votes

The ISO C standard allows three encoding methods for signed integers: two's complement, one's complement and sign/magnitude.

What's an efficient or good way to detect the encoding at runtime (or some other time if there's a better solution)? I want to know this so I can optimise a bignum library for the different possibilities.

I plan on calculating this and storing it in a variable each time the program runs so it doesn't have to be blindingly fast - I'm assuming the encoding won't change during the program run :-)

You just have to check the low order bits of the constant -1 with something like -1 & 3. This evaluates to

  1. for sign and magnitude,
  2. for one's complement and
  3. for two's complement.

This should even be possible to do in a preprocessor expression inside #if #else constructs.

Which file systems support splicing via Linux's splice(2)?

10 votes

The man page for the splice system call says that splice may fail and set errno to EINVAL if:

Target file system doesn't support splicing; neither of the descriptors refers to a pipe; or offset given for non-seekable device

Which file systems support splicing?

My original answer was partially incorrect, this is a major rewrite.

Linux 2.6.30.10 and below

In Linux 2.6.30.10 and older, splice returns EINVAL when the source or target filesystem does not support splicing. Here are the filesystems that do support splicing:

  • in read mode: adfs, affs, afs, bfs, btrfs, coda, ecryptfs, exofs, ext2, ext3, ext4, fat, fuse, hpfs, jffs2, jfs, minix, nfs, nilfs2, ntfs, ocfs2, omfs, qnx4, reiserfs, smbfs, sysv, ubifs, udf, ufs.
  • in write mode: exofs, ext2, ext3, ext4, jfs, ocfs2, reiserfs, ubifs.

Details follow. Support for splicing in determined in the do_splice_to() function in the "file to pipe" case and in the do_splice_from() function in the "pipe to file" case. It is done by checking whether the relevant struct file_operations contains .splice_read or .splice_write, respectively. In order to produce the above lists of filesystems, I've grepped fs/*/file.c for .splice_read and .splice_write.

Linux 2.6.31 and above

Starting with Linux 2.6.31, all the filesystems support splicing both in read and write modes.

Details follow. When a filesystem does not have .splice_read or .splice_write in its struct file_operations, a fallback function is used: default_file_splice_read and default_file_splice_write, respectively. See do_splice_to() and do_splice_from() for implementations. Note: EINVAL may still be returned for other reasons listed in the documentation.

Help me understand the assembly code

8 votes

I am trying to understand the assembly level code for a simple C program by inspecting it with gdb's disassembler.

Following is the C code:

#include <stdio.h>

void function(int a, int b, int c) {
   char buffer1[5];
   char buffer2[10];
}

void main() {
  function(1,2,3);
}

Following is the disassembly code for both main and function

gdb) disass main
Dump of assembler code for function main:
0x08048428 <main+0>:    push   %ebp
0x08048429 <main+1>:    mov    %esp,%ebp
0x0804842b <main+3>:    and    $0xfffffff0,%esp
0x0804842e <main+6>:    sub    $0x10,%esp
0x08048431 <main+9>:    movl   $0x3,0x8(%esp)
0x08048439 <main+17>:   movl   $0x2,0x4(%esp)
0x08048441 <main+25>:   movl   $0x1,(%esp)
0x08048448 <main+32>:   call   0x8048404 <function>
0x0804844d <main+37>:   leave  
0x0804844e <main+38>:   ret
End of assembler dump.

(gdb) disass function
Dump of assembler code for function function:
0x08048404 <function+0>:    push   %ebp
0x08048405 <function+1>:    mov    %esp,%ebp
0x08048407 <function+3>:    sub    $0x28,%esp
0x0804840a <function+6>:    mov    %gs:0x14,%eax
0x08048410 <function+12>:   mov    %eax,-0xc(%ebp)
0x08048413 <function+15>:   xor    %eax,%eax
0x08048415 <function+17>:   mov    -0xc(%ebp),%eax
0x08048418 <function+20>:   xor    %gs:0x14,%eax
0x0804841f <function+27>:   je     0x8048426 <function+34>
0x08048421 <function+29>:   call   0x8048340 <__stack_chk_fail@plt>
0x08048426 <function+34>:   leave  
0x08048427 <function+35>:   ret    
End of assembler dump.

I am seeking answers for following things :

  1. how the addressing is working , I mean (main+0) , (main+1), (main+3)
  2. In the main, why is $0xfffffff0,%esp being used
  3. In the function, why is %gs:0x14,%eax , %eax,-0xc(%ebp) being used.
  4. If someone can explain , step by step happening, that will be greatly appreciated.

The reason for the "strange" addresses such as main+0, main+1, main+3, main+6 and so on, is because each instruction takes up a variable number of bytes. For example:

main+0: push %ebp

is a one-byte instruction so the next instruction is at main+1. On the other hand,

main+3: and $0xfffffff0,%esp

is a three-byte instruction so the next instruction after that is at main+6.

And, since you ask in the comments why movl seems to take a variable number of bytes, the explanation for that is as follows.

Instruction length depends not only on the opcode (such as movl) but also the addressing modes for the operands as well (the things the opcode are operating on). I haven't checked specifically for your code but I suspect the

movl $0x1,(%esp)

instruction is probably shorter because there's no offset involved - it just uses esp as the address. Whereas something like:

movl $0x2,0x4(%esp)

requires everything that movl $0x1,(%esp) does, plus an extra byte for the offset 0x4.

In fact, here's a debug session showing what I mean:

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

c:\pax> debug
-a
0B52:0100 mov word ptr [di],7
0B52:0104 mov word ptr [di+2],8
0B52:0109 mov word ptr [di+0],7
0B52:010E
-u100,10d
0B52:0100 C7050700      MOV     WORD PTR [DI],0007
0B52:0104 C745020800    MOV     WORD PTR [DI+02],0008
0B52:0109 C745000700    MOV     WORD PTR [DI+00],0007
-q
c:\pax> _

You can see that the second instruction with an offset is actually different to the first one without it. It's one byte longer (5 bytes instead of 4, to hold the offset) and actually has a different encoding c745 instead of c705.

You can also see that you can encode the first and third instruction in two different ways but they basically do the same thing.


The and $0xfffffff0,%esp instruction is a way to force esp to be on a specific boundary. This is used to ensure proper alignment of variables. Many memory accesses on modern processors will be more efficient if they follow the alignment rules (such as a 4-byte value having to be aligned to a 4-byte boundary). Some modern processors will even raise a fault if you don't follow these rules.

After this instruction, you're guaranteed that esp is both less than or equal to its previous value and aligned to a 16 byte boundary.


The gs: prefix simply means to use the gs segment register to access memory rather than the default.

The instruction mov %eax,-0xc(%ebp) means to take the contents of the ebp register, subtract 12 (0xc) and then put the value of eax into that memory location.


Re the explanation of the code. Your function function is basically one big no-op. The assembly generated is limited to stack frame setup and teardown, along with some stack frame corruption checking which uses the afore-mentioned %gs:14 memory location.

It loads the value from that location (probably something like 0xdeadbeef) into the stack frame, does its job, then checks the stack to ensure it hasn't been corrupted.

Its job, in this case, is nothing. So all you see is the function administration stuff.

Stack set-up occurs between function+0 and function+12. Everything after that is setting up the return code in eax and tearing down the stack frame, including the corruption check.

Similarly, main consist of stack frame set-up, pushing the parameters for function, calling function, tearing down the stack frame and exiting.

Comments have been inserted into the code below:

0x08048428 <main+0>:    push   %ebp                 ; save previous value.
0x08048429 <main+1>:    mov    %esp,%ebp            ; create new stack frame.
0x0804842b <main+3>:    and    $0xfffffff0,%esp     ; align to boundary.
0x0804842e <main+6>:    sub    $0x10,%esp           ; make space on stack.

0x08048431 <main+9>:    movl   $0x3,0x8(%esp)       ; push values for function.
0x08048439 <main+17>:   movl   $0x2,0x4(%esp)
0x08048441 <main+25>:   movl   $0x1,(%esp)
0x08048448 <main+32>:   call   0x8048404 <function> ; and call it.

0x0804844d <main+37>:   leave                       ; tear down frame.
0x0804844e <main+38>:   ret                         ; and exit.

0x08048404 <func+0>:    push   %ebp                 ; save previous value.
0x08048405 <func+1>:    mov    %esp,%ebp            ; create new stack frame.
0x08048407 <func+3>:    sub    $0x28,%esp           ; make space on stack.
0x0804840a <func+6>:    mov    %gs:0x14,%eax        ; get sentinel value.
0x08048410 <func+12>:   mov    %eax,-0xc(%ebp)      ; put on stack.

0x08048413 <func+15>:   xor    %eax,%eax            ; set return code 0.

0x08048415 <func+17>:   mov    -0xc(%ebp),%eax      ; get sentinel from stack.
0x08048418 <func+20>:   xor    %gs:0x14,%eax        ; compare with actual.
0x0804841f <func+27>:   je     <func+34>            ; jump if okay.
0x08048421 <func+29>:   call   <_stk_chk_fl>        ; otherwise corrupted stack.
0x08048426 <func+34>:   leave                       ; tear down frame.
0x08048427 <func+35>:   ret                         ; and exit.

I think the reason for the %gs:0x14 may be evident from above but, just in case, I'll elaborate here.

It uses this value (a sentinel) to put in the current stack frame so that, should something in the function do something silly like write 1024 bytes to a 20-byte array created on the stack or, in your case:

char buffer1[5];
strcpy (buffer1, "Hello there, my name is Pax.");

then the sentinel will be overwritten and the check at the end of the function will detect that, calling the failure function to let you know, and then probably aborting so as to avoid any other problems.

If it placed 0xdeadbeef onto the stack and this was changed to something else, then an xor with 0xdeadbeef would produce a non-zero value which is detected in the code with the je instruction.

The relevant bit is paraphrased here:

          mov    %gs:0x14,%eax     ; get sentinel value.
          mov    %eax,-0xc(%ebp)   ; put on stack.

          ;; Weave your function
          ;;   magic here.

          mov    -0xc(%ebp),%eax   ; get sentinel back from stack.
          xor    %gs:0x14,%eax     ; compare with original value.
          je     stack_ok          ; zero/equal means no corruption.
          call   stack_bad         ; otherwise corrupted stack.
stack_ok: leave                    ; tear down frame.

How to prevent a Linux program from running more than once?

8 votes

What is the best way to prevent a Linux program/daemon from being executed more than once at a given time?

The most common way is to create a PID file: define a location where the file will go (inside /var/run is common). On successful startup, you'll write your PID to this file. When deciding whether to start up, read the file and check to make sure that the referenced process doesn't exist (or if it does, that it's not an instance of your daemon: on Linux, you can look at /proc/$PID/exe). On shutdown, you may remove the file but it's not strictly necessary.

There are scripts to help you do this, you may find start-stop-daemon to be useful: it can use PID files or even just check globally for the existence of an executable. It's designed precisely for this task and was written to help people get it right.

Why mkdir fails to work with tilde (~)?

7 votes

When I write

mkdir("~/folder1" , 0777);

in linux, it failed to create a directory. If I replace the ~ with the expanded home directory, it works fine. What is the problem with using ~ ?

Thanks

~ is known only to the shell and not to the mkdir system call.

But if you try:

system("mkdir ~/foo");

this works as the "mkdir ~/foo" is passed to a shell and shell expands ~ to $HOME

If you want to make use of the $HOME with mkdir, you can make use of the getenv function as:

char path[MAX];
char *home = getenv ("HOME");
if (home != NULL) {
        snprintf(path, sizeof(path), "%s/new_dir", home);
        // now use path in mkdir
        mkdir(path, PERM);
}