Best linux questions in December 2011

Why does 'main' not return 0 here?

33 votes

I was just reading ISO/IEC 9899:201x Committee Draft — April 12, 2011 in which I found under 5.1.2.2.3 Program termination:

..reaching the } that terminates the main function returns a value of 0.

It means if you don't specify any return statement in main(), and if the program runs successfully, then at the closing brace, }, of main() will return 0.

But in the following code I don't specify any return statement, yet it does not return 0.

#include<stdio.h>
int sum(int a,int b)
{
    return (a + b);
}

int main()
{
    int a=10;
    int b=5;
    int ans;
    ans=sum(a,b);
    printf("sum is %d",ans);
}

Compile:

gcc test.c
./a.out
sum is 15
echo $?
9          // Here it should be 0, but it shows 9. Why?

That rule was added in the 1999 version of the C standard. In C90, the status returned is undefined.

You can enable it by passing -std=c99 to gcc.

As a side note, interestingly 9 is returned because it's the return of printf which just wrote 9 characters.

How can I get what my main function has returned?

31 votes

In a C program if we want to give some input from terminal then we can give it by:

int main(int argc, char *argv[])

In the same way, if we want to get return value of main() function then how can we get it?

In each main() we write return 1 or return 0; how can I know what my main() has returned at terminal?

Edit:1

i get it that by echo $? we can get the return value of main() but it only allows to return value less then 125 (in linux) successfully, return value more then that can not be be succesfully received by $ variable so

why return type of main() is int ? why dont keep it short int ?

Edit2

from where i can finout the meaning of error code if main() has return more then 125 value ?

Most shells store the exit code of the previous run command in $? so you can store or display it.

$ ./a.out
$ echo $?     # note - after this command $? contains the exit code of echo!

or

$ ./a.out
$ exit_code=$?    # save the exit code in another shell variable.

Note that under linux, although you return an int, generally only values less than 126 are safe to use. Higher values are reserved to record other errors that might occur when attempting to run a command or to record which signal, if any, terminated your program.

What is the reason for having unreserved identifiers as built-in macros in gcc?

19 votes

Today I stumbled upon a rather interesting compiler error:

int main() {
  int const unix = 0; // error-line
  return unix;
}

Gives the following message with gcc 4.3.2 (yes, ancient...):

error: expected unqualified-id before numeric constant

which is definitely quite confusing.

Fortunately, clang (3.0) is a little more helpful (as usual):

error: expected unqualified-id
  int const unix = 0
            ^
<built-in>:127:14: note: expanded from:
#define unix 1
             ^

I certainly did not expect unix, which is neither written in upper-case nor begin with underscore to be a macro, especially a built-in one.

I checked the predefined macros in gcc and there are 2 (on my platform) that use "unreserved" symbols:

$ g++ -E -dM - < /dev/null | grep -v _
#define unix 1
#define linux 1

All the others are "well-behaved" macros with leading underscores, using the traditional reserved identifiers, sample:

#define __linux 1
#define __linux__ 1
#define __gnu_linux__ 1

#define __unix__ 1
#define __unix 1

#define __CHAR_BIT__ 8
#define __x86_64 1
#define __amd64 1
#define _LP64 1

(it's a mess and there does not seem to be any particular order...)

Furthermore, there are lots of "similar" symbols, so I guess there is an issue of backward compatibility...

So, where do the unix and linux macros come from ?

gcc does not fully conform to any C standard by default.

Invoke it with -ansi or -std=c99 and unix won't be predefined.

Quoting the GNU preprocessor documentation (info cpp, version 4.5):

The C standard requires that all system-specific macros be part of the "reserved namespace". All names which begin with two underscores, or an underscore and a capital letter, are reserved for the compiler and library to use as they wish. However, historically system-specific macros have had names with no special prefix; for instance, it is common to find `unix' defined on Unix systems. For all such macros, GCC provides a parallel macro with two underscores added at the beginning and the end. If `unix' is defined, `__unix__' will be defined too. There will never be more than two underscores; the parallel of `_mips' is `__mips__'.

When the `-ansi' option, or any `-std' option that requests strict conformance, is given to the compiler, all the system-specific predefined macros outside the reserved namespace are suppressed. The parallel macros, inside the reserved namespace, remain defined.

We are slowly phasing out all predefined macros which are outside the reserved namespace. You should never use them in new programs, and we encourage you to correct older code to use the parallel macros whenever you find it. We don't recommend you use the system-specific macros that are in the reserved namespace, either. It is better in the long run to check specifically for features you need, using a tool such as `autoconf'.

The current version of the manual is here.

open 100 files in vim

11 votes

I need to grep to tons (10k+) of files for specific words. now that returns a list of files that i also need to grep for another word.

i found on that grep can do this so i use:

grep -rl word1 *

which returns the list of files i want to check. now from these files (100+), i need to grep another word. so i have to do another grep

vim `grep word2 `grep -rl word1 *``

but that hangs, and it does not do anything,

why?

Because you have a double `, you need to use the $()

vi `grep -l 'word2' $(grep -rl 'word1' *)`

(And welcome to stackoverflow)

how can I detect whether a specific page is mapped in memory?

9 votes

I would like to detect whether or not a specific page has already been mapped in memory. The goal here is to be able to perform this check before calling mmap with a fixed memory address. The following code illustrates what happens in this case by default: mmap silently remaps the original memory pages.

#include <sys/mman.h>
#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
  int page_size;
  void *ptr;
  page_size = getpagesize();
  ptr = mmap(0, 10 * page_size, PROT_READ | PROT_WRITE,
             MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
  if (ptr == MAP_FAILED) {
    printf ("map1 failed\n");
    return 1;
  }
  ((int *)ptr)[0] = 0xdeadbeaf;
  ptr = mmap(ptr, 2 * page_size, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, 0, 0);
  if (ptr == MAP_FAILED) {
    printf ("map2 failed\n");
    return 1;
  }
  if (((int *)ptr)[0] != 0xdeadbeaf) {
    printf ("oops, data gone !\n");
  }
  return 0;
}

I understand that I could open and parse /proc/self/maps to figure out which memory range has been allocated and infer from that if I can safely request a specific memory range with mmap but I am looking for a proper API: is there such a thing ?

msync(addr, len, 0) and checking for ENOMEM seems to work (with a fairly superficial test).

How do I create a shortcut folder on a linux web server?

8 votes

I have some folders on my web server that link to other folders.

I am wondering how this is done?

Example: I want http://www.example.com/public_html/css/ to point to http://www.example.com/public_html/wp-content/themes/theme-name/css/

easiest way is to create a symblink like this:

cd /path/to/public_html/
ln -s /path/to/public_html/wp-content/themes/theme-name/css/ css

example:

cd ~
ln -s wp-content/themes/theme-name/css/ css

C standard I/O vs UNIX I/O basics

7 votes

Here's a very basic question I have. In my professor's lecture slide, there is a example I dont really get.

She wrote:

printf("u"); 
write(STDOUT_FILENO, "m", 1); 
printf("d\n");

...and she said the out put of this code would be:

mud

I don't get it. So if anyone understand why this happens, please explain to me.

Reference this question:

http://lagoon.cs.umd.edu/216/Lectures/lect17.pdf

(in the second last slide page.)

write is a system call -- it is implemented by the interface between user mode (where programs like yours run) and the operating system kernel (which handles the actual writing to disk when bytes are written to a file).

printf is a C standard library function -- it is implemented by library code loaded into your user mode program.

The C standard library output functions buffer their output, by default until end-of-line is reached. When the buffer is full or terminated with a newline, it is written to the file via a call to write from the library implementation.

Therefore, the output via printf is not sent to the operating system write immediately. In your example, you buffer the letter 'u', then immediately write the letter 'm', then append "d\n" to the buffer and the standard library makes the call write(STDOUT_FILENO, "ud\n");

What is kernel section mismatch?

7 votes

I tried to compile a kernel module however I got an WARNING and in the warning there was a note to add a compile option CONFIG_DEBUG_SECTION_MISMATCH=y It give me more detailed info about issue:

WARNING: \**\*path to module\***(.text+0x8d2): Section mismatch in reference from the function Pch_Spi_Enable_Bios_Wr() to the variable .devinit.data:ich9_pci_tbl.22939
The function Pch_Spi_Enable_Bios_Wr() references
the variable __devinitdata ich9_pci_tbl.22939.
This is often because Pch_Spi_Enable_Bios_Wr lacks a __devinitdata
annotation or the annotation of ich9_pci_tbl.22939 is wrong.

Of course i tried to google the problem but there is no information about what exactly kernel section mismatch is, not to mention how to go about fixing it.

I would love to learn what it actually is and what causes it.

It means that a function that is in a section with a given lifetime references something that is in a section with a different lifetime.

When the kernel binary is linked, different parts of the code and data are split up into different sections. Some of these sections are kept loaded all the time, but some others are removed once they are no longer needed (things that are only required during boot for example can be freed once boot is done - this saves memory).

If a function that is in a long-lasting section refers to data in one of the discardable sections, there is a problem - it might try to access that data when it has already been released, leading to all kinds of runtime issues.

This is not a warning you'll fix yourself, unless you wrote that code or are very familiar with it. It gets fixed by correctly annotating the function (or the data it refers to) so that it goes into the right section. The right fix can only be determined with detailed knowledge of that part of the kernel.


For a list of these sections and annotations, refer to the include/linux/init.h header in your kernel source tree:

/* These macros are used to mark some functions or 
 * initialized data (doesn't apply to uninitialized data)
 * as `initialization' functions. The kernel can take this
 * as hint that the function is used only during the initialization
 * phase and free up used memory resources after
 *
 * Usage:
 * For functions:
 * 
 * You should add __init immediately before the function name, like:
 *
 * static void __init initme(int x, int y)
 * {
 *    extern int z; z = x * y;
 * }
 *
 * If the function has a prototype somewhere, you can also add
 * __init between closing brace of the prototype and semicolon:
 *
 * extern int initialize_foobar_device(int, int, int) __init;
 *
 * For initialized data:
 * You should insert __initdata between the variable name and equal
 * sign followed by value, e.g.:
 *
 * static int init_variable __initdata = 0;
 * static const char linux_logo[] __initconst = { 0x32, 0x36, ... };
 *
 * Don't forget to initialize data not at file scope, i.e. within a function,
 * as gcc otherwise puts the data into the bss section and not into the init
 * section.
 * 
 * Also note, that this data cannot be "const".
 */

/* These are for everybody (although not all archs will actually
   discard it in modules) */
#define __init      __section(.init.text) __cold notrace
#define __initdata  __section(.init.data)
#define __initconst __section(.init.rodata)
#define __exitdata  __section(.exit.data)
#define __exit_call __used __section(.exitcall.exit)

Others follow, with more comments and explanations.

See also the help text for the CONFIG_DEBUG_SECTION_MISMATCH Kconfig symbol:

The section mismatch analysis checks if there are illegal
references from one section to another section.
Linux will during link or during runtime drop some sections
and any use of code/data previously in these sections will
most likely result in an oops.
In the code functions and variables are annotated with
__init, __devinit etc. (see full list in include/linux/init.h)
which results in the code/data being placed in specific sections.
The section mismatch analysis is always done after a full
kernel build but enabling this option will in addition
do the following:

  • Add the option -fno-inline-functions-called-once to gcc
    When inlining a function annotated __init in a non-init
    function we would lose the section information and thus
    the analysis would not catch the illegal reference.
    This option tells gcc to inline less but will also
    result in a larger kernel.
  • Run the section mismatch analysis for each module/built-in.o
    When we run the section mismatch analysis on vmlinux.o we
    lose valueble information about where the mismatch was
    introduced.
    Running the analysis for each module/built-in.o file
    will tell where the mismatch happens much closer to the
    source. The drawback is that we will report the same
    mismatch at least twice.
  • Enable verbose reporting from modpost to help solving
    the section mismatches reported.

Reading raw bytes from a serial port

7 votes

I'm trying to read raw bytes from a serial port sent by a IEC 870-5-101 win32 protocol simulator with a program written in C running on Linux 32bit.

It's working fine for byte values like 0x00 - 0x7F. But for values beginning from 0x80 to 0xAF the high bit is wrong, e.g.:

0x7F -> 0x7F //correct
0x18 -> 0x18 //correct
0x79 -> 0x79 //correct
0x80 -> 0x00 //wrong
0xAF -> 0x2F //wrong
0xFF -> 0x7F //wrong

After digging around for two days now, I have no idea, what's causing this.

This is my config of the serial port:

    cfsetispeed(&config, B9600);
    cfsetospeed(&config, B9600);

    config.c_cflag |= (CLOCAL | CREAD);

    config.c_cflag &= ~CSIZE;                               /* Mask the character size bits */
    config.c_cflag |= (PARENB | CS8);                       /* Parity bit Select 8 data bits */

    config.c_cflag &= ~(PARODD | CSTOPB);                   /* even parity, 1 stop bit */


    config.c_cflag |= CRTSCTS;                              /*enable RTS/CTS flow control - linux only supports rts/cts*/


    config.c_iflag &= ~(IXON | IXOFF | IXANY);              /*disable software flow control*/ 

    config.c_oflag &= ~OPOST;                               /* enable raw output */
    config.c_lflag &= ~(ICANON | ECHO | ECHOE | ISIG);      /* enable raw input */

    config.c_iflag &= ~(INPCK | PARMRK);                    /* DANGEROUS no parity check*/
    config.c_iflag |= ISTRIP;                               /* strip parity bits */
    config.c_iflag |= IGNPAR;                               /* DANGEROUS ignore parity errors*/

    config.c_cc[VTIME] = 1;                                 /*timeout to read a character in tenth of a second*/

I'm reading from the serial port with:

*bytesread = read((int) fd, in_buf, BytesToRead);

Right after this operation "in_buf" contains the wrong byte, so I guess there's something wrong with my config, which is a port from a win32 DCB structure.

Thanks for any ideas!

Based on your examples, only the 8th bit (the high bit) is wrong, and it's wrong by being always 0. You are setting ISTRIP in your line discipline on the Linux side, and that would cause this. ISTRIP does not, as the comment in the C code claims, strip parity bits. It strips the 8th data bit.

If ISTRIP is set, valid input bytes shall first be stripped to seven bits; otherwise, all eight bits shall be processed. IEEE Std 1003.1, 2004 Edition, chapter 11, General Terminal Interface

What is this Bash (and/or other shell?) construct called?

6 votes

What is the construct in bash called where you can take wrap a command that outputs to stdout, such that the output itself is treated like a stream? In case I'm not describing that so well, maybe an example will do best, and this is what I typically use it for: applying diff to output that does not come from a file, but from other commands, where

cmd 

is wrapped as

<(cmd)

By wrapping a command in such a manner, in the example below I determine that there a difference of one between the two commands that I am running, and then I am able to determine that one precise difference. What is the construct/technique of wrapping a command as <(cmd) called? Thanks

[builder@george v6.5 html]$ git status | egrep modified | awk '{print $3}' | wc -l
51
[builder@george v6.5 html]$ git status | egrep modified | awk '{print $3}' | xargs grep -l 'Ext\.define' | wc -l
50
[builder@george v6.5 html]$ diff <(git status | egrep modified | awk '{print $3}') <(git status | egrep modified | awk '{print $3}' | xargs grep -l 'Ext\.define')
39d38
< javascript/reports/report_initiator.js

ADDENDUM The revised command using the advice for using git's ls-file should be as follows (untested):

diff <(git ls-files -m) <(git ls-files -m | xargs grep -l 'Ext\.define')

It is called process substitution.

Signed zero linux vs windows

5 votes

i am running a program in c++ on windows and on linux. the output is meant to be identical. i am trying to make sure that the only differences are real differences oppose to working inviorment differences. so far i have taken care of all the differences that can be caused by \r\n differences but there is one thing that i can't seem to figure out.

in the windows out put there is a 0.000 and in linux it is -0.000

does any one know what can it be that is making the difference?

thanx

Since in the IEEE floating point format the sign bit is separate from the value, you have two different values of 0, a positive and a negative one. In most cases it doesn't make a difference; both zeros will compare equal, and they indeed describe the same mathematical value (mathematically, 0 and -0 are the same). Where the difference can be significant is when you have underflow and need to know whether the underflow occurred from a positive or from a negative value. Also if you divide by 0, the sign of the infinity you get depends on the sign of the 0 (i.e. 1/+0.0 give +Inf, but 1/-0.0 gives -Inf). In other words, most probably it won't make a difference for you.

Note however that the different output does not necessarily mean that the number itself is different. It could well be that the value in Windows is also -0.0, but the output routine on Windows doesn't distinguish between +0.0 and -0.0 (they compare equal, after all).