Best linux questions in January 2012

Returning struct containing array

16 votes

The following simple code segfaults under gcc 4.4.4

#include<stdio.h>

typedef struct Foo Foo;
struct Foo {
    char f[25];
};

Foo foo(){
    Foo f = {"Hello, World!"};
    return f;
}

int main(){
    printf("%s\n", foo().f);
}

Changing the final line to

 Foo f = foo(); printf("%s\n", f.f);

Works fine. Both versions work when compiled with -std=c99. Am I simply invoking undefined behavior, or has something in the standard changed, which permits the code to work under C99? Why does is crash under C89?

I believe the behavior is undefined both in C89/C90 and in C99.

foo().f is an expression of array type, specifically char[25]. C99 6.3.2.1p3 says:

Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.

The problem in this particular case (an array that's an element of a structure returned by a function) is that there is no "array object". Function results are returned by value, so the result of calling foo() is a value of type struct Foo, and foo().f is a value (not an lvalue) of type char[25].

This is, as far as I know, the only case in C (up to C99) where you can have a non-lvalue expression of array type. I'd say that the behavior of attempting to access it is undefined by omission, likely because the authors of the standard (understandably IMHO) didn't think of this case. You're likely to see different behaviors at different optimization settings.

The new 2011 C standard patches this corner case by inventing a new storage class. N1570 (the link is to a late pre-C11 draft) says in 6.2.4p8:

A non-lvalue expression with structure or union type, where the structure or union contains a member with array type (including, recursively, members of all contained structures and unions) refers to an object with automatic storage duration and temporary lifetime. Its lifetime begins when the expression is evaluated and its initial value is the value of the expression. Its lifetime ends when the evaluation of the containing full expression or full declarator ends. Any attempt to modify an object with temporary lifetime results in undefined behavior.

So the program's behavior is well defined in C11. Until you're able to get a C11-conforming compiler, though, your best bet is probably to store the result of the function in a local object (assuming your goal is working code rather than breaking compilers):

[...]
int main(void ) {
    struct Foo temp = foo();
    printf("%s\n", temp.f);
}

Can Haskell be used to write shell scripts?

12 votes

Is it possible to write shell scripts in Haskell and if so, how do you do it? Just changing the interpreter like so?

#!/bin/ghci

Using ghci will just load the module in GHCi. To run it as a script, use runhaskell or runghc:

#!/usr/bin/env runhaskell
main = putStrLn "Hello World!"

how to monitor a complete directory tree for changes in Linux?

10 votes

How can I monitor a whole directory tree for changes in Linux (ext3 file system)?

Currently the directory contains about half a million files in about 3,000 subdirectories, organized in three directory levels.

Those are mostly small files (< 1kb, some few up to 100 kb). It's a sort of queue and I need to know when files are being created, deleted or their content modified within 5-10 seconds of that happening.

I know there is inotify and sorts, but AFAIK they only monitor a single directory, which means I would need 3,000 inotify handles in my case - more than the usual 1024 handles allowed for a single process. Or am I wrong?

In case the Linux system can't tell me what I need: perhaps there is a FUSE project that simulates a file system (replicating all file accesses on a real file system) and separately logs all modifications (couldn't fine one)?

To my knowledge, there's no other way than recursively setting an inotify watch on each directory.

That said, you won't run out of file descriptors because inotify does not have to reserve an fd to watch a file or a directory (its predecessor, dnotify, did suffer from this limitation). inotify uses "watch descriptors" instead.

According to the documentation for inotifywatch, the default limit is 8192 watch descriptors, and you can increase it by writing the new value to /proc/sys/fs/inotify/max_user_watches.

Is kill function synchronous?

10 votes

Is the kill function in Linux synchronous? Say, I programatically call the kill function to terminate a process, will it return only when the intended process is terminated, or it just sends the signal and return. If that is the case, how can I make it wait for the intended process to be killed?

No, since it doesn't kill anything, it only sends a signal to the process.

By default this signal can even be blocked or ignored.

You can't block kill -9 which represents sending SIGKILL

To wait for the process to die:

while kill -0 PID_OF_THE_PROCESS 2>/dev/null; do sleep 1; done

Why MUST detach from tty when writing a linux daemon?

9 votes

When i tried to write a daemon under linux using C, i was told i should add following code after fork code block:

/* Preparations */
...

/* Fork a new process */
pid_t cpid = fork();
if (cpid == -1){perror("fork");exit(1);}
if (cpid > 0){exit(0);}

/* WHY detach from tty ? */
int fd = open("/dev/tty", O_RDWR);
ioctl(fd, TIOCNOTTY, NULL);

/* Why set PGID as current PID ? */
setpgid(getpid(), 0);

My question is: Is there a must to do the above operations?

You must disassociate your daemon process from the terminal to avoid being sent signals related to terminal's operation (like SIGHUP when the terminal session ends as well as potentially SIGTTIN and SIGTTOU).

Note however that the way of disassociating from the terminal using TIOCNOTTY ioctl is largely obsolete. You should use setsid() instead.

The reason for a daemon to leave its original process group is not to receive signals sent to that group. Note that setsid() also places your process in its own process group.

echoing in shell -n doesn't get printed the right thing

9 votes

I know that this is some kind of special character issue but I do not know how to solve it.

I type in console

echo "-n"

and nothing get printed :(

I also tried with

echo -e "-n" 

to execute the special characters (the one escaped from sequence) but again nothing happend

how can I print "-n" ?

Here is one way:

aix@aix:~$ echo -e '\x2dn'
-n

It escapes the - as \x2d.

A more verbose way is to print the two characters separately:

aix@aix:~$ echo -n -; echo n
-n

Here, the -n instructs the first echo to not print a newline; it is not related to the -n being printed. :)

What (working) alternate toolchains exist for x86 C++ development on linux?

8 votes

I precise that I restrict this question to "native" development for my x86 (64bits) linux box. No embedded or non-x86 architecture.

Since I'm a C++ user and there is a C++ renaissance, I'm currently using C++ for personnal projects.

Right now I'm using the robust, traditionnal linux/gcc/make toolchain.

But through blog posts and SO questions, I recently became aware of new promising tools :

  • ''clang'' as an alternative for ''gcc'', a lot faster, giving better error messages
  • ''gold'' as a replacement of ''ld'', a lot faster

Those tools are less known and it's easy to not even know about them.

Are there other interesting less known tools that I should be aware of ? For example alternative to gdb or the like ? (I'm also using cmake)

I'm looking for ease of development first, then development speed improvement. Any other improvement is welcome.

Free tools if possible.

You could be interested by ccache (a compiler cache able to avoid useless recompilation, and transparently usable thru the same g++ command, just by adding a symlink inside your $PATH)

For C (but not C++) programming, you might be interested by tinycc - which compiles very quickly (but produce slowly running binary code).

When coding, the Boehm's garbage collector might be used. See this question related to using it in C++.

And also use valgrind to debug your memory leaks.

Sometimes, dynamically loading a shared object with dlopen is intersting. The dlsym-ed symbols should be extern "C" in C++. I sometimes love generating C or C++ code on the fly, compiling it, and dlopen-ing the module.

For building, consider investigating other builders, like e.g. omake.

When compiling, don't forget the -Wall (and perhaps -Wextra) flag to the compiler. The new link time optimization (with CXX=g++ -flto in your Makefile) could be interesting (but compile time suffers, for perhaps a 10% increase in speed of the executable).

If your source code files share all the same C++ header, pre-compiling that header is worthwhile.

Newer (e.g. better than C++) languages exist, like Ocaml and Haskell but also Go and D.

Use a version control system like GIT even for pet projects.

Qt is a good C++ framework, notably for its graphical toolkit.

Wt enables you to code in C++ quite quickly web interfaces.

Both GCC & GDB are still evolving. Don't forget to use the latest versions (eg 4.6 for GCC, 7.3 for GDB) which provide major improvements over earlier ones.

Consider extending or customizing your GCC compiler for your particular needs thru plugins or better yet using MELT extensions.

Using vim, what is " '<,'>"?

8 votes

While using Vim, in visual mode, selecting text and then calling a colon command shows : '<,'> instead of just : as it would show when I do other things (such as opening a file).

What does '<,'> mean?

Using linux (debian), gnome-terminal, vim7.2

It means that the command that you type after :'<,'> will operate on the part of the file that you've selected.

For example, :'<,'>d would delete the selected block, whereas :d deletes the line under the cursor.

Similarly, :'<,'>w fragment.txt would write the selected block to the file called fragment.txt.

The two comma-separated things ('< and '>) are marks that correspond to the start and the end of the selected area. From the help pages (:help '<):

                                                       *'<* *`<*
'<  `<                  To the first line or character of the last selected
                        Visual area in the current buffer.  For block mode it
                        may also be the last character in the first line (to
                        be able to define the block).  {not in Vi}.

                                                        *'>* *`>*
'>  `>                  To the last line or character of the last selected
                        Visual area in the current buffer.  For block mode it
                        may also be the first character of the last line (to
                        be able to define the block).  Note that 'selection'
                        applies, the position may be just after the Visual
                        area.  {not in Vi}.

When used in this manner, the marks simply specify the range for the command that follows (see :help range). They can of course be mixed and matched with other line number specifiers. For example, the following command would delete all lines from the start of the selected area to the end of the file:

:'<,$d

The Vim Wiki has more information on Vim ranges.

Writing to multiple file-descriptors

8 votes

Is there any OS-level (Linux) speedup when writing one fixed byte buffer to many file-descriptors? When writing many buffers to one file-descriptor one can use writev(2), so I wonder if there's any analogue to this or it must be done by multiple sys-calls.

I know no such syscall on Linux. Their exhaustive list is given in the syscalls(2) man page.

And I won't bother that much. For file access, the real bottleneck is the disk...

find -name "*.xyz" -o -name "*.abc" -exec to Execute on all found files, not just the last suffix specified

7 votes

I'm trying to run

find ./ -name "*.xyz" -o -name "*.abc" -exec cp {} /path/i/want/to/copy/to

In reality it's a larger list of name extensions but I don't know that matters for this example. Basically I'd like to copy all those found to another /path/i/want/to/copy/to. However it seems to only be executing the last -name test in the list.

If I remove the -exec portion all the variations of files I expect to be found are printed out.

How do I get it to pass the full complement of files found to -exec?

find works by evaluating the expressions you give it until it can determine the truth value (true or false) of the entire expression. In your case, you're essentially doing the following, since by default it ANDs the expressions together.

-name "*.xyz" OR (-name "*.abc" AND -exec ...)

Quoth the man page:

GNU find searches the directory tree rooted at each given file name by evaluating the given expression from left to right, according to the rules of precedence (see section OPERATORS), until the outcome is known (the left hand side is false for and operations, true for or), at which point find moves on to the next file name.

That means that if the name matches *.xyz, it won't even try to check the latter -name test or -exec, since it's already true.

What you want to do is enforce precedence, which you can do with parentheses. Annoyingly, you also need to use backspaces to escape them on the shell:

find ./ \( -name "*.xyz" -o -name "*.abc" \) -exec cp {} /path/i/want/to/copy/to \;

Prevent import of function from static library

7 votes

Say I have two static libraries that were not built by me and I have no control over their contents.

Library 1 has functions:

A()
B()
C()

Library 2 has functions:

A()
D()
E()

Both need to be linked into a calling application but the naming conflict of A() throws errors.

Is there a way to say "Ignore A() from Library 1 when linking" in linux using gcc and ld.

There are a couple of methods that I know of:

  1. You could make a copy of the library which has the relevant symbol hidden, and link against the copy. You don't need access to any of the source for the library code to do this: objcopy can do it with the --localize-symbol option. I describe how to do this with .o files in this answer to a similar question, but the same method works just as well with .a libraries.

  2. The --allow-multiple-definition option could be used. (If you're linking via a gcc command, rather than with ld directly, you'll need to specify the option as -Wl,--allow-multiple-definition.) This will cause the linker to stop caring about the multiple definition, and simply use the first one that it encounters instead - so you have to be careful what order the libraries appear in on the command line. The downside it that it's a global option, so if you have other unexpected symbol clashes, it might quitely do the wrong thing instead of telling you about it.

calculate data in linux

7 votes

i want to calculate:

  • the total points (sum)
  • the today points (sum)
  • the total points (average)
  • the today points (average)

i have no idea with bash scripting other than i need to start with: #!/bin/bash

here's a sample of my file

#file 14516 - 2011-01-26 19:01:00 EDT#
user: xxxxxxxx@email.com / id(11451611)
lastlogin: 1295896515
total_points: 11.76 / today: 5.21
gameid: 51

user: xxxxxxxx@email.com / id(11837327)
lastlogin: 1293893041
total_points: 416.1 / today: 98.1
gameid: 49

user: xxxxxxxx@email.com / id(11451611)
lastlogin: 1294917135
total_points: 1.76 / today: 0.21
gameid: 51

You can use this:

#!/bin/bash

if [ ! -f $1 ]; then
  echo "File $1 not found"
  exit 1
fi

number=$(grep total_points $1 | wc -l )
sumTotal=$(grep total_points $1 | awk '{sum+=$2} END { print sum }')
sumToday=$(grep total_points $1 | awk '{sum+=$5} END { print sum }')

echo "Total SUM: $sumTotal"
echo -n "Total AVG: "
echo "scale=5;$sumTotal/$number" | bc

echo "Today SUM: $sumToday"
echo -n "Today AVG: "
echo "scale=5;$sumToday/$number" | bc

Then save to a file like: script.sh

Change the permission to executable: chmod +x script.sh

Then run it: ./script.sh sample.txt

This will output:

Total Record: 3
Total SUM: 429.62
Total AVG: 143.20666
Today SUM: 103.52
Today AVG: 34.50666

Note: $1 will the the input file.

Here's more help about the bc command, grep, awk

Finding the longest word in a text file

6 votes

I am trying to make a a simple script of finding the largest word and its number/length in a text file using bash. I know when I use awk its simple and straight forward but I want to try and use this method...lets say I know if a=wmememememe and if I want to find the length I can use echo {#a} its word I would echo ${a}. But I want to apply it on this below

for i in `cat so.txt` do

Where so.txt contains words, I hope it makes sense.

Normally, you'd want to use a while read loop instead of for i in $(cat), but since you want all the words to be split, in this case it would work out OK.

#!/bin/bash
longest=0
for word in $(<so.txt)
do
    len=${#word}
    if (( len > longest ))
    then
        longest=$len
        longword=$word
    fi
done
printf 'The longest word is %s and its length is %d.\n' "$longword" "$longest"

How to (trivially) parallelize with the Linux shell by starting one task per Linux core?

6 votes

Today's CPUs typically comprise several physical cores. These might even be multi-threaded so that the Linux kernel sees quite a large number of cores and accordingly starts several times the Linux scheduler (one for each core). When running multiple tasks on a Linux system the scheduler achieves normally a good distribution of the total workload to all Linux cores (might be the same physical core).

Now, say, I have a large number of files to process with the same executable. I usually do this with the "find" command:

find <path> <option> <exec>

However, this starts just one task at any time and waits until its completion before starting the next task. Thus, just one core at any time is in use for this. This leaves the majority of the cores idle (if this find-command is the only task running on the system). It would be much better to launch N tasks at the same time. Where N is the number of cores seen by the Linux kernel.

Is there a command that would do that ?

Use find with the -print0 option. Pipe it to xargs with the -0 option. xargs also accepts the -P option to specify a number of processes. -P should be used in combination with -n or -L.

Read man xargs for more information.

An example command: find . -print0 | xargs -0 -P4 -n4 grep searchstring

file not found after mysql export

6 votes

i need to export data from a table to a csv. i have the following structure (not really my table but for demo purposes)

CREATE TABLE `mytable` (
  `id` int(11) DEFAULT NULL,
  `mycolumn` varchar(25) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 

with data (about 3000 records). now i want to export some of these records (from a script i run via cronjob)

SELECT * INTO OUTFILE '/tmp/mytable.sql'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM mytable;

it shows:

Query OK, 3 rows affected (0.00 sec)

then i do:

ls: cannot access /tmp/mytable.sql: No such file or directory

where is my file?

When you use the INTO OUTFILE command, it export the data to the local folder of the server not the one you are executing the query.

Example: you are on your computer (ip: 192.168.0.100) and you connect to the mysqlserver (ip: 192.168.0.101) using the mysql command: mysql -uuser -h192.168.0.101 -A database. By executing the SELECT * INTO OUTFILE the file is saved on the mysqlserver (ip: 192.168.0.101) and NOT on your computer (ip: 192.168.0.100).

Now, you can use a script that creates a CSV file (in your cronjob - you select all the data, generate the file and send via scp to the other server).

Or - You can also have a NFS mounted on /shared/ and when you create the file automatically the other server has it.

Or - you can simply run a mysql command in a bash script like this from your first server.

mysql -uroot test -B -e "select * from test.mytable;" | sed 's/\t/","/g;s/^/"/;s/$/"/;s/\n//g' > /tmp/filename.csv

source: http://tlug.dnho.net/node/209

X11/GLX - Fullscreen mode?

6 votes

I am trying to create a Linux application - a screensaver, in this case - and it is proving remarkably difficult to find information on the simple task of making a window full-screen. Even the code of existing screensavers makes no mention of how they manage it, and I've yet to see any obvious function like XRemoveDecoration().

After much fumbling around, I did manage to create a window that's the same size as the desktop, with this:

Window win = DefaultRootWindow(disp);
XWindowAttributes getWinAttr;
XGetWindowAttributes(disp, win, &getWinAttr);
win = XCreateWindow(disp, win, 0, 0, getWinAttr.width, getWinAttr.height, 0, vInfo->depth, InputOutput, vInfo->visual, CWBorderPixel|CWColormap|CWEventMask|CWOverrideRedirect, &winAttr );

But that doesn't do anything to get rid of the titlebar and borders. I know there's a way, obviously - but I have yet to find anything even pointing in that direction that doesn't rely on some other massive library being thrown on top (which existing screensavers are definitely not using).

EDIT: Please don't remove information from my posts. There is a very good reason I explicitly pointed out that existing screensavers aren't using optional libraries, and that is because I have been analyzing source code for most of the past day.

I have chosen the answer that most directly answers the question, and applies to applications in general.

If you have found this question researching xscreensavers... the same still applies. Yes, xscreensaver has its own API - which is complicated, and actually involves writing more lines of code (yes, seriously). If you want OpenGL in your screensaver, you'll need to go through another API (xlockmore, a competing system) and a compatibility layer that translates it to xscreensaver.

However, xscreensaver is capable of running any program that can use virtual root windows (look into vroot.h) as a screensaver. So my advice is to just do that - you'll have more control, no limiting API, and greater portability. (One example I looked at can even compile for Linux or Windows, with the same file!)

One way is to bypass the window manager:

XSetWindowAttributes wa;                                                     
wa.override_redirect = True;                                           
XCreateWindow( ..., &wa );