Best linux questions in April 2012

What is this style of syntax in C?

75 votes

From sys.c line 123:

void *sys_call_table[__NR_syscalls] = 
{
    [0 ... __NR_syscalls-1] = sys_ni_syscall, #include <asm/unistd.h>
};

sys_call_table is a generic pointer to arrays, I can see that. However what is the notation:

[0 ... __NR_syscalls-1]

What is the ...?


EDIT:
I learned another C trick here: #include <asm/unistd.h> will be preprocessed and replaced with its content and assigned to [0 ... _NR_syscalls-1].

It is initialized using Designated Initializers.

The range based initialization is a gnu gcc extension.

To initialize a range of elements to the same value, write `[first ... last] = value'. This is a GNU extension. For example,

 int widths[] = { [0 ... 9] = 1, [10 ... 99] = 2, [100] = 3 };

It is not portable. Compiling with -pedantic with tell you so.

Can some explain the performance behavior of the following memory allocating C program?

14 votes

On my machine Time A and Time B swap depending on whether A is defined or not (which changes the order in which the two callocs are called).

I initially attributed this to the paging system. Weirdly, when mmap is used instead of calloc, the situation is even more bizzare -- both the loops take the same amount of time, as expected. As can be seen with strace, the callocs ultimately result in two mmaps, so there is no return-already-allocated-memory magic going on.

I'm running Debian testing on an Intel i7.

#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>

#include <time.h>

#define SIZE 500002816

#ifndef USE_MMAP
#define ALLOC calloc
#else
#define ALLOC(a, b) (mmap(NULL, a * b, PROT_READ | PROT_WRITE,  \
                          MAP_PRIVATE | MAP_ANONYMOUS, -1, 0))
#endif

int main() {
  clock_t start, finish;
#ifdef A
  int *arr1 = ALLOC(sizeof(int), SIZE);
  int *arr2 = ALLOC(sizeof(int), SIZE);
#else
  int *arr2 = ALLOC(sizeof(int), SIZE);
  int *arr1 = ALLOC(sizeof(int), SIZE);
#endif
  int i;

  start = clock();
  {
    for (i = 0; i < SIZE; i++)
      arr1[i] = (i + 13) * 5;
  }
  finish = clock();

  printf("Time A: %.2f\n", ((double)(finish - start))/CLOCKS_PER_SEC);

  start = clock();
  {
    for (i = 0; i < SIZE; i++)
      arr2[i] = (i + 13) * 5;
  }
  finish = clock();

  printf("Time B: %.2f\n", ((double)(finish - start))/CLOCKS_PER_SEC);

  return 0;
}

The output I get:

 ~/directory $ cc -Wall -O3 bench-loop.c -o bench-loop
 ~/directory $ ./bench-loop 
Time A: 0.94
Time B: 0.34
 ~/directory $ cc -DA -Wall -O3 bench-loop.c -o bench-loop
 ~/directory $ ./bench-loop                               
Time A: 0.34
Time B: 0.90
 ~/directory $ cc -DUSE_MMAP -DA -Wall -O3 bench-loop.c -o bench-loop
 ~/directory $ ./bench-loop                                          
Time A: 0.89
Time B: 0.90
 ~/directory $ cc -DUSE_MMAP -Wall -O3 bench-loop.c -o bench-loop 
 ~/directory $ ./bench-loop                                      
Time A: 0.91
Time B: 0.92

Short Answer

The first time that calloc is called it is explicitly zeroing out the memory. While the next time that it is called it assumed that the memory returned from mmap is already zeroed out.

Details

Here's some of the things that I checked to come to this conclusion that you could try yourself if you wanted:

  1. Insert a calloc call before your first ALLOC call. You will see that after this the Time for Time A and Time B are the same.

  2. Use the clock() function to check how long each of the ALLOC calls take. In the case where they are both using calloc you will see that the first call takes much longer than the second one.

  3. Use time to time the execution time of the calloc version and the USE_MMAP version. When I did this I saw that the execution time for USE_MMAP was consistently slightly less.

  4. I ran with strace -tt -T which shows both the time of when the system call was made and how long it took. Here is part of the output:

Strace output:

21:29:06.127536 mmap(NULL, 2000015360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fff806fd000 <0.000014>
21:29:07.778442 mmap(NULL, 2000015360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fff093a0000 <0.000021>
21:29:07.778563 times({tms_utime=63, tms_stime=102, tms_cutime=0, tms_cstime=0}) = 4324241005 <0.000011>

You can see that the first mmap call took 0.000014 seconds, but that about 1.5 seconds elapsed before the next system call. Then the second mmap call took 0.000021 seconds, and was followed by the times call a few hundred microsecond later.

I also stepped through part of the application execution with gdb and saw that the first call to calloc resulted in numerous calls to memset while the second call to calloc did not make any calls to memset. You can see the source code for calloc here (look for __libc_calloc) if you are interested. As for why calloc is doing the memset on the first call but not subsequent ones I don't know. But I feel fairly confident that this explains the behavior you have asked about.

As for why the array that was zeroed memset has improved performance my guess is that it is because of values being loaded into the TLB rather than the cache since it is a very large array. Regardless the specific reason for the performance difference that you asked about is that the two calloc calls behave differently when they are executed.

How are interrupts handled on SMP?

12 votes

How are interrupts handled on SMP (Symmeteric multiprocessor/multicore) machines? Is there only one memory management unit or more?

Say two threads, A and B running on different cores touch a memory page (at the same time) which is not there in the page table, in which case there will be a page fault and a new page is brought in from the memory.

What is the sequence of events which will happen? If there is one memory management unit, to which core is the page fault forwarded to? How does the kernel handle it? Are there multiple instances of the kernel, each one running on a different core? If so, how do they synchronize on such events as page fault handling?

On multicore/multiprocessor architectures, an APIC is used to route interrupts to cores/processors. As the name implies, APICs can be programmed to do the routing as desired.

Regarding the synchronization of the kernel: This depends on the kernel/OS. You can either use a scheme with locking (although IPIs might be necessary on non-cachecoherent architectures) or you can also use your suggested approach of running a kernel on every core and use some kind of explicit inter-kernel communication.

Barrelfish is an example of an OS running multiple kernels. If you are interested in that kind of architecture, you might want to read the paper "The Multikernel: A new OS architecture for scalable multicore systems"

Bash: why the pipe is terminated?

10 votes

This is a piped command for generating 10 characters password at random:

cat /dev/urandom | base64 | head -c 10

My question is cat /dev/urandom | base64 is an infinite output stream which will not stop by itself. But why appending head -c 10 makes the whole pipe terminated? I assume cat, base64 and head are 3 separated processes, how can head terminate the cat?

After base64 outputs 10 bytes, head gets enough inputs and exits. When the former attempts to output more bytes, it will receive SIGPIPE signal and hence exit too。For the same reason, cat will exit in turn.

Prevent R from using virtual memory on unix/linux?

10 votes

Short version

Is there a way to prevent R from ever using any virtual memory on a unix machine? Whenever it happens it is because I screwed up and I then want to abort the computation.

Longer version

I am working with a big datasets on a powerful computer shared with several other people. Sometimes I set off commands that requires more RAM than is available, which causes R to start swapping and eventually freeze the whole machine. Normally I can solve this by setting a ulimit in my ~/.bashrc

ulimit -m 33554432 -v 33554432  # 32 GB RAM of the total 64 GB

which causes R to throw an error and abort when trying to allocate more memory than is available. However, if I make a misstake of this sort when parallelizing (typically using the snow package) the ulimit has no effect and the machine crashes anyway. I guess that is because snow launches the workers as separate processes that are not run in bash. If I instead try to set the ulimit in my ~/.Rprofile I just get an error:

> system("ulimit -m 33554432 -v 33554432")
ulimit: 1: too many arguments

Could someone help me figure out a way to accomplish this?

Side track

Why can I not set a ulimit of 0 virtual memory in bash?

$ ulimit -m 33554432 -v 0

If I do it quickly shuts down.

When you run system("ulimit") that is executing in a child process. The parent does not inherit the ulimit from the parent. (This is analgous to doing system("cd dir"), or system("export ENV_VAR=foo").

Setting it in the shell from which you launch the environment is the correct way. The limit is not working in the parallel case most likely because it is a per-process limit, not a global system limit.

On Linux you can configure strict(er) overcommit accounting which tries to prevent the kernel from handling out a mmap request that cannot be backed by physical memory.

This is done by tuning the sysctl parameters vm.overcommit_memory and vm.overcommit_ratio. (Google about these.)

This can be an effective way to prevent thrashing situations. But the tradeoff is that you lose the benefit that overcommit provides when things are well-behaved (cramming more/larger processes into memory).

Mono on Raspberri Pi

8 votes

I've seen a lot of talk about running Mono/.NET code on the Raspberry Pi. Has anyone actually managed to run any Mono code on the Raspberri Pi? On their site, they list several Linux distros that work on the device and some of these distros include Mono. However, none detail whether Mono works on it.

Is there a working implementation that someone can point me to?

The folks on the Raspberri Pi board are reporting that Mono does indeed work, at least for simple apps.

Need RegExp help for Linux Bash grep command to filter out lines containing square brackets

7 votes

Using the following example, I need to filter out the line containing 'ABC' only, while skipping the lines matching 'ABC' that contain square brackets:

2012-04-04 04:13:48,760~sample1~ABC[TLE 5332.233 2/13/2032 3320392]:CAST
2012-04-04 04:13:48,761~sample2~ABC
2012-04-04 04:13:48,761~sample3~XYZ[BAC.CAD.ABC.CLONE 232511]:TEST

Here is what I have, but so far I'm unable to successfully filter out the lines with square brackets:

bash-3.00$ cat Metrics.log | grep -e '[^\[\]]' | grep -i 'ABC'

Please help?

Edited based on comments:

Try grep -i 'ABC' Metrics.log | grep -v "[[]" | grep -v "ABC\w"

Input:

2012-04-04 04:13:48,760~sample1~ABC[TLE 5332.233 2/13/2032 3320392]:CAST
2012-04-04 04:13:48,761~sample2~ABC
2012-04-04 04:13:48,761~sample3~XYZ[BAC.CAD.ABC.CLONE 232511]:TEST
2012-04-04 04:13:48,761~sample4~XYZ
2012-04-04 04:13:48,761~sample5~ABCD
2012-04-04 04:13:48,761~sample6~ABC:TEST

Output:

2012-04-04 04:13:48,761~sample2~ABC
2012-04-04 04:13:48,761~sample6~ABC:TEST

Using GNU/Linux system call `splice` for zero-copy Socket to Socket data transfers in Haskell

7 votes

Update: Mr. Nemo's answer helped solve the problem! The code below contains the fix! See the nb False and nb True calls below.

There is also a new Haskell package called splice (, which has OS-specific and portable implementations of best known socket to socket data transfer loops).

I have the following (Haskell) code:

#ifdef LINUX_SPLICE
#include <fcntl.h>
{-# LANGUAGE CPP #-}
{-# LANGUAGE ForeignFunctionInterface #-}
#endif

module Network.Socket.Splice (
    Length
  , zeroCopy
  , splice
#ifdef LINUX_SPLICE
  , c_splice
#endif
  ) where

import Data.Word
import Foreign.Ptr

import Network.Socket
import Control.Monad
import Control.Exception
import System.Posix.Types
import System.Posix.IO

#ifdef LINUX_SPLICE
import Data.Int
import Data.Bits
import Unsafe.Coerce
import Foreign.C.Types
import Foreign.C.Error
import System.Posix.Internals
#else
import System.IO
import Foreign.Marshal.Alloc
#endif


zeroCopy :: Bool
zeroCopy =
#ifdef LINUX_SPLICE
  True
#else
  False
#endif


type Length =
#ifdef LINUX_SPLICE
  (#type size_t)
#else
  Int
#endif


-- | The 'splice' function pipes data from
--   one socket to another in a loop.
--   On Linux this happens in kernel space with
--   zero copying between kernel and user spaces.
--   On other operating systems, a portable
--   implementation utilizes a user space buffer
--   allocated with 'mallocBytes'; 'hGetBufSome'
--   and 'hPut' are then used to avoid repeated 
--   tiny allocations as would happen with 'recv'
--   'sendAll' calls from the 'bytestring' package.
splice :: Length -> Socket -> Socket -> IO ()
splice l (MkSocket x _ _ _ _) (MkSocket y _ _ _ _) = do

  let e  = error "splice ended"

#ifdef LINUX_SPLICE

  (r,w) <- createPipe
  print ('+',r,w)
  let s  = Fd x -- source
  let t  = Fd y -- target
  let c  = throwErrnoIfMinus1 "Network.Socket.Splice.splice"
  let u  = unsafeCoerce :: (#type ssize_t) -> (#type size_t)
  let fs = sPLICE_F_MOVE .|. sPLICE_F_MORE
  let nb v = do setNonBlockingFD x v
                setNonBlockingFD y v
  nb False
  finally
    (forever $ do 
       b <- c $ c_splice s nullPtr w nullPtr    l  fs
       if b > 0
         then   c_splice r nullPtr t nullPtr (u b) fs)
         else   e
    (do closeFd r
        closeFd w
        nb True
        print ('-',r,w))

#else

  -- ..    

#endif


#ifdef LINUX_SPLICE
-- SPLICE

-- fcntl.h
-- ssize_t splice(
--   int          fd_in,
--   loff_t*      off_in,
--   int          fd_out,
--   loff_t*      off_out,
--   size_t       len,
--   unsigned int flags
-- );

foreign import ccall "splice"
  c_splice
  :: Fd
  -> Ptr (#type loff_t)
  -> Fd
  -> Ptr (#type loff_t)
  -> (#type size_t)
  -> Word
  -> IO (#type ssize_t)

sPLICE_F_MOVE :: Word
sPLICE_F_MOVE = (#const "SPLICE_F_MOVE")

sPLICE_F_MORE :: Word
sPLICE_F_MORE = (#const "SPLICE_F_MORE")
#endif

Note: The code above now just works! Below is no longer valid thanks to Nemo!

I call splice as defined above with two open and connected sockets (which are already used to transmit minimal amount of handshake data using either the sockets API send and recv calls or converted to handles and used with hGetLine and hPut) and I keep getting:

Network.Socket.Splice.splice: resource exhausted (Resource temporarily unavailable)

at the first c_splice call site: c_splice returns -1 and sets some errno to a value (probably EAGAIN) that reads resource exhausted | resource temporarily unavailable when looked up.

I tested calling splice with different Length values: 1024, 8192.

I don't know Haskell, but "resource temporarily unavailable" is EAGAIN.

And it looks like Haskell sets its sockets to non-blocking mode by default. So if you try to read from one when there is no data, or try to write to one when its buffer is full, you will fail with EAGAIN.

Figure out how to change the sockets to blocking mode, and I bet you will solve your problem.

[update]

Alternatively, call select or poll before attempting to read or write the socket. But you still need to handle EAGAIN, because there are rare corner cases where Linux select will indicate a socket is ready when actually it isn't.

about typecheck in linux kernel

7 votes
#define typecheck(type,x) \
({      type __dummy; \
        typeof(x) __dummy2; \
        (void)(&__dummy == &__dummy2); \
        1; \
}

the file of typecheck.h contains these codes. i know this code is check if x is the same type as the parameter type . but i can't understand the codes about

 (void)(&__dummy == &__dummy2);

why this way can solve this ? the first address of two variable can make sense? thanks for your answer. or tell me i should learn some points.

Comparing pointers with incompatible types is a constraint violation and requires the compiler to issue a diagnostic. See 6.5.9 Equality operators:

Constraints

One of the following shall hold:

  • both operands have arithmetic type;
  • both operands are pointers to qualified or unqualified versions of compatible types;
  • one operand is a pointer to an object or incomplete type and the other is a pointer to a qualified or unqualified version of void; or
  • one operand is a pointer and the other is a null pointer constant.

and 5.1.1.3 Diagnostics:

A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined. Diagnostic messages need not be produced in other circumstances.

Is .git folder crossplatform?

6 votes

If the .git folder that was created using linux is copied to windows, will it work?

Yes, it will be okay - I work like this myself - on two computers with Linux and Windows .git directory is synced by dropbox, and there are absolutely no problems at all :)

btw - .hg works equally well.

In C on Linux, popen / system to "ps all > file" truncates all lines to 80 characters

6 votes

I'm using Ubuntu 11.10. If I open a terminal and call: ps all I get the results truncated (i.e. 100 characters at most for each line) to the size of the terminal window.
If I call ps all > file The lines don't get truncated and all the information is in the file (There is a line that has ~200 characters)

In C, I am trying to achieve the same but the lines get truncated.
I've tried
int rc = system("ps all > file"); as well as variants of popen.
I assume the shell being used by system (and popen) defaults the output of each line to 80, which make sense if I were to parse it using popen, but since I am piping it to a file I expect it to disregard the size of the shell like I experienced when doing it in my shell.

TL;DR
How can I make sure ps all > file doesn't truncate lines when called from C application?

As a workaround, try passing -w or possibly -ww to ps when you invoke it.

From the man page (BSD):

-w      Use 132 columns to display information, instead of the default which is your 
        window size.  If the -w option is specified more than once, ps will use as many
        columns as necessary without regard for your window size.  When output is
        not to a terminal, an unlimited number of columns are always used.

Linux:

-w      Wide output. Use this option twice for unlimited width.

Alternatively,

You might have some success doing a fork/exec/wait yourself instead of using system; omitting error handling for brevity:

#include <unistd.h>
#include <stdio.h>

pid_t pid = fork();

if (!pid) {
   /* child */
   FILE* fp = fopen("./your-file", "w");
   close(STDOUT_FILENO);
   dup2(fileno(fp), STDOUT_FILENO);
   execlp("ps", "ps", "all", (char*)NULL);
} else {
  /* parent */
  int status;
  wait(&status);
  printf("ps exited with status %d\n", status);
}

mmap and memory usage

6 votes

I am writing a program that receives huge amounts of data (in pieces of different sizes) from the network, processes them and writes them to memory. Since some pieces of data can be very large, my current approach is limiting the buffer size used. If a piece is larger than the maximum buffer size, I write the data to a temporary file and later read the file in chunks for processing and permanent storage.

I'm wondering if this can be improved. I've been reading about mmap for a while but I'm not one hundred percent sure if it can help me. My idea is to use mmap for reading the temporary file. Does this help in any way? The main thing I'm concerned about is that an occasional large piece of data should not fill up my main memory causing everything else to be swapped out.

Also, do you think the approach with temporary files is useful? Should I even be doing that or, perhaps, should I trust the linux memory manager to do the job for me? Or should I do something else altogether?

Mmap can help you in some ways, I'll explain with some hypothetical examples:

First thing: Let's say you're running out of memory, and your application that have a 100MB chunk of malloc'ed memory get 50% of it swapped out, that means that the OS had to write 50MB to the swapfile, and if you need to read it back, you have written, occupied and then read it back again 50MB of your swapfile.

In case the memory was just mmap'ed, the operating system will not write that piece of information to the swapfile (as it knows that that data is identical to the file itself), instead, it will just scratch 50MB of information (again: supposing you have not written anything for now) and that's that. If you ever need that memory to be read again, the OS will fetch the contents not from the swapfile, but from the original file you've mmaped, so if any other program needs 50MB of swap, they're available. Also there is not overhead with swapfile manipulation at all.

Let's say you read a 100MB chunk of data, and according to the initial 1MB of header data, the information that you want is located at offset 75MB, so you don't need anything between 1~74.9MB! You have read it for nothing but to make your code simpler. With mmap, you will only read the data you have actually accessed (rounded 4kb, or the OS page size, which is mostly 4kb), so it would only read the first and the 75th MB. I think it's very hard to make a simpler and more effective way to avoid disk reading than mmaping files. And if by some reason you need the data at offset 37MB, you can just use it! You don't have to mmap it again, as the whole file is accessible in memory (of course limited by your process' memory space).

All files mmap'ed are backed up by themselves, not by the swapfile, the swapfile is made to grant data that doesn't have a file to back up, which usually is data malloc'ed or data that is backed up by a file, but it was altered and [can not/shall not] be written back to it before the program actually tells the OS to do so via a msync call.

Beware that you don't need to map the whole file in the memory, you can map any amount (2nd arg is "size_t length") starting from any place (6th arg - "off_t offset"), but unless your file is likely to be enormous, you can safely map 1GB of data with no fear, even if the system only packs 64mb of physical memory, but that's for reading, if you plan on writing then you should be more conservative and map only the stuff that you need.

Mapping files will help you making your code simpler (you already have the file contents on the memory, ready to use, with much less memory overhead since it's not anonymous memory) and faster (you will only read the data that your program accessed).

How to suppress Perl warnings emitted from within a loaded module's code?

6 votes

My Perl program is reading data from a serial device attached through USB. Headlines of my script in pseudo-Perl:

use warnings;
use strict;

use Device::SerialPort;
my $PortObj = tie( *$handle , "Device::SerialPort" , $PortName ) or die "Cannot open serial port: $!\n";
while ( 1 ) {
  my $readLength = read( $handle , my $frameData , $frameLength )
}

All works fine and even when I unplug the device from USB I'm able to recover from that situation, when the device file disappears and reappears. I can catch all errors spawned from my own script, but the loaded modules (Device::SerialPort) spawns warnings too and I don't want them to appear in my logging.

Can I add some sort of flag to my code so I don't see these specific warnings? It is important for me that only warnings from the module(s) are suppressed, not the warnings from my own script. Currently it looks like this:

[/dev/ttyUSB1]   0x0020 : 00 00 00 00 00 00 00 00 00 AA 93 82 73 68 5E 58 : ............sh^X
[/dev/ttyUSB1]   0x0030 : 55 54 52 52 4F 4E 50 51 50 00 00 00 00 00 00 00 : UTRRONPQP.......
Use of uninitialized value $count_in in addition (+) at /usr/lib/perl5/Device/SerialPort.pm line 2214.
Use of uninitialized value $string_in in concatenation (.) or string at /usr/lib/perl5/Device/SerialPort.pm line 2232.
[/dev/ttyUSB1] Restart required!
[/dev/ttyUSB1] Cannot open serial port: No such file or directory
[/dev/ttyUSB1] Cannot open serial port: No such file or directory
[/dev/ttyUSB1] Cannot open serial port: No such file or directory

[/dev/ttyUSB1]   0x0000 : 41 42 01 40 71 01 1C E4 80 99 80 80 80 80 00 00 : AB.@q...........
[/dev/ttyUSB1]   0x0010 : 00 03 00 00 83 00 01 01 00 00 00 00 00 00 00 00 : ................

So it is about the two Use of uninitialized value warnings that I want to get rid of. The other warnings are my own logging.

  • libdevice-serialport-perl 1.04-2build1
  • perl v5.12.4

You could try and intercept the warnings:

$SIG{'__WARN__'} = sub { warn $_[0] unless (caller eq "Device::SerialPort"); };

Can I slow down Django

5 votes

Simple question really

./manage.py runserver

Can I slow down localhost:8000 on my development machine so I can simulate file uploads and work on the look and feel of ajax uploading?

depending on where you want to simulate such you could simply sleep?

from time import sleep
sleep(500)