Best linux questions in January 2011

How to stop time from running backwards on Linux?

18 votes

Here's a little test I've written to verify that time does indeed only run forwards in Linux.

#include <time.h>
#include <sys/time.h>  

bool timeGoesForwardTest2()
{
   timeval tv1, tv2;   
   double startTime = getTimeSeconds();  // my function

   while ( getTimeSeconds() - startTime < 5 )
   {
      gettimeofday( &tv1, NULL );  
      gettimeofday( &tv2, NULL );  

      if ( tv2.tv_usec == tv1.tv_usec &&
           tv2.tv_sec == tv1.tv_sec )
      {
         continue;  // Equal times are allowed.
      }

      // tv2 should be greater than tv1
      if ( !( tv2.tv_usec>tv1.tv_usec ||
              tv2.tv_sec-1 == tv1.tv_sec ) )
      {
         printf( "tv1: %d %d\n", int( tv1.tv_sec ), int( tv1.tv_usec ) );
         printf( "tv2: %d %d\n", int( tv2.tv_sec ), int( tv2.tv_usec ) );
         return false;
      }         
   }
   return true;
}

Test fails with the result.

 tv1: 1296011067 632550
 tv2: 1296011067 632549

ummm....

Why does this happen?

Here's my setup:

Linux version 2.6.35-22-generic (buildd@rothera) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu4) ) #33-Ubuntu SMP Sun Sep 19 20:34:50 UTC 2010 (Ubuntu 2.6.35-22.33-generic 2.6.35.4)
... running inside VirtualBox 3.2.12, in Windows 7.

There is an open issue at the VirtualBox Bug Tracker. They link to a blog post stating why you shouldn't use gettimeofday() to measure the passage of time:

The most portable way to measure time correctly seems to be clock_gettime(CLOCK_MONOTONIC, ...)

Is memory allocation in linux non-blocking?

15 votes

I am curious to know if the allocating memory using a default new operator is a non-blocking operation.

e.g.

struct Node {
    int a,b;
};

...

Node foo = new Node();

If multiple threads tried to create a new Node and if one of them was suspended by the OS in the middle of allocation, would it block other threads from making progress?

The reason why I ask is because I had a concurrent data structure that created new nodes. I then modified the algorithm to recycle the nodes. The throughput performance of the two algorithms was virtually identical on a 24 core machine. However, I then created an interference program that ran on all the system cores in order to create as much OS pre-emption as possible. The throughput performance of the algorithm that created new nodes decreased by a factor of 5 relative the the algorithm that recycled nodes.

I'm curious to know why this would occur.

Thanks.

*Edit : pointing me to the code for the c++ memory allocator for linux would be helpful as well. I tried looking before posting this question, but had trouble finding it.

seems to me if your interference app were using new/delete (malloc/free), then the interference app's would interfere with the non recycle test more. But I don't know how your interference test is implemented.

Depending on how you recycle (ie if you use pthread mutexes god forbid) your recycle code could be slow (gcc atomic ops would be 40x faster at implementing recycle).

Malloc, in some variation for a long time on at least some platforms, has been aware of threads. Use the compiler switches on gcc to be sure you get it. Newer algorithms maintain pools of small memory chunks for each thread, so there is no or little blocking if your thread has the small item available. I have over simplified this and it depends on what malloc your system is using. Plus, if you go and allocate millions of items to do a test....well then you wont see that effect, because the small item pools are limited in size. Or maybe you will. I don't know. If you freed the item right after allocating, you would be more likely to see it. Freed small items go back into the small item lists rather than the shared heap. Although "what happens when thread B frees an item allocated by thread A" is a problem that may or may not be dealt with on your version of malloc and may not be dealt with in a non blocking manner. For sure, if you didn't immediately free during a large test, then the the thread would have to refill its small item list many times. That can block if more than one thread tries. Finally, at some point your process' heap will ask the system for heap memory, which can obviously block.

So are you using small memory items? For your malloc I don't know what small would be, but if you are < 1k that is for sure small. Are you allocating and freeing one after the other, or allocating thousands of nodes and then freeing thousands of nodes? Was your interference app allocating? All of these things will affect the results.

How to recycle with atomic ops (CAS = compare and swap):

First add a pNextFreeNode to your node object. I used void*, you can use your type. This code is for 32 bit pointers, but works for 64 bit as well. Then make a global recycle pile.

void *_pRecycleHead; // global head of recycle list. 

Add to recycle pile:

void *Old;
while (1) { // concurrency loop
  Old = _pRecycleHead;  // copy the state of the world. We operate on the copy
  pFreedNode->pNextFreeNode = Old; // chain the new node to the current head of recycled items
  if (CAS(&_pRecycleHead, Old, pFreedNode))  // switch head of recycled items to new node
    break; // success
}

remove from pile:

void *Old;
while (Old = _pRecycleHead) { // concurrency loop, only look for recycled items if the head aint null
  if (CAS(&_pRecycleHead, Old, Old->pNextFreeNode))  // switch head to head->next.
    break; // success
}
pNodeYoucanUseNow = Old;

Using CAS means the operation will succeed only if the item you are changing is the Old value you pass in. If there is a race and another thread got there first, then the old value will be different. In real life this race happens very very rarely. CAS is only slighlty slower than actually setting a value so compared to mutexes....it rocks.

The remove from pile, above, has a race condition if you add and remove the same item rapidly. We solve that by adding a version # to the CAS'able data. If you do the version # at the same time as the pointer to the head of the recycle pile you win. Use a union. Costs nothing extra to CAS 64 bits.

union TRecycle {
  struct {
    int iVersion;
    void *pRecycleHead;
  } ;  // we can set these.  Note, i didn't name this struct.  You may have to if you want ANSI
  unsigned long long n64;  // we cas this
}

Note, You will have to go to 128 bit struct for 64 bit OS. so the global recycle pile looks like this now:

TRecycle _RecycleHead;

Add to recycle pile:

while (1) { // concurrency loop
  TRecycle New,Old;
  Old.n64 = _RecycleHead.n64;  // copy state
  New.n64 = Old.n64;  // new state starts as a copy
  pFreedNode->pNextFreeNode = Old.pRecycleHead;  // link item to be recycled into recycle pile
  New.pRecycleHead = pFreedNode;  // make the new state
  New.iVersion++;  // adding item to list increments the version.
  if (CAS(&_RecycleHead.n64, Old.n64, New.n64))  // now if version changed...we fail
    break; // success
}

remove from pile:

while (1) { // concurrency loop
  TRecycle New,Old;
  Old.n64 = _RecycleHead.n64;  // copy state
  New.n64 = Old.n64;  // new state starts as a copy
  New.pRecycleHead = New.pRecycledHead.pNextFreeNode;  // new will skip over first item in recycle list so we can have that item.
  New.iVersion++;  // taking an item off the list increments the version.
  if (CAS(&_RecycleHead.n64, Old.n64, New.n64))  // we fail if version is different.
    break; // success
}
pNodeYouCanUseNow = Old.pRecycledHead;

I bet if you recycle this way you will see a perf increase.

calling c function from assembly

14 votes

I'm trying to use a function in assembly in a C project, the function is supposed to call a libc function let's say printf() but I keep getting a segmentation fault.

In the .c file I have the declaration of the function let's say

int do_shit_in_asm()

In the .asm file I have

.extern printf
.section .data
         printtext:
              .ascii "test"
.section .text
.global do_shit_in_asm
.type do_shit_in_asm, @function

do_shit_in_asm:
    pushl %ebp
    movl %esp, %ebp
    push printtext
    call printf
    movl %ebp, %esp
    pop %ebp
ret

Any pointers comments would be appreciated.

as func.asm -o func.o

gcc prog.c func.o -o prog

Change push printtext to push $printtext.

As it is, you're loading a value from the address printtext and pushing that, rather than pushing the address. Thus, you're passing 'test' as a 32-bit number, rather than a pointer, and printf is trying to interpret that as an address and crashing.

Proving the primality of strong probable primes.

12 votes

Using the probabilistic version of the Miller-Rabin test, I have generated a list of medium-large (200-300 digit) probable primes. But probable ain't good enough! I need to know these numbers are prime. Is there a library -- preferably wrapped or wrappable in Python -- that implements one of the more efficient primality proving algorithms?

Alternatively, does anyone know where I can find a clear, detailed, and complete description of ECPP (or a similarly fast algorithm) that does not assume a great deal of prior knowledge?

Update: I've found a Java implementation of another test, APRT-CLE, that conclusively proves primality. It verified a 291-digit prime candidate in under 10 minutes on an atom processor. Still hoping for something faster, but this seems like a promising start. http://alpertron.com.ar/ECM.HTM

As an algorithm that gives a reliable polynomial primality test, consider AKS. There is an older SO article referencing implementations and presentations of the algorithm.

How to make a process aware of other processes of the same program

10 votes

I must write a program that must be aware of another instance of itself running on that machine, and communicate with it, then die. I want to know if there is a canonical way of doing that in Linux.

My first thought was to write a file containing the PID of the process somewere, and look for that file every time the program executes, but where is the "right" place and name for that file? Is there a better, or more "correct" way?

Then I must communicate, saying the user tried to run it, but since there is another instance it will hand over the job and exit. I thought of just sending a signal, like SIGUSR1, but that would not allow me to send more information, like the X11 display from where the user executed the second process. How to send this info?

The program is linked against Gtk, so a solution that uses the glib is OK.

Putting the pid in a file is a common way of achieving this. For daemons ("system programs"), the common place to put such a file is /var/run/PROGRAM.pid. For user programs, put the pid file hidden in the user's homedir (if the program also has configuration files, then put both config files and the pid file in a subdir of the home dir).

Sending information to the "master" instance is most commonly achieved using Unix domain sockets, also known as local sockets. With a socket, you won't need a pid file (if no-one listens on the socket, the process knows it's master).

Is Mono's VB.Net support ready for a production site?

9 votes

Previously, I've only used Microsoft-centric solutions, but for an upcoming ASP.Net project I'm considering using Mono and hosting it on a Linux Amazon EC2 instance. Based on the responses to my previous question, this sounds doable. However, I'm most comfortable with VB.Net and I'm wondering how well Mono supports it.

Does anyone have first-hand experience writing ASP.Net applications for Mono using VB.Net? If so, I'd like to know how it went, what kind of compatibility issues you ran into, and if you consider Mono's VB.Net support ready for use on a production site?

I know Mono's C#.Net support is very good, so that's my fall-back plan, but I'd really prefer to use VB.Net.

The VB compiler hasn't been abandoned, it's just a lack of time that is preventing the required work to update to newer VB versions.

Currently vbnc has support for VB 8 (aka Visual Studio 2005), with a few minor features from newer VB versions.

The easiest and safest would be to precompile your site on Windows, in which case you won't have to deal with any potential compiler issues (and you can use the most recent Visual Studio version). If you take this route you shouldn't run into any bugs you wouldn't hit using C# [1]

[1]: You'd be referencing one assembly more: Microsoft.VisualBasic.dll, which could be a source of bugs - but if you adhere to what is considered good programming practice for VB (turn on Option Strict) the chances that you'll hit any significant new bugs is pretty low.

Several groups in RPM package

7 votes

Is it possible for single rpm package to belong to several groups?

In spec file you can set package group:

Group: System Environment/Base

What I need is to be able to set several groups for this package (like System|Util|MyCompanyName) - they would be like tags assigned to the package.

When the package is installed I want to query it like

rpm -q --group System

or

rpm -q --group MyCompanyName

and in both cases I should see my package (and others belonging to this group)


Edit:

Many packages may belong to MyCompanyName group, but only few may be installed. I need a way to differentiate our packages from linux system packages - I was planning to do it using the group name


I tried putting several Group: lines, but it only uses the last one. Everything after Group: seems to be taken as one string and I couldn't find a way to split them.

Another solution that I could think of is putting this stuff as PROVIDES and then to query

rpm -q --whatprovides System

but I don't like it this way.

Is there other way to accomplish the requested functionality?

The correct way to specify your company name is via the Vendor tag like this:

Vendor: Yoyodyne, Inc.

To get a list of packages by vendor you can run this command:

rpm -qa --qf '%{NAME} %{VENDOR}\n' | grep -v Yoyodyne

An RPM can only belong to one group. Furthermore, the allowable groups is defined by the distribution. For example, here is the list of valid groups for Mandriva:

http://wiki.mandriva.com/en/Development/Packaging/Groups

To find the valid groups for a particular distribution you must often run the package manager for that distro and look at the list.

RPM is not nearly as well defined as the Debian package format is. There seems to be no official or thorough documentation.

http://www.rpm.org/max-rpm/s1-rpm-inside-tags.html

Change read/write permissions on a file descriptor

7 votes

I'm working on a linux C project and I'm having trouble working with file descriptors.

I have an orphan file descriptor (the file was open()'d then unlink()'d but the fd is still good) that has write-only permission. The original backing file had full permissions (created with S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH), but alas the file was opened with O_WRONLY. Is it possible to duplicate the file descriptor and change the copy to O_RDWR?

psudo-code:


//open orphan file
int fd = open(fname, O_WRONLY, ...)
unlink(fname)
//fd is still good, but I can't read from it

//...

//I want to be able to read from orphan file
int fd2 = dup(fd)
//----change fd2 to read/write???----

Thanks in advance! -Andrew

No, there is no POSIX function to change the open mode. You will need to open it in read / write mode. Since you are created a temporary file, though, I strongly recommend that you use mkstemp. That function properly opens the file in read/write mode and unlinks it. Most importantly, it avoids a race condition in naming and creating the file, thereby avoiding a vulnerability in the creation of temporary files.

Why is the output of my forking program different when I pipe its output?

7 votes

I was looking at some simple code on fork, and decided to try it out for myself. I compiled and then ran it from inside Emacs, and got a different output to that output produced from running it in Bash.

#include <unistd.h>
#include <stdio.h>

int main() {
  if (fork() != 0) {
    printf("%d: X\n", getpid());
  }

  if (fork() != 0) {
    printf("%d: Y\n", getpid());
  }

  printf("%d: Z\n", getpid());
}

I compiled it with gcc, and then ran a.out from inside Emacs, as well as piping it to cat, and grep ., and got this.

2055: X
2055: Y
2055: Z
2055: X
2058: Z
2057: Y
2057: Z
2059: Z

This isn't right. Running it just from Bash I get (which I expected)

2084: X
2084: Y
2084: Z
2085: Y
2085: Z
2087: Z
2086: Z

edit - missed some newlines

What's going on?

The order in which different processes write their output is entirely unpredictable. So the only surprise is that sometimes the "X" print statement sometimes happens twice.

I believe this is because sometimes at the second fork(), an output line including "X" is in an output buffer, needing to be flushed. So both processes eventually print it. Since getpid() was already called and converted into the string, they'll show the same pid.

I was able to reproduce multiple "X" lines, but if I add fflush(stdout); just before the second fork(), I always only see one "X" line and always a total of 7 lines.

What is up with [A-Z] meaning [A-Za-z]?

7 votes

I've noticed for a while now that, on some of the Unix-based systems I use at least, ls [A-Z]* has been giving me the results I would anticipate from ls [A-Za-z]*, leaving me unable to easily get a list of just the goddamned files that start with capital letters. I just now ran into the same thing with grep, where I could not get it to stop matching lowercase letters with [A-Z] until I eventually used grep -P to get Perl regex.

So I have some related questions:

  1. When did this idiocy start?
  2. Who is responsible and needs to be punished?
  3. WHY???
  4. Is there some reasonably simple workaround for either or both of the ls and grep cases? (Trying, for example, grep --no-ignore-case was fruitless. grep -P is not a very good workaround because of its experimental feature status.)

It's actually [A-Za-y], and it has to do with language collation. If you want to override it then set $LC_COLLATE appropriately; either of C or POSIX should do.

Get number of CPUs in Linux using C

6 votes

Is there an API to get the number of CPUs available in Linux? I mean, without using /proc/cpuinfo or any other sys-node file...

I've found this implementation using sched.h:

int GetCPUCount()
{
 cpu_set_t cs;
 CPU_ZERO(&cs);
 sched_getaffinity(0, sizeof(cs), &cs);

 int count = 0;
 for (int i = 0; i < 8; i++)
 {
  if (CPU_ISSET(i, &cs))
   count++;
 }
 return count;
}

But, isn't there anything more higher level using common libraries? Thanks...

This code should work on both windows and *NIX platforms.

#ifdef _WIN32
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#else
#include <unistd.h>
#endif
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>


int main() {
  long nprocs = -1;
  long nprocs_max = -1;
#ifdef _WIN32
#ifndef _SC_NPROCESSORS_ONLN
SYSTEM_INFO info;
GetSystemInfo(&info);
#define sysconf(a) info.dwNumberOfProcessors
#define _SC_NPROCESSORS_ONLN
#endif
#endif
#ifdef _SC_NPROCESSORS_ONLN
  nprocs = sysconf(_SC_NPROCESSORS_ONLN);
  if (nprocs < 1)
  {
    fprintf(stderr, "Could not determine number of CPUs online:\n%s\n", 
strerror (errno));
    exit (EXIT_FAILURE);
  }
  nprocs_max = sysconf(_SC_NPROCESSORS_CONF);
  if (nprocs_max < 1)
  {
    fprintf(stderr, "Could not determine number of CPUs configured:\n%s\n", 
strerror (errno));
    exit (EXIT_FAILURE);
  }
  printf ("%ld of %ld processors online\n",nprocs, nprocs_max);
  exit (EXIT_SUCCESS);
#else
  fprintf(stderr, "Could not determine number of CPUs");
  exit (EXIT_FAILURE);
#endif
}

How to directly write to display buffer in GTK/GDK

6 votes

I have a program that displays an animation with a fixed frame rate (say 30 fps) in a window.

Currently I use SDL but unfortunately it lacks desktop integration(like drag & drop) and now I want to use GTK instead.

so what I want to do (assuming the window is double buffered):

1 obtain off-screen buffer
2 render my stuff to buffer
3 tell the toolkit to swap buffers
4 update window
5 repeat

I don't want to use any features except very basic GDK/GTK calls to create and manage the window and receive input events.

In the FAQ they suggest using GdkRGB. However, in the official documentation for GdkRGB they use the gdk_draw_rgb_image() function which is deprecated, in fact most GdkRGB functions are.

As an alternative they suggest using OpenGL. The problem with that is it requires some uncommon lib like GtkGLArea that is not part of GTK (I don't want to depend on any libs that aren't installed by default). Also, on my system the driver's OpenGL support is... limited.

I've looked into Cairo (which is offered as the alternative for some deprecated functions) but that would add even more boilerplate code (and I suspect additional overhead) for my relatively simple problem.

I personally would say that GdkRGB is a reasonable choice for you. It was designed to be simple, and its scope is pretty much exactly what you describe. Using deprecated code is chancy - it can be a fine choice if you want to get something working now, but if you want your code to grow and be maintained into the future, you should look at the alternatives.

Of those, Cairo is probably your best bet. There's only a little more boilerplate - you should look at http://cairographics.org/samples/ which has some pretty small and easy to understand examples. One of the potential benefits of Cairo is that it's more likely to be hardware accelerated.

[disclaimer: I'm the original author of GdkRGB. It's sad to see it deprecated now, but I suppose it served its purpose well]

High delay in RS232 communication on a PXA270

6 votes

Hi there.

I'm experiencing a long delay (1.5ms - 9.5ms) in a RS232 communication on a PXA270 RISC PC/104. I want to minimize the long delay but I'm a beginner with embedded devices and C++ so I think I'm missing something.

The mentioned delay is at the time when the PXA board receives a packet from the external device via RS232 (115200 baud) until it sends an ACK custom packet back to the external device. I measured the delay on the PXA board with an oscilloscope, one channel at the Rx and the other on the Tx.

The PXA board is running an Arcom Embedded Linux (AEL). I know, it's not a real-time OS, but I still think, that an average delay of 4.5ms is way too high for extracting the received packet, verify it's CRC16, construct an ACK packet (with CRC) and send it back the serial line. I also deliberately put the CPU under heavy load (some parallel gzip operations) but the delay time didn't increase at all. The maximum size of a received packet is 30 bytes.

A C++ application (another former co-worker wrote it) is handling the reception of the packets and their acknowledgement. One thread is sending and the other is receiving the packets.

I thought that the RTC on the PXA board has a very bad resolution and the AEL can not align the timing to the internal RTC resolution. But the RTC has a frequency of 32.768 kHz. The resolution is sufficient, still don't explain the high delay. Btw, I think the OS is using the internal PXA clock (which has also a sufficient resolution) instead of the RTC for the timing.

Therefore the problem must be in the C++ app or in a driver/OS setting of the RS232 interface.

The following control flags are used for the RS232 communication in the C++ application according to the Serial Programming Guide for POSIX Operating Systems:

// Open RS232 on COM1
mPhysicalComPort = open(aPort, O_RDWR | O_NOCTTY | O_NDELAY);
// Force read call to block if no data available
int f = fcntl(mPhysicalComPort, F_GETFL, 0);
f &= ~O_NONBLOCK;
fcntl(mPhysicalComPort, F_SETFL, f);
// Get the current options for the port...
tcgetattr(mPhysicalComPort, &options);
// ... and set them to the desired values
cfsetispeed(&options, baudRate);
cfsetospeed(&options, baudRate);
// no parity (8N1)
options.c_cflag &= ~PARENB;
options.c_cflag &= ~CSTOPB;
options.c_cflag &= ~CSIZE;
options.c_cflag |= CS8;
// disable hardware flow control
options.c_cflag &= ~CRTSCTS;
// raw input
options.c_lflag = 0;
// disable software flow control
options.c_iflag = 0;
// raw output
options.c_oflag = 0;
// Set byte times
options.c_cc[VMIN] = 1;
options.c_cc[VTIME] = 0;
// Set the new options for the port
tcsetattr(mPhysicalComPort, TCSAFLUSH, &options);
// Flush to put settings to work
tcflush(mPhysicalComPort, TCIOFLUSH);

I think I'm missing something very simple. I think, that if the process of the app is running under a higher priority, this will not solve the problem. There must be something, which instructs the RS232 driver to handle the requests with a higher priority to minimize the latency.

Does anyone have any ideas? Thank you very much in advance for your help.

Hi

Thank you very much for your comments.

I was able to reduce the delay to ~0.4ms. The command setserial(8) was referenced in the AEL manual. And bingo, I found the low_latency flag there with the following description:

Minimize the receive latency of the serial device at the cost of greater CPU utilization. (Normally there is an average of 5-10ms latency before characters are handed off to the line discpline to minimize overhead.) This is off by default, but certain real-time applications may find this useful.

I then executed setserial /dev/ttyS1 low_latency and the delay was reduced to ~0.4ms :-)

But I wanted to implement this behaviour in the C++ app, without setting this flag globally with setserial (this command is by default not included in all distros).

I've added the following code lines, which had the same effect as the low_latency flag from setserial:

#include <sys/ioctl.h> 
#include <linux/serial.h>
// Open RS232 on COM1
mPhysicalComPort = open(aPort, O_RDWR | O_NOCTTY | O_NDELAY);
struct serial_struct serial;
ioctl(mPhysicalComPort, TIOCGSERIAL, &serial); 
serial.flags |= ASYNC_LOW_LATENCY; // (0x2000)
ioctl(mPhysicalComPort, TIOCSSERIAL, &serial);

STL and release/debug library mess

6 votes

I'm using some 3rd party. I'm using it's shared library version, since the library is big (~60MB) and is used by several applications.

Is there a way at application startup to find out that release/debug version of library is used respectively for release/debug version of my application?

Longer description

The library which exposes C++ interface. One of API methods return std::vector<std::string>.

The problem when I compile my application in debug mode, debug version of the library should be used. Same for release. If incorrect version of the library is used application is crashed.

According to gcc (see http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt03ch17s04.html)

but with a mixed mode standard library that could be using either debug-mode or release-mode basic_string objects, things get more complicated

P.S. 1

It looks like proposal of Timbo is a possible solution - use different soname for debug and release libraries. So, what should be passed to ./configure script to change library soname?

P.S. 2

My problem is not at link time, but rather at run time.

P.S. 3

Here is question demonstrating problem I is facing with.

I believe that you have misread the documentation at the link you provide. In particular, you've misunderstood its purpose -- that section is entitled "Goals", and describes a number of hypothetical designs for a C++ debug library and the consequences of those designs in order to explain the actual design choices that were made. The bits of text that follow the lines you quoted are describing the chaos that would result from a hypothetical implementation that had separate designs for release-mode and debug-mode strings. It goes on to say:

For this reason we cannot easily provide safe iterators for the std::basic_string class template, as it is present throughout the C++ standard library.

(Or, rephrasing that, providing a special "debug" version of string iterators is impossible.)

...

With the design of libstdc++ debug mode, we cannot effectively hide the differences between debug and release-mode strings from the user. Failure to hide the differences may result in unpredictable behavior, and for this reason we have opted to only perform basic_string changes that do not require ABI changes. The effect on users is expected to be minimal, as there are simple alternatives (e.g., __gnu_debug::basic_string), and the usability benefit we gain from the ability to mix debug- and release-compiled translation units is enormous.

In other words, the design of the debug and release modes in GCC's libstdc++ has rejected this hypothetical implementation with separate designs for the strings, specifically in order to allow cross-mode linking of the sort that you are worrying about how to avoid.

Thus, you should not have problems with compiling your library once, without -D_GLIBCXX_DEBUG (or with it, if for some reason you prefer), and linking it with either mode of your application. If you do have problems, it is due to a bug somewhere. [But see edit below! This is specific to std::string, not other containers!]

Edit: After this answer was accepted, I followed up in answering the follow-up question at std::vector crash, and realized that the conclusion of this answer is incorrect. GCC's libstdc++ does clever things with strings to support "Per-use recompilation" (in which all uses of a given container object must be compiled with the same flags, but uses of the same container class within a program need not be compiled with the same flags), but that is not the same thing as complete "Per-unit compilation" that would provide the cross-linking ability you need. In particular, the documentation says of that cross-linking ability,

We believe that this level of recompilation is in fact not possible if we intend to supply safe iterators, leave the program semantics unchanged, and not regress in performance under release mode....

Thus, if you're passing containers across your library interface, you will need two separate libraries. Honestly, for this situation I've found that the easiest solution is just to install the two libraries into different directories (one for each variant -- and you'll want both to be separate from your main library directory). Alternately, you can rename the debug library file and then install it manually.

As a further suggestion -- you're presumably not running this in debug mode very often. It may be worth only compiling and linking the debug version statically into your application, so you don't have to worry about installing multiple dynamic libraries and keeping them straight at runtime.

Graphviz can't find any fonts

6 votes

I'm getting "Could not find/open font" errors when doing anything with graphviz. I've been narrowing it down to an as simple graph as possible, in the file simplest.dot:

digraph G {
  node1
}

When running $ dot simplest.dot -Tpng -O the graph is rendered to simplest.dot.png, but I always get this error: Error: Could not find/open font, and the font used in the output isn't very pretty.

According to the graphviz faq, when this error occurs, you can tell graphviz where to look for fonts. I've been looking around for fonts on the system I'm using, and there seem to be some TrueType fonts in /usr/share/fonts, among others, the Bitstream Vera fonts, which seem to live in /usr/share/fonts/bitstream-vera.

So I've tried setting fontpath and fontname in the dot graph, to help graphviz figure things out:

digraph G {
  fontpath="/usr/share/fonts/bitstream-vera"
  fontname="Bitstream Vera Sans"
  node1
}

But I'm still getting the exact same error. I've tried several variations of the path and font name, but I can't seem to get it right. What am I doing wrong?

This might be a shot into the dark, but in http://www.graphviz.org/doc/info/attrs.html#d:fontname it says If you specify fontname=schlbk, the tool will look for a file named schlbk.ttf or schlbk.pfa or schlbk.pfb in one of the directories specified by the fontpath attribute.

So, I'd probably try

digraph G {
  fontpath="/usr/share/fonts/bitstream-vera"
  fontname="nameOfttfWITHOUTsuffix.ttf"
  node1
}

where does top gets real-time data

6 votes

Hi all,

Where does top application gets it's data on Linux? I would be interested in real-time CPU load/pid data.(I read allmost all documentation in /proc/pid man page, but the info isn't there).

The pid is a jboss. I need the data lightweight (to be exported easily).

Thanks, Xander

As documented in proc(5), in the file /proc/(pid)/stat you have the fields:

utime %lu

Amount of time that this process has been scheduled in user mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK). This includes guest time, guest_time (time spent running a virtual CPU, see below), so that applications that are not aware of the guest time field do not lose that time from their calculations.

stime %lu

Amount of time that this process has been scheduled in kernel mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK).

To get CPU usage for a specific process, use those fields. The toplevel process will aggregate CPU usage over all threads; for a per-thread breakdown, you can find the other threads in /proc/(pid)/task.

If you would prefer to be notified when CPU time exceeds some threshold, you can use clock_getcpuclockid to get a handle to its cpu time clock, then timer_create or timerfd to be notified when it hits a specified level. However, note that cross-process cputime timers are an optional feature in the POSIX specification and may not be supported (I've not tested).

Help with understanding a very basic main() disassembly in GDB

5 votes

Heyo,

I have written this very basic main function to experiment with disassembly and also to see and hopefully understand what is going on at the lower level:

int main() {
  return 6;
}

Using gdb to disas main produces this:

0x08048374 <main+0>:    lea    0x4(%esp),%ecx
0x08048378 <main+4>:    and    $0xfffffff0,%esp
0x0804837b <main+7>:    pushl  -0x4(%ecx)
0x0804837e <main+10>:   push   %ebp
0x0804837f <main+11>:   mov    %esp,%ebp
0x08048381 <main+13>:   push   %ecx
0x08048382 <main+14>:   mov    $0x6,%eax
0x08048387 <main+19>:   pop    %ecx
0x08048388 <main+20>:   pop    %ebp
0x08048389 <main+21>:   lea    -0x4(%ecx),%esp
0x0804838c <main+24>:   ret  

Here is my best guess as to what I think is going on and what I need help with line-by-line:

lea 0x4(%esp),%ecx

Load the address of esp + 4 into ecx. Why do we add 4 to esp?

I read somewhere that this is the address of the command line arguments. But when I did x/d $ecx I get the value of argc. Where are the actual command line argument values stored?

and $0xfffffff0,%esp

Align stack

pushl -0x4(%ecx)

Push the address of where esp was originally onto the stack. What is the purpose of this?

push %ebp

Push the base pointer onto the stack

mov %esp,%ebp

Move the current stack pointer into the base pointer

push %ecx

Push the address of original esp + 4 on to stack. Why?

mov $0x6,%eax

I wanted to return 6 here so i'm guessing the return value is stored in eax?

pop %ecx

Restore ecx to value that is on the stack. Why would we want ecx to be esp + 4 when we return?

pop %ebp

Restore ebp to value that is on the stack

lea -0x4(%ecx),%esp

Restore esp to it's original value

ret

I am a n00b when it comes to assembly so any help would be great! Also if you see any false statements about what I think is going on please correct me.

Thanks a bunch! :]

You did pretty good with your interpretation. When a function is called, the return address is automatically pushed to the stack, which is why argc, the first argument, has been pushed back to 4(%esp). argv would start at 8(%esp), with a pointer for each argument, followed by a null pointer. This function pushes the old value of %esp to the stack so that it can contain the original, unaligned value upon returned. The value of %ecx at return doesn't matter, which is why it is used as temporary storage for the %esp reference. Other than that, you are correct with everything.