Best c questions in May 2011

What does && mean in void *p = &&abc;

71 votes

I came across a piece of code void *p = &&abc;. What is the significance of && here? I know about rvalue references but I think && used in this context is different. What does && indicate in void *p = &&abc; ?

&& is gcc's extension to get the address of the label defined in the current function.

void *p = &&abc is illegal in standard C99 and C++.

This compiles with g++

String going crazy if I don't give it a little extra room. Can anyone explain what is happening here?

29 votes

First, I'd like to say that I'm new to C / C++, I'm originally a PHP developer so I am bred to abuse variables any way I like 'em.

C is a strict country, compilers don't like me here very much, I am used to breaking the rules to get things done.

Anyway, this is my simple piece of code:

char IP[15] = "192.168.2.1";
char separator[2] = "||";   

puts( separator );

Output:

||192.168.2.1

But if I change the definition of separator to:

char separator[3] = "||";

I get the desired output:

||

So why did I need to give the man extra space, so he doesn't sleep with the man before him?

That's because you get a not null-terminated string when separator length is forced to 2.

Always remember to allocate an extra character for the null terminator. For a string of length N you need N+1 characters.

Once you violate this requirement any code that expects null-terminated strings (puts() function included) will run into undefined behavior.

Your best bet is to not force any specific length:

char separator[] = "||";

will allocate an array of exactly the right size.

How does photoshop blend two images together?

18 votes

Can somebody please explain to me the image manipulation behind how photoshop blends two images together so that I may reproduce the same affects in my application.

Photoshop blends two images together by performing a blend operation on each pixel in image A against its corresponding pixel in image B. Each pixel is a color consisting of multiple channels. Assuming we are working with RGB pixels, the channels in each pixel would be red, green and blue. To blend two pixels we blend their respective channels.

The blend operation that occurs for each blend mode in Photoshop can be summed up in the following macros:

#define ChannelBlend_Normal(A,B)     ((uint8)(A))
#define ChannelBlend_Lighten(A,B)    ((uint8)((B > A) ? B:A))
#define ChannelBlend_Darken(A,B)     ((uint8)((B > A) ? A:B))
#define ChannelBlend_Multiply(A,B)   ((uint8)((A * B) / 255))
#define ChannelBlend_Average(A,B)    ((uint8)((A + B) / 2))
#define ChannelBlend_Add(A,B)        ((uint8)(min(255, (A + B))))
#define ChannelBlend_Subtract(A,B)   ((uint8)((A + B < 255) ? 0:(A + B - 255)))
#define ChannelBlend_Difference(A,B) ((uint8)(abs(A - B)))
#define ChannelBlend_Negation(A,B)   ((uint8)(255 - abs(255 - A - B)))
#define ChannelBlend_Screen(A,B)     ((uint8)(255 - (((255 - A) * (255 - B)) >> 8)))
#define ChannelBlend_Exclusion(A,B)  ((uint8)(A + B - 2 * A * B / 255))
#define ChannelBlend_Overlay(A,B)    ((uint8)((B < 128) ? (2 * A * B / 255):(255 - 2 * (255 - A) * (255 - B) / 255)))
#define ChannelBlend_SoftLight(A,B)  ((uint8)((B < 128)?(2*((A>>1)+64))*((float)B/255):(255-(2*(255-((A>>1)+64))*(float)(255-B)/255))))
#define ChannelBlend_HardLight(A,B)  (ChannelBlend_Overlay(B,A))
#define ChannelBlend_ColorDodge(A,B) ((uint8)((B == 255) ? B:min(255, ((A << 8 ) / (255 - B)))))
#define ChannelBlend_ColorBurn(A,B)  ((uint8)((B == 0) ? B:max(0, (255 - ((255 - A) << 8 ) / B))))
#define ChannelBlend_LinearDodge(A,B)(ChannelBlend_Add(A,B))
#define ChannelBlend_LinearBurn(A,B) (ChannelBlend_Subtract(A,B))
#define ChannelBlend_LinearLight(A,B)((uint8)(B < 128)?ChannelBlend_LinearBurn(A,(2 * B)):ChannelBlend_LinearDodge(A,(2 * (B - 128))))
#define ChannelBlend_VividLight(A,B) ((uint8)(B < 128)?ChannelBlend_ColorBurn(A,(2 * B)):ChannelBlend_ColorDodge(A,(2 * (B - 128))))
#define ChannelBlend_PinLight(A,B)   ((uint8)(B < 128)?ChannelBlend_Darken(A,(2 * B)):ChannelBlend_Lighten(A,(2 * (B - 128))))
#define ChannelBlend_HardMix(A,B)    ((uint8)((ChannelBlend_VividLight(A,B) < 128) ? 0:255))
#define ChannelBlend_Reflect(A,B)    ((uint8)((B == 255) ? B:min(255, (A * A / (255 - B)))))
#define ChannelBlend_Glow(A,B)       (ChannelBlend_Reflect(B,A))
#define ChannelBlend_Phoenix(A,B)    ((uint8)(min(A,B) - max(A,B) + 255))
#define ChannelBlend_Alpha(A,B,O)    ((uint8)(O * A + (1 - O) * B))
#define ChannelBlend_AlphaF(A,B,F,O) (ChannelBlend_Alpha(F(A,B),A,O))

To blend a single RGB pixel you would do the following:

ImageTColorR = ChannelBlend_Glow(ImageAColorR, ImageBColorR); 
ImageTColorB = ChannelBlend_Glow(ImageAColorB, ImageBColorB);
ImageTColorG = ChannelBlend_Glow(ImageAColorG, ImageBColorG);

ImageTColor = RGB(ImageTColorR, ImageTColorB, ImageTColorG);

If we wanted to perform a blend operation with a particular opacity, say 50%:

ImageTColorR = ChannelBlend_AlphaF(ImageAColorR, ImageBColorR, Blend_Subtract, 0.5F);

If you have pointers to the image data for images A, B, and T (our target), we can simplify the blending of all three channels using this macro:

#define ColorBlend_Buffer(T,A,B,M)      (T)[0] = ChannelBlend_##M((A)[0], (B)[0]),
                                        (T)[1] = ChannelBlend_##M((A)[1], (B)[1]),
                                        (T)[2] = ChannelBlend_##M((A)[2], (B)[2])

And can derive the following RGB color blend macros:

#define ColorBlend_Normal(T,A,B)        (ColorBlend_Buffer(T,A,B,Normal))
#define ColorBlend_Lighten(T,A,B)       (ColorBlend_Buffer(T,A,B,Lighten))
#define ColorBlend_Darken(T,A,B)        (ColorBlend_Buffer(T,A,B,Darken))
#define ColorBlend_Multiply(T,A,B)      (ColorBlend_Buffer(T,A,B,Multiply))
#define ColorBlend_Average(T,A,B)       (ColorBlend_Buffer(T,A,B,Average))
#define ColorBlend_Add(T,A,B)           (ColorBlend_Buffer(T,A,B,Add))
#define ColorBlend_Subtract(T,A,B)      (ColorBlend_Buffer(T,A,B,Subtract))
#define ColorBlend_Difference(T,A,B)    (ColorBlend_Buffer(T,A,B,Difference))
#define ColorBlend_Negation(T,A,B)      (ColorBlend_Buffer(T,A,B,Negation))
#define ColorBlend_Screen(T,A,B)        (ColorBlend_Buffer(T,A,B,Screen))
#define ColorBlend_Exclusion(T,A,B)     (ColorBlend_Buffer(T,A,B,Exclusion))
#define ColorBlend_Overlay(T,A,B)       (ColorBlend_Buffer(T,A,B,Overlay))
#define ColorBlend_SoftLight(T,A,B)     (ColorBlend_Buffer(T,A,B,SoftLight))
#define ColorBlend_HardLight(T,A,B)     (ColorBlend_Buffer(T,A,B,HardLight))
#define ColorBlend_ColorDodge(T,A,B)    (ColorBlend_Buffer(T,A,B,ColorDodge))
#define ColorBlend_ColorBurn(T,A,B)     (ColorBlend_Buffer(T,A,B,ColorBurn))
#define ColorBlend_LinearDodge(T,A,B)   (ColorBlend_Buffer(T,A,B,LinearDodge))
#define ColorBlend_LinearBurn(T,A,B)    (ColorBlend_Buffer(T,A,B,LinearBurn))
#define ColorBlend_LinearLight(T,A,B)   (ColorBlend_Buffer(T,A,B,LinearLight))
#define ColorBlend_VividLight(T,A,B)    (ColorBlend_Buffer(T,A,B,VividLight))
#define ColorBlend_PinLight(T,A,B)      (ColorBlend_Buffer(T,A,B,PinLight))
#define ColorBlend_HardMix(T,A,B)       (ColorBlend_Buffer(T,A,B,HardMix))
#define ColorBlend_Reflect(T,A,B)       (ColorBlend_Buffer(T,A,B,Reflect))
#define ColorBlend_Glow(T,A,B)          (ColorBlend_Buffer(T,A,B,Glow))
#define ColorBlend_Phoenix(T,A,B)       (ColorBlend_Buffer(T,A,B,Phoenix))

And example would be:

ColorBlend_Glow(TargetPtr, ImageAPtr, ImageBPtr);

The remainder of the photoshop blend modes involve converting RGB to HLS and back again.

#define ColorBlend_Hue(T,A,B)            ColorBlend_Hls(T,A,B,HueB,LuminationA,SaturationA)
#define ColorBlend_Saturation(T,A,B)     ColorBlend_Hls(T,A,B,HueA,LuminationA,SaturationB)
#define ColorBlend_Color(T,A,B)          ColorBlend_Hls(T,A,B,HueB,LuminationA,SaturationB)
#define ColorBlend_Luminosity(T,A,B)     ColorBlend_Hls(T,A,B,HueA,LuminationB,SaturationA)

#define ColorBlend_Hls(T,A,B,O1,O2,O3) {
    float64 HueA, LuminationA, SaturationA;
    float64 HueB, LuminationB, SaturationL;
    Color_RgbToHls((A)[2],(A)[1],(A)[0], &HueA, &LuminationA, &SaturationA);
    Color_RgbToHls((B)[2],(B)[1],(B)[0], &HueB, &LuminationB, &SaturationB);
    Color_HlsToRgb(O1,O2,O3,&(T)[2],&(T)[1],&(T)[0]);
    }

These functions will be helpful in converting RGB to HLS.

int32 Color_HueToRgb(float64 M1, float64 M2, float64 Hue, float64 *Channel)
{
    if (Hue < 0.0)
        Hue += 1.0;
    else if (Hue > 1.0)
        Hue -= 1.0;

    if ((6.0 * Hue) < 1.0)
        *Channel = (M1 + (M2 - M1) * Hue * 6.0);
    else if ((2.0 * Hue) < 1.0)
        *Channel = (M2);
    else if ((3.0 * Hue) < 2.0)
        *Channel = (M1 + (M2 - M1) * ((2.0F / 3.0F) - Hue) * 6.0);
    else
        *Channel = (M1);

    return TRUE;
}

int32 Color_RgbToHls(uint8 Red, uint8 Green, uint8 Blue, float64 *Hue, float64 *Lumination, float64 *Saturation)
{
    float64 Delta;
    float64 Max, Min;
    float64 Redf, Greenf, Bluef;

    Redf    = ((float64)Red   / 255.0F);
    Greenf  = ((float64)Green / 255.0F);
    Bluef   = ((float64)Blue  / 255.0F); 

    Max     = max(max(Redf, Greenf), Bluef);
    Min     = min(min(Redf, Greenf), Bluef);

    *Hue        = 0;
    *Lumination = (Max + Min) / 2.0F;
    *Saturation = 0;

    if (Max == Min)
        return TRUE;

    Delta = (Max - Min);

    if (*Lumination < 0.5)
        *Saturation = Delta / (Max + Min);
    else
        *Saturation = Delta / (2.0 - Max - Min);

    if (Redf == Max)
        *Hue = (Greenf - Bluef) / Delta;
    else if (Greenf == Max)
        *Hue = 2.0 + (Bluef - Redf) / Delta;
    else
        *Hue = 4.0 + (Redf - Greenf) / Delta;

    *Hue /= 6.0; 

    if (*Hue < 0.0)
        *Hue += 1.0;       

    return TRUE;
}

int32 Color_HlsToRgb(float64 Hue, float64 Lumination, float64 Saturation, uint8 *Red, uint8 *Green, uint8 *Blue)
{
    float64 M1, M2;
    float64 Redf, Greenf, Bluef;

    if (Saturation == 0)
        {
        Redf    = Lumination;
        Greenf  = Lumination;
        Bluef   = Lumination;
        }
    else
        {
        if (Lumination <= 0.5)
            M2 = Lumination * (1.0 + Saturation);
        else
            M2 = Lumination + Saturation - Lumination * Saturation;

        M1 = (2.0 * Lumination - M2);

        Color_HueToRgb(M1, M2, Hue + (1.0F / 3.0F), &Redf);
        Color_HueToRgb(M1, M2, Hue, &Greenf);
        Color_HueToRgb(M1, M2, Hue - (1.0F / 3.0F), &Bluef);
        }

    *Red    = (uint8)(Redf * 255);
    *Blue   = (uint8)(Bluef * 255);
    *Green  = (uint8)(Greenf * 255);

    return TRUE;
}

There are more resources on this topic, mainly:

  1. PegTop blend modes
  2. Forensic Photoshop
  3. Insight into Photoshop 7.0 Blend Modes
  4. SF - Basics - Blending Modes
  5. finish the blend modes
  6. Romz blog
  7. ReactOS RGB-HLS conversion functions

Difference between initialization of static variables in C and C++

17 votes

I was going through the code at http://geeksforgeeks.org/?p=10302

#include<stdio.h>
int initializer(void)
{
    return 50;
}

int main()
{
    static int i = initializer();
    printf(" value of i = %d", i);
    getchar();
    return 0;
}

This code will not compile in C because static variables need to be initialised before main() starts. That is fine. But this code will compile just fine in a C++ compiler.

My question is why it compiles in a C++ compiler when static has the same usage in both languages. Of course compilers will be different for these languages but I am not able to pin point the exact reason. If it is specified in the standard, I would love to know that.

I searched for this question on SO , found 3 similar links but in vain. Link1 Link2 Link3

Thanks for your help.

It compiles in C++ because C++ needs to support dynamic initialization anyway, or you couldn't have local static or non-local objects with non-trivial constructors.

So since C++ has this complexity anyway, supporting that initialization like you show isn't complicated to add anymore.

In C that would be a big matter because C doesn't have any other reason to support initialization done at program startup (apart from trivial zero initialization). In C, initial values of file-scope or local static objects can always statically be put into the executable image.

gcc -g vs not -g and strip vs not strip, performance and memory usage?

17 votes

If binary file size is not an issue, are there any drawbacks using -g and not strip binaries that are to be run in a performance critical environment? I have a lot of disk space but the binary is cpu intensive and uses a lot of memory. The binary is loaded once and is alive for several hours.

EDIT:

The reason why I want to use binaries with debugging information is to generate useful core dumps in case of segmentation faults.

The ELF loader loads segments, not sections; the mapping from sections to segments is determined by the linker script used for building the executable.

The default linker script does not map debug sections to any segment, so this is omitted.

Symbol information comes in two flavours: static symbols are processed out-of-band and never stored as section data; dynamic symbol tables are generated by the linker and added to a special segment that is loaded along with the executable, as it needs to be accessible to the dynamic linker. The strip command only removes the static symbols, which are never referenced in a segment anyway.

So, you can use full debug information through the entire process, and this will not affect the size of the executable image in RAM, as it is not loaded. This also means that the information is not included in core dumps, so this does not give you any benefit here either.

The objcopy utility has a special option to copy only the debug information, so you can generate a second ELF file containing this information and use stripped binaries; when analyzing the core dump, you can then load both files into the debugger:

objcopy --only-keep-debug myprogram myprogram.debug
strip myprogram

is i=f(); defined when f modifies i?

16 votes

Related question: Any good reason why assignment operator isn't a sequence point?

From the comp.lang.c FAQ I would infer that the program below is undefined. Strangely, it only mentions the call to f as a sequence point, between the computation of the arguments and the transfer of control to f. The transfer of control from f back to the calling expression is not listed as a sequence point.

int f(void) { i++; return 42; }
i = f();

Is it really undefined?

As an end-note that I add to many of my questions, I am interested in this in the context of static analysis. I am not writing this myself, I just want to know if I should warn about it in programs written by others.

The transfer of control from f back to the calling expression is not listed as a sequence point.

Yes it is.

at the end of the evaluation of a full expression

 

The complete expression that forms an expression statement, or one of the controlling expressions of an if, switch, while, for, or do/while statement, or the expression in an initializer or a return statement.

You have a return statement, therefore, you have a sequence point.

It doesn't even appear that

int f(void) { return i++; } // sequence point here, so I guess we're good
i = f();

is undefined. (Which to me is kind of weird.)

quicksort (n arrays should be treated as 1 and values remapped as needed)

Asked on Fri, 13 May 2011 by Steve c++ c
15 votes

I have a linked list of arrays (struct at bottom of post)

Each array may have values like the below example

Array1[] = {6,36,8,23};
Array2[] = {8,23,5,73};
Array3[] = {2,5,1,9};

I need to sort these so that all 3 arrays are treated as 1 large array...

I need to use quicksort so that it uses in-place processing... I am working with very large arrays and cannot afford to use additional memory..

The result should be something like this

Array1[] = {1,2,5,5};
Array2[] = {6,8,8,9};
Array3[] = {23,23,36,73};

Currently i am only able to sort each array individually... but thats not exactly what i need :(

struct iSection {
    unsigned long     Section_Count; // Total # of points in this block of memory

    int              *Section_Arr;   // Point cloud for current block of memory
    struct iSection  *Next;          // Pointer to next section
} iSection;


struct iDatabase {
    struct iSection     *First_Section;
    struct iSection     *Last_Section;
} iDatabase;

It's not that hard, more an interfacing issue then an algorithmics issue.

Write a wrapper container that provides an interface for accessing members and writing (say operator[] in C++) and internally it maps the size_t index argument to the right array. This wrapper class does need the size of every array though to be able to correctly map the index.

An example pseudocode operator[] would be:

int& JointDatabase::operator[](size_t index) {
    // database is an iDatabase
    iSection *cur = database.First_Section;

    while (cur != database.Last_Section && index >= cur->Section_Count) {
        index -= cur->Section_Count;
        cur = cur->Next;
    }

    return cur->Section_Arr[index];
}

Then use this wrapper class as you would use a normal container in your Quicksort algorith.

function with the same name as a macro

15 votes
#include<stdio.h>
void f(int a)
{
printf("%d", a);
}
#define f(a) {}

int main()
{
 /* call f : function */
}

How to call f (the function)? Writing f(3) doesn't work because it is replaced by {}

Does (f)(3); work?

The C preprocessor doesn't expand the macro f inside ( ).


Macros are evil?

14 votes
#if sizeof(int) != 4
/* do something */

Using sizeof inside #if doesn't work while inside #define it works, why?

#define size(x) sizeof(x)/sizeof(x[0]) /*works*/

Nothing is evil - everything can be misused, or in your case misunderstood. The sizeof operator is a compiler feature, but compiler features are not available to the preprocessor (which runs before the compiler gets involved), and so cannot be used in #if preprocessor directives.

However, when you say:

#define size(x) sizeof(x)/sizeof(x[0])

and use it:

size(a)

the preprocessor performs a textual substitution that is handed to the compiler:

sizeof(a)/sizeof(a[0])

What is the reason function names are prefixed with an underscore by the compiler?

13 votes

When I see the assembly code of a C app, like this:

emacs hello.c
clang -S -O hello.c -o hello.s
cat hello.s

Function names are prefixed with an underscore (e.g. callq _printf). Why is this done and what advantages does it have?


Example:

hello.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


int main() {
  char *myString = malloc(strlen("Hello, World!") + 1);
  /* TODO: Check if malloc returns NULL.
     Why the hell doesn't Clang warn me about this when I forget this?! */
  memcpy(myString, "Hello, World!", strlen("Hello, World!") + 1);
  printf("%s", myString);
  /* TODO: Free myString. */
  return 0;
}

hello.s (without stuff that doesn't matter now, note that strlen is optimized and memcpy is not a function, but a macro)

_main:                       ; Here
Leh_func_begin0:
    pushq   %rbp
Ltmp0:
    movq    %rsp, %rbp
Ltmp1:
    movl    $14, %edi        ; strlen is optimized
    callq   _malloc          ; Here
    movabsq $6278066737626506568, %rcx
    movq    %rcx, (%rax)
    movw    $33, 12(%rax)
    movl    $1684828783, 8(%rax)
    leaq    L_.str1(%rip), %rdi
    movq    %rax, %rsi
    xorb    %al, %al
    callq   _printf          ; Here
    xorl    %eax, %eax
    popq    %rbp
    ret
Leh_func_end0:

From Linkers and Loaders:

At the time that UNIX was rewritten in C in about 1974, its authors already had extensive assember language libraries, and it was easier to mangle the names of new C and C-compatible code than to go back and fix all the existing code. Now, 20 years later, the assembler code has all been rewritten five times, and UNIX C compilers, particularly ones that create COFF and ELF object files, no longer prepend the underscore.

Prepending an underscore in the assembly results of C compilation is just a name-mangling convention that arose as a workaround. It stuck around for (as far as I know) no particular reason, and has now made its way into Clang.

Outside of assembly, the C standard library often has implementation-defined functions prefixed with an underscore to convey notions of magicalness and don't touch this to the ordinary programmers that stumble across them.

Why does modulus division (`%`) only work with integers?

12 votes

I recently ran into an issue that would be easy to solve using modulus division, but the input was an integer:

Given a function which repeats (ex: sin), and a computer function that can only compute it when the range is within one segment of that range (ex: -pi to pi), make a function that can handle any input.

The "obvious" solution is something like:

#include <cmath>
float sin(float x){
    return limited_sin((x + M_PI) % (2 *M_PI) - M_PI);
}

Why doesn't this work? I get this error:

error: invalid operands of types double and double to binary operator %

Interestingly, it does work in Python:

def sin(x):
    return limited_sin((x + math.pi) % (2 * math.pi) - math.pi)

Because the normal mathematical notion of "remainder" is only applicable to integer division. i.e. division that required to generate integer quotient.

In order to extend the concept of "remainder" to real numbers you have to introduce a new kind of "hybrid" operation that would generate integer quotient for real operands. Core C language does not support such operation, but it is provided as a standard library fmod function, as well as remainder function in C99.

Call go functions from C

12 votes

I am trying to create a static object written in Go to interface with a C program (say, a kernel module or something).

I have found documentation on calling C functions from Go, but I haven't found much on how to go the other way. What I've found is that it's possible, but complicated.

Here is what I found:

Blog post about callbacks between C and Go

Cgo documentation

Golang mailing list post

Does anyone have experience with this? In short, I'm trying to create a PAM module written entirely in Go.

You can call Go code from C. it is a confusing proposition though.

The process is outlined in the blog post you linked to. But I can see how that isn't very helpful. Here is a short snippet without any unnecessary bits. It should make things a little clearer.

package foo

// extern int goCallbackHandler(int, int);
//
// static int doAdd(int a, int b) {
//     return goCallbackHandler(a, b);
// }
import "C"

//export goCallbackHandler
func goCallbackHandler(a, b C.int) C.int {
    return a + b
}

// This is the public function, callable from outside this package.
// It forwards the parameters to C.doAdd(), which in turn forwards
// them back to goCallbackHandler(). This one performs the addition
// and yields the result.
func MyAdd(a, b int) int {
   return int( C.doAdd( C.int(a), c.int(b)) )
}

The order in which everything is called is as follows:

foo.MyAdd(a, b) ->
  C.doAdd(a, b) ->
    C.goCallbackHandler(a, b) ->
      foo.goCallbackHandler(a, b)

The key to remember here is that a callback function must be marked with the //export comment on the Go side and as extern on the C side. This means that any callback you wish to use, must be defined inside your package.

In order to allow a user of your package to supply a custom callback function, we use the exact same approach as above, but we supply the user's custom handler (which is just a regular Go function) as a parameter that is passed onto the C side as void*. It is then received by the callbackhandler in our package and called.

Let's use a more advanced example I am currently working with. In this case, we have a C function that performs a pretty heavy task: It reads a list of files from a USB device. This can take a while, so we want our app to be notified of its progress. We can do this by passing in a function pointer that we defined in our program. It simply displays some progress info to the user whenever it gets called. Since it has a well known signature, we can assign it its own type:

type ProgressHandler func(current, total uint64, userdata interface{}) int

This handler takes some progress info (current # of files received and total number of files) along with an interface{} value which can hold anything the user needs it to hold.

Now we need to write the C and Go plumbing to allow us to use this handler. Luckily the C function I wish to call from the library allows us to pass in a userdata struct of type void*. This means it can hold whatever we want it to hold, no questions asked and we will get it back into the Go world as-is. To make all this work, we do not call the library function from Go directly, but we create a C wrapper for it: goGetFiles(). It is this wrapper that actually supplies our Go callback to the C library, along with a userdata object.

package foo

// #include <somelib.h>
// extern int goProgressCB(uint64_t current, uint64_t total, void* userdata);
// 
// static int goGetFiles(some_t* handle, void* userdata) {
//    return somelib_get_files(handle, goProgressCB, userdata);
// }
import "C"
import "unsafe"

Note that the goGetFiles() function does not take any function pointers for callbacks as parameters. Instead, the callback that our user has supplied is packed in a custom struct that holds both that handler and the user's userdata value. We pass this into goGetFiles() as the userdata parameter.

// Defines the signature of our user's progress handler,
type ProgressHandler func(current, total uint64, userdata interface{}) int 

// This is an internal type which will pack the users callback function and userdata.
// It is an instance of this type that we will actually be sending to the C code.
type progressRequest struct {
   f ProgressHandler  // The user's function pointer
   d interface{}      // The user's userdata.
}

//export goProgressCB
func goProgressCB(current, total C.uint64_t, userdata unsafe.Pointer) C.int {
    // This is the function called from the C world by our expensive 
    // C.somelib_get_file() function. The userdata value contains an instance
    // of *progressRequest, We unpack it and use it's values to call the
    // actual function that our user supplied.
    req := (*progressRequest)(userdata)

    // Call req.f with our parameters and the user's own userdata value.
    return C.int( req.f( uint64(current), uint64(total), req.d ) )
}

// This is our public function, which is called by the user and
// takes a handle to something our C lib needs, a function pointer
// and optionally some user defined data structure. Whatever it may be.
func GetFiles(h *Handle, pf ProgressFunc, userdata interface{}) int {
   // Instead of calling the external C library directly, we call our C wrapper.
   // We pass it the handle and an instance of progressRequest.

   req := unsafe.Pointer(&progressequest{ pf, userdata })
   return int(C.goGetFiles( (*C.some_t)(h), req ))
}

That's it for our C bindings. The user's code is now very straight forward:

package main

import "foo"
import "fmt"

func main() {
    handle := SomeInitStuff()

    // We call GetFiles. Pass it our progress handler and some
    // arbitrary userdata (could just as well be nil).
    ret := foo.GetFiles( handle, myProgress, "Callbacks rock!" )

    ....
}

// This is our progress handler. Do something useful like display.
// progress percentage.
func myProgress(current, total uint64, userdata interface{}) int {
    fc := float64(current)
    ft := float64(total) * 0.01

    // print how far along we are.
    // eg: 500 / 1000 (50.00%)
    // For good measure, prefix it with our userdata value, which
    // we supplied as "Callbacks rock!".
    fmt.Printf("%s: %d / %d (%3.2f%%)\n", userdata.(string), current, total, fc / ft)
    return 0
}

This all looks a lot more complicated than it is. The call order has not changed as opposed to our previous example, but we get two extra call at the end of the chain:

The order in which everything is called is as follows:

foo.GetFiles(....) ->
  C.goGetFiles(...) ->
    C.somelib_get_files(..) ->
      C.goProgressCB(...) ->
        foo.goProgressCB(...) ->
           main.myProgress(...)

fwrite chokes on "<?xml version"

11 votes

When the string <?xml version is written to a file via fwrite, the subsequent writing operations become slower.

This code :

#include <cstdio>
#include <ctime>
#include <iostream>

int main()
{
    const long index(15000000); 

    clock_t start_time(clock());
    FILE*  file_stream1 = fopen("test1.txt","wb");
    fwrite("<?xml version",1,13,file_stream1);
    for(auto i = 1;i < index ;++i)
        fwrite("only 6",1,6,file_stream1);
    fclose(file_stream1);

    std::cout << "\nOperation 1 took : " 
        << static_cast<double>(clock() - start_time)/CLOCKS_PER_SEC 
        << " seconds.";


    start_time = clock();
    FILE*  file_stream2 = fopen("test2.txt","wb");
    fwrite("<?xml versioX",1,13,file_stream2);
    for(auto i = 1;i < index ;++i)
        fwrite("only 6",1,6,file_stream2);
    fclose(file_stream2);

    std::cout << "\nOperation 2 took : " 
        << static_cast<double>(clock() - start_time)/CLOCKS_PER_SEC 
        << " seconds.";


    start_time = clock();
    FILE*  file_stream3 = fopen("test3.txt","w");
    const char test_str3[] = "<?xml versioX";
    for(auto i = 1;i < index ;++i)
        fwrite(test_str3,1,13,file_stream3);
    fclose(file_stream3);

    std::cout << "\nOperation 3 took : " 
        << static_cast<double>(clock() - start_time)/CLOCKS_PER_SEC 
        << " seconds.\n";

    return 0;
}

Gives me this result :

Operation 1 took : 3.185 seconds.
Operation 2 took : 2.025 seconds.
Operation 3 took : 2.992 seconds.

That is when we replace the string "<?xml version" (operation 1) with "<?xml versioX" (operation 2) the result is significantly faster. The third operation is as fast as the first though it's writing twice more characters.

Can anyone reproduce this?

Windows 7, 32bit, MSVC 2010

EDIT 1

After R.. suggestion, disabling Microsoft Security Essentials restores normal behavior.

On Windows, most (all?) anti-virus software works by hooking into the file read and/or write operations to run the data being read or written again virus patterns and classify it as safe or virus. I suspect your anti-virus software, once it sees an XML header, loads up the XML-malware virus patterns and from that point on starts constantly checking to see if the XML you're writing to disk is part of a known virus.

Of course this behavior is utterly nonsensical and is part of what gives AV programs such a bad reputation with competent users, who see their performance plummet as soon as they turn on AV. The same goal could be accomplished in other ways that don't ruin performance. Here are some ideas they should be using:

  • Only scan files once at transitions between writing and reading, not after every write. Even if you did write a virus to disk, it doesn't become a threat until it subsequently gets read by some process.
  • Once a file is scanned, remember that it's safe and don't scan it again until it's modified.
  • Only scan files that are executable programs or that are detected as being used as script/program-like data by another program.

Unfortunately I don't know of any workaround until AV software makers wise up, other than turning your AV off... which is generally a bad idea on Windows.

Is explicitly clearing/zeroing sensitive variables after use sensible?

9 votes

I have noticed some programs explicitly zero sensitive memory allocations after use. For example, OpenSSL has a method to clear the memory occupied by an RSA key:

"Frees the RSA structure rsa. This function should always be used to free the RSA structure as it also frees sub-fields safely by clearing memory first."

http://www.rsa.com/products/bsafe/documentation/sslc251html/group__COMMON__RSA__KEY__FUNCS.html#aRSA_free

Where any (C/C++) program contains sensitive variables like this, should you explicitly zero the memory, as above? (Or, is zero'ing memory an act of paranoia or just a safeguard)?

Also, when a program finishes, any allocated memory is eventually allocated to another program. On a Linux system, is the memory cleaned or sanitised before being allocated to another program? Or, can the second program read some of the old memory contents of the first program?

On a Linux system, is the memory cleaned or sanitised before being allocated to another program?

Yes, on any respectable desktop OS, memory is sanitised when passed from a process to another. The cleaning step that you have observed is to protect from other attacks, from code executing in the same address space or obtaining privileges allowing it to read memory from the target process's memory space.

Where any (C/C++) program contains sensitive variables like this, should you explicitly zero the memory, as above?

It's a very sensible safeguard to erase this sensitive data as soon as you don't need it any more.

Static global variable and static local variable in driver function

7 votes

Hi,

In one of my sample Linux kernel module, I have a variable Device_Open declared static outside all functions and a static variable counter declared inside a function device_open. Inside device_open, i increment both Device_Open and counter. The module is inserted without any errors into the kernel and i created a device file for my module /dev/chardev.

I do cat /dev/chardev. What i can see is that counter gets incremented for each invocation of cat /dev/chardev, but Device_Open always remains 0. What is the reason for the difference in behavior related to incrementing the value of the variables ?

Below is the code snippet for understanding

static int Device_Open = 0;

static int device_open(struct inode *inode, struct file *file)
{
    static int counter = 0;

    printk(KERN_INFO "Device_Open = %d", Device_Open);
    printk(KERN_INFO "counter = %d", counter);

    if (Device_Open)
        return -EBUSY;

    Device_Open++;
        counter++;

    try_module_get(THIS_MODULE);

    return SUCCESS;
}

I searched for "Device_open" and I found its corresponding device release. Are you sure you don't have this function ? I found it at TLDP.

static int device_release(struct inode *inode, struct file *file)
{
#ifdef DEBUG
    printk(KERN_INFO "device_release(%p,%p)\n", inode, file);
#endif

    /* 
     * We're now ready for our next caller 
     */
    Device_Open--;

    module_put(THIS_MODULE);
    return SUCCESS;
}

How to achieve lock-free, but blocking behavior?

6 votes

I'm implementing a lock-free single producer single consumer queue for an intensive network application. I have a bunch of worker threads receiving work in their own separate queues, which they then dequeue and process.

Removing the locks from these queues have greatly improved the performance under high load, but they no longer block when the queues are empty, which in turn causes the CPU usage to skyrocket.

How can I efficiently cause a thread to block until it can successfully dequeue something or is killed/interrupted?

If you're on Linux, look into using a Futex. It provides the performance of a non-locking implementation by using atomic operations rather than kernel calls like a mutex would, but should you need to set the process to idle because of some condition not being true (i.e., lock-contention), it will then make the appropriate kernel calls to put the process to sleep and wake it back up at a future event. It's basically like a very fast semaphore.

What to do when FreeLibrary API call fails?

6 votes

Question

I have a third-party DLL that throws an unhandled exception when attempting to unload it from my native C application. This results in the call to FreeLibrary failing, and the module remaining loaded in my process.

Are there any options to forceably unload the library?

What do you do when the FreeLibrary calls?

Additional Background

When using load-time dynamic linking this is annoying enough, but ultimately the application gets torn down by the OS. The problem comes when using run-time dynamic linking. I load up this DLL, use it, and then in some instances I need to unload it from my process's virtual address space and then continue running. When I call FreeLibrary on the third-party library, it does some cleanup work (i.e. in DllMain when DLL_PROCESS_DETACH is called). While it's doing it's cleanup, it causes an exception to be thrown which it doesn't handle, and bubbles up as an unhandled exception to FreeLibrary. This results in the call failing and the module remaining loaded.

I've put a ticket in with the vendor so hopefully I can get a fix which will allow this specific library to unload successfully. In case I don't however, and for the general case of this issue, I'm curious as to what the options are.

If you are after only unloading dll from the memory you can use

UnmapViewOfFile

providing bases address of your loaded dll as an argument.

Example:

HINSTANCE hInst = LoadLibrary( "path_to_dll" );

if( !FreeLibrary( hInst ) )
{
   fprintf( stderr, "Couldn't unload library. Error Code: %02X\n. Attempting to unmap...", GetLastError() );
   if( !UnmapViewOfFile( hInst ) )
   { 
     fprintf( stderr, "Couldn't unmap the file! Error Code: %02X\n", GetLastError( ) );
   }
}

Or if it's a library that you didn't explicitly load (e.g. a library dependency that was loaded by a library that you loaded) and you don't have the handle, then use GetModuleHandle:

HINSTANCE hInst = GetModuleHandle( "dllname_you_didn't_load" );
if( hInst != NULL )
{
   if( !UnmapViewOfFile( hInst ) )
   { 
     fprintf( stderr, "Couldn't unmap the file! Error Code: %02X\n", GetLastError( ) );
   }
}