Saturday, July 14, 2012

gcc name demangling

According to wikipedia, name mangling is a "way of encoding additional information in the name of a function, structure, class or another datatype in order to pass more semantic information from the compilers to linkers".

The need for name mangeling arises because the name of a symbol in a file is very restricted. It can not contain spaces, brackets or columns, for example. With name mangling, the linker is able to distinguish overloaded functions in C++ like
int bla();
int bla(double);
The wikipedia article and this describe name mangling quite well.

You'll see a lot of mangled names when you try
nm someexecutable

So I created a small C++ program demangles the names. Save below code in a file named mydemangle.cpp, and compile it with
g++ -o mydemangle mydemangle.cpp

Now you can try it with
mydemangle _ZN9wikipedia7article6formatEv
and the result is
wikipedia::article::format()

But demangle can do more: It can read input from stdin. So you might try
nm someexecutable | mydemangle 

Now mydemangle demangles all names it can demangle and leaves the rest as it is.

There's a quite handy way in C++ to demangle names via the function
char * __cxa_demangle (const char *mangled_name, char *output_buffer, size_t *length, int *status);


Be aware that this will not be available for all processors/compilers.



Here's the code:

mydemangle.cpp

#include <iostream>
#include <cxxabi.h>
#include <stdlib.h>
#include <string>
#include <stdio.h>
using namespace std;
int main(int argc, char *argv[])
{
    int status;
    if(argc == 2)
    {
        // we have one argument: demangle it
        char *realname = abi::__cxa_demangle(argv[1], 0, 0, &status);
        if(status==0)
            cout << "\n" << realname << "\n";
        else
            cout << "\ncould not demangle " << argv[1] << "\n";
        free(realname);
    }
    else
    {
        // we have no argument: read from stdin
        char c, lastc=0x00;
        string s;
        while((c = getchar()) != EOF)
        {
            // mangled names start with _ . consider only if last sign was space or tab or newline
            if(c == '_' && (lastc==' ' || lastc == '\n' || lastc == '\t'))
            {
                s = "_";
                // add all characters to the string until space, tab or newline
                while((c = getchar()) != EOF)
                {
                    if(c == ' ' || c == '\n' || c == '\t')
                        break;
                    s += c;
                }
                // some compilers add an additional _ in front. skip it.
                const char *p = s.c_str();
                if(s.length() > 2 && s[1] == '_') p = p+1;
                char *realname = abi::__cxa_demangle(p, 0, 0, &status);
                if(status==0)
                    cout << realname; // demangle successfull
                else
                    cout<< s; // demangling failed, print normal string
                free(realname);
            }
            cout << c;
            lastc =c;
        }
    }
    return 0;
}

Of course, there is some easy way in bash to do it. This requires demangle (part of KDE Dev Kit) and is not as fast and convenient as above method.
case 
#!/bin/bash
while read LINE
do
    for WORD in $LINE
    do
 echo -n $WORD | demangle
    done
    echo
done

No comments:

Post a Comment