BPF的内核调试方法

一直以来,笔者对BPF功能的认知一直停留在与tcpdump工具进行网络抓包相关;最近几年BPF已成为内核的顶级子系统,从网络抓包的内核支持模块演进成为内核调试、性能跟踪的复杂模块。其对内核的调试(也可用于应用的调试)是动态的;不过该动态调试过程并不需要编译内核模块并加载之,而是将调试应用编译成为eBPF字节码,并通过bpf系统调用将该字节码加载到内核中的BPF虚拟机中解析执行,或由内核中的BPF Jit编译为机器码后运行。这一调试机制使得其与传统的GDB调试方法相比,具有更低的延时和更少的性能开销,同时也具备更多的灵活性。不过BPF并不能替代GDB调试工具,更多的是作为内核调试跟踪的一个重要的补充:

  • GDB常用于应用调试,而BPF常用于内核的跟踪、性能分析和调试;

  • GDB调试工具可运行在多个平台上,如Windows等,BPF调试工具仅能运行于Linux系统上;

  • GDB可以调试多种目标软件,如通过Jlink来调试单片机软件,而BPF仅能用于Linux内核及应用的调试;

  • GDB调试开销比较大,很可能会影响被调试软件的性能,而BPF调试的资源开销少,对性能影响很小;

为64位ARM/Linux的内核配置

BPF作为Linux内核的子系统,首先要使能内核的BPF及相关功能的配置选项。以下是笔者在Linux/aarch64设备上调用bpftool feature命令工具得到的相关信息:

root@OpenWrt:/# uname -a
Linux OpenWrt 5.4.123+ #1 SMP Sat Oct 17 20:53:48 CST 2021 aarch64 GNU/Linux
root@OpenWrt:/# bpftool feature
Scanning system configuration...
bpf() syscall for unprivileged users is enabled
JIT compiler is enabled
JIT compiler hardening is disabled
JIT compiler kallsyms exports are disabled
Global memory limit for JIT compiler for unprivileged users is 33554432 bytes
CONFIG_BPF is set to y
CONFIG_BPF_SYSCALL is set to y
CONFIG_HAVE_EBPF_JIT is set to y
CONFIG_BPF_JIT is set to y
CONFIG_BPF_JIT_ALWAYS_ON is set to y
CONFIG_CGROUPS is set to y
CONFIG_CGROUP_BPF is set to y
CONFIG_CGROUP_NET_CLASSID is set to y
CONFIG_SOCK_CGROUP_DATA is set to y
CONFIG_BPF_EVENTS is set to y
CONFIG_KPROBE_EVENTS is set to y
CONFIG_UPROBE_EVENTS is set to y
CONFIG_TRACING is set to y
CONFIG_FTRACE_SYSCALLS is set to y
CONFIG_FUNCTION_ERROR_INJECTION is set to y
CONFIG_BPF_KPROBE_OVERRIDE is set to y
CONFIG_NET is set to y
CONFIG_XDP_SOCKETS is set to y
CONFIG_LWTUNNEL_BPF is set to y
CONFIG_NET_ACT_BPF is set to m
CONFIG_NET_CLS_BPF is set to m
CONFIG_NET_CLS_ACT is set to y
CONFIG_NET_SCH_INGRESS is not set
CONFIG_XFRM is not set
CONFIG_IP_ROUTE_CLASSID is not set
CONFIG_IPV6_SEG6_BPF is set to y
CONFIG_BPF_LIRC_MODE2 is not set
CONFIG_BPF_STREAM_PARSER is set to y
CONFIG_NETFILTER_XT_MATCH_BPF is set to m
CONFIG_BPFILTER is set to y
CONFIG_BPFILTER_UMH is set to m
CONFIG_TEST_BPF is not set

使能以上BPF相关的内核配置后,系统就可以支持绝大部分BPF相关的调试跟踪功能了。

应用层的基础库的移植

BPF调试功能应用的开发,依赖两个开源库zlibelfutils。这两个库的编译可以用buildrootopenwrt来实现,简单直接。其中elfutils用于libbpf.so动态库解析带有eBPF字节码的ELF文件(解析后libbpf.so会调用bpf系统调用将字节码加载到内核中);而elfutils又依赖了zlib库以读取ELF文件中的压缩(调试)信息。笔者使用了openwrt编译生成的zlib库,手动交叉编译elfutils开源库,编译的配置为:

cd elfutils-0.185
./configure --prefix=/opt/libbpf --build=x86_64-linux-gnu --host=aarch64-linux-gnu \
    CC=aarch64-linux-gnu-gcc CXX=aarch64-linux-gnu-g++ \
    CFLAGS='-mcpu=cortex-a53 -I/opt/libbpf/include -Wall -fPIC -O2 -D_GNU_SOURCE' \
    CXXFLAGS='-mcpu=cortex-a53 -I/opt/libbpf/include -Wall -fPIC -O2 -D_GNU_SOURCE' \
    --disable-nls --disable-rpath --disable-libdebuginfod --disable-debuginfod \
    --with-zlib LDFLAGS='-L/opt/libbpf/lib -lz'

在配置前zlib的头文件和库文件已安装到/opt/libbpf目录下。

Linux内核源码中的libbpf.so编译

BPF子系统在持续演进中,各个版本的Linux内核提供的BPF功能稍有差异。为了保证兼容性,建议使与设备运行的内核相同版本的应用层支持库:libbpf.so。该库的编译需要依赖elfutils,其代码位于Linux内核源码中的tools/lib/bpf目录下。编译的操作相对复杂,首先,需要修改tools/lib/bpf/Makefile文件:

diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
index 9758bfa59..6567fedc3 100644
--- a/tools/lib/bpf/Makefile
+++ b/tools/lib/bpf/Makefile
@@ -191,7 +191,8 @@ $(OUTPUT)libbpf.so: $(OUTPUT)libbpf.so.$(LIBBPF_VERSION)
 
 $(OUTPUT)libbpf.so.$(LIBBPF_VERSION): $(BPF_IN_SHARED)
        $(QUIET_LINK)$(CC) --shared -Wl,-soname,libbpf.so.$(LIBBPF_MAJOR_VERSION) \
-                                   -Wl,--version-script=$(VERSION_SCRIPT) $^ -lelf -o $@
+               -L$(prefix)/lib -Wl,-rpath-link=$(prefix)/lib \
+               -Wl,--version-script=$(VERSION_SCRIPT) $^ -lelf -o $@
        @ln -sf $(@F) $(OUTPUT)libbpf.so
        @ln -sf $(@F) $(OUTPUT)libbpf.so.$(LIBBPF_MAJOR_VERSION)
 
@@ -199,7 +200,8 @@ $(OUTPUT)libbpf.a: $(BPF_IN_STATIC)
        $(QUIET_LINK)$(RM) $@; $(AR) rcs $@ $^
 
 $(OUTPUT)test_libbpf: test_libbpf.cpp $(OUTPUT)libbpf.a
-       $(QUIET_LINK)$(CXX) $(INCLUDES) $^ -lelf -o $@
+       $(QUIET_LINK)$(CXX) $(INCLUDES) $^ -lelf -o $@ \
+               -L$(prefix)/lib -Wl,-rpath-link=$(prefix)/lib
 
 $(OUTPUT)libbpf.pc:
        $(QUIET_GEN)sed -e "s|@PREFIX@|$(prefix)|" \

之后的编译和安装的操作为:

alias lmake='make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-'
cd linux-5.4.123/tools/lib/bpf
lmake prefix=/opt/libbpf FEATURE_CHECK_CFLAGS-libelf='-I/opt/libbpf/include' \
    FEATURE_CHECK_LDFLAGS-libelf='-L/opt/libbpf/lib -Wl,-rpath-link=/opt/libbpf/lib' \
    EXTRA_CFLAGS='-mcpu=cortex-a53 -O2 -I/opt/libbpf/include' install

编译后的动态库libbpf.so会被安装到/opt/libbpf/lib64路径下:

yejq@ubuntu:/opt/libbpf$ ls lib64/
libbpf.a  libbpf.so  libbpf.so.0  libbpf.so.0.0.5  pkgconfig

编译该库的过程如下:

Auto-detecting system features:
...                        libelf: [ on  ]
...                           bpf: [ on  ]

  MKDIR    staticobjs/
  CC       staticobjs/libbpf.o
  CC       staticobjs/bpf.o
  CC       staticobjs/nlattr.o
  CC       staticobjs/btf.o
  CC       staticobjs/libbpf_errno.o
  CC       staticobjs/str_error.o
  CC       staticobjs/netlink.o
  CC       staticobjs/bpf_prog_linfo.o
  CC       staticobjs/libbpf_probes.o
  CC       staticobjs/xsk.o
  CC       staticobjs/hashmap.o
  CC       staticobjs/btf_dump.o
  LD       staticobjs/libbpf-in.o
  LINK     libbpf.a
Warning: Kernel ABI header at 'tools/include/uapi/linux/netlink.h' differs from latest version at 'include/uapi/linux/netlink.h'
  MKDIR    sharedobjs/
  CC       sharedobjs/libbpf.o
  CC       sharedobjs/bpf.o
  CC       sharedobjs/nlattr.o
  CC       sharedobjs/btf.o
  CC       sharedobjs/libbpf_errno.o
  CC       sharedobjs/str_error.o
  CC       sharedobjs/netlink.o
  CC       sharedobjs/bpf_prog_linfo.o
  CC       sharedobjs/libbpf_probes.o
  CC       sharedobjs/xsk.o
  CC       sharedobjs/hashmap.o
  CC       sharedobjs/btf_dump.o
  LD       sharedobjs/libbpf-in.o
  LINK     libbpf.so.0.0.5
  GEN      libbpf.pc
  LINK     test_libbpf
  INSTALL  libbpf.a
  INSTALL  libbpf.so.0.0.5
  INSTALL  libbpf.pc

编译bpftool工具

bpftool是一个用于操作eBPF文件和检查内核BPF相关配置的工具,本文的简单调试不会使用到该工具。其源码位于内核的tools/bpf/bpftool目录下,编译时会链接到libbpf.a静态库。笔者的编译操作如下:

cd linux-5.4.123/tools/bpf/bpftool
lmake prefix=/opt/libbpf EXTRA_LDFLAGS='-L/opt/libbpf/lib -Wl,-rpath-link=/opt/libbpf/lib' \
    FEATURE_CHECK_CFLAGS-zlib='-I/opt/libbpf/include' \
    EXTRA_CFLAGS='-mcpu=cortex-a53 -O2 -I/opt/libbpf/include' install

编译输出结果如下:

Auto-detecting system features:
...                        libbfd: [ OFF ]
...        disassembler-four-args: [ OFF ]
...                          zlib: [ on  ]

  CC       map_perf_ring.o
  CC       xlated_dumper.o
  CC       btf.o
  CC       tracelog.o
  CC       perf.o
  CC       cfg.o
  CC       btf_dumper.o
  CC       net.o
  CC       netlink_dumper.o
  CC       common.o
  CC       cgroup.o
  CC       main.o
  CC       json_writer.o
  CC       prog.o
  CC       map.o
  CC       feature.o
  CC       disasm.o
make[1]: Entering directory '/home/yejq/program/linux-5.4.123/tools/lib/bpf'

Auto-detecting system features:
...                        libelf: [ on  ]
...                           bpf: [ on  ]

  MKDIR    staticobjs/
  CC       staticobjs/libbpf.o
  CC       staticobjs/bpf.o
  CC       staticobjs/nlattr.o
  CC       staticobjs/btf.o
  CC       staticobjs/libbpf_errno.o
  CC       staticobjs/str_error.o
  CC       staticobjs/netlink.o
  CC       staticobjs/bpf_prog_linfo.o
  CC       staticobjs/libbpf_probes.o
  CC       staticobjs/xsk.o
  CC       staticobjs/hashmap.o
  CC       staticobjs/btf_dump.o
  LD       staticobjs/libbpf-in.o
  LINK     libbpf.a
make[1]: Leaving directory '/home/yejq/program/linux-5.4.123/tools/lib/bpf'
  LINK     bpftool
  INSTALL  bpftool
install: cannot remove '/usr/share/bash-completion/completions/bpftool': Permission denied
make: *** [Makefile:138: install] Error 1

注意,安装bash-completion的错误可以忽略;上面的操作完成后,bpftool会被安装到/opt/libbpf/sbin路径下。

使用BPF跟踪execve系统调用

交叉编译得到libbpf.so动态库后,就可以编写简单的BPF内核跟踪调试的应用了。笔者选择了入门级的演示案例(源码摘于此处),跟踪Linux内核的execve系统调用。其中,编译为BPF字节码的代码文件bpf_program.c内容为:

#include <linux/bpf.h>
#define SEC(NAME) __attribute__((section(NAME), used))

static int (*bpf_trace_printk)(const char *fmt, int fmt_size,
                               ...) = (void *)BPF_FUNC_trace_printk;

SEC("tracepoint/syscalls/sys_enter_execve")
int bpf_prog(void *ctx) {
  char msg[] = "Hello, BPF World!";
  bpf_trace_printk(msg, sizeof(msg));
  return 0;
}

char _license[] SEC("license") = "GPL";

应用层的调试代码文件loader.c内容为:

#include "bpf_load.h"
#include <stdio.h>

int main(int argc, char **argv) {
  if (load_bpf_file("bpf_program.o") != 0) {
    printf("The kernel didn't load the BPF program\n");
    return -1;
  }
  read_trace_pipe();
  return 0;
}

编译的操作如下:

clang -O2 -target bpf -c bpf_program.c \
    -I/home/yejq/program/linux-5.4.123/tools/testing/selftests/bpf -o bpf_program.o

aarch64-linux-gnu-gcc -DHAVE_ATTR_TEST=0 -o monitor-exec -lelf \
    -I/home/yejq/program/linux-5.4.123/samples/bpf \
    -I/home/yejq/program/linux-5.4.123/tools/lib \
    -I/home/yejq/program/linux-5.4.123/tools/perf \
    -I/home/yejq/program/linux-5.4.123/tools/include \
    -I/home/yejq/program/linux-5.4.123/tools/include/uapi \
    -I/opt/libbpf/include -L/opt/libbpf/lib64 \
    -L/opt/libbpf/lib -Wl,-rpath-link=/opt/libbpf/lib -lbpf \
    /home/yejq/program/linux-5.4.123/samples/bpf/bpf_load.c loader.c

注意,bpf_program.c的编译会用到主机上的clang编译器,生成一个目标文件。该文件在64位ARM设备上会被解析并加载到内核中。可以用file工具查看,可以确定它并不是一个x86_64平台的目标文件:

# file monitor-exec bpf_program.o
monitor-exec:  ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 3.7.0, BuildID[sha1]=6623307da4ac5660bc81f79b7a00d20c7601a41c, with debug_info, not stripped
bpf_program.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), not stripped

最后,笔者在设备上运行monitor-exec程序,其会加载bpf_program.o到内核中。通过SSH登录到设备会得到以下跟踪调试信息:

root@OpenWrt:/data/user# ./monitor-exec
        dropbear-1319    [003] ....  5085.602351: 0: Hello, BPF World!
             ash-1320    [003] ....  5085.607391: 0: Hello, BPF World!
           <...>-1321    [002] ....  5085.612052: 0: Hello, BPF World!
           <...>-1323    [001] ....  5085.616872: 0: Hello, BPF World!
           <...>-1324    [000] ....  5085.617117: 0: Hello, BPF World!
           <...>-1326    [001] ....  5085.622996: 0: Hello, BPF World!
             ash-1327    [003] ....  5085.627841: 0: Hello, BPF World!

通过Secure Shell协议登录设备过程中,内核的execve系统调用依次加载了dropbear/ash等可执行文件;向由此可见BPF子系统的调试、跟踪功能的强大。

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐