What is XDP after all?

Introduction

Since XDP (eXpress Data Path) emerged as technology in mainline Linux around 2017 I heard large number of misconceptions and misunderstandings about it's purpose and capabilities. It's very hard to blame anybody for it as amount of high quality materials about this amazing technology is very scarce and that's what I would like to address.

Things XDP cannot do

To save time for readers I would like to share my most common answers regarding XDP right here:

  • XDP is not firewall like iptables or nftables
  • XDP cannot magically stop bad traffic from reaching your server
  • XDP cannot make Linux routing magically faster
  • XDP does not magically increase performance of your old application (sorry)

What is the the main audience for XDP?

The main audience for XDP technology is software engineers. This is very first concept you need to learn. It's not technology for end users and it's not technology for system administrators. You definitely can use XDP microcode prepared by somebody else but you will need to read and understand what microcode is doing what makes you little bit software engineer.

What is eBPF?

If you're familiar with programming language Java then you can consider eBPF as sibling to Java bytecode which works on any platform without changes. In comparison with Java bytecode eBPF is exceptionally limited and provides only absolutely limited operations such as conditional jumps, loops, integer arithmetic and memory access.

Old Sun Microsystems office

If you're lucky enough to be unfamiliar with Java then let's talk about assembler programming language which allows you to write program using language which almost directly translates to CPU machine instructions.

Let look on x86_64 Assembler code example kindly provided by Jim Fisher:

global _start

section .text

_start:
  mov rax, 1        ; write(
  mov rdi, 1        ;   STDOUT_FILENO,
  mov rsi, msg      ;   "Hello, world!\n",
  mov rdx, msglen   ;   sizeof("Hello, world!\n")
  syscall           ; );

  mov rax, 60       ; exit(
  mov rdi, 0        ;   EXIT_SUCCESS
  syscall           ; );

section .rodata
  msg: db "Hello, world!", 10
  msglen: equ $ - msg

As you may see Assembler language uses CPU instruction names for x86_64 directly.

What are main issues with using Assembler for writing XDP programs and compiling them to machine code:

  • Machine codes are hardware platform specific and machine code compiled for x86_64 will not work on ARM or PowerPC
  • Machine code for x86_64 has around 1000 of different unique opcodes and very challenging to implement in software
  • Machine code is impossible to verify for security (remember, we run it in Kernel space with unlimited level of access)
  • Machine code is relatively slow

Considering all these requirements XDP authors decided to use artificial CPU architecture with very limited set of instructions which consists around 100 instructions and they called it eBPF.

eBPF is an artificial CPU architecture which runs inside of Linux kernel on astonishingly fast speed.

Caltrain is well known to almost all early adopters of XDP

You can write eBPF programs in Assembler but C is way better and easier for it.

We can refer to eBPF as kind of C based programming language but at the same time it's format of binary code.

We can tell that programs for XDP are written in eBPF and compiled to eBPF bytecode (or microcode). We clearly can refer to programs written for XDP as XDP programs or in full form as eBPF XDP programs.

What I can do with XDP?

We finally reached this point and I'll do my best to explain what XDP can do:

  • XDP allows you to alter incoming traffic by manipulating any fields available in packet payload
  • XDP allows you to discard some specific type of traffic on your Linux machine before it reaches your application. Like iptables but with unlimited capabilities and access to whole packet payload. From this perspective it's close to iptables u32 filters but way more functional.
  • XDP can forward some types of traffic between network interfaces and you can build bridge or router this way.
  • XDP can send received packet back and simulate server response this way

I would like to note that there are way more things you can do in Linux kernel using eBPF but not all of them are part of XDP. XDP has main purpose of network traffic processing and we will focus on it.

XDP works in Linux kernel (to be more precise it lives in network interface driver) or directly inside of network card (you will find these cards are very expensive).

As consequence XDP can easily achieve incredible performance of tens or even hundreds of millions packets per second and hundreds of gigabits (100G+).

In some contexts XDP can be references as Kernel bypass technology. It has meaning that Linux network stack is not involved in processing packets when XDP handles them.

How generic XDP program looks like?

// SPDX-License-Identifier: GPL-2.0
#define KBUILD_MODNAME "fastnetmon"

// Fixed width integer type: uint16_t and other
#include <stdint.h>

#include <linux/types.h>

// true and false like in C++
#include <stdbool.h>

// This one requires presence of package libbpf-dev on host system
// sudo apt-get install -y libbpf-dev
#include <bpf/bpf_helpers.h>

#include <linux/bpf.h>

// Endian-less conversion functions
#include <arpa/inet.h>


struct __attribute__((__packed__)) ethernet_header_t {
    uint8_t destination_mac[6];
    uint8_t source_mac[6];
    uint16_t ethertype; 
};

struct __attribute__((__packed__)) ipv4_header_t {
    uint8_t ihl : 4, version : 4; 
    uint8_t ecn : 2, dscp : 6; 

    // This is the combined length of the header and the data
    uint16_t total_length; 
    uint16_t identification; 

    // There we have plenty of fragmentation specific fields
    // encoded in ipv4_header_fragmentation_flags_t
    uint16_t fragmentation_details_as_integer; 

    uint8_t ttl;
    uint8_t protocol; 

    uint16_t checksum; 

    uint32_t source_ip; 
    uint32_t destination_ip; 
};

struct __attribute__((__packed__)) udp_header_t {
    uint16_t source_port; 
    uint16_t destination_port;
    uint16_t length;
    uint16_t checksum;
};


enum IanaEthertype : unsigned int {
    IanaEthertypeIPv4 = 2048,
    IanaEthertypeARP  = 2054,
    IanaEthertypeVLAN            = 33024,
    IanaEthertypeIPv6            = 34525
};

enum IpProtocolNumberNotTyped : unsigned int {
    IpProtocolNumberICMP             = 1,
    IpProtocolNumberTCP              = 6,
    IpProtocolNumberUDP              = 17
};


SEC("fastnetmon_xdp_filter")
int fastnetmon_filter(struct xdp_md* ctx) {
	void* data = (void *)(long)ctx->data;    
    void* data_end = (void *)(long)ctx->data_end;
  
    // Data is too short to be ethernet packet
    if (data + sizeof(struct ethernet_header_t) > data_end) {
        return XDP_PASS;
    }

    const struct ethernet_header_t* ethernet = (struct ethernet_header_t*)data;

    if (ethernet->ethertype != htons(IanaEthertypeIPv4)) {
        return XDP_PASS;
    }

    // Data is too short to be IPv4 packet
    if (data + sizeof(struct ethernet_header_t) +
        sizeof(struct ipv4_header_t) > data_end) {
        return XDP_PASS;
    }

    // Yay, we have access to IPv4 packet
    struct ipv4_header_t* ipv4 = (struct ipv4_header_t*)(data + 
        sizeof(struct ethernet_header_t));

    // Convert total IP header length to host byte order
    uint16_t ip_length = ntohs(ipv4->total_length);

    // We will define these variables only for for UDP and TCP traffic
    uint16_t source_port = 0;
    uint16_t destination_port = 0;
    
    if (ipv4->protocol == IpProtocolNumberUDP) {
        // Data is too short to be UDP packet
        if (data + sizeof(struct ethernet_header_t) +
            4 * ipv4->ihl + sizeof(struct udp_header_t) > data_end) {
            return XDP_PASS;
        }

        struct udp_header_t* udp_header = (struct udp_header_t*)(data + 
            sizeof(struct ethernet_header_t) + 4 * ipv4->ihl);
        
        source_port = ntohs(udp_header->source_port);
        destination_port = ntohs(udp_header->destination_port);
    }


    if (ipv4->protocol == 17 && source_port == 53) {
        return XDP_DROP;
    }

    return XDP_PASS;
}

char _license[] SEC("license") = "GPL";

In this example we implemented XDP program which discards all UDP packets coming from port 53. I intentionally provided all network data structures instead of using ones defined in standard library for better visbility.

As you see this code is very low level and has a lot of code for such very basic task. Very likely your real application will be way more complicated.

If you're scared enough you may stop here. You did well by reaching this part.

Up to challenge?

Keep reading!

I'm software engineer and I have no idea about networks

Well, you need to learn them on the deepest level. Due to such unusual overlaps of skills between software engineering and networking XDP developers are very rare and are in exceptionally high demand (hint for career!)

What you need to know as software developer?

You will be in best position to work with XDP if you're familiar with following concepts of C language:

  • Pointers
  • Pointers arithmetic
  • Memory management
  • Flow control (if, loops)

What I can personally recommend to start is to learn packet formats for most commonly used protocols:

You will need to understand how these protocols operate and I can personally recommend this book TCP/IP_Illustrated (vol 1).

Do I need special compiler to compile eBPF XDP program?

During early days of XDP we had to use bcc and custom compilers. These days you can use clang from your modern Linux distribution to compile XDP microcode as easy as:

clang -c -g -O2 -target bpf xdp_kernel.c -o xdp_kernel.o

How to load XDP microcode?

It's very easy and there are great tools for it available in Debian 12 and other latest distros:

sudo xdp-loader load --mode skb eth0 xdp_kernel.o

Then to unload use following syntax:

sudo xdp-loader unload eth0 --all

Conclusions

XDP is a great technology but it's just foundation for building new exciting products which can operate on mind blowing speeds.

Sadly GitHub full of semi-finished XDP based applications and complete products are rare to find.

Let's build more of them together!

Questions

Please comment if you noticed that I was wrong somewhere or my explanation was not clear. XDP is a complicated technology and if you have ideas how it can be explained easier please share.

Subscribe to Pavel's blog about underlying Internet technologies

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe