Why Does My eBPF Program Work on One Kernel but Fail on Another?

ebpfchirp.substack.com

98 points by musha68k a day ago

musicale 19 hours ago

Contempt for stable kernel data structures and APIs (and forget about any sort of kernel ABI) might make things easier for certain kernel developers, but it offloads a constant maintenance burden onto many other people, such as eBPF, driver, and kernel extension developers.

This sort of asymmetry is why system modules, and platforms in general, should absorb pain in order to benefit their many clients, rather than doing the opposite.

Could be worse though - some platforms (cough, iOS) are happy to break user apps every year and offload a constant maintenance burden onto many thousands of app developers, when a more stable ABI would save developers (and users) billions of dollars in aggregate.

beng-nl 18 hours ago

In Linux’s defense, the userland abi is stable, which is no small feat in terms of absorbing pain in order to benefit their many users..
Not sure why the trade-off consideration led to a different outcome for in-kernel api’s, but given the work done to ensure the stability of the userland abi, I’m sure there is thought behind it..
- DSMan195276 17 hours ago
  
  Well from a certain POV it's like stabilizing APIs internal to your application - nobody else should be calling them so "stabilizing" them just creates unnecessary maintenance work. Obviously in practice certain things like eBPF or externally-maintained drivers can break this model, but then they don't really want people doing those things vs. merging code into the kernel.
- alexjplant 5 hours ago
  
  > In Linux’s defense, the userland abi is stable, which is no small feat in terms of absorbing pain in order to benefit their many users..
  I understand that technically eBPF programs run on a VM in kernel space but aren't they loaded from userspace? Isn't eBPF an alternative to developing kernel modules and in-tree drivers? To a layperson like me it walks, talks, and quacks like userspace much more than the kernel. The fact that struct layout can change at the whim of kernel developers seems counterproductive. I guess this is what CO-RE is supposed to solve but having to deal with a bunch of pointer and sizeof() chicanery seems archaic (coming from a total luser/kernel nublet that hasn't written C in over a decade).
- musicale 18 hours ago
  
  > userland abi is stable
  The system call interface per se is relatively stable. Then there's all that stuff that has been dumped into /proc...
pjmlp 2 hours ago

Most do, the Linux kernel is the exception in the OS world.

kazinator 8 hours ago

"struct tcphdr" follows the wire format dictated by the TCP protocol. It positively has not changed between 5.3 and 5.4, and cannot. It would make no sense, since it would violate RFCs and fail to interoperate.

https://elixir.bootlin.com/linux/v5.3/source/include/uapi/li...

https://elixir.bootlin.com/linux/v5.4/source/include/uapi/li...

Upvoter33 21 hours ago

There's some research on this topic, e.g., https://depsurf.github.io

linuxftw 20 hours ago

This is all covered in the eBPF documentation. CORE was introduced over 6 years ago.

mackman 19 hours ago

CORE only works on kernels that support BTF. This post introduces one workaround which is to generate BTF data for kernels without it. That's still only half the problem though. You also need to write your eBPF program so every kernel verifier passes it, even though every kernel's eBPF verifier has different bugs, capabilities, and complexity limits. I maintain a large eBPF program that supports 4.14 through 6.14. We implemented our own version of CORE before CORE really existed. In reality, it's a lot more work than "compile once run everywhere."
- roblabla 18 hours ago
  
  Yeah same, we maintain some eBPF probes spanning 4.11 to latest kernel, and holy hell, it's really bad. The worst offender being some old RedHat kernels with half-baked backports of the eBPF features containing a bunch of weird bugs or features that aren't perfectly in line with what's used in mainline...
  Here's a fun bug we recently had: we had to ban substractions in our program (replacing them with an __asm__ macro) because of a bug in linux kernel 5.7.0 to 5.10.10, which had the (indirect) impact of not properly tracking the valid min/max values in the verifier[0]. The worst part is, it didn't cause the verifier to reject our program outright - instead, it used that information to optimize out some branches it thought were never reachable, making for some really wonky to debug situation where the program was running an impossible control-flow[1], resulting in it returning garbage to user-space.
  All this to say, CORE is really only half the problem. Supporting every kernel in existance is still a huge effort. Still worth it compared to the alternative of writing a linux kernel driver though!
  [0]: https://github.com/torvalds/linux/commit/bc895e8b2a64e502fbb...
  [1]: https://github.com/torvalds/linux/blob/bc895e8b2a64e502fbba7...
- linuxftw 19 hours ago
  
  Kernels without BTF data are ancient at this point. BTF was added in 4.18, that was in 2018. 2018! If you're running a kernel older than that, you don't need BPF, you need a whole new operating system.
  Yes, each kernel version might have different features between then and now. You have to pick a minimum supported version and write against that.
  - roblabla 18 hours ago
    
    Many, many distributions didn't embed the BTF information until fairly recently. OpenSUSE did it in 15.4, released in 2023. At $WORK, we have many customers running on distros that didn't have embedded BTF - such as RHEL7 (yes, they pay for extended maintenance).
    I really wish customers would update to a newer distro, but I also understand why they don't. So it's up to me to adapt.
    > You have to pick a minimum supported version and write against that.
    What we end up doing is progressively enabling features based on what's available in the kernel. Every eBPF we write is compiled multiple times with a couple of different flags to enable/disable certain features. It works decently well, and allows using the most capable datastructure/helpers based on the kernel version.
  - magicalhippo 2 hours ago
    
    We've got customers who complained when we bumped some critical dependencies and our software suddenly didn't work on Windows 2008 R2 servers any more... in 2025.

jstrong 16 hours ago

wow that sounds like a PITA to deal with

jeffrallen a day ago

Feels like yet another example of "essential complexity driven by too much churn in infrastructural code".

I wonder why no one needs to write this article about dtrace probes? Is it because they are less used? Less capable? More stable? Better engineered?

Probably all of the above, alas.

chronid 13 hours ago

IIRC eBPF and DTrace are (no longer) solving a similar problem, eBPF has become far bigger than just tracing, it's now a way to have user space code "driving" kernel decisions. I'm not sure they can be compared this way - and even if we do, the user base of DTrace is infinitesimally smaller of the one of eBPF.
heinrichhartman 21 hours ago

From my experience most DTrace users rely on DTrace "providers" [1] and Static Trace Points [2] rather than directly probing kernel structs. Also these days the Solaris kernel is not moving all that much.
[1] https://www.illumos.org/books/dtrace/chp-syscall.html#chp-sy... [2] https://www.illumos.org/books/dtrace/chp-sdt.html#chp-sdt
- toast0 16 hours ago
  
  DTrace isn't limited to Solaris. Per Wikipedia, it's in FreeBSD, NetBSD, Mac OS (but you can't use it with SIP), and Windows. And lots of userland stuff too.