How to debug kernel crashes

Note: How to report a Problem is assumed to be known, this text explains how to gather further information from a kernel crash.

A typical kernel crash on OpenBSD might look like this:

    kernel: page fault trap, code=0
    Stopped at    _pf_route+0x263:        mov     0x40(%edi),%edx
    ddb>
    

The first command to run from the ddb> prompt is trace (see ddb(4) for details):

    ddb> trace
    _pf_route(e28cb7e4,e28bc978,2,1fad,d0b8b120) at _pf_route+0x263
    _pf_test(2,1f4ad,e28cb7e4,b4c1) at _pf_test+0x706
    _pf_route(e28cbb00,e28bc978,2,d0a65440,d0b8b120) at _pf_route+0x207
    _pf_test(2,d0a65440,e28cbb00,d023c282) at _pf_test+0x706
    _ip_output(d0b6a200,0,0,0,0) at _ip_output+0xb67
    _icmp_send(d0b6a200,0,1,a012) at _icmp_send+0x57
    _icmp_reflect(d0b6a200,0,1,0,3) at _icmp_reflect+0x26b
    _icmp_input(d0b6a200,14,0,0,d0b6a200) at _icmp_input+0x42c
    _ipv4_input(d0b6a200,e289f140,d0a489e0,e289f140) at _ipv4_input+0x6eb
    _ipintr(10,10,e289f140,e289f140,e28cbd38) at _ipintr+0x8d
    Bad frame pointer: 0xe28cbcac
    ddb> 
    

This tells us what function calls lead to the crash.

To find out the particular line of C code that caused the crash, you can do the following:

Find the source file where the crashing function is defined in. In this example, that would be pf_route() in sys/net/pf.c. Recompile that source file with debug information:

    # cd /usr/src/sys/arch/$(uname -m)/compile/GENERIC/
    # rm pf.o
    # make -n pf.o | sed "s/$/ -g/" | sh -s
    

Then use objdump(1) to get the disassembly:

    # objdump --line --disassemble --reloc pf.o >pf.dis
    

In the output, grep for the function name (pf_route in our example):

    # grep "<_pf_route>:" pf.dis
    00007d88 <_pf_route>:
    

Take this first hex number and add the offset from the 'Stopped at' line: 0x7d88 + 0x263 == 0x7feb.
Scroll down to that line (the assembler instruction should match the one quoted in the 'Stopped at' line), then up to the nearest C line number:

    # more pf.dis
    /usr/src/sys/arch/i386/compile/GENERIC/../../../../net/pf.c:3872
        7fe7:       0f b7 43 02             movzwl 0x2(%ebx),%eax
        7feb:       8b 57 40                mov    0x40(%edi),%edx
        7fee:       39 d0                   cmp    %edx,%eax
        7ff0:       0f 87 92 00 00 00       ja     8088 <_pf_route+0x300>
    

So, it's precisely line 3872 of pf.c that crashes:

    # cat -n pf.c | head -n 3872 | tail -n 1
    3872          if ((u_int16_t)ip->ip_len <= ifp->if_mtu) {
    
Note that the kernel that produced the crash output and the object file for objdump must be compiled from the exact same source file, otherwise the offset doesn't match.

If you provide both the ddb> trace output and the relevant objdump section, that's very helpful.