Bug 218339 - kernel goes unresponsive if single-stepping over an instruction which writes to an address for which a hardware read/write watchpoint has been set
Summary: kernel goes unresponsive if single-stepping over an instruction which writes ...
Status: NEW
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P3 normal
Assignee: virtualization_kvm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-04 02:35 UTC by Anthony L. Eden
Modified: 2024-03-05 20:24 UTC (History)
2 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Anthony L. Eden 2024-01-04 02:35:51 UTC
In a debian QEMU/KVM virtual machine, run `gdb` on any executable (e.g. `/usr/bin/ls`). Run the program by typing `starti`. Proceed to `_dl_start` (i.e. `break _dl_start`, `continue`). When you get there disassemble the function (i.e. `disas`). Find an instruction that's going to be executed for which you can compute the address in memory it will write to. Run the program to that instruction (i.e. `break *0xINSN`, `continue`). When you're on that instruction, set a read/write watchpoint on the address it will write to, then single-step (i.e. `stepi`) and the kernel will go unresponsive.


>(gdb) x/1i $pc
>=> 0x7ffff7fe6510 <_dl_start+48>:      mov    %rdi,-0x88(%rbp)
>(gdb) x/1wx $rbp-0x88
>0x7fffffffec28:        0x00000000
>(gdb) awatch *0x7fffffffec28
>Hardware access (read/write) watchpoint 2: *0x7fffffffec28
>(gdb) stepi


Looking with `journalctl`, I cannot find anything printed to dmesg.

The kernel of the guest inside the virtual machine is Debian 6.1.0-15-amd64. The kernel of the host running qemu-system-x86_64 is Archlinux 6.6.7-arch1-1. gdb is version 13.1.
Comment 1 Sean Christopherson 2024-01-04 16:54:10 UTC
On Thu, Jan 04, 2024, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=218339
> 
>             Bug ID: 218339
>            Summary: kernel goes unresponsive if single-stepping over an
>                     instruction which writes to an address for which a
>                     hardware read/write watchpoint has been set
>            Product: Virtualization
>            Version: unspecified
>           Hardware: All
>                 OS: Linux
>             Status: NEW
>           Severity: normal
>           Priority: P3
>          Component: kvm
>           Assignee: virtualization_kvm@kernel-bugs.osdl.org
>           Reporter: anthony.louis.eden@gmail.com
>         Regression: No
> 
> In a debian QEMU/KVM virtual machine, run `gdb` on any executable (e.g.
> `/usr/bin/ls`). Run the program by typing `starti`. Proceed to `_dl_start`
> (i.e. `break _dl_start`, `continue`). When you get there disassemble the
> function (i.e. `disas`). Find an instruction that's going to be executed for
> which you can compute the address in memory it will write to. Run the program
> to that instruction (i.e. `break *0xINSN`, `continue`). When you're on that
> instruction, set a read/write watchpoint on the address it will write to,
> then
> single-step (i.e. `stepi`) and the kernel will go unresponsive.

By "the kernel", I assume you mean the guest kernel?

> >(gdb) x/1i $pc
> >=> 0x7ffff7fe6510 <_dl_start+48>:      mov    %rdi,-0x88(%rbp)
> >(gdb) x/1wx $rbp-0x88
> >0x7fffffffec28:        0x00000000
> >(gdb) awatch *0x7fffffffec28
> >Hardware access (read/write) watchpoint 2: *0x7fffffffec28
> >(gdb) stepi
> 
> 
> Looking with `journalctl`, I cannot find anything printed to dmesg.
> 
> The kernel of the guest inside the virtual machine is Debian 6.1.0-15-amd64.
> The kernel of the host running qemu-system-x86_64 is Archlinux 6.6.7-arch1-1.
> gdb is version 13.1.

Is this a regression or something that has always been broken?  I.e. did this work
on previous host kernels?
Comment 2 Anthony L. Eden 2024-01-04 23:21:23 UTC
> By "the kernel", I assume you mean the guest kernel?
Yes, the guest kernel. I can no longer interact with the VM via the serial console. It is unresponsive.

I attached a debugger to qemu-system-x86_64 to see if qemu itself was in an infinite loop or something but the stacktraces all looked normal.

> Is this a regression or something that has always been broken?  I.e. did this
> work on previous host kernels?
I do not know whether this has always been broken or not.
Comment 3 Yao Yuan 2024-01-10 12:38:08 UTC
Hi,

I tried on my side but can't reproducce it, logs below. Any steps I missed ?

(gdb) b *0x00007ffff7fe4048                                                                                                                                                
Breakpoint 1 at 0x7ffff7fe4048: file ./elf/rtld.c, line 527.                                                                                                               
(gdb) c                                                                                                                                                                    
Continuing.                                                                                                                                                                
                                                                                                                                                                           
Breakpoint 1, 0x00007ffff7fe4048 in _dl_start (arg=0x7fffffffe510) at ./elf/rtld.c:527                                                                                     
527     in ./elf/rtld.c                                                                                                                                                    
(gdb) disassemble                                                                                                                                                          
Dump of assembler code for function _dl_start:                                                                                                                             
   0x00007ffff7fe4030 <+0>:     endbr64                                                                                                                                    
   0x00007ffff7fe4034 <+4>:     push   %rbp                                                                                                                                
   0x00007ffff7fe4035 <+5>:     mov    %rsp,%rbp                                                                                                                           
   0x00007ffff7fe4038 <+8>:     push   %r15                                                                                                                                
   0x00007ffff7fe403a <+10>:    push   %r14                                                                                                                                
   0x00007ffff7fe403c <+12>:    push   %r13                                                                                                                                
   0x00007ffff7fe403e <+14>:    push   %r12                                                                                                                                
   0x00007ffff7fe4040 <+16>:    push   %rbx                                                                                                                                
   0x00007ffff7fe4041 <+17>:    sub    $0x88,%rsp                                                                                                                          
=> 0x00007ffff7fe4048 <+24>:    mov    %rdi,-0x78(%rbp)                                                                                                                    
   0x00007ffff7fe404c <+28>:    rdtsc                                                                                                                                      
(gdb) x/16xb $rbp-0x78                                                                                                                                                     
0x7fffffffe488: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00                                                                                               
0x7fffffffe490: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00                                                                                               
(gdb) awatch *0x7fffffffe488                                                                                                                                               
Hardware access (read/write) watchpoint 2: *0x7fffffffe488                                                                                                                 
(gdb) stepi                                                                                                                                                                
                                                                                                                                                                           
Hardware access (read/write) watchpoint 2: *0x7fffffffe488                                                                                                                 
                                                                                                                                                                           
Old value = 0                                                                                                                                                              
New value = -6896                                                                                                                                                          
0x00007ffff7fe404c in rtld_timer_start (var=0x7ffff7ffcaa0 <start_time>) at ./elf/rtld.c:85                                                                                
85      in ./elf/rtld.c     


the guest kernel runs properly after above steps inside guest.

My configure:
Host: stable kernel v6.6.8 commit 4c9646a796d66a2d81871a694e88e19a38b115a7
QEMU: v8.1.1 commit 6bb4a8a47a43f35a345f107227fcd6abed59e62c
Guest kernel: kvm tree tags/kvm-6.8-1 commit 1c6d984f523f67ecfad1083bb04c55d91977bb15
Comment 4 Anthony L. Eden 2024-01-10 20:21:39 UTC
>> I tried on my side but can't reproducce it, logs below. Any steps I missed ?

Nope, it looks like you did everything right.



I spent a little more time investigating this, since for me it's trivial to reproduce. I was able to get the guest kernel vmlinux *with* debugging information from the linux-image-6.1.0-15-amd64-dbg debian package.

After entering the final `stepi` within gdb, which is when the guest goes totally unresponsive, in htop I see qemu-system-x86_64 is taking up 100% CPU. Like I said, the thread call stacks in the qemu process look typical.

I used the qemu monitor command 'dump-guest-memory -p /root/linux.core' three separate times after the guest went unresponsive, and all three of the core file backtraces look like this:

#0  pv_native_set_debugreg (regno=7, val=0) at arch/x86/include/asm/debugreg.h:92
#1  0xffffffff81a21533 in set_debugreg (reg=7, val=0) at arch/x86/include/asm/paravirt.h:129
#2  local_db_save () at arch/x86/include/asm/debugreg.h:127
#3  exc_debug_kernel (dr6=0, regs=0xfffffe0000010f58) at arch/x86/kernel/traps.c:1038
#4  exc_debug (regs=0xfffffe0000010f58) at arch/x86/kernel/traps.c:1175
#5  0xffffffff81c00c6a in asm_exc_debug () at /build/reproducible-path/linux-6.1.66/arch/x86/include/asm/idtentry.h:606
#6  0x0000000000000000 in ?? ()



My VM was in a self-contained folder with its own run script on the host so I made a tarball of it. It is available for download here (~9 GB):

https://drive.google.com/file/d/1r3tlrw8kG17vFwXzP6ETv76ptNhbLYjt/view?usp=sharing

Usage:

$ tar xvSf deb-vm-x86_64.tar
$ cd deb-vm-x86_64/
$ ./run.sh

In another terminal,

$ screen /dev/pts/23 115200
$ login as user 'root' with password 'root'

Once inside,

$ gdb /usr/bin/ls
$ starti
...


Oh and by the way, the version of qemu-system-x86_64 on my host is 7.2.7 (Debian 1:7.2+dfsg-7+deb12u3).
Comment 5 Anthony L. Eden 2024-01-10 21:07:32 UTC
Actually upon closer inspection I'm seeing two distinct call stacks appear in the core files.

#0  pv_native_set_debugreg (regno=7, val=0) at arch/x86/include/asm/debugreg.h:92
#1  0xffffffff81a21533 in set_debugreg (reg=7, val=0) at arch/x86/include/asm/paravirt.h:129
#2  local_db_save () at arch/x86/include/asm/debugreg.h:127
#3  exc_debug_kernel (dr6=0, regs=0xfffffe0000010f58) at arch/x86/kernel/traps.c:1038
#4  exc_debug (regs=0xfffffe0000010f58) at arch/x86/kernel/traps.c:1175
#5  0xffffffff81c00c6a in asm_exc_debug () at /build/reproducible-path/linux-6.1.66/arch/x86/include/asm/idtentry.h:606
#6  0x0000000000000000 in ?? ()

#0  pv_native_set_debugreg (regno=7, val=983554) at arch/x86/include/asm/debugreg.h:92
#1  0xffffffff81a21509 in set_debugreg (reg=7, val=983554) at arch/x86/include/asm/paravirt.h:129
#2  local_db_restore (dr7=983554) at arch/x86/include/asm/debugreg.h:147
#3  exc_debug_kernel (dr6=<optimized out>, regs=0xfffffe0000010f58) at arch/x86/kernel/traps.c:1095
#4  exc_debug (regs=0xfffffe0000010f58) at arch/x86/kernel/traps.c:1175
#5  0xffffffff81c00c6a in asm_exc_debug () at /build/reproducible-path/linux-6.1.66/arch/x86/include/asm/idtentry.h:606
#6  0x0000000000000000 in ?? ()


They are quite similar except in one of them frame #2 is local_db_save() and in the other trace frame #2 is local_db_restore().

By the way this time I ran the VM under a different, newer version of qemu-system-x86_64 (8.2.0), and it appears to have made no difference.


Also, concerning the VM in that tarball I linked to, if the run.sh is run as it is you will be able to ssh into the running guest with 'ssh -p 10024 root@localhost', furthermore the path to the kernel image *with* debug info is located at /usr/lib/debug/vmlinux-6.1.0-15-amd64.
Comment 6 Kishen Maloor 2024-03-05 20:24:56 UTC
(In reply to Anthony L. Eden from comment #0)
> In a debian QEMU/KVM virtual machine, run `gdb` on any executable (e.g.
> `/usr/bin/ls`). Run the program by typing `starti`. Proceed to `_dl_start`
> (i.e. `break _dl_start`, `continue`). When you get there disassemble the
> function (i.e. `disas`). Find an instruction that's going to be executed for
> which you can compute the address in memory it will write to. Run the
> program to that instruction (i.e. `break *0xINSN`, `continue`). When you're
> on that instruction, set a read/write watchpoint on the address it will
> write to, then single-step (i.e. `stepi`) and the kernel will go
> unresponsive.
> 
> 
> >(gdb) x/1i $pc
> >=> 0x7ffff7fe6510 <_dl_start+48>:      mov    %rdi,-0x88(%rbp)
> >(gdb) x/1wx $rbp-0x88
> >0x7fffffffec28:        0x00000000
> >(gdb) awatch *0x7fffffffec28
> >Hardware access (read/write) watchpoint 2: *0x7fffffffec28
> >(gdb) stepi
> 

I can reproduce the behavior you describe. But it seems that you're not invoking KVM at all, because when I add '-accel kvm' or '-enable-kvm' to your qemu command line the problem goes away.

There may be an issue specifically in the handling of hardware watchpoints
on the qemu emulation. If I disable hardware watchpoints in gdb using 'set can-use-hw-watchpoints 0' and then use 'watch *<ADDR>', that works.

Note You need to log in before you can comment on or make changes to this bug.