Bug 118771 - hv_netvsc panics when given a shared skbuff with < needed_headroom headroom and
Summary: hv_netvsc panics when given a shared skbuff with < needed_headroom headroom and
Status: RESOLVED INVALID
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_network@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-23 20:03 UTC by Robert Collins
Modified: 2018-01-22 00:13 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.4.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Robert Collins 2016-05-23 20:03:05 UTC
I consider this a bug because in the netdevice header:
```
 *      @needed_headroom: Extra headroom the hardware may need, but not in all
 *                        cases can this be guaranteed
```

so it seems that hv_netvsc should indeed handle this situation.

details...

Example bt:

       PANIC: "kernel BUG at /build/linux-FvcHlK/linux-4.4.0/net/core/skbuff.c:1201!"
         PID: 11022
     COMMAND: "pkt-gen"
        TASK: ffff8801f3633700  [THREAD_INFO: ffff8801f73e4000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 11022  TASK: ffff8801f3633700  CPU: 0   COMMAND: "pkt-gen"
 #0 [ffff8801f73e7168] machine_kexec at ffffffff8105befb
 #1 [ffff8801f73e71c8] crash_kexec at ffffffff8110da12
 #2 [ffff8801f73e7298] oops_end at ffffffff81031c29
 #3 [ffff8801f73e72c0] die at ffffffff810320db
 #4 [ffff8801f73e72f0] do_trap at ffffffff8102f0e1
 #5 [ffff8801f73e7340] do_error_trap at ffffffff8102f489
 #6 [ffff8801f73e7400] do_invalid_op at ffffffff8102fa40
 #7 [ffff8801f73e7410] invalid_op at ffffffff81826e0e
    [exception RIP: pskb_expand_head+579]
    RIP: ffffffff81709673  RSP: ffff8801f73e74c8  RFLAGS: 00010202
    RAX: 0000000000000002  RBX: ffff88007aba2c00  RCX: 0000000002080020
    RDX: 0000000000000fc0  RSI: 0000000000000100  RDI: ffff88007aba2c00
    RBP: ffff8801f73e7500   R8: 000000000001a000   R9: 0000000000000001
    R10: ffff8801f8130ec0  R11: 0000000000000000  R12: ffff8801f8130000
    R13: 0000000000000ec0  R14: 0000000000000000  R15: ffff88007aba2c00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff8801f73e7508] netvsc_start_xmit at ffffffffc00853e0 [hv_netvsc]
 #9 [ffff8801f73e7780] generic_ndo_start_xmit at ffffffffc019fc4f [netmap]
#10 [ffff8801f73e7790] dev_hard_start_xmit at ffffffff8171d5e9
#11 [ffff8801f73e7800] sch_direct_xmit at ffffffff817420ff
#12 [ffff8801f73e7850] __qdisc_run at ffffffff817422e3
#13 [ffff8801f73e78a0] __dev_queue_xmit at ffffffff8171db8d
#14 [ffff8801f73e78f0] dev_queue_xmit at ffffffff8171de50
#15 [ffff8801f73e7900] nm_os_generic_xmit_frame at ffffffffc01a0ab3 [netmap]
#16 [ffff8801f73e7940] generic_netmap_txsync at ffffffffc0199402 [netmap]


This was triggered via netmap (which is about to have a workaround added to it) which was transmitting a shared skb without enough headroom. skb_cow_head() will call into pskb_expand_headroom if it needs to expand headroom, but when the reference count is not 1, pskb_expand_headroom panics.

I'm not sure whether a) skb_cow_head is wrong (and it should copy the skb in this situation) or b) whether hv_netvsc shoudl not be using skb_cow_head here.
Comment 1 Robert Collins 2016-05-23 22:47:36 UTC
Reproduction instructions.

git clone https://github.com/luigirizzo/netmap.git
git reset --hard 8123c744470aa4562635dd139d3895ed68d45dd7
cd LINUX
./configure --no-drivers
make
sudo insmod netmap.ko
cd ../examples
make
sudo ./pkt-gen -i eth0 -f tx -d 192.168.1.2:80 -s 192.168.137.75:1023 -n 5000 -S 00:15:5d:ba:c3:01 -D 00:15:5d:ba:c3:00

This should trigger the failure.

The underlying kernel calls to do the same manually would be something like:

/* Note, there is no headroom allocated */
skb = alloc_skb(1800, GFP_ATOMIC);
skb->dev = dev;
skb_copy_to_linear_data(skb, somedata, 1800);
atomic_inc(&skb->users);
dev_queue_xmit(skb);

Note that the patched driver in master still calls skb_cow_head to ask for enough headroom - and when there is no headroom this will call pskb_reserve_head, which with users != 1 will barf.
Comment 2 Stephen Hemminger 2017-01-24 23:17:26 UTC
Bug is invalid.
Kernel bugzilla is only for upstream kernel bugs WITH NO OUT OF TREE drivers.
Comment 3 Robert Collins 2018-01-21 09:49:23 UTC
I'm confused: is it invalid because I triggered the panic using netmap?

If so fair enough, but is there not an actual bug here in the hyperv driver which *is* in-tree? As far as I can tell it is legimitate (if illadvised) to submit an skb with 2 users linear data and insufficient headroom for the hyperv bus, and that is what will panic, not the netmap code.
Comment 4 Stephen Hemminger 2018-01-21 17:32:21 UTC
Unless you can reproduce with no out of tree drivers, it is invalid.
Comment 5 Robert Collins 2018-01-22 00:13:01 UTC
That confuses me because I would have thought that latent bugs were still bugs even if the current code doesn't actually provoke it.

In this case a driver panicing the kernel if dev->needed_headroom is not provided - because it is documented as being not guaranteed.

However its your policy not mine, so I'll leave this alone at this point.

Note You need to log in before you can comment on or make changes to this bug.