Bug 78771

Summary: [REGRESSION] Loading Hyper-V network drivers is racy in 3.14+
Product: Drivers Reporter: Sitsofe Wheeler (sitsofe)
Component: NetworkAssignee: drivers_network (drivers_network)
Status: RESOLVED INVALID    
Severity: normal CC: jskier, mysqlstudent
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.14 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Kernel configuration
dmesg of 3.16.0-rc4 failing to bring up networking

Description Sitsofe Wheeler 2014-06-23 14:59:45 UTC
Description of problem:
With the 3.14.2-200.fc20 kernel Hyper-V is no longer able to bring enable its networking devices in time on cloud images leading to the network devices not being brought online.

Version-Release number of selected component (if applicable):
3.14.2-200.fc20

How reproducible:
Reproducible very frequently.

Steps to Reproduce:
1. Install cloud image on Hyper-V.
2. Update to the latest kernel.
3. Reboot.

Actual results:
The journal output from a boot on a hand built 3.15.0 kernel:

Jun 23 13:56:00 localhost systemd-sysctl[408]: Overwriting earlier assignment of kernel/sysrq in file '/etc/sysctl.d/sysrq.conf'.
Jun 23 13:56:00 localhost network[381]: Bringing up loopback interface:  [  OK  ]
Jun 23 13:56:00 localhost network[381]: Bringing up interface eth0:  ERROR    : [/etc/sysconfig/network-scripts/ifup-eth] Device eth0 does not seem to be present, delaying initialization.
Jun 23 13:56:00 localhost /etc/sysconfig/network-scripts/ifup-eth[496]: Device eth0 does not seem to be present, delaying initialization.
Jun 23 13:56:00 localhost network[381]: [FAILED]
Jun 23 13:56:00 localhost network[381]: Bringing up interface eth1:  ERROR    : [/etc/sysconfig/network-scripts/ifup-eth] Device eth1 does not seem to be present, delaying initialization.
Jun 23 13:56:00 localhost /etc/sysconfig/network-scripts/ifup-eth[518]: Device eth1 does not seem to be present, delaying initialization.
Jun 23 13:56:00 localhost network[381]: [FAILED]
Jun 23 13:56:00 localhost systemd-sysctl[539]: Overwriting earlier assignment of kernel/sysrq in file '/etc/sysctl.d/sysrq.conf'.
Jun 23 13:56:00 localhost systemd[1]: network.service: control process exited, code=exited status=1
Jun 23 13:56:00 localhost systemd[1]: Failed to start LSB: Bring up/down networking.
Jun 23 13:56:00 localhost systemd[1]: Unit network.service entered failed state.

So it looks like something is racing. Booting the same VM with Fedora's kernel-3.13.10-200.fc20.x86_64 never hits this issue. The VMware ESXi VMs I have used have no issue with later kernels (and have always booted faster).

Additional info:
After things have failed logging in and manually running
ifup eth0
works.

Originally filed on https://bugzilla.redhat.com/show_bug.cgi?id=1095387 but there's been no response.

Looking through /etc/sysconfig/network-scripts/ifup-eth /etc/sysconfig/network-scripts/network-functions it looks like two checks take place for /sys/class/net/eth0 but it's not there on either occasion...

Version information:
v3.15 vanilla kernel
Fedora 20
Hypervisor is Hyper-V 2012 R2
Comment 1 Sitsofe Wheeler 2014-07-06 19:44:56 UTC
After a painful round of bisection I've narrowed this down to commit b679ef73edc251f6d200a7dd2396e9fef9e36fc3 :

# bad: [455c6fdbd219161bd09b1165f11699d6d73de11c] Linux 3.14
# good: [d8ec26d7f8287f5788a494f56e8814210f0e64be] Linux 3.13
git bisect start 'v3.14' 'v3.13'
# good: [82c477669a4665eb4e52030792051e0559ee2a36] Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 82c477669a4665eb4e52030792051e0559ee2a36
# bad: [ca2a650f3dfdc30d71d21bcbb04d2d057779f3f9] Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma
git bisect bad ca2a650f3dfdc30d71d21bcbb04d2d057779f3f9
# bad: [205e2210daa975d92ace485a65a31ccc4077fe1a] iwlwifi: disable TX AMPDU by default for iwldvm
git bisect bad 205e2210daa975d92ace485a65a31ccc4077fe1a
# bad: [09db30805300e9ed5ad43d4d339115cf1d9c84e1] dccp: re-enable debug macro
git bisect bad 09db30805300e9ed5ad43d4d339115cf1d9c84e1
# bad: [d9120198ddef2c0b61ca6659ace41b7c1e7c8f08] clk: shmobile: rcar-gen2: Use kick bit to allow Z clock frequency change
git bisect bad d9120198ddef2c0b61ca6659ace41b7c1e7c8f08
# bad: [1b07da516ee25250f458c76c012ebe4cd677a84f] hyperv: Move state setting for link query
git bisect bad 1b07da516ee25250f458c76c012ebe4cd677a84f
# bad: [53611c0ce9f6e2fa2e31f9ab4ad8c08c512085ba] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
git bisect bad 53611c0ce9f6e2fa2e31f9ab4ad8c08c512085ba
# bad: [a34fe10750ebe524a39f97bd78ab4d232a554edb] parisc: locks: remove redundant arch_*_relax operations
git bisect bad a34fe10750ebe524a39f97bd78ab4d232a554edb
# bad: [004e5cf743086990e5fc04a14437b3966d7fa9a2] Merge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes
git bisect bad 004e5cf743086990e5fc04a14437b3966d7fa9a2
# bad: [a4ecdf82f8ea49f7d3a072121dcbd0bf3a7cb93a] Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad a4ecdf82f8ea49f7d3a072121dcbd0bf3a7cb93a
# bad: [c60f7d5a8e7c639de5d9dfe07e1e91d302d506e4] Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
git bisect bad c60f7d5a8e7c639de5d9dfe07e1e91d302d506e4
# bad: [bf21d605bf7d18d2b3cdb1c19fc1b2a1549c1f11] Merge branch 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
git bisect bad bf21d605bf7d18d2b3cdb1c19fc1b2a1549c1f11
# bad: [07ae78c9798b79bad3d3adf983c94ba23fde54d4] drm/radeon/cik: stop the sdma engines in the enable() function
git bisect bad 07ae78c9798b79bad3d3adf983c94ba23fde54d4
# bad: [7848865914c6a63ead674f0f5604b77df7d3874f] drm/radeon: fix runpm disabling on non-PX harder
git bisect bad 7848865914c6a63ead674f0f5604b77df7d3874f
# bad: [e9e352e9100b98aed1a5fb9e33355c29fb07d5b1] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform
git bisect bad e9e352e9100b98aed1a5fb9e33355c29fb07d5b1
# good: [6e1f586d31ad49063da391db12632b31c7b00d76] qlcnic: Fix SR-IOV cleanup code path
git bisect good 6e1f586d31ad49063da391db12632b31c7b00d76
# good: [562e74fefc36eb57286455c68a60f2776659a7e1] Merge tag 'cris-for-3.14' of git://jni.nu/cris
git bisect good 562e74fefc36eb57286455c68a60f2776659a7e1
# good: [f1499382f114231cbd1e3dee7e656b50ce9d8236] Merge tag 'xfs-for-linus-v3.14-rc1-2' of git://oss.sgi.com/xfs/xfs
git bisect good f1499382f114231cbd1e3dee7e656b50ce9d8236
# good: [0e47c969c65e213421450c31043353ebe3c67e0c] Merge tag 'for-linus-20140127' of git://git.infradead.org/linux-mtd
git bisect good 0e47c969c65e213421450c31043353ebe3c67e0c
# bad: [30c867eebfbd1c25310aec9f152578deaf793080] Merge tag 'blackfin-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/realmz6/blackfin-linux
git bisect bad 30c867eebfbd1c25310aec9f152578deaf793080
# bad: [c044dc2132d19d8c643cdd340f21afcec177c046] qeth: fix build of s390 allmodconfig
git bisect bad c044dc2132d19d8c643cdd340f21afcec177c046
# bad: [d922e1cb1ea17ac7f0a5c3c2be98d4bd80d055b8] net: Document promote_secondaries
git bisect bad d922e1cb1ea17ac7f0a5c3c2be98d4bd80d055b8
# good: [f2ebd477f141bc09b10fb8deb612a4d9b8999bba] bonding: restructure locking of bond_ab_arp_probe()
git bisect good f2ebd477f141bc09b10fb8deb612a4d9b8999bba
# bad: [b679ef73edc251f6d200a7dd2396e9fef9e36fc3] hyperv: Add support for physically discontinuous receive buffer
git bisect bad b679ef73edc251f6d200a7dd2396e9fef9e36fc3
# good: [a452ce345d63ddf92cd101e4196569f8718ad319] net: Fix memory leak if TPROXY used with TCP early demux
git bisect good a452ce345d63ddf92cd101e4196569f8718ad319
# good: [731073b9c99d46c6b6c01184f67ee6f75fd7a163] sky2: initialize napi before registering device
git bisect good 731073b9c99d46c6b6c01184f67ee6f75fd7a163
# first bad commit: [b679ef73edc251f6d200a7dd2396e9fef9e36fc3] hyperv: Add support for physically discontinuous receive buffer

commit b679ef73edc251f6d200a7dd2396e9fef9e36fc3
Author: Haiyang Zhang <haiyangz@microsoft.com>
Date:   Mon Jan 27 15:03:42 2014 -0800

    hyperv: Add support for physically discontinuous receive buffer
    
    This will allow us to use bigger receive buffer, and prevent allocation failure
    due to fragmented memory.
    
    Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
    Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

The problem can be intermittent (sometimes it happens rarely, sometimes it happens seemingly every boot) so I used the following script to perform a check:

#!/bin/bash
ok=1
pass=0
bootcount=$(</root/bootcount)
bootcount=$((bootcount + 1))
while [[ $ok -ne 0 ]] && [[ $pass -lt 10 ]]; do
        pass=$((pass + 1))
        ping -qc 1 kernel.org
        ok=$?
        if [[ $ok -eq 0 ]]; then
                echo $bootcount > /root/bootcount
                sync
                reboot
        fi
        sleep 1
done
echo "No network"
read

With kernels equal to or after b679ef73edc251f6d200a7dd2396e9fef9e36fc3 the system will usually stop rebooting before 20 passes but the most extreme cases were always less than 100. With a pre b679ef73edc251f6d200a7dd2396e9fef9e36fc3 kernel it did over 390 passes before I manually stopped it.
Comment 2 Sitsofe Wheeler 2014-07-06 19:51:56 UTC
Created attachment 142201 [details]
Kernel configuration
Comment 3 Sitsofe Wheeler 2014-07-07 18:10:59 UTC
Created attachment 142351 [details]
dmesg of 3.16.0-rc4 failing to bring up networking
Comment 4 Sitsofe Wheeler 2014-07-08 18:21:33 UTC
Filed on LKML and netdev mailing lists: http://thread.gmane.org/gmane.linux.drivers.driver-project.devel/53083 .
Comment 5 Alex Regan 2014-09-29 20:57:28 UTC
Is there any update on this? I'm still experiencing this problem with 3.16.3-200.fc20.x86_64 and just sysv 'network' networking.

The thread on the netdev list just ends back in July.
Comment 6 Sitsofe Wheeler 2014-09-30 03:05:51 UTC
Alex:
I *think* this is fixed by the patch posted here https://lkml.org/lkml/2014/8/28/649 although you may also want those patches posted in https://lkml.org/lkml/2014/8/26/611 that were also cc'd to stable. I'm not aware of these patches being in any tree other than the yet-to-be-released 3.17.
Comment 7 Alex Regan 2014-09-30 23:23:32 UTC
Do you know which versions are affected by this?

Any idea how I would try and reproduce this for testing?
Comment 8 Sitsofe Wheeler 2014-10-01 06:09:17 UTC
Alex:
Versions affected are 3.14 and above (see comment #1 for the bisection).

You can test for this using a script like the one described in... comment #1 :-) Set it to start on boot after the network is up. Once you've got it working if you want to stop the VM rebooting (because you haven't hit the issue) simply detach all network cards from their virtual switches in Hyper-V manager.

(You are using Hyper-V/Azure right?)
Comment 9 Sitsofe Wheeler 2015-11-21 11:37:10 UTC
This problem (boot happens so fast that the network device isn't yet ready) can't be solved in the kernel but thankfully does have workarounds/solutions.

If you have to use old-style initscript networking then you can try a workaround by upping the length of time the system will wait at boot before declaring the network interface unavailable (see https://bugzilla.redhat.com/show_bug.cgi?id=1204612 for using DEVTIMEOUT on Fedora).

A better solution is to switch to a system that can configure network devices on demand (such as NetworkManager or systemd-networkd) after the device appears (see https://bugzilla.redhat.com/show_bug.cgi?id=1194623 ). I've since moved to systemd-networkd and no longer see this issue at all with Fedora 22/23. Marking resolved.