Connman is using udev message to know a new net card created. But when the message comes, the call of ioctl(sk, SIOCGIFNAME, ifr) fails. If I use the bellow patch, it will be ok : diff --git a/net/core/dev.c b/net/core/dev.c index 6a94475..385de09 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4783,11 +4783,6 @@ int register_netdevice(struct net_device *dev) if (dev->features & NETIF_F_SG) dev->features |= NETIF_F_GSO; - netdev_initialize_kobject(dev); - ret = netdev_register_kobject(dev); - if (ret) - goto err_uninit; - dev->reg_state = NETREG_REGISTERED; /* * Default initial state at registry is that the @@ -4800,6 +4795,13 @@ int register_netdevice(struct net_device *dev) dev_hold(dev); list_netdevice(dev); + netdev_initialize_kobject(dev); + ret = netdev_register_kobject(dev); + if (ret) + goto err_uninit; + dev->reg_state = NETREG_REGISTERED; + + /* Notify protocols, that a new device appeared. */ ret = call_netdevice_notifiers(NETDEV_REGISTER, dev); ret = notifier_to_errno(ret);
I believe the driver you used was a cfg80211 driver, I am starting to think connman picked up a udev event for the kobject created for the device that gets used internally by cfg80211 through wiphy_register(). So for example, at least mac80211 later adds an actual networking device and registers it via register_netdevice(), and this will also create a udev event, this is the udev event that connman should look out for. Are we certain that the udev event that connman uses to process a networking device is the one from register_netdevice() and not from wiphy_register()? For ath6kl we have gotten issues reported with similar issues and at first I thought the race was we were calling register_netdev() after wiphy_register() but this turns out to be OK, even though inverting this can actually help, but its the wrong fix. The next step I took was to drop using register_netdev() and instead contend on the rtnl_lock() and use dev_alloc_name() myself and later use register_netdevice(). This should expose the proper desired name for the networking device for ath6kl. If this doesn't fix it I start wondering more about the udev events that connman is starting to process and if its really for a networking device.
Created attachment 51952 [details] ath6kl: Use register_netdevice() and dev_alloc_name() This is the stuff I was talking about for ath6kl. This uses register_netdevice() and dev_alloc_name(), can you try this instead?
Oh that patch will not apply, I have a boat load of other updates on top of staging-2.6.git on the staging-next branch for which that will apply.. hmm, so I can get you all my other patches.. but that's a lot.. I think you can read the patch though and get the idea of what was intended though.
anyone?
I can reproduce this easily with flimflam on x86 box with one cpu and ath6kl. I just added this extra delay: --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5428,6 +5428,9 @@ int register_netdevice(struct net_device *dev) ret = netdev_register_kobject(dev); if (ret) goto err_uninit; + + msleep(10000); + dev->reg_state = NETREG_REGISTERED; netdev_update_features(dev);
And here's a part of flimflam log when the race happens. Notice how SIOCGIFNAME fails. Apr 29 00:21:35 roska flimflamd[2598]: src/rfkill.c:rfkill_event() idx 0 type 1 op 2 soft 0 hard 0 Apr 29 00:21:35 roska kernel: [ 125.530250] ar6000_init() Got WMI @ 0xf4bb3d40. Apr 29 00:21:35 roska kernel: [ 125.541146] Target Ready: credits: 11 credit size: 1664 Apr 29 00:21:35 roska kernel: [ 125.541151] HIF-SCATTER Enabled: max scatter req : 4 entries: 16 Apr 29 00:21:35 roska kernel: [ 125.541174] AR6K: HIF layer supports scatter requests (max scatter items:16: maxlen:32768) Apr 29 00:21:35 roska kernel: [ 125.541177] AR6K: max recv: 32768 max send: 12288 Apr 29 00:21:35 roska kernel: [ 125.549883] ar6000_init() WMI is ready Apr 29 00:21:35 roska kernel: [ 125.550168] wmi_control_rx() : Unknown id 0x101e Apr 29 00:21:35 roska kernel: [ 125.550673] AR6000 Reg Code = 0x40000060 Apr 29 00:21:35 roska flimflamd[2598]: === add === Apr 29 00:21:35 roska flimflamd[2598]: DEVPATH = /devices/pci0000:00/0000:00:1e.0/0000:03:02.1/mmc_host/mmc0/mmc0:0001/mmc0:0001:1/net/wlan0 Apr 29 00:21:35 roska flimflamd[2598]: INTERFACE = wlan0 Apr 29 00:21:35 roska flimflamd[2598]: IFINDEX = 4 Apr 29 00:21:35 roska flimflamd[2598]: DEVPATH = /devices/pci0000:00/0000:00:1e.0/0000:03:02.1/mmc_host/mmc0/mmc0:0001/mmc0:0001:1 Apr 29 00:21:35 roska flimflamd[2598]: src/udev.c:add_net_device() Apr 29 00:21:35 roska flimflamd[2598]: connman_inet_ifname: SIOCGIFNAME(index 4): No such device Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message() buf 0xbfefda3c len 1004 Apr 29 00:21:45 roska flimflamd[2598]: src/rtnl.c:rtnl_message() NEWLINK len 1004 type 16 flags 0x0000 seq 0 Apr 29 00:21:45 roska flimflamd[2598]: src/ipconfig.c:__connman_ipconfig_newlink() index 4 type 1 flags 0x1002 ifname wlan0 Apr 29 00:21:45 roska flimflamd[2598]: wlan0 {create} index 4 type 1 <ETHER>
Jouni Malinen suggested another way to fix the issue: for some reason SIOCGIFNAME ioctl call doesn't take rtnl. Adding this fixes the race as well. Now the question is which way is the best way to do it? Here's a patch: --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4917,8 +4917,12 @@ int dev_ioctl(struct net *net, unsigned int cmd, void __user *arg) rtnl_unlock(); return ret; } - if (cmd == SIOCGIFNAME) - return dev_ifname(net, (struct ifreq __user *)arg); + if (cmd == SIOCGIFNAME) { + rtnl_lock(); + ret = dev_ifname(net, (struct ifreq __user *)arg); + rtnl_unlock(); + return ret; + } if (copy_from_user(&ifr, arg, sizeof(struct ifreq))) return -EFAULT;
Created attachment 56402 [details] udev-ifname-race.c I can now reproduce with a stock wireless-testing kernel 2.6.39-rc5-wl (rc5 plus wireless patches) on my x86 laptop, without flimflam or any other special software. Here's how to do it: 1. Run four burnP6 instances (I have four virtual cpus) 2. ./udev-ifname-race (attached) 3. while true; do modprobe e1000e; sleep 0.5; modprobe -r e1000e; echo -n .; done Within few minutes udev-ifname-race will fail on my laptop (lenovo x201, i5, 64 bit).
Links to few discussions: http://marc.info/?l=linux-netdev&m=130409861507476&w=2 http://marc.info/?l=linux-netdev&m=130403226800848&w=2
patchset v3: http://marc.info/?l=linux-netdev&m=130555720227212&w=2