Bug 118571 - kernel: BUG: scheduling while atomic: ip/453/0x00000002
Summary: kernel: BUG: scheduling while atomic: ip/453/0x00000002
Status: RESOLVED CODE_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-19 21:59 UTC by James
Modified: 2016-06-09 20:54 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.6.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments
log file - scheduling while atomic: ip (101.30 KB, application/octet-stream)
2016-05-19 21:59 UTC, James
Details
Test Patch (2.21 KB, patch)
2016-06-08 20:52 UTC, [account disabled by administrator]
Details | Diff

Description James 2016-05-19 21:59:48 UTC
Created attachment 216881 [details]
log file - scheduling while atomic: ip

Arch linux 4.6-1
wpa_supplicant 1:2.5-3
Toshiba Satellite, circa 2011, with a Pentium Dual-Core Mobile
Error is not seen on other machines.

kernel: BUG: scheduling while atomic: ip/453/0x00000002

See attached.
Comment 1 James 2016-05-19 23:18:34 UTC
These "scheduling while atomic" errors show for both "ip" and "wpa_supplicant".  From the wpa_supplicant/hostap mailing list, there was this:

This is a kernel bug, not a wpa_supplicant bug.  The linux-wireless mailing
list would be a more appropriate venue for this bug report.
...
[Probably due to this kernel change:

commit 49f86ec21c01b654f6ec47f2f4567f4f9ebaa26b
Author: Larry Finger <Larry.Finger@lwfinger.net>
Date:   Mon Feb 15 16:12:07 2016 -0600

    rtlwifi: Change long delays to sleeps


...apparently this function isn't in sleepable context after all.]
Comment 2 [account disabled by administrator] 2016-06-08 20:51:46 UTC
Just reverted that commit and attaching as a patch for you to try and see if it fixes your issue.
Comment 3 [account disabled by administrator] 2016-06-08 20:52:28 UTC
Created attachment 219351 [details]
Test Patch
Comment 4 James 2016-06-09 00:31:00 UTC
No, no - this has been fixed.  I'm sorry - I'm so used to having other people close my bug reports that I'm never sure what the proper etiquette is.

This was addressed by Bob Copeland on hostap@lists.infradead.org and linux-wireless@vger.kernel.org, and by Larry Finger and Kalle Valo, on 
devel@driverdev.osuosl.org and linux-wireless@vger.kernel.org.

Summarizing:

Forward from Larry Finger: [PATCH] rtlwifi: Fix scheduling while atomic error from commit 49f86ec21c01

Commit 49f86ec21c01 ("rtlwifi: Change long delays to sleeps") was correct
for most cases; however, driver rtl8192ce calls the affected routines while
in atomic context. The kernel bug output is as follows:

BUG: scheduling while atomic: wpa_supplicant/627/0x00000002
[...]
[<ffffffff815c2b39>] __schedule+0x899/0xad0
[<ffffffff815c2dac>] schedule+0x3c/0x90
[<ffffffff815c5bb2>] schedule_hrtimeout_range_clock+0xa2/0x120
[<ffffffff810e8b80>] ? hrtimer_init+0x120/0x120
[<ffffffff815c5ba6>] ? schedule_hrtimeout_range_clock+0x96/0x120
[<ffffffff815c5c43>] schedule_hrtimeout_range+0x13/0x20
[<ffffffff815c568f>] usleep_range+0x4f/0x70
[<ffffffffa0667218>] rtl_rfreg_delay+0x38/0x50 [rtlwifi]
[<ffffffffa06dd0e7>] rtl92c_phy_config_rf_with_headerfile+0xc7/0xe0 [rtl8192ce]

To fix this bug, three of the changes from delay to sleep are reverted.
Unfortunately, one of the changes involves a delay of 50 msec. The calling
code will be modified so that this long delay can be avoided; however,
this change is being pushed now to fix the problem in kernel 4.6.0.
...

Forward from Kalle Valo:
I'm planning to queue this to 4.7.

Forward from Larry Finger:
That will be good as it will be ported to 4.6 quickly after that.

---


drivers/net/wireless/realtek/rtlwifi/core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/core.c b/drivers/net/wireless/realtek/rtlwifi/core.c
index 0f48048..3a0faa8 100644
--- a/drivers/net/wireless/realtek/rtlwifi/core.c
+++ b/drivers/net/wireless/realtek/rtlwifi/core.c
@@ -54,7 +54,7 @@ EXPORT_SYMBOL(channel5g_80m);
void rtl_addr_delay(u32 addr)
{
if (addr == 0xfe)
-	msleep(50);
+	mdelay(50);
else if (addr == 0xfd)
msleep(5);
else if (addr == 0xfc)
@@ -75,7 +75,7 @@ void rtl_rfreg_delay(struct ieee80211_hw *hw, enum radio_path rfpath, u32 addr,
rtl_addr_delay(addr);
} else {
rtl_set_rfreg(hw, rfpath, addr, mask, data);
-	usleep_range(1, 2);
+	udelay(1);
}
}
EXPORT_SYMBOL(rtl_rfreg_delay);
@@ -86,7 +86,7 @@ void rtl_bb_delay(struct ieee80211_hw *hw, u32 addr, u32 data)
rtl_addr_delay(addr);
} else {
rtl_set_bbreg(hw, addr, MASKDWORD, data);
-	usleep_range(1, 2);
+	udelay(1);
}
}
EXPORT_SYMBOL(rtl_bb_delay);


Forward from Kalle Valo:
Thanks, 1 patch applied to wireless-drivers.git:

de26859dcf36 rtlwifi: Fix scheduling while atomic error from commit 49f86ec21c01

-------------------

The fix was effective in Arch linux 4.6.1-2

I don't know about the relationship between bugzilla.kernel.org and linux-wireless@vger.kernel.org, so please handle this as you think best.  There was Larry's comment about "Unfortunately, one of the changes involves a delay of 50 msec. The calling code will be modified so that this long delay can be avoided".  I don't know that that issue has been addressed yet.  And otherwise, please close this bug report if that seems appropriate.
Comment 5 [account disabled by administrator] 2016-06-09 01:57:33 UTC
I can't close this bug due to not having the permissions as you opened it and I am not a subsystem maintainer. Therefore can you close this bug.

Note You need to log in before you can comment on or make changes to this bug.