I've just compiled 3.8.0 for my Yeeloong laptop (mips64) and I found out that scanning wireless network in managed mode panics the kernel. I have USB wireless card reported by rtl8187 driver as "RTL8187BvE V0 + rtl8225z2". If I start wpa_supplicant or do iw dev wlan0 scan, the kernel panics. This is regression against 3.7.6. The only thing I remember I enabled are transparent huge tables (which are not probably supported by the CPU either). Other USB traffic like copying data from mass storage works. Unfortunately the panic message is quite long. Actually I can see scrolling 3 different oopses. Until I find how to get the panic log to different machine, I can provide only by-hand rewritten last screen: [former lines scrolled away] usb_hcd_giveback_urb ehci_urb_done qh_completitions ehci_irq usb_hcd_irq handle_irq_event_percpu handle_irq_event handle_level_irq generic_handle_irq do_IRQ ret_from_irq __do_soft_irq do_soft_irq irq_exit ret_from_irq die do_page_fault resume_userspace_check kthread_data wq_worker_sleeping __schedule die do_page_fault resume_userspace ieee80211_iface_work process_one_work worker_thread kthread ret_from_kernel_thread Code: fe000000 fe000008 fc830008 <fc640000> 0040202d 3c02893d 64429c80 3c01cfff 3421ffff Last panic has some functions from the rtl8187 driver. This quoted one not. I'm sorry if this is not much helpful.
I recompiled kernel without huge pages and it panics either. Now I got better traceback with deepest call: [<ffffffffc0406a44>] rtl8187_rx_cb+0x84/0x3f8 [rtl8187] [<ffffffffc00a4a60>] usb_hcd_giveback_urb+0xa0/0x180 [usbcore] [...]
According to a few users' report, since Linux 3.8, there is a major bug in the rtl8187 kernel driver. We tested on Linux 3.10, 3.11, 3.12 and confirmed the bug existed on the Loongson 8089D laptop in the vanilla kernel. It can be fully reproduced by connect to a WPA encrypted access point by wpa_supplicant. Panic: ieee80211_tx_status_irqsafe+0x40/0x483 [mac80211] rtl8087_tx_cb+0x204/0x204 [rtl8187] usb_hcd_giveback_urb+0xc0/0x2c0 .... loongson2_cpu_wait+0xa0/0xd8 [loongson2_cpufreq] -> Note: Still panic if I blacklist the module. So the module can't be broken. I can't provide 100% correct backtraces text, because I'm not an OCR program. Please see the photos for full panic information. There are 3 different panics. The first one contains almost same backtrace as Petr Pisar. 1. http://img.vim-cn.com/75/3d64b98b20607d6efbfc689508a7fdf53f8820.jpeg 2. http://img.vim-cn.com/5d/d6fc985a61c7e9e9c435c68fb796f6ec9b8b9f.jpeg 3. http://img.vim-cn.com/11/7d0014d2b19615b7e51be4f08bcc14a3491450.jpeg
There are only 3 changes on rtl8187 between 3.7 and 3.8 version. [stasiu@localhost linux]$ git log --pretty=oneline v3.7..v3.8 -- drivers/net/wireless/rtl818x/ fd549f135c43d60b92aff7cfbbfdf4e79b6cc631 rtl8187: remove __dev* attributes fb4e899dea7ea5e540db1deb71442d37bb8100fc rtl8187: remove __dev* attributes f4bda337bbb6e245e2a07f344990adeb6a70ff35 mac80211: support RX_FLAG_MACTIME_END None of them can be responsible for this bug. This seems to be problem in other subsystem like USB or network. The only way to move this bug forward is probably bisection.
Quite odd... I'll start testing and bisecting now.
While bisecting, some versions are unable to build because of a same bug: > On MIPS if SPARSEMEM is enabled we've got this: > In file included from > /home/kas/git/public/linux/arch/mips/include/asm/pgtable.h:552, > from include/linux/mm.h:44, > from arch/mips/kernel/asm-offsets.c:14: > include/asm-generic/pgtable.h: In function ‘my_zero_pfn’: > include/asm-generic/pgtable.h:466: error: implicit declaration of function > ‘page_to_section’ > In file included from arch/mips/kernel/asm-offsets.c:14: > include/linux/mm.h: At top level: > include/linux/mm.h:738: error: conflicting types for ‘page_to_section’ > include/asm-generic/pgtable.h:466: note: previous implicit declaration of > ‘page_to_section’ was here > Due header files inter-dependencies, the only way I see to fix it is convert my_zero_pfn() for __HAVE_COLOR_ZERO_PAGE to macros. I applied the 3.8-rc1-build-failure-with-MIPS-SPARSEMEM.patch to workaround the issue and continue my testing. Finally, my result is: git bisect start # bad: [1800098549fc310cffffefdcb3722adaad0edda8] ARM: OMAP: Fix build breakage due to missing include in i2c.c git bisect bad 1800098549fc310cffffefdcb3722adaad0edda8 # bad: [1800098549fc310cffffefdcb3722adaad0edda8] ARM: OMAP: Fix build breakage due to missing include in i2c.c git bisect bad 1800098549fc310cffffefdcb3722adaad0edda8 # good: [29594404d7fe73cd80eaa4ee8c43dcc53970c60e] Linux 3.7 git bisect good 29594404d7fe73cd80eaa4ee8c43dcc53970c60e # bad: [a13eea6bd9ee62ceacfc5243d54c84396bc86cb4] Merge tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs git bisect bad a13eea6bd9ee62ceacfc5243d54c84396bc86cb4 # bad: [a49f0d1ea3ec94fc7cf33a7c36a16343b74bd565] Linux 3.8-rc1 git bisect bad a49f0d1ea3ec94fc7cf33a7c36a16343b74bd565 # good: [6be35c700f742e911ecedd07fcc43d4439922334] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next git bisect good 6be35c700f742e911ecedd07fcc43d4439922334 # bad: [027c0a6af42efa4f2f6034421349bd26a3ca4923] ARM: imx: Move platform-mx2-emma to arch/arm/mach-imx/devices git bisect bad 027c0a6af42efa4f2f6034421349bd26a3ca4923 # bad: [027c0a6af42efa4f2f6034421349bd26a3ca4923] ARM: imx: Move platform-mx2-emma to arch/arm/mach-imx/devices git bisect bad 027c0a6af42efa4f2f6034421349bd26a3ca4923 # bad: [15de0599277f3477ddd11766282587f12d214252] Merge branch 'autofs' (patches from Ian Kent) git bisect bad 15de0599277f3477ddd11766282587f12d214252 # bad: [15de0599277f3477ddd11766282587f12d214252] Merge branch 'autofs' (patches from Ian Kent) git bisect bad 15de0599277f3477ddd11766282587f12d214252 # good: [046e7d685bc370fd4c879ab6635ad3f69e6673d1] Merge tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound git bisect good 046e7d685bc370fd4c879ab6635ad3f69e6673d1 # good: [193c0d682525987db59ac3a24531a77e4947aa95] Merge tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci git bisect good 193c0d682525987db59ac3a24531a77e4947aa95 # good: [d2ff4fc557a4c5248b2d99b0d48e47a246d994b2] Merge branch 'for-upstream' of https://github.com/agraf/linux-2.6 into queue git bisect good d2ff4fc557a4c5248b2d99b0d48e47a246d994b2 # good: [3127f23f013eabe9b58132c05061684c49146ba3] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k git bisect good 3127f23f013eabe9b58132c05061684c49146ba3 # good: [3127f23f013eabe9b58132c05061684c49146ba3] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k git bisect good 3127f23f013eabe9b58132c05061684c49146ba3 # good: [c59b9f92653f102856ca7802af551788c143a3a3] s390/pci: no msleep in potential IRQ context git bisect good c59b9f92653f102856ca7802af551788c143a3a3 # good: [6a7ed405114b2a53ccd99631b0636aaeabf71b3e] Merge branch 'arm-privcmd-for-3.8' of git://xenbits.xen.org/people/ianc/linux into stable/for-linus-3.8 git bisect good 6a7ed405114b2a53ccd99631b0636aaeabf71b3e # good: [66cdd0ceaf65a18996f561b770eedde1d123b019] Merge tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm git bisect good 66cdd0ceaf65a18996f561b770eedde1d123b019 # good: [66cdd0ceaf65a18996f561b770eedde1d123b019] Merge tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm git bisect good 66cdd0ceaf65a18996f561b770eedde1d123b019 # good: [189251705649bdfdf5e5850eb178f8cbfdac5480] ktest: Fix breakage from change of oldnoconfig to olddefconfig git bisect good 189251705649bdfdf5e5850eb178f8cbfdac5480 # good: [e05a1c6397a73d09389e033b6b2c25c954d2177c] Merge tag 'ktest-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest git bisect good e05a1c6397a73d09389e033b6b2c25c954d2177c # good: [e05a1c6397a73d09389e033b6b2c25c954d2177c] Merge tag 'ktest-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest git bisect good e05a1c6397a73d09389e033b6b2c25c954d2177c # good: [0259cb02c4004d3088b0999799f8f5c8801f6b97] autofs4 - use simple_empty() for empty directory check git bisect good 0259cb02c4004d3088b0999799f8f5c8801f6b97 # first bad commit: [15de0599277f3477ddd11766282587f12d214252] Merge branch 'autofs' (patches from Ian Kent)
Another interesting thing, 0259cb02c4004d3088b0999799f8f5c8801f6b97 builds without the patch. But since 15de0599277f3477ddd11766282587f12d214252, I have to apply 3.8-rc1-build-failure-with-MIPS-SPARSEMEM.patch
If some commit does not compile or have bug you can use git bisect skip for it, does it help ?
> # first bad commit: [15de0599277f3477ddd11766282587f12d214252] Merge branch > 'autofs' (patches from Ian Kent) That's strange, seems like bisection went wrong i.e instead of bad you marked some commit good or vice versa. You can 100% verify if this is not-correct by compiling kernel without CONFIG_AUTOFS4_FS .
Looks like starting from 046e7d685bc370fd4c879ab6635ad3f69e6673d1 you have only good commits, so this one is probably incorrectly marked as good or 15de0599277f3477ddd11766282587f12d214252 is incorrectly marked as bad.
You can also check if building without CONFIG_TRANSPARENT_HUGEPAGE and CONFIG_HUGETLBFS helps.
CONFIG_TRANSPARENT_HUGEPAGE and CONFIG_HUGETLBFS are broken for Loongson CPU, I never use it.
Really sorry, I made some mistakes on git bisect. I did a git bisect again, and finally, I got it!! ➜ linux git:(51d943f) ✗ git bisect good a16dad7763420a3b46cff1e703a9070827796cfc is the first bad commit commit a16dad7763420a3b46cff1e703a9070827796cfc Author: Ralf Baechle <ralf@linux-mips.org> Date: Sat Jun 9 20:48:47 2012 +0100 MIPS: Fix potencial corruption Normally r4k_dma_cache_inv should only ever be called with cacheline aligned addresses. If however, it isn't there is the theoretical possibility of data corruption. There is no correct way of handling this and anyway, it should only happen if the DMA API is used incorrectly so drop There is a different corruption scenario with these CACHE instructions removed but again there is no way of handling this correctly and it can be triggered only through incorrect use of the DMA API. So just get rid of the complexity. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Reported-by: James Rodriguez <jamesr@juniper.net> :040000 040000 01daf6e6755a15d1d96d5d0e92117af21789afbe 445b1384a39f3f1030cb0fca932f0b03c55ce826 M arch I can't believe it at at first. But I tested the previous versions, and confirmed this is the first bad commit. I reverted the commit for my 3.12.7 kernel, and the wireless is working again! So, is it a kernel bug or seems like the hardward's bug of CPU/IO on Loongson platform?
I read the commit, and found out this commit: /* * There is no clearly documented alignment requirement * for the cache instruction on MIPS processors and @@ -644,6 +647,9 @@ static void r4k_dma_cache_inv(unsigned long addr, unsigned long size) * hit ops with insufficient alignment. Solved by * aligning the address to cache line size. */ I don't have knowledge of CPU or IO subsystems. But according to > There is no clearly documented alignment requirement I guess different hardwares have different implementation. Maybe we need to re-consider about caching on MIPS.
Good job on bisection. On bad commit there is written: There is a different corruption scenario with these CACHE instructions removed but again there is no way of handling this correctly and it can be triggered only through incorrect use of the DMA API. So looks like DMA API is not used correctly. rtl8187 driver does not use DMA directly, it is done via USB host controller. Hence this bug should be reported to maintainers of USB-HCD from your laptop, on top which rtl8187 device is used. Please write email to them (you can get email addresses using "./scripts/get_maintainer.pl -f driver/usb/host/PROPER_HCD_DRIVER.c"). Cc Ralf, so he could explain how exactly DMA API is misused, or maybe agree that for some CPUs alignment instructions should be added back. You can also cc me.
I'm not sure where is the implementation of my HCD. YeeLoong 8089D uses a AMD CS5536 controller: > 00:0e.4 USB controller: Advanced Micro Devices, Inc. [AMD] CS5536 [Geode > companion] OHC (rev 02) > 00:0e.5 USB controller: Advanced Micro Devices, Inc. [AMD] CS5536 [Geode > companion] EHC (rev 02) I search the source. ➜ linux git:(51d943f) ✗ find ./ | grep 5536 | grep -v "\.o" ./include/config/cs5536 ./include/config/cs5536/mfgpt.h ./include/config/cs5536.h ./arch/mips/loongson/common/cs5536 ./arch/mips/loongson/common/cs5536/cs5536_pci.c ./arch/mips/loongson/common/cs5536/cs5536_ide.c ./arch/mips/loongson/common/cs5536/Makefile ./arch/mips/loongson/common/cs5536/cs5536_ehci.c ./arch/mips/loongson/common/cs5536/cs5536_isa.c ./arch/mips/loongson/common/cs5536/cs5536_ohci.c ./arch/mips/loongson/common/cs5536/cs5536_acc.c ./arch/mips/loongson/common/cs5536/cs5536_mfgpt.c ./arch/mips/loongson/common/cs5536/modules.builtin ./arch/mips/include/asm/mach-loongson/cs5536 ./arch/mips/include/asm/mach-loongson/cs5536/cs5536_mfgpt.h ./arch/mips/include/asm/mach-loongson/cs5536/cs5536_vsm.h ./arch/mips/include/asm/mach-loongson/cs5536/cs5536_pci.h ./arch/mips/include/asm/mach-loongson/cs5536/cs5536.h ./drivers/ata/pata_cs5536.c ./drivers/ide/cs5536.c ./drivers/usb/gadget/amd5536udc.h ./drivers/usb/gadget/amd5536udc.c And I think none of them implements a HCD driver.
At first glance cs5536_ohci.c and cs5536_ehci.c look like proper files, but indeed they do not implement HCD driver. But perhaps are related with this bug anyway as they seems to be used as PCI abstraction for proper HCD driver, which are ehci-pci and ohci-pci (lsmod and/or "lsusb -t" should confirm that). In any case you can just report to generic mailing lists: linux-mips@linux-mips.org and linux-usb@vger.kernel.org
Your are right. lsusb -t shows Driver=ohci-pci and ehci-pci. I'll report the issue to LinuxMIPS mailing list.
I can see your report did not get much attention :-( http://www.linux-mips.org/archives/linux-mips/2014-02/msg00001.html Let's try to continue debug it here ...
Created attachment 125161 [details] mips_dma_align_warn.patch This is revert of bad commit together with WARN prints, which will show call trace when r4k_dma_cache_inv is called with unaligned addresses. This should show us where DMA-API is misused.
Created attachment 125171 [details] mips_dma_align_warn_v2.patch Correct the end address alignment check.
Thanks for your WARNs, I got two warnings. The first one occurred on the early boot stage: [ 0.620000] TCP: cubic registered [ 0.620000] NET: Registered protocol family 17 [ 0.620000] registered taskstats version 1 [ 0.692000] ------------[ cut here ]------------ [ 0.692000] WARNING: CPU: 0 PID: 16 at arch/mips/mm/c-r4k.c:677 r4k_dma_cache_inv+0x150/0x198() [ 0.692000] not aligned END: addr 98000000bf38ab80 size 12 lsize 20 [ 0.692000] Modules linked in: [ 0.692000] CPU: 0 PID: 16 Comm: khubd Not tainted 3.13.1-e-yeeloong-gaizi #2 [ 0.692000] Stack : 0000000000000041 ffffffff847b0000 0000000000000006 ffffffff840755b0 0000000000000000 000000000000000b ffffffff847adf70 ffffffff847b0000 ffffffff846abfe0 ffffffff8472a037 ffffffff847adf70 98000000bf0cab90 0000000000000010 0000000000000000 98000000bf346000 98000000bf38ab80 0000000000000080 ffffffff84410a5c 98000000bf13f898 ffffffff840343d8 98000000bf13f8a8 ffffffff84076e9c 98000000bf0ca8b0 ffffffff846abfe0 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 98000000bf13f7e0 0000000000000000 ffffffff84034564 0000000000000000 0000000000000000 00000000000002a5 98000000bf13f950 0000000000000009 ffffffff8400c028 ffffffff846ad118 ffffffff84034564 ... [ 0.692000] Call Trace: [ 0.692000] [<ffffffff8400c028>] show_stack+0x80/0x98 [ 0.692000] [<ffffffff84034564>] warn_slowpath_common+0x84/0xb8 [ 0.692000] [<ffffffff840345d0>] warn_slowpath_fmt+0x38/0x50 [ 0.692000] [<ffffffff84023190>] r4k_dma_cache_inv+0x150/0x198 [ 0.692000] [<ffffffff8401b178>] mips_dma_map_page+0xb0/0x178 [ 0.692000] [<ffffffff84305898>] usb_hcd_map_urb_for_dma+0x4e8/0x580 [ 0.692000] [<ffffffff84305c90>] usb_hcd_submit_urb+0x360/0xa50 [ 0.692000] [<ffffffff843082d8>] usb_start_wait_urb+0x50/0x100 [ 0.692000] [<ffffffff8430845c>] usb_control_msg+0xd4/0x138 [ 0.692000] [<ffffffff843094a0>] usb_get_descriptor+0xa8/0x150 [ 0.692000] [<ffffffff84309db8>] usb_get_device_descriptor+0x58/0xb8 [ 0.692000] [<ffffffff842ffb8c>] hub_port_init+0x454/0xab0 [ 0.692000] [<ffffffff84301588>] hub_thread+0x630/0x1410 [ 0.692000] [<ffffffff84057e10>] kthread+0xe0/0xf8 [ 0.692000] [<ffffffff84006bd0>] ret_from_kernel_thread+0x20/0x28 [ 0.692000] ---[ end trace 597e23e7a88288b4 ]--- [ 0.704000] usb 2-1: New USB device found, idVendor=0bda, idProduct=0158 [ 0.704000] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 0.704000] usb 2-1: Product: USB2.0-CRW [ 0.704000] usb 2-1: Manufacturer: Generic [ 0.704000] usb 2-1: SerialNumber: 20071114173400000 [ 0.708000] usb-storage 2-1:1.0: USB Mass Storage device detected [ 0.712000] scsi2 : usb-storage 2-1:1.0 [ 0.824000] usb 2-2: new high-speed USB device number 3 using ehci-pci And the second one emitted when I load rtl8187 and connecting to the access point: [ 64.532000] cfg80211: Calling CRDA to update world regulatory domain [ 65.120000] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht' [ 65.144000] ieee80211 phy0: hwaddr 00:17:c4:5a:11:53, RTL8187BvE V0 + rtl8225z2, rfkill mask 2 [ 65.180000] rtl8187: Customer ID is 0x00 [ 65.188000] rtl8187: wireless switch is off [ 65.188000] usbcore: registered new interface driver rtl8187 [ 70.832000] rtl8187: wireless radio switch turned on [ 76.088000] ------------[ cut here ]------------ [ 76.088000] WARNING: CPU: 0 PID: 118 at arch/mips/mm/c-r4k.c:676 r4k_dma_cache_inv+0x18c/0x198() [ 76.088000] not aligned START: addr 98000000bf02df70 lsize 20 [ 76.088000] Modules linked in: [ 76.088000] arc4 rtl8187 eeprom_93cx6 led_class mac80211 cfg80211 rfkill psmouse loongson2_cpufreq snd_cs5535audio 8139too mii reiserfs loop snd_ac97_codec ac97_bus snd_pcm snd_page_alloc snd_timer snd soundcore ipv6 [ 76.088000] CPU: 0 PID: 118 Comm: NetworkManager Tainted: G W 3.13.1-e-yeeloong-gaizi #2 [ 76.088000] Stack : 0000000000000056 ffffffff84074da4 0000000000000006 ffffffff840755b0 0000000000000000 0000000000000009 ffffffff847adf70 ffffffff847b0000 ffffffff846abfe0 ffffffff8472a037 ffffffff847adf70 98000000bf64d070 0000000000000076 0000000000000000 0000000000000000 000000000000ff40 98000000bf02d640 ffffffff84410a5c 98000000bfc1b3b8 ffffffff84034334 ffffffff847b3270 ffffffff84076e9c 98000000bf64cd90 ffffffff846abfe0 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 98000000bfc1b300 0000000000000000 ffffffff84034564 0000000000000000 0000000000000000 00000000000002a4 98000000bfc1b470 0000000000000009 ffffffff8400c028 ffffffff846ad118 ffffffff84034564 ... [ 76.088000] Call Trace: [ 76.088000] [<ffffffff8400c028>] show_stack+0x80/0x98 [ 76.088000] [<ffffffff84034564>] warn_slowpath_common+0x84/0xb8 [ 76.088000] [<ffffffff840345d0>] warn_slowpath_fmt+0x38/0x50 [ 76.088000] [<ffffffff840231cc>] r4k_dma_cache_inv+0x18c/0x198 [ 76.088000] [<ffffffff8401b178>] mips_dma_map_page+0xb0/0x178 [ 76.088000] [<ffffffff84305898>] usb_hcd_map_urb_for_dma+0x4e8/0x580 [ 76.088000] [<ffffffff84305c90>] usb_hcd_submit_urb+0x360/0xa50 [ 76.088000] [<ffffffffc030bbcc>] rtl8187_start+0x321c/0x3910 [rtl8187] [ 76.088000] [<ffffffffc0285b9c>] ieee80211_do_open+0x254/0xfa8 [mac80211] [ 76.088000] [<ffffffff84374980>] __dev_open+0x120/0x1a8 [ 76.088000] [<ffffffff84374cf8>] __dev_change_flags+0xb8/0x1d8 [ 76.088000] [<ffffffff84374e3c>] dev_change_flags+0x24/0x78 [ 76.088000] [<ffffffff84385200>] do_setlink+0x330/0x940 [ 76.088000] [<ffffffff84385c44>] rtnl_newlink+0x31c/0x520 [ 76.088000] [<ffffffff84384c74>] rtnetlink_rcv_msg+0xac/0x2c0 [ 76.088000] [<ffffffff84396334>] netlink_rcv_skb+0x10c/0x138 [ 76.088000] [<ffffffff84384bb4>] rtnetlink_rcv+0x2c/0x40 [ 76.088000] [<ffffffff843959bc>] netlink_unicast+0x194/0x288 [ 76.088000] [<ffffffff84395ff0>] netlink_sendmsg+0x3f8/0x4a8 [ 76.088000] [<ffffffff84354e44>] sock_sendmsg+0x7c/0xc8 [ 76.092000] [<ffffffff84355c2c>] ___sys_sendmsg+0x344/0x358 [ 76.092000] [<ffffffff84359070>] __sys_sendmsg+0x48/0xb0 [ 76.092000] [<ffffffff84015130>] handle_sysn32+0x50/0x7c [ 76.092000] ---[ end trace 597e23e7a88288b5 ]--- [ 76.200000] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 81.036000] wlan0: authenticate with 14:cf:92:17:29:2c [ 81.336000] wlan0: send auth to 14:cf:92:17:29:2c (try 1/3) [ 81.340000] wlan0: authenticated The same errors happens even without load rtl8187, must be something wrong in usb_hcd_map_urb_for_dma.
Well, it does not align mapped buffer to cache line size. It is normally not required on more commodify architectures. Actually buffer address and size come from upper layers i.e. rtl8187_start or usb_get_descriptor. Fixing all cases where we pass not aligned buffer to usb_hcd_submit_urb() is virtually not possible. I'm not sure how this should be fixed, this should be probably discussed on linux-mips and usb mailing list. For now I'll prepare patch for rtl8187 which possibly fix corruption when that driver is in use.
Created attachment 125361 [details] rtl8187_cache_align.patch Possible fix for rtl8187. Please test without any other patches to see if it make gone of oops'es originally reported in this bug report.
Yes, the oops'es disappeared and the kernel no longer panic.
I posted the patch: http://marc.info/?l=linux-wireless&m=139206817709511&w=2 It still possible to corrupt memory (theoretically) on those MIPS machines, by USB HCD or some other driver. But that was always possible, even without a16dad7763420a3b46cff1e703a9070827796cfc commit. Ralf, kernel MIPS maintainer is aware of that. Hopefully all modern MIPS processors will come with hardware DMA coherency. I think this bug can be closed.
Finally, the bug has been fixed :) This is a good news for the whole Loongson Community. Thanks very much for your attention and working on the bug.
Patch has been in Linux since 3.14 (commit b6213e413a4e0c66548153516b074df14f9d08e0). Please mark as fixed.
(In reply to Matt Turner from comment #27) > Patch has been in Linux since 3.14 (commit > b6213e413a4e0c66548153516b074df14f9d08e0). Please mark as fixed. I don't have permission to do that. Original reporter disappeared.
I did not disappear. I only thought Stanislaw will do it because he understands the fix.