Bug 54391

Summary: USB wireless driver rtl8187 on mips64 3.8.0 panics when receiving frames
Product: Drivers Reporter: Petr Pisar (petr.pisar)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: RESOLVED CODE_FIX    
Severity: normal CC: biergaizi2009, mattst88, oliva, stf_xl
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.8.0 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: mips_dma_align_warn.patch
mips_dma_align_warn_v2.patch
rtl8187_cache_align.patch

Description Petr Pisar 2013-02-24 17:47:01 UTC
I've just compiled 3.8.0 for my Yeeloong laptop (mips64) and I found out that scanning wireless network in managed mode panics the kernel.

I have USB wireless card reported by rtl8187 driver as "RTL8187BvE V0 + rtl8225z2". If I start wpa_supplicant or do iw dev wlan0 scan, the kernel panics.

This is regression against 3.7.6. The only thing I remember I enabled are transparent huge tables (which are not probably supported by the CPU either). Other USB traffic like copying data from mass storage works.

Unfortunately the panic message is quite long. Actually I can see scrolling 3 different oopses. Until I find how to get the panic log to different machine, I can provide only by-hand rewritten last screen:

[former lines scrolled away]
usb_hcd_giveback_urb
ehci_urb_done
qh_completitions
ehci_irq
usb_hcd_irq
handle_irq_event_percpu
handle_irq_event
handle_level_irq
generic_handle_irq
do_IRQ
ret_from_irq
__do_soft_irq
do_soft_irq
irq_exit
ret_from_irq
die
do_page_fault
resume_userspace_check
kthread_data
wq_worker_sleeping
__schedule
die
do_page_fault
resume_userspace
ieee80211_iface_work
process_one_work
worker_thread
kthread
ret_from_kernel_thread

Code: fe000000 fe000008 fc830008 <fc640000> 0040202d 3c02893d 64429c80 3c01cfff 3421ffff


Last panic has some functions from the rtl8187 driver. This quoted one not. I'm sorry if this is not much helpful.
Comment 1 Petr Pisar 2013-02-24 20:16:38 UTC
I recompiled kernel without huge pages and it panics either. Now I got better traceback with deepest call:

[<ffffffffc0406a44>] rtl8187_rx_cb+0x84/0x3f8 [rtl8187]
[<ffffffffc00a4a60>] usb_hcd_giveback_urb+0xa0/0x180 [usbcore]
[...]
Comment 2 Tom Li 2014-01-28 19:20:50 UTC
According to a few users' report, since Linux 3.8, there is a major bug in the rtl8187 kernel driver. We tested on Linux 3.10, 3.11, 3.12 and confirmed the bug existed on the Loongson 8089D laptop in the vanilla kernel.

It can be fully reproduced by connect to a WPA encrypted access point by wpa_supplicant.

Panic:
    ieee80211_tx_status_irqsafe+0x40/0x483 [mac80211]
    rtl8087_tx_cb+0x204/0x204 [rtl8187]
    usb_hcd_giveback_urb+0xc0/0x2c0
    ....
    loongson2_cpu_wait+0xa0/0xd8 [loongson2_cpufreq]  -> Note: Still panic if I blacklist the module. So the module can't be broken.

I can't provide 100% correct backtraces text, because I'm not an OCR program. Please see the photos for full panic information. There are 3 different panics. The first one contains almost same backtrace as Petr Pisar.

    1. http://img.vim-cn.com/75/3d64b98b20607d6efbfc689508a7fdf53f8820.jpeg
    2. http://img.vim-cn.com/5d/d6fc985a61c7e9e9c435c68fb796f6ec9b8b9f.jpeg
    3. http://img.vim-cn.com/11/7d0014d2b19615b7e51be4f08bcc14a3491450.jpeg
Comment 3 Stanislaw Gruszka 2014-01-28 21:11:15 UTC
There are only 3 changes on rtl8187 between 3.7 and 3.8 version.

[stasiu@localhost linux]$ git log --pretty=oneline v3.7..v3.8 -- drivers/net/wireless/rtl818x/
fd549f135c43d60b92aff7cfbbfdf4e79b6cc631 rtl8187: remove __dev* attributes
fb4e899dea7ea5e540db1deb71442d37bb8100fc rtl8187: remove __dev* attributes
f4bda337bbb6e245e2a07f344990adeb6a70ff35 mac80211: support RX_FLAG_MACTIME_END

None of them can be responsible for this bug. This seems to be problem in other subsystem like USB or network. The only way to move this bug forward is probably bisection.
Comment 4 Tom Li 2014-01-28 22:10:02 UTC
Quite odd... I'll start testing and bisecting now.
Comment 5 Tom Li 2014-01-29 05:00:00 UTC
While bisecting, some versions are unable to build because of a same bug:

> On MIPS if SPARSEMEM is enabled we've got this:

> In file included from 
> /home/kas/git/public/linux/arch/mips/include/asm/pgtable.h:552,
>                 from include/linux/mm.h:44,
>                 from arch/mips/kernel/asm-offsets.c:14:
> include/asm-generic/pgtable.h: In function ‘my_zero_pfn’:
> include/asm-generic/pgtable.h:466: error: implicit declaration of function
> ‘page_to_section’
> In file included from arch/mips/kernel/asm-offsets.c:14:
> include/linux/mm.h: At top level:
> include/linux/mm.h:738: error: conflicting types for ‘page_to_section’
> include/asm-generic/pgtable.h:466: note: previous implicit declaration of
> ‘page_to_section’ was here

> Due header files inter-dependencies, the only way I see to fix it is
convert my_zero_pfn() for __HAVE_COLOR_ZERO_PAGE to macros.

I applied the 3.8-rc1-build-failure-with-MIPS-SPARSEMEM.patch to workaround the issue and continue my testing.

Finally, my result is:

git bisect start
# bad: [1800098549fc310cffffefdcb3722adaad0edda8] ARM: OMAP: Fix build breakage due to missing include in i2c.c
git bisect bad 1800098549fc310cffffefdcb3722adaad0edda8
# bad: [1800098549fc310cffffefdcb3722adaad0edda8] ARM: OMAP: Fix build breakage due to missing include in i2c.c
git bisect bad 1800098549fc310cffffefdcb3722adaad0edda8
# good: [29594404d7fe73cd80eaa4ee8c43dcc53970c60e] Linux 3.7
git bisect good 29594404d7fe73cd80eaa4ee8c43dcc53970c60e
# bad: [a13eea6bd9ee62ceacfc5243d54c84396bc86cb4] Merge tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
git bisect bad a13eea6bd9ee62ceacfc5243d54c84396bc86cb4
# bad: [a49f0d1ea3ec94fc7cf33a7c36a16343b74bd565] Linux 3.8-rc1
git bisect bad a49f0d1ea3ec94fc7cf33a7c36a16343b74bd565
# good: [6be35c700f742e911ecedd07fcc43d4439922334] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect good 6be35c700f742e911ecedd07fcc43d4439922334
# bad: [027c0a6af42efa4f2f6034421349bd26a3ca4923] ARM: imx: Move platform-mx2-emma to arch/arm/mach-imx/devices
git bisect bad 027c0a6af42efa4f2f6034421349bd26a3ca4923
# bad: [027c0a6af42efa4f2f6034421349bd26a3ca4923] ARM: imx: Move platform-mx2-emma to arch/arm/mach-imx/devices
git bisect bad 027c0a6af42efa4f2f6034421349bd26a3ca4923
# bad: [15de0599277f3477ddd11766282587f12d214252] Merge branch 'autofs' (patches from Ian Kent)
git bisect bad 15de0599277f3477ddd11766282587f12d214252
# bad: [15de0599277f3477ddd11766282587f12d214252] Merge branch 'autofs' (patches from Ian Kent)
git bisect bad 15de0599277f3477ddd11766282587f12d214252
# good: [046e7d685bc370fd4c879ab6635ad3f69e6673d1] Merge tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good 046e7d685bc370fd4c879ab6635ad3f69e6673d1
# good: [193c0d682525987db59ac3a24531a77e4947aa95] Merge tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
git bisect good 193c0d682525987db59ac3a24531a77e4947aa95
# good: [d2ff4fc557a4c5248b2d99b0d48e47a246d994b2] Merge branch 'for-upstream' of https://github.com/agraf/linux-2.6 into queue
git bisect good d2ff4fc557a4c5248b2d99b0d48e47a246d994b2
# good: [3127f23f013eabe9b58132c05061684c49146ba3] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
git bisect good 3127f23f013eabe9b58132c05061684c49146ba3
# good: [3127f23f013eabe9b58132c05061684c49146ba3] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
git bisect good 3127f23f013eabe9b58132c05061684c49146ba3
# good: [c59b9f92653f102856ca7802af551788c143a3a3] s390/pci: no msleep in potential IRQ context
git bisect good c59b9f92653f102856ca7802af551788c143a3a3
# good: [6a7ed405114b2a53ccd99631b0636aaeabf71b3e] Merge branch 'arm-privcmd-for-3.8' of git://xenbits.xen.org/people/ianc/linux into stable/for-linus-3.8
git bisect good 6a7ed405114b2a53ccd99631b0636aaeabf71b3e
# good: [66cdd0ceaf65a18996f561b770eedde1d123b019] Merge tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
git bisect good 66cdd0ceaf65a18996f561b770eedde1d123b019
# good: [66cdd0ceaf65a18996f561b770eedde1d123b019] Merge tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
git bisect good 66cdd0ceaf65a18996f561b770eedde1d123b019
# good: [189251705649bdfdf5e5850eb178f8cbfdac5480] ktest: Fix breakage from change of oldnoconfig to olddefconfig
git bisect good 189251705649bdfdf5e5850eb178f8cbfdac5480
# good: [e05a1c6397a73d09389e033b6b2c25c954d2177c] Merge tag 'ktest-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
git bisect good e05a1c6397a73d09389e033b6b2c25c954d2177c
# good: [e05a1c6397a73d09389e033b6b2c25c954d2177c] Merge tag 'ktest-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
git bisect good e05a1c6397a73d09389e033b6b2c25c954d2177c
# good: [0259cb02c4004d3088b0999799f8f5c8801f6b97] autofs4 - use simple_empty() for empty directory check
git bisect good 0259cb02c4004d3088b0999799f8f5c8801f6b97
# first bad commit: [15de0599277f3477ddd11766282587f12d214252] Merge branch 'autofs' (patches from Ian Kent)
Comment 6 Tom Li 2014-01-29 05:02:05 UTC
Another interesting thing, 0259cb02c4004d3088b0999799f8f5c8801f6b97 builds without the patch. But since 15de0599277f3477ddd11766282587f12d214252, I have to apply 3.8-rc1-build-failure-with-MIPS-SPARSEMEM.patch
Comment 7 Stanislaw Gruszka 2014-01-29 06:58:08 UTC
If some commit does not compile or have bug you can use git bisect skip for it, does it help ?
Comment 8 Stanislaw Gruszka 2014-01-29 07:12:27 UTC
> # first bad commit: [15de0599277f3477ddd11766282587f12d214252] Merge branch
> 'autofs' (patches from Ian Kent)

That's strange, seems like bisection went wrong i.e instead of bad you marked some commit good or vice versa. You can 100% verify if this is not-correct by compiling kernel without CONFIG_AUTOFS4_FS .
Comment 9 Stanislaw Gruszka 2014-01-29 07:15:20 UTC
Looks like starting from 046e7d685bc370fd4c879ab6635ad3f69e6673d1 you have only good commits, so this one is probably incorrectly marked as good or 15de0599277f3477ddd11766282587f12d214252 is incorrectly marked as bad.
Comment 10 Stanislaw Gruszka 2014-01-29 07:19:12 UTC
You can also check if building without CONFIG_TRANSPARENT_HUGEPAGE and CONFIG_HUGETLBFS helps.
Comment 11 Tom Li 2014-01-30 19:10:02 UTC
CONFIG_TRANSPARENT_HUGEPAGE and CONFIG_HUGETLBFS are broken for Loongson CPU, I never use it.
Comment 12 Tom Li 2014-01-30 22:34:54 UTC
Really sorry, I made some mistakes on git bisect. I did a git bisect again, and finally, I got it!!

➜  linux git:(51d943f) ✗ git bisect good
a16dad7763420a3b46cff1e703a9070827796cfc is the first bad commit
commit a16dad7763420a3b46cff1e703a9070827796cfc
Author: Ralf Baechle <ralf@linux-mips.org>
Date:   Sat Jun 9 20:48:47 2012 +0100

    MIPS: Fix potencial corruption
    
    Normally r4k_dma_cache_inv should only ever be called with cacheline
    aligned addresses.  If however, it isn't there is the theoretical
    possibility of data corruption.  There is no correct way of handling this
    and anyway, it should only happen if the DMA API is used incorrectly
    so drop
    
    There is a different corruption scenario with these CACHE instructions
    removed but again there is no way of handling this correctly and it can
    be triggered only through incorrect use of the DMA API.
    
    So just get rid of the complexity.
    
    Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
    Reported-by: James Rodriguez <jamesr@juniper.net>

:040000 040000 01daf6e6755a15d1d96d5d0e92117af21789afbe 445b1384a39f3f1030cb0fca932f0b03c55ce826 M      arch

I can't believe it at at first. But I tested the previous versions, and confirmed this is the first bad commit. I reverted the commit for my 3.12.7 kernel, and the wireless is working again!

So, is it a kernel bug or seems like the hardward's bug of CPU/IO on Loongson platform?
Comment 13 Tom Li 2014-01-30 22:43:49 UTC
I read the commit, and found out this commit:

/*
 * There is no clearly documented alignment requirement
 * for the cache instruction on MIPS processors and
@@ -644,6 +647,9 @@ static void r4k_dma_cache_inv(unsigned long addr, unsigned long size)
 * hit ops with insufficient alignment.  Solved by
 * aligning the address to cache line size.
*/

I don't have knowledge of CPU or IO subsystems. But according to 

> There is no clearly documented alignment requirement

I guess different hardwares have different implementation. Maybe we need to re-consider about caching on MIPS.
Comment 14 Stanislaw Gruszka 2014-01-31 06:58:37 UTC
Good job on bisection. On bad commit there is written:

There is a different corruption scenario with these CACHE instructions
removed but again there is no way of handling this correctly and it can
be triggered only through incorrect use of the DMA API.

So looks like DMA API is not used correctly. rtl8187 driver does not use DMA directly, it is done via USB host controller. Hence this bug should be reported to maintainers of USB-HCD from your laptop, on top which rtl8187 device is used. Please write email to them (you can get email addresses using "./scripts/get_maintainer.pl -f driver/usb/host/PROPER_HCD_DRIVER.c"). Cc Ralf, so he could explain how exactly DMA API is misused, or maybe agree that for some CPUs alignment instructions should be added back. You can also cc me.
Comment 15 Tom Li 2014-01-31 16:26:44 UTC
I'm not sure where is the implementation of my HCD. YeeLoong 8089D uses a AMD CS5536 controller:

> 00:0e.4 USB controller: Advanced Micro Devices, Inc. [AMD] CS5536 [Geode
> companion] OHC (rev 02) 
> 00:0e.5 USB controller: Advanced Micro Devices, Inc. [AMD] CS5536 [Geode
> companion] EHC (rev 02) 

I search the source. 

➜ linux git:(51d943f) ✗ find ./ | grep 5536 | grep -v "\.o"

./include/config/cs5536
./include/config/cs5536/mfgpt.h
./include/config/cs5536.h
./arch/mips/loongson/common/cs5536
./arch/mips/loongson/common/cs5536/cs5536_pci.c
./arch/mips/loongson/common/cs5536/cs5536_ide.c
./arch/mips/loongson/common/cs5536/Makefile
./arch/mips/loongson/common/cs5536/cs5536_ehci.c
./arch/mips/loongson/common/cs5536/cs5536_isa.c
./arch/mips/loongson/common/cs5536/cs5536_ohci.c
./arch/mips/loongson/common/cs5536/cs5536_acc.c
./arch/mips/loongson/common/cs5536/cs5536_mfgpt.c
./arch/mips/loongson/common/cs5536/modules.builtin
./arch/mips/include/asm/mach-loongson/cs5536
./arch/mips/include/asm/mach-loongson/cs5536/cs5536_mfgpt.h
./arch/mips/include/asm/mach-loongson/cs5536/cs5536_vsm.h
./arch/mips/include/asm/mach-loongson/cs5536/cs5536_pci.h
./arch/mips/include/asm/mach-loongson/cs5536/cs5536.h
./drivers/ata/pata_cs5536.c
./drivers/ide/cs5536.c
./drivers/usb/gadget/amd5536udc.h
./drivers/usb/gadget/amd5536udc.c

And I think none of them implements a HCD driver.
Comment 16 Stanislaw Gruszka 2014-01-31 18:44:14 UTC
At first glance cs5536_ohci.c and cs5536_ehci.c look like proper files, but indeed they do not implement HCD driver. But perhaps are related with this bug anyway as they seems to be used as PCI abstraction for proper HCD driver, which are ehci-pci and ohci-pci (lsmod and/or "lsusb -t" should confirm that).

In any case you can just report to generic mailing lists:
linux-mips@linux-mips.org and linux-usb@vger.kernel.org
Comment 17 Tom Li 2014-01-31 19:41:41 UTC
Your are right. lsusb -t shows Driver=ohci-pci and ehci-pci.

I'll report the issue to LinuxMIPS mailing list.
Comment 18 Stanislaw Gruszka 2014-02-07 20:18:20 UTC
I can see your report did not get much attention :-(
http://www.linux-mips.org/archives/linux-mips/2014-02/msg00001.html

Let's try to continue debug it here ...
Comment 19 Stanislaw Gruszka 2014-02-07 20:22:21 UTC
Created attachment 125161 [details]
mips_dma_align_warn.patch

This is revert of bad commit together with WARN prints, which will show call trace when r4k_dma_cache_inv is called with unaligned addresses. This should show us where DMA-API is misused.
Comment 20 Stanislaw Gruszka 2014-02-07 20:43:44 UTC
Created attachment 125171 [details]
mips_dma_align_warn_v2.patch

Correct the end address alignment check.
Comment 21 Tom Li 2014-02-08 11:12:40 UTC
Thanks for your WARNs, I got two warnings.

The first one occurred on the early boot stage:

[    0.620000] TCP: cubic registered
[    0.620000] NET: Registered protocol family 17
[    0.620000] registered taskstats version 1
[    0.692000] ------------[ cut here ]------------
[    0.692000] WARNING: CPU: 0 PID: 16 at arch/mips/mm/c-r4k.c:677 r4k_dma_cache_inv+0x150/0x198()
[    0.692000] not aligned END: addr 98000000bf38ab80 size 12 lsize 20
[    0.692000] Modules linked in:

[    0.692000] CPU: 0 PID: 16 Comm: khubd Not tainted 3.13.1-e-yeeloong-gaizi #2
[    0.692000] Stack : 0000000000000041 ffffffff847b0000 0000000000000006 ffffffff840755b0
	  0000000000000000 000000000000000b ffffffff847adf70 ffffffff847b0000
	  ffffffff846abfe0 ffffffff8472a037 ffffffff847adf70 98000000bf0cab90
	  0000000000000010 0000000000000000 98000000bf346000 98000000bf38ab80
	  0000000000000080 ffffffff84410a5c 98000000bf13f898 ffffffff840343d8
	  98000000bf13f8a8 ffffffff84076e9c 98000000bf0ca8b0 ffffffff846abfe0
	  0000000000000000 0000000000000000 0000000000000000 0000000000000000
	  0000000000000000 98000000bf13f7e0 0000000000000000 ffffffff84034564
	  0000000000000000 0000000000000000 00000000000002a5 98000000bf13f950
	  0000000000000009 ffffffff8400c028 ffffffff846ad118 ffffffff84034564
	  ...
[    0.692000] Call Trace:
[    0.692000] [<ffffffff8400c028>] show_stack+0x80/0x98
[    0.692000] [<ffffffff84034564>] warn_slowpath_common+0x84/0xb8
[    0.692000] [<ffffffff840345d0>] warn_slowpath_fmt+0x38/0x50
[    0.692000] [<ffffffff84023190>] r4k_dma_cache_inv+0x150/0x198
[    0.692000] [<ffffffff8401b178>] mips_dma_map_page+0xb0/0x178
[    0.692000] [<ffffffff84305898>] usb_hcd_map_urb_for_dma+0x4e8/0x580
[    0.692000] [<ffffffff84305c90>] usb_hcd_submit_urb+0x360/0xa50
[    0.692000] [<ffffffff843082d8>] usb_start_wait_urb+0x50/0x100
[    0.692000] [<ffffffff8430845c>] usb_control_msg+0xd4/0x138
[    0.692000] [<ffffffff843094a0>] usb_get_descriptor+0xa8/0x150
[    0.692000] [<ffffffff84309db8>] usb_get_device_descriptor+0x58/0xb8
[    0.692000] [<ffffffff842ffb8c>] hub_port_init+0x454/0xab0
[    0.692000] [<ffffffff84301588>] hub_thread+0x630/0x1410
[    0.692000] [<ffffffff84057e10>] kthread+0xe0/0xf8
[    0.692000] [<ffffffff84006bd0>] ret_from_kernel_thread+0x20/0x28

[    0.692000] ---[ end trace 597e23e7a88288b4 ]---
[    0.704000] usb 2-1: New USB device found, idVendor=0bda, idProduct=0158
[    0.704000] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    0.704000] usb 2-1: Product: USB2.0-CRW
[    0.704000] usb 2-1: Manufacturer: Generic
[    0.704000] usb 2-1: SerialNumber: 20071114173400000
[    0.708000] usb-storage 2-1:1.0: USB Mass Storage device detected
[    0.712000] scsi2 : usb-storage 2-1:1.0
[    0.824000] usb 2-2: new high-speed USB device number 3 using ehci-pci

And the second one emitted when I load rtl8187 and connecting to the access point:

[   64.532000] cfg80211: Calling CRDA to update world regulatory domain
[   65.120000] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   65.144000] ieee80211 phy0: hwaddr 00:17:c4:5a:11:53, RTL8187BvE V0 + rtl8225z2, rfkill mask 2
[   65.180000] rtl8187: Customer ID is 0x00
[   65.188000] rtl8187: wireless switch is off
[   65.188000] usbcore: registered new interface driver rtl8187
[   70.832000] rtl8187: wireless radio switch turned on
[   76.088000] ------------[ cut here ]------------
[   76.088000] WARNING: CPU: 0 PID: 118 at arch/mips/mm/c-r4k.c:676 r4k_dma_cache_inv+0x18c/0x198()
[   76.088000] not aligned START: addr 98000000bf02df70 lsize 20
[   76.088000] Modules linked in:
[   76.088000]  arc4 rtl8187 eeprom_93cx6 led_class mac80211 cfg80211 rfkill psmouse loongson2_cpufreq snd_cs5535audio 8139too mii reiserfs loop snd_ac97_codec ac97_bus snd_pcm snd_page_alloc snd_timer snd soundcore ipv6
[   76.088000] CPU: 0 PID: 118 Comm: NetworkManager Tainted: G        W    3.13.1-e-yeeloong-gaizi #2
[   76.088000] Stack : 0000000000000056 ffffffff84074da4 0000000000000006 ffffffff840755b0
	  0000000000000000 0000000000000009 ffffffff847adf70 ffffffff847b0000
	  ffffffff846abfe0 ffffffff8472a037 ffffffff847adf70 98000000bf64d070
	  0000000000000076 0000000000000000 0000000000000000 000000000000ff40
	  98000000bf02d640 ffffffff84410a5c 98000000bfc1b3b8 ffffffff84034334
	  ffffffff847b3270 ffffffff84076e9c 98000000bf64cd90 ffffffff846abfe0
	  0000000000000000 0000000000000000 0000000000000000 0000000000000000
	  0000000000000000 98000000bfc1b300 0000000000000000 ffffffff84034564
	  0000000000000000 0000000000000000 00000000000002a4 98000000bfc1b470
	  0000000000000009 ffffffff8400c028 ffffffff846ad118 ffffffff84034564
	  ...
[   76.088000] Call Trace:
[   76.088000] [<ffffffff8400c028>] show_stack+0x80/0x98
[   76.088000] [<ffffffff84034564>] warn_slowpath_common+0x84/0xb8
[   76.088000] [<ffffffff840345d0>] warn_slowpath_fmt+0x38/0x50
[   76.088000] [<ffffffff840231cc>] r4k_dma_cache_inv+0x18c/0x198
[   76.088000] [<ffffffff8401b178>] mips_dma_map_page+0xb0/0x178
[   76.088000] [<ffffffff84305898>] usb_hcd_map_urb_for_dma+0x4e8/0x580
[   76.088000] [<ffffffff84305c90>] usb_hcd_submit_urb+0x360/0xa50
[   76.088000] [<ffffffffc030bbcc>] rtl8187_start+0x321c/0x3910 [rtl8187]
[   76.088000] [<ffffffffc0285b9c>] ieee80211_do_open+0x254/0xfa8 [mac80211]
[   76.088000] [<ffffffff84374980>] __dev_open+0x120/0x1a8
[   76.088000] [<ffffffff84374cf8>] __dev_change_flags+0xb8/0x1d8
[   76.088000] [<ffffffff84374e3c>] dev_change_flags+0x24/0x78
[   76.088000] [<ffffffff84385200>] do_setlink+0x330/0x940
[   76.088000] [<ffffffff84385c44>] rtnl_newlink+0x31c/0x520
[   76.088000] [<ffffffff84384c74>] rtnetlink_rcv_msg+0xac/0x2c0
[   76.088000] [<ffffffff84396334>] netlink_rcv_skb+0x10c/0x138
[   76.088000] [<ffffffff84384bb4>] rtnetlink_rcv+0x2c/0x40
[   76.088000] [<ffffffff843959bc>] netlink_unicast+0x194/0x288
[   76.088000] [<ffffffff84395ff0>] netlink_sendmsg+0x3f8/0x4a8
[   76.088000] [<ffffffff84354e44>] sock_sendmsg+0x7c/0xc8
[   76.092000] [<ffffffff84355c2c>] ___sys_sendmsg+0x344/0x358
[   76.092000] [<ffffffff84359070>] __sys_sendmsg+0x48/0xb0
[   76.092000] [<ffffffff84015130>] handle_sysn32+0x50/0x7c

[   76.092000] ---[ end trace 597e23e7a88288b5 ]---
[   76.200000] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   81.036000] wlan0: authenticate with 14:cf:92:17:29:2c
[   81.336000] wlan0: send auth to 14:cf:92:17:29:2c (try 1/3)
[   81.340000] wlan0: authenticated


The same errors happens even without load rtl8187, must be something wrong in usb_hcd_map_urb_for_dma.
Comment 22 Stanislaw Gruszka 2014-02-08 20:12:33 UTC
Well, it does not align mapped buffer to cache line size. It is normally not required on more commodify architectures. Actually buffer address and size come from upper layers i.e. rtl8187_start or usb_get_descriptor. Fixing all cases where we pass not aligned buffer to usb_hcd_submit_urb() is virtually not possible. I'm not sure how this should be fixed, this should be probably discussed on linux-mips and usb mailing list.

For now I'll prepare patch for rtl8187 which possibly fix corruption when that driver is in use.
Comment 23 Stanislaw Gruszka 2014-02-08 20:20:30 UTC
Created attachment 125361 [details]
rtl8187_cache_align.patch

Possible fix for rtl8187. Please test without any other patches to see if it make gone of oops'es originally reported in this bug report.
Comment 24 Tom Li 2014-02-09 15:45:50 UTC
Yes, the oops'es disappeared and the kernel no longer panic.
Comment 25 Stanislaw Gruszka 2014-02-11 17:00:12 UTC
I posted the patch:
http://marc.info/?l=linux-wireless&m=139206817709511&w=2

It still possible to corrupt memory (theoretically) on those MIPS machines, by USB HCD or some other driver. But that was always possible, even without a16dad7763420a3b46cff1e703a9070827796cfc commit. 

Ralf, kernel MIPS maintainer is aware of that. Hopefully all modern MIPS processors will come with hardware DMA coherency. I think this bug can be closed.
Comment 26 Tom Li 2014-02-11 17:20:38 UTC
Finally, the bug has been fixed :) This is a good news for the whole Loongson Community.

Thanks very much for your attention and working on the bug.
Comment 27 Matt Turner 2016-02-22 06:49:01 UTC
Patch has been in Linux since 3.14 (commit b6213e413a4e0c66548153516b074df14f9d08e0). Please mark as fixed.
Comment 28 Tom Li 2016-02-22 06:52:24 UTC
(In reply to Matt Turner from comment #27)
> Patch has been in Linux since 3.14 (commit
> b6213e413a4e0c66548153516b074df14f9d08e0). Please mark as fixed.

I don't have permission to do that. Original reporter disappeared.
Comment 29 Petr Pisar 2016-02-22 17:26:49 UTC
I did not disappear. I only thought Stanislaw will do it because he understands the fix.