Bug 91711 - 3w-9xxx: DMA-API: device driver tries to free DMA memory it has not allocated
Summary: 3w-9xxx: DMA-API: device driver tries to free DMA memory it has not allocated
Status: NEW
Alias: None
Product: SCSI Drivers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 high
Assignee: linux-scsi@vger.kernel.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-21 19:17 UTC by Orion
Modified: 2016-03-18 18:59 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.18.3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Test Fix (1.53 KB, patch)
2016-03-17 20:36 UTC, nickkrause
Details | Diff
Test Patch (3.97 KB, patch)
2016-03-18 13:34 UTC, nickkrause
Details | Diff

Description Orion 2015-01-21 19:17:25 UTC
3w-9xxx 0000:03:03.0: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=0 bytes]
can cause randomly a kernel panic...

here is the bug trace

[10205.263190] ------------[ cut here ]------------
[10205.271182] WARNING: CPU: 3 PID: 0 at lib/dma-debug.c:1080 check_unmap+0x8ea/0x9e0()
[10205.273087] 3w-9xxx 0000:03:03.0: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=0 bytes]
[10205.273087] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfnetlink_queue nfnetlink_log nfnetlink tun bridge stp llc bonding ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ts_kmp xt_limit nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 xt_hashlimit nf_conntrack_ipv6 nf_defrag_ipv6 xt_string xt_multiport xt_conntrack nf_conntrack ip6table_filter ip6_tables iTCO_wdt ppdev iTCO_vendor_support lpc_ich mfd_core serio_raw pcspkr e752x_edac parport_pc i2c_i801 edac_core parport shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc ata_generic pata_acpi tg3 e1000 3w_9xxx
[10205.273087] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.18.3 #1
[10205.273087] Hardware name: Supermicro X6DH8/X6DH8, BIOS 6.00 08/16/2007
[10205.273087]  0000000000000000 fa5e96fc2665aa72 ffff88022fd83b68 ffffffff817a573e
[10205.273087]  0000000000000000 ffff88022fd83bc0 ffff88022fd83ba8 ffffffff81093601
[10205.273087]  ffff88022fd83ba8 ffff880223d47a60 ffff88022fd83cb8 0000000000000000
[10205.273087] Call Trace:
[10205.273087]  <IRQ>  [<ffffffff817a573e>] dump_stack+0x4f/0x7c
[10205.273087]  [<ffffffff81093601>] warn_slowpath_common+0x81/0xa0
[10205.273087]  [<ffffffff81093675>] warn_slowpath_fmt+0x55/0x70
[10205.273087]  [<ffffffff813c68db>] ? debug_dma_mapping_error+0x7b/0x90
[10205.273087]  [<ffffffff813c8a9a>] check_unmap+0x8ea/0x9e0
[10205.273087]  [<ffffffff817ac021>] ? _raw_spin_unlock_irqrestore+0x21/0x40
[10205.273087]  [<ffffffff813c8d45>] debug_dma_unmap_sg+0x75/0x150
[10205.273087]  [<ffffffff814f73d3>] scsi_dma_unmap+0x73/0xc0
[10205.273087]  [<ffffffffa00016a5>] twa_interrupt+0x585/0x770 [3w_9xxx]
[10205.273087]  [<ffffffff810fea9b>] ? __hrtimer_start_range_ns+0x1eb/0x480
[10205.273087]  [<ffffffff810eba7e>] handle_irq_event_percpu+0x3e/0x1f0
[10205.273087]  [<ffffffff810ebc71>] handle_irq_event+0x41/0x70
[10205.273087]  [<ffffffff810eecc3>] handle_fasteoi_irq+0xc3/0x170
[10205.273087]  [<ffffffff81018752>] handle_irq+0xb2/0x1a0
[10205.273087]  [<ffffffff810b42ce>] ? atomic_notifier_call_chain+0x3e/0x50
[10205.273087]  [<ffffffff817af86d>] do_IRQ+0x5d/0x100
[10205.273087]  [<ffffffff817ad6ed>] common_interrupt+0x6d/0x6d
[10205.273087]  <EOI>  [<ffffffff8110e6b0>] ? tick_nohz_stop_sched_tick+0x2b0/0x300
[10205.273087]  [<ffffffff8105aae6>] ? native_safe_halt+0x6/0x10
[10205.273087]  [<ffffffff813b6137>] ? debug_smp_processor_id+0x17/0x20
[10205.273087]  [<ffffffff810212cb>] default_idle+0x1b/0xf0
[10205.273087]  [<ffffffff81021dcf>] arch_cpu_idle+0xf/0x20
[10205.273087]  [<ffffffff810d6d26>] cpu_startup_entry+0x456/0x510
[10205.273087]  [<ffffffff817ac021>] ? _raw_spin_unlock_irqrestore+0x21/0x40
[10205.273087]  [<ffffffff8110b44c>] ? clockevents_register_device+0xbc/0x130
[10205.273087]  [<ffffffff81049281>] start_secondary+0x1b1/0x200
[10205.273087] ---[ end trace f8b7f072834e5aba ]---
Comment 1 nickkrause 2016-03-17 20:36:58 UTC
Created attachment 209671 [details]
Test Fix
Comment 2 nickkrause 2016-03-17 20:37:19 UTC
The patch I just attached about this may fix your issue.
Comment 3 Orion 2016-03-17 21:29:00 UTC
Seems to work now thank you very much.
do you think your patch will be included on 4.x kernels?
Comment 4 nickkrause 2016-03-18 13:34:36 UTC
Created attachment 209781 [details]
Test Patch
Comment 5 nickkrause 2016-03-18 13:34:53 UTC
If you are willing to test the patch against main line kernel just o make sure it's just fine. I also rewrote the patch with a commit log now just add your tested by below it.
Comment 6 Orion 2016-03-18 13:57:02 UTC
sorry I can't have access to the log anymore...
Comment 7 nickkrause 2016-03-18 14:00:31 UTC
The kernel log or the bug log.
Comment 8 nickkrause 2016-03-18 14:01:49 UTC
Sorry do you mean the bug log or the kernel bugzilla log?
Comment 9 Orion 2016-03-18 14:03:24 UTC
just today my data center blocked my server where the card is
because of invoice contentious... a kind of conspiracy against my project....
Comment 10 nickkrause 2016-03-18 14:05:45 UTC
That's OK :). If you can however add tested by on my patch I would really appreciate it.
Comment 11 Orion 2016-03-18 14:12:13 UTC
I just had time yesterday to check the log and see that the errors disappear.
today no ssh access at all. anyhow it's months now they are creating troubles against me so I just give up and ask my money back next week.
cheers
Comment 12 Orion 2016-03-18 14:13:46 UTC
how can I mark tested on your patch?
Comment 13 nickkrause 2016-03-18 14:16:05 UTC
Just type the line:
Tested-by: Your Full Name email address
Comment 14 Orion 2016-03-18 14:30:44 UTC
Tested-by: Orion admin@e-blokos.com
Comment 15 nickkrause 2016-03-18 14:34:53 UTC
And if you do get around to it try testing on mainline. Personally I wrote the patch on mail line so if it applies clearly to your kernel it possibly works fine on main line but just doubt check if you can.
Comment 16 Orion 2016-03-18 14:38:42 UTC
ok thanks. the best would be to check if another guys can test it also
Comment 17 Adam Radford 2016-03-18 18:24:03 UTC
I think Christoph Hellwig already fixed this issue in the upstream kernel, with these 2 upstream patches:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=118c855b5623f3e2e6204f02623d88c09e0c34de

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=15e3d5a285ab9283136dba34bbf72886d9146706

I would apply the above 2 patches to your kernel and try to reproduce.
Comment 18 nickkrause 2016-03-18 18:52:23 UTC
Those patches may fix it but I am pretty sure those patches are in the 3.18.3 kernel release back port. Maybe I am work but let Orion test those patches too to see if those should be back ported or are already there.
Comment 19 Orion 2016-03-18 18:59:59 UTC
Unfortunately for now my DC blocked all ssh access and I'm afraid it's for a long time...

Note You need to log in before you can comment on or make changes to this bug.