Created attachment 65422 [details] dmesg Hi, With the upgrade to 2.6.39.2, I cannot copy more than ~1GB of data over the network before my input devices lock up. Example: use scp on tty1 to copy a folder or file larger than 3GB over my gigabit network and after ~1min the keyboard stops responding. use cp on tty1 with an nfs-3 share mounted over the same network and try to copy the same file, same happens I can hit ctrl+c for about 20-30 sec and eventually get an interrupt that stops the copying, then the input devices slowly gain control again. If I do the same on X, there is no chance to get in between and I have to use SysRq to reboot. In our bug report at https://bugs.gentoo.org/show_bug.cgi?id=373109 We did a bisect between 2.6.39.1 and 2.6.39.2 and found the following patch is causing this problem: commit 87cc4d1e3e05af38c7c51323a3d86fe2572ab033 Author: Chris Wright <chrisw@sous-sol.org> Date: Sat May 28 13:15:04 2011 -0500 intel-iommu: Dont cache iova above 32bit I will also attach dmesg, current kernel config, and my bisect log (I put a uname -a into the log after each bisect) plus git bisect log Please let me know if you need more information. Thanks, Marcus
Created attachment 65432 [details] working config 2.6.39.1
Created attachment 65442 [details] bisect log I created during the process
Created attachment 65452 [details] git bisect log
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Wed, 13 Jul 2011 14:32:42 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=39312 > > URL: https://bugs.gentoo.org/show_bug.cgi?id=373109 > Summary: intel-iommu: Dont cache iova above 32bit - network > copy freezes system > Product: Drivers > Version: 2.5 > Kernel Version: 2.6.39.2 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > AssignedTo: drivers_other@kernel-bugs.osdl.org > ReportedBy: marcus.disi@gmail.com > Regression: No > > > Created an attachment (id=65422) > --> (https://bugzilla.kernel.org/attachment.cgi?id=65422) > dmesg > > Hi, > > With the upgrade to 2.6.39.2, I cannot copy more than ~1GB of data > over the network before my input devices lock up. > Example: > use scp on tty1 to copy a folder or file larger than 3GB over my > gigabit network and after ~1min the keyboard stops responding. > use cp on tty1 with an nfs-3 share mounted over the same network and > try to copy the same file, same happens > > I can hit ctrl+c for about 20-30 sec and eventually get an interrupt > that stops the copying, then the input devices slowly gain control > again. If I do the same on X, there is no chance to get in between and > I have to use SysRq to reboot. > > In our bug report at https://bugs.gentoo.org/show_bug.cgi?id=373109 > We did a bisect between 2.6.39.1 and 2.6.39.2 and found the following > patch is causing this problem: > > commit 87cc4d1e3e05af38c7c51323a3d86fe2572ab033 > Author: Chris Wright <chrisw@sous-sol.org> > Date: Sat May 28 13:15:04 2011 -0500 > > intel-iommu: Dont cache iova above 32bit > > I will also attach dmesg, current kernel config, and my bisect log (I > put a uname -a into the log after each bisect) plus git bisect log > A 2.3.39.1->2.6.39.2 regression. And, presumably, a 2.6.39->mainline regression. That's commit 1c9fc3d11b84fbd0c4f4aa7855702c2a1f098ebb in mainline.
First-Bad-Commit : 1c9fc3d11b84fbd0c4f4aa7855702c2a1f098ebb
Mike Travis wrote: > Interesting, I was just preparing a patch to fix this (follows) Oops, sorry, I was mistaken. The patch I'm preparing is for a different problem. I'll look more closely at the bug report but I may need Chris's help in resolving it. Thanks, Mike > > Andrew Morton wrote: >> (switched to email. Please respond via emailed reply-to-all, not via the >> bugzilla web interface). >> >> On Wed, 13 Jul 2011 14:32:42 GMT >> bugzilla-daemon@bugzilla.kernel.org wrote: >> >>> https://bugzilla.kernel.org/show_bug.cgi?id=39312 >>> >>> URL: https://bugs.gentoo.org/show_bug.cgi?id=373109 >>> Summary: intel-iommu: Dont cache iova above 32bit - network >>> copy freezes system >>> Product: Drivers >>> Version: 2.5 >>> Kernel Version: 2.6.39.2 >>> Platform: All >>> OS/Version: Linux >>> Tree: Mainline >>> Status: NEW >>> Severity: normal >>> Priority: P1 >>> Component: Other >>> AssignedTo: drivers_other@kernel-bugs.osdl.org >>> ReportedBy: marcus.disi@gmail.com >>> Regression: No >>> >>> >>> Created an attachment (id=65422) >>> --> (https://bugzilla.kernel.org/attachment.cgi?id=65422) >>> dmesg >>> >>> Hi, >>> >>> With the upgrade to 2.6.39.2, I cannot copy more than ~1GB of data >>> over the network before my input devices lock up. >>> Example: >>> use scp on tty1 to copy a folder or file larger than 3GB over my >>> gigabit network and after ~1min the keyboard stops responding. >>> use cp on tty1 with an nfs-3 share mounted over the same network and >>> try to copy the same file, same happens >>> >>> I can hit ctrl+c for about 20-30 sec and eventually get an interrupt >>> that stops the copying, then the input devices slowly gain control >>> again. If I do the same on X, there is no chance to get in between and >>> I have to use SysRq to reboot. >>> >>> In our bug report at https://bugs.gentoo.org/show_bug.cgi?id=373109 >>> We did a bisect between 2.6.39.1 and 2.6.39.2 and found the following >>> patch is causing this problem: >>> >>> commit 87cc4d1e3e05af38c7c51323a3d86fe2572ab033 >>> Author: Chris Wright <chrisw@sous-sol.org> >>> Date: Sat May 28 13:15:04 2011 -0500 >>> >>> intel-iommu: Dont cache iova above 32bit >>> >>> I will also attach dmesg, current kernel config, and my bisect log (I >>> put a uname -a into the log after each bisect) plus git bisect log >>> >> >> A 2.3.39.1->2.6.39.2 regression. >> >> And, presumably, a 2.6.39->mainline regression. >> >> That's commit 1c9fc3d11b84fbd0c4f4aa7855702c2a1f098ebb in mainline.
Interesting, I was just preparing a patch to fix this (follows) Andrew Morton wrote: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Wed, 13 Jul 2011 14:32:42 GMT > bugzilla-daemon@bugzilla.kernel.org wrote: > >> https://bugzilla.kernel.org/show_bug.cgi?id=39312 >> >> URL: https://bugs.gentoo.org/show_bug.cgi?id=373109 >> Summary: intel-iommu: Dont cache iova above 32bit - network >> copy freezes system >> Product: Drivers >> Version: 2.5 >> Kernel Version: 2.6.39.2 >> Platform: All >> OS/Version: Linux >> Tree: Mainline >> Status: NEW >> Severity: normal >> Priority: P1 >> Component: Other >> AssignedTo: drivers_other@kernel-bugs.osdl.org >> ReportedBy: marcus.disi@gmail.com >> Regression: No >> >> >> Created an attachment (id=65422) >> --> (https://bugzilla.kernel.org/attachment.cgi?id=65422) >> dmesg >> >> Hi, >> >> With the upgrade to 2.6.39.2, I cannot copy more than ~1GB of data >> over the network before my input devices lock up. >> Example: >> use scp on tty1 to copy a folder or file larger than 3GB over my >> gigabit network and after ~1min the keyboard stops responding. >> use cp on tty1 with an nfs-3 share mounted over the same network and >> try to copy the same file, same happens >> >> I can hit ctrl+c for about 20-30 sec and eventually get an interrupt >> that stops the copying, then the input devices slowly gain control >> again. If I do the same on X, there is no chance to get in between and >> I have to use SysRq to reboot. >> >> In our bug report at https://bugs.gentoo.org/show_bug.cgi?id=373109 >> We did a bisect between 2.6.39.1 and 2.6.39.2 and found the following >> patch is causing this problem: >> >> commit 87cc4d1e3e05af38c7c51323a3d86fe2572ab033 >> Author: Chris Wright <chrisw@sous-sol.org> >> Date: Sat May 28 13:15:04 2011 -0500 >> >> intel-iommu: Dont cache iova above 32bit >> >> I will also attach dmesg, current kernel config, and my bisect log (I >> put a uname -a into the log after each bisect) plus git bisect log >> > > A 2.3.39.1->2.6.39.2 regression. > > And, presumably, a 2.6.39->mainline regression. > > That's commit 1c9fc3d11b84fbd0c4f4aa7855702c2a1f098ebb in mainline.
On 13 July 2011 22:40, Chris Wright <chrisw@sous-sol.org> wrote: >> On Wed, 13 Jul 2011 14:32:42 GMT >> bugzilla-daemon@bugzilla.kernel.org wrote: >> > https://bugzilla.kernel.org/show_bug.cgi?id=39312 >> > >> > URL: https://bugs.gentoo.org/show_bug.cgi?id=373109 >> > Summary: intel-iommu: Dont cache iova above 32bit - network >> > copy freezes system >> > Product: Drivers >> > Version: 2.5 >> > Kernel Version: 2.6.39.2 >> > Platform: All >> > OS/Version: Linux >> > Tree: Mainline >> > Status: NEW >> > Severity: normal >> > Priority: P1 >> > Component: Other >> > AssignedTo: drivers_other@kernel-bugs.osdl.org >> > ReportedBy: marcus.disi@gmail.com >> > Regression: No >> > >> > >> > Created an attachment (id=65422) >> > --> (https://bugzilla.kernel.org/attachment.cgi?id=65422) >> > dmesg > > Can you send a dmesg from boot, and an lspci? > Hi, I cannot reboot right now, I'll produce dmesg with 2.6.39.2 later here is lspci for now: disi-bigtop ~ # lspci 00:00.0 Host bridge: Intel Corporation Device 0104 (rev 09) 00:01.0 PCI bridge: Intel Corporation Device 0101 (rev 09) 00:16.0 Communication controller: Intel Corporation Cougar Point HECI Controller #1 (rev 04) 00:1a.0 USB Controller: Intel Corporation Cougar Point USB Enhanced Host Controller #2 (rev 05) 00:1b.0 Audio device: Intel Corporation Cougar Point High Definition Audio Controller (rev 05) 00:1c.0 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 1 (rev b5) 00:1c.1 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 2 (rev b5) 00:1c.2 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 3 (rev b5) 00:1c.3 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 4 (rev b5) 00:1d.0 USB Controller: Intel Corporation Cougar Point USB Enhanced Host Controller #1 (rev 05) 00:1f.0 ISA bridge: Intel Corporation Device 1c49 (rev 05) 00:1f.2 SATA controller: Intel Corporation Cougar Point 6 port SATA AHCI Controller (rev 05) 00:1f.3 SMBus: Intel Corporation Cougar Point SMBus Controller (rev 05) 01:00.0 VGA compatible controller: nVidia Corporation Device 0dd1 (rev a1) 01:00.1 Audio device: nVidia Corporation Device 0be9 (rev a1) 02:00.0 USB Controller: NEC Corporation Device 0194 (rev 03) 03:00.0 Ethernet controller: JMicron Technology Corp. JMC250 PCI Express Gigabit Ethernet Controller (rev 05) 03:00.1 System peripheral: JMicron Technology Corp. Device 2392 (rev 90) 03:00.2 SD Host controller: JMicron Technology Corp. Device 2391 (rev 90) 03:00.3 System peripheral: JMicron Technology Corp. Device 2393 (rev 90) 04:00.0 Network controller: Intel Corporation Device 0091 (rev 34) 05:00.0 FireWire (IEEE 1394): JMicron Technology Corp. IEEE 1394 Host Controller (rev 30)
On 13 July 2011 23:02, Chris Wright <chrisw@sous-sol.org> wrote: > * Marcus Becker (marcus.disi@gmail.com) wrote: >> On 13 July 2011 22:40, Chris Wright <chrisw@sous-sol.org> wrote: >> >> On Wed, 13 Jul 2011 14:32:42 GMT >> >> bugzilla-daemon@bugzilla.kernel.org wrote: >> >> > https://bugzilla.kernel.org/show_bug.cgi?id=39312 >> >> > >> >> > URL: https://bugs.gentoo.org/show_bug.cgi?id=373109 >> >> > Summary: intel-iommu: Dont cache iova above 32bit - network >> >> > copy freezes system >> >> > Product: Drivers >> >> > Version: 2.5 >> >> > Kernel Version: 2.6.39.2 >> >> > Platform: All >> >> > OS/Version: Linux >> >> > Tree: Mainline >> >> > Status: NEW >> >> > Severity: normal >> >> > Priority: P1 >> >> > Component: Other >> >> > AssignedTo: drivers_other@kernel-bugs.osdl.org >> >> > ReportedBy: marcus.disi@gmail.com >> >> > Regression: No >> >> > >> >> > >> >> > Created an attachment (id=65422) >> >> > --> (https://bugzilla.kernel.org/attachment.cgi?id=65422) >> >> > dmesg >> > >> > Can you send a dmesg from boot, and an lspci? >> > >> Hi, >> >> I cannot reboot right now, I'll produce dmesg with 2.6.39.2 later >> here is lspci for now: >> disi-bigtop ~ # lspci >> 00:00.0 Host bridge: Intel Corporation Device 0104 (rev 09) >> 00:01.0 PCI bridge: Intel Corporation Device 0101 (rev 09) >> 00:16.0 Communication controller: Intel Corporation Cougar Point HECI >> Controller #1 (rev 04) >> 00:1a.0 USB Controller: Intel Corporation Cougar Point USB Enhanced >> Host Controller #2 (rev 05) >> 00:1b.0 Audio device: Intel Corporation Cougar Point High Definition >> Audio Controller (rev 05) >> 00:1c.0 PCI bridge: Intel Corporation Cougar Point PCI Express Root >> Port 1 (rev b5) >> 00:1c.1 PCI bridge: Intel Corporation Cougar Point PCI Express Root >> Port 2 (rev b5) >> 00:1c.2 PCI bridge: Intel Corporation Cougar Point PCI Express Root >> Port 3 (rev b5) >> 00:1c.3 PCI bridge: Intel Corporation Cougar Point PCI Express Root >> Port 4 (rev b5) >> 00:1d.0 USB Controller: Intel Corporation Cougar Point USB Enhanced >> Host Controller #1 (rev 05) >> 00:1f.0 ISA bridge: Intel Corporation Device 1c49 (rev 05) >> 00:1f.2 SATA controller: Intel Corporation Cougar Point 6 port SATA >> AHCI Controller (rev 05) >> 00:1f.3 SMBus: Intel Corporation Cougar Point SMBus Controller (rev 05) >> 01:00.0 VGA compatible controller: nVidia Corporation Device 0dd1 (rev a1) >> 01:00.1 Audio device: nVidia Corporation Device 0be9 (rev a1) >> 02:00.0 USB Controller: NEC Corporation Device 0194 (rev 03) >> 03:00.0 Ethernet controller: JMicron Technology Corp. JMC250 PCI >> Express Gigabit Ethernet Controller (rev 05) > > Is this the network device the traffic is going through? Can you send > full lspci -vvv? > >> 03:00.1 System peripheral: JMicron Technology Corp. Device 2392 (rev 90) >> 03:00.2 SD Host controller: JMicron Technology Corp. Device 2391 (rev 90) >> 03:00.3 System peripheral: JMicron Technology Corp. Device 2393 (rev 90) >> 04:00.0 Network controller: Intel Corporation Device 0091 (rev 34) >> 05:00.0 FireWire (IEEE 1394): JMicron Technology Corp. IEEE 1394 Host >> Controller (rev 30) > the network adapter is the jme eth0 Ethernet controller: JMicron Technology Corp. JMC250 PCI Express Gigabit Ethernet Controller (rev 05)
Created attachment 65502 [details] dmesg 2.6.39.2 the jme picked up 100Mbps this time, booting back to 2.6.39.1 and it picked up 1000Mbps, I also had Gigabit on 2.6.39.2 usually jme 0000:03:00.0: eth0: Link is up at ANed: 1000 Mbps, Full-Duplex, MDI
Created attachment 65512 [details] lspci -vvv
> On Wed, 13 Jul 2011 14:32:42 GMT > bugzilla-daemon@bugzilla.kernel.org wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=39312 > > > > URL: https://bugs.gentoo.org/show_bug.cgi?id=373109 > > Summary: intel-iommu: Dont cache iova above 32bit - network > > copy freezes system > > Product: Drivers > > Version: 2.5 > > Kernel Version: 2.6.39.2 > > Platform: All > > OS/Version: Linux > > Tree: Mainline > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: Other > > AssignedTo: drivers_other@kernel-bugs.osdl.org > > ReportedBy: marcus.disi@gmail.com > > Regression: No > > > > > > Created an attachment (id=65422) > > --> (https://bugzilla.kernel.org/attachment.cgi?id=65422) > > dmesg Can you send a dmesg from boot, and an lspci?
* Marcus Becker (marcus.disi@gmail.com) wrote: > On 13 July 2011 22:40, Chris Wright <chrisw@sous-sol.org> wrote: > >> On Wed, 13 Jul 2011 14:32:42 GMT > >> bugzilla-daemon@bugzilla.kernel.org wrote: > >> > https://bugzilla.kernel.org/show_bug.cgi?id=39312 > >> > > >> > URL: https://bugs.gentoo.org/show_bug.cgi?id=373109 > >> > Summary: intel-iommu: Dont cache iova above 32bit - network > >> > copy freezes system > >> > Product: Drivers > >> > Version: 2.5 > >> > Kernel Version: 2.6.39.2 > >> > Platform: All > >> > OS/Version: Linux > >> > Tree: Mainline > >> > Status: NEW > >> > Severity: normal > >> > Priority: P1 > >> > Component: Other > >> > AssignedTo: drivers_other@kernel-bugs.osdl.org > >> > ReportedBy: marcus.disi@gmail.com > >> > Regression: No > >> > > >> > > >> > Created an attachment (id=65422) > >> > --> (https://bugzilla.kernel.org/attachment.cgi?id=65422) > >> > dmesg > > > > Can you send a dmesg from boot, and an lspci? > > > Hi, > > I cannot reboot right now, I'll produce dmesg with 2.6.39.2 later > here is lspci for now: > disi-bigtop ~ # lspci > 00:00.0 Host bridge: Intel Corporation Device 0104 (rev 09) > 00:01.0 PCI bridge: Intel Corporation Device 0101 (rev 09) > 00:16.0 Communication controller: Intel Corporation Cougar Point HECI > Controller #1 (rev 04) > 00:1a.0 USB Controller: Intel Corporation Cougar Point USB Enhanced > Host Controller #2 (rev 05) > 00:1b.0 Audio device: Intel Corporation Cougar Point High Definition > Audio Controller (rev 05) > 00:1c.0 PCI bridge: Intel Corporation Cougar Point PCI Express Root > Port 1 (rev b5) > 00:1c.1 PCI bridge: Intel Corporation Cougar Point PCI Express Root > Port 2 (rev b5) > 00:1c.2 PCI bridge: Intel Corporation Cougar Point PCI Express Root > Port 3 (rev b5) > 00:1c.3 PCI bridge: Intel Corporation Cougar Point PCI Express Root > Port 4 (rev b5) > 00:1d.0 USB Controller: Intel Corporation Cougar Point USB Enhanced > Host Controller #1 (rev 05) > 00:1f.0 ISA bridge: Intel Corporation Device 1c49 (rev 05) > 00:1f.2 SATA controller: Intel Corporation Cougar Point 6 port SATA > AHCI Controller (rev 05) > 00:1f.3 SMBus: Intel Corporation Cougar Point SMBus Controller (rev 05) > 01:00.0 VGA compatible controller: nVidia Corporation Device 0dd1 (rev a1) > 01:00.1 Audio device: nVidia Corporation Device 0be9 (rev a1) > 02:00.0 USB Controller: NEC Corporation Device 0194 (rev 03) > 03:00.0 Ethernet controller: JMicron Technology Corp. JMC250 PCI > Express Gigabit Ethernet Controller (rev 05) Is this the network device the traffic is going through? Can you send full lspci -vvv? > 03:00.1 System peripheral: JMicron Technology Corp. Device 2392 (rev 90) > 03:00.2 SD Host controller: JMicron Technology Corp. Device 2391 (rev 90) > 03:00.3 System peripheral: JMicron Technology Corp. Device 2393 (rev 90) > 04:00.0 Network controller: Intel Corporation Device 0091 (rev 34) > 05:00.0 FireWire (IEEE 1394): JMicron Technology Corp. IEEE 1394 Host > Controller (rev 30)
* Chris Wright (chrisw@sous-sol.org) wrote: > > On Wed, 13 Jul 2011 14:32:42 GMT > > bugzilla-daemon@bugzilla.kernel.org wrote: > > > https://bugzilla.kernel.org/show_bug.cgi?id=39312 > > > > > > URL: https://bugs.gentoo.org/show_bug.cgi?id=373109 > > > Summary: intel-iommu: Dont cache iova above 32bit - network > > > copy freezes system > > > Product: Drivers > > > Version: 2.5 > > > Kernel Version: 2.6.39.2 > > > Platform: All > > > OS/Version: Linux > > > Tree: Mainline > > > Status: NEW > > > Severity: normal > > > Priority: P1 > > > Component: Other > > > AssignedTo: drivers_other@kernel-bugs.osdl.org > > > ReportedBy: marcus.disi@gmail.com > > > Regression: No > > > > > > > > > Created an attachment (id=65422) > > > --> (https://bugzilla.kernel.org/attachment.cgi?id=65422) > > > dmesg > > Can you send a dmesg from boot, and an lspci? Two things worth trying: 1) boot with intel_iommu=strict (to disable batching of unmaps, should keep the number of outstanding mappings much lower) 2) boot with intel_iommu=forcedac (to disable the current behaviour which tries to map < 32bit, then if that fails, maps >32bit). The hangs sounds like the iova allocation is looping excessively under spin_lock_irqsave()
I am having the same issue. For me the system got frozen after about 1.3G transferred to my computer via nfs. With intel_iommu=strict the behavior is the same as without, freeze after 1.3G With intel_iommu=forcedac my system gets nearly frozen after about a second (~65MB transferred). The screen refresh gets very slow as well as keyboard input. Transfer rate drops down to some hundreds kb/s but I am able to Ctrl-C and the system gets back to normal after some seconds.
On 14 July 2011 00:14, Chris Wright <chrisw@sous-sol.org> wrote: > * Chris Wright (chrisw@sous-sol.org) wrote: >> > On Wed, 13 Jul 2011 14:32:42 GMT >> > bugzilla-daemon@bugzilla.kernel.org wrote: >> > > https://bugzilla.kernel.org/show_bug.cgi?id=39312 >> > > >> > > URL: https://bugs.gentoo.org/show_bug.cgi?id=373109 >> > > Summary: intel-iommu: Dont cache iova above 32bit - network >> > > copy freezes system >> > > Product: Drivers >> > > Version: 2.5 >> > > Kernel Version: 2.6.39.2 >> > > Platform: All >> > > OS/Version: Linux >> > > Tree: Mainline >> > > Status: NEW >> > > Severity: normal >> > > Priority: P1 >> > > Component: Other >> > > AssignedTo: drivers_other@kernel-bugs.osdl.org >> > > ReportedBy: marcus.disi@gmail.com >> > > Regression: No >> > > >> > > >> > > Created an attachment (id=65422) >> > > --> (https://bugzilla.kernel.org/attachment.cgi?id=65422) >> > > dmesg >> >> Can you send a dmesg from boot, and an lspci? > > Two things worth trying: > > 1) boot with intel_iommu=strict (to disable batching of unmaps, should > keep the number of outstanding mappings much lower) > > 2) boot with intel_iommu=forcedac (to disable the current behaviour > which tries to map < 32bit, then if that fails, maps >32bit). > > The hangs sounds like the iova allocation is looping excessively under > spin_lock_irqsave() > Hi, as Marc stated in the bug report, first method is the same behavior as before and the second method made my input delay at first and then later locked up as method 1. On first boot with method one, my external USB-keyboard refused to work (funny flashing keys) so I used the laptop keyboard. I rebooted again using the command option and it worked... Hope that helps, Marcus
One more thing I tested, to copy ~5GB from an external USB hard drive works without problems. Marc has the same network card, it might has something to do with this?
(In reply to comment #15) > I am having the same issue. For me the system got frozen after about 1.3G > transferred to my computer via nfs. > > With intel_iommu=strict the behavior is the same as without, freeze after > 1.3G > > With intel_iommu=forcedac my system gets nearly frozen after about a second > (~65MB transferred). The screen refresh gets very slow as well as keyboard > input. > Transfer rate drops down to some hundreds kb/s but I am able to Ctrl-C and > the > system gets back to normal after some seconds. Thanks (both Marc and Marcus) for testing this. The forcedac test means we always allocate from the end of the 64-bit address space. This suggests that the the linear search from the end of the address space is slow, which should only happen if there are a lot of address mappings. (In reply to comment #17) > One more thing I tested, to copy ~5GB from an external USB hard drive works > without problems. Marc has the same network card, it might has something to > do > with this? Yes, seems like the jme driver is not unmapping all descriptors. I don't have access to the hardware, but if you enable CONFIG_IOMMU_DEBUG=y we can see if the iommu is filling up with mappings. The driver itself would be pretty easy to debug to discover which unmap calls aren't being made.
CONFIG_IOMMU_DEBUG=y doesn't really show anything more than before. I had to enable AMD features to enable it, guess that doesn't trace intel-iommu? Maybe, if you involve Guo-Fu Tseng you could help him solve the jme problem? He already provided a patch to only map to addresses below 32 but didn't help: https://bugs.gentoo.org/show_bug.cgi?id=373109
Not the trace, but it will keep track of all mappings (it does sanity checking on map/unmap requests). And you should see it give up because the memory available for tracking the dma mappings is exhausted: "DMA-API: debugging out of memory - disabling". Basically an indirect indication that the driver is mapping but not unmapping. I'll see if Guo-Fu Tseng can help, thanks.
The patch http://patchwork.ozlabs.org/patch/105878/ Guo-Fu Tseng reported upstream works for me and two others... https://bugs.gentoo.org/process_bug.cgi
The patch got merged in v3.1-rc1: commit 94c5b41b327e08de0ddf563237855f55080652a1 Author: Guo-Fu Tseng <cooldavid@cooldavid.org> Date: Wed Jul 20 16:57:36 2011 +0000 jme: Fix unmap error (Causing system freeze)