Bug 7569
Summary: | Host disappears on 1394 bus reset | ||
---|---|---|---|
Product: | Drivers | Reporter: | Robert Crocombe (rcrocomb) |
Component: | IEEE1394 | Assignee: | Stefan Richter (stefanr) |
Status: | REJECTED WILL_NOT_FIX | ||
Severity: | normal | CC: | protasnb |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.18 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 10046 | ||
Attachments: |
Code to provoke and notice the problem
Very slightly revised code to provoke reset problem [patch 2.6.20-rcX] ieee1394: ohci1394: serialize reset requests and irq handler Slightly more revised code to provoke reset problem [PATCH linux1394-2.6.git] ieee1394: ohci1394: serialize reset requests and irq handler New, hopefully better code for provoking reset problem |
Description
Robert Crocombe
2006-11-22 16:19:44 UTC
Discussion on lkml and linux1394-devel can be found starting at: http://lkml.org/lkml/2006/11/22/122 Could you test with the patch mentioned in bug 6070 and see if you get the warning? Newer kernels will log it with "WARNING...", not "Badness...". Stefan, I applied: http://me.in-berlin.de/~s5r6/linux1394/work-in-progress/debug-warn-if-we-sleep-in-an-irq-for-a-long-time/debug-warn-if-we-sleep-in-an-irq-for-a-long-time.2.6.19.patch to a 2.6.19 kernel, but got bored after about ~350 resets with no warnings and no port losses. I then applied the patch to 2.6.19.1-rt15, and lost a port after 148 resets. However, the WARN_ON in the patch doesn't appear to trigger. According to linux/bug.h, I should be grepping on "BUG: warning", but "grep -i" for both "bug" and "warning" turned up nothing interesting. Created attachment 10083 [details]
Code to provoke and notice the problem
Here's the code I now use to provoke the problem.
OK, then this is more or less separate from bug 6070. I shall try your testcase on my PCs. One has a CardBus-PCI bridge, the other one a PCI-PCIe bridge. Needless to say, I never got this bug before, mostly using one or a few SBP-2 devices. I'll see if I can get an -rt kernel set up sometime soon. Created attachment 10085 [details]
Very slightly revised code to provoke reset problem
Okay. Note that it's not -rt only. I finally just let the test thing run and
provoked it in successive tests after 438, 371, 340, and 445 resets.
I poked at the output from my program to make it more concise and pretty. It
now looks something like:
root@spanky:read_file$ date; ./break_things ; date
Mon Jan 15 11:54:36 MST 2007
Signal handler installed.
There are 5 1394 interfaces.
[ 0] Looking for 5 ports: 0 1 2 . 4
[ 1] Looking for 5 ports: 0 1 2 . 4
[ 2] Looking for 5 ports: 0 1 2 . 4
[ 3] Looking for 5 ports: . 1 2 . 4
[ 4] Looking for 5 ports: 0 1 2 . 4
[ 5] Looking for 5 ports: 0 1 2 . 4
[ 6] Looking for 5 ports: 0 1 2 . 4
[ 7] Looking for 5 ports: 0 1 2 . 4
[ 8] Looking for 5 ports: 0 1 2 . 4
[ 9] Looking for 5 ports: 0 1 2 . 4
[ 10] Looking for 5 ports: 0 1 2 . 4
Okay, I'm pretty sure port 3 is gone: missing 10 consecutive times
Program terminating
libraw1394: raw1394_destroy_handle line 157: Calling close on fd 3
Mon Jan 15 11:54:47 MST 200
where I ran this after port 3 had previously gone out after 445 resets. I'll
reattach the code.
Do you know if the IWill H8502 contains an IOMMU? Yes, from "dmesg | grep IOMMU": PCI-DMA: using GART IOMMU. and there is a BIOS entry that indicates its presence. I currently hope that the solution is to serialize ohci1394's IRQ handler vs. ohci1394 bus reset request function (and certainly a lot of other register accessing functions) by means of spinlocks plus mmiowb barrier. Might be easy to do but I don't know if I manage to put this together at one of these evenings; there is a lot of other stuff going on here lately. Sorry for dragging this issue out... Eh. We contacted Indigita to let them know about the problem, and I provided them with my code, but they have decided that the problem doesn't merit their attention. I wasn't in on the exact conversation at this point, but the email to me (from a co-worker) simply indicated that they called "uncle". Uhm, in case that doesn't translate, that is an indicator that they have given up on providing a solution at their end. This is annoying from a port density point of view, but the approach for the future for the work here is to use Fireboard 800s in a PCI expansion chassis, which was the initial approach until someone noticed the Indigita cards. I would note that apparently Indigita does not own a 1394b bus analyzer (from email), and their first solution to us was "don't do that" (put all 4 ports on the same bus). However, I have a bunch of the cards in a desk drawer now, and will be happy to install one to test any solution you propose. But take your time: the card's functionality has become moot. Thanks for all your efforts thus far, by the way. Created attachment 10198 [details]
[patch 2.6.20-rcX] ieee1394: ohci1394: serialize reset requests and irq handler
This is a poke in the dark and could even break more than it attempts to fix.
It will only address behaviour WRT bus resets which were triggered by local
software, not if the bus resets were forced by hardware or remote software.
Well, given that even the linux1394 crew (what's left of it) is perplexed by Linux' IEEE 1394 drivers, what do you expect from the poor guys at Indigita? PS about the patch: Need to do more research for the issue with externally triggered bus resets. Hrm, I'm not sure it's made much difference. I've been pushed to a different machine: same H8502, same # of processors, but less (10GB vs 32GB) RAM, and causing the reset problem to occur seems to take a bit longer. I ran 2.6.20 and 2.6.20-rt2 kernels both with and without the patch, and the results are like so: 2.6.20 w/out patch 1st iteration: 1264 resets then broke 2nd iteration: 4172 resets then broke 3rd iteration: 46,256 resets (!) resets then broke 2.6.20 w/ patch (I actually tried this first) 1st iteration: 1718 resets then broke 2nd iteration: 3010 resets then stopped by me (so I could see if the non-patched was any worse) 2.6.20-rt2 w/out patch 1st iteration: 311 resets then broke 2nd iteration: 1035 resets then broke 3rd iteration: "10" resets, which means one interface wasn't restored via rmmod/modprobe (didn't think to record which) 4th iteration: 48 resets then broke 5th iteration: 702 resets then broke 6th iteration: port 5 (indigita interfaces were 2,3,4,5) not restored (program quit after my "give it enough time" 10 minimum resets) At this point I ran my little reset script: #!/bin/bash rmmod raw1394 rmmod ohci1394 modprobe ohci1394 modprobe raw1394 and the kernel locked up hard (can't ping, keyboard seized) 2.6.20-rt2 (w/ patch) 1st iteration: 75 resets and broke 2nd iteration: 496 resets and then locked up (as above) reset machine 3rd iteration: 41 resets reset modules immediate lockup on restarting test program reset machine 4th iteration: 252 resets and then lost all 4 ports simultaneously (wow!) reset modules -> lockup So there's some ugliness in the -rt kernel, or some ugliness revealed by the -rt kernel. I'll probably poke at Ingo a little later. The number of interrupts that occurs skyrockets right before the lockup. 'vmstat 1' follows. Normally I seem to be running around 1600-2200 interrupts a second (Hz == 250), which dips to about 400+ when I start the reset test program (and you can see the amount of debug stuff that gets written) or when I do the rmmod/modprobe magic, and then picks back up when the program terminates. So 102k interrupts is a bit unusual... procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- 0 0 0 8714900 145420 348580 0 0 0 0 1887 17922 0 2 98 0 0 0 0 8714776 145420 348580 0 0 0 0 1595 17254 0 3 97 0 0 0 0 8714900 145420 348580 0 0 0 0 1703 17461 0 2 98 0 0 0 0 8714884 145420 348580 0 0 0 0 1701 17512 0 2 98 0 0 0 0 8714884 145420 348580 0 0 0 0 1582 17154 0 2 98 0 0 1 0 8716808 145420 348580 0 0 0 416 1236 12329 0 1 89 10 0 1 0 8716808 145420 348580 0 0 0 1036 395 4306 0 1 75 24 0 1 0 8717036 145420 348588 0 0 0 976 380 4249 0 0 76 25 0 1 0 8717036 145420 348588 0 0 0 992 385 4262 0 0 75 25 0 1 0 8717036 145420 348596 0 0 0 1152 446 4638 0 0 76 24 0 1 0 8717044 145420 348596 0 0 0 1008 389 4298 0 0 75 25 0 1 0 8717048 145420 348608 0 0 0 1056 403 4384 0 0 75 25 0 0 0 8716924 145420 348608 0 0 0 848 335 4151 0 0 79 21 0 0 0 8715064 145420 348616 0 0 0 0 1707 14770 0 1 99 0 0 0 0 8715064 145420 348616 0 0 0 0 1718 17723 0 3 97 0 0 0 0 8715068 145420 348616 0 0 0 0 1700 17209 0 2 98 0 0 0 0 8715068 145420 348616 0 0 0 0 1893 17218 0 2 98 0 0 0 0 8714944 145420 348616 0 0 0 0 1921 17703 0 3 97 0 0 0 0 8715068 145420 348616 0 0 0 0 1886 16980 0 2 98 0 0 0 0 8717052 145420 348616 0 0 0 12 546 5675 0 0 99 0 0 0 0 8715068 145420 348616 0 0 0 0 1606 11743 0 1 99 0 0 0 0 8715200 145420 348616 0 0 0 0 1790 14243 0 2 98 0 0 1 0 8711484 145420 348616 0 0 0 836 820 9356 1 11 72 16 6 1 0 8711872 145420 348620 0 0 0 992 35476 4091 0 10 65 25 6 1 0 8711872 145420 348620 0 0 0 1056 99101 3801 0 25 50 25 6 1 0 8711872 145420 348628 0 0 0 1088 101964 3805 0 25 50 25 Created attachment 10334 [details]
Slightly more revised code to provoke reset problem
Tweaked to make it possible to ignore ports, which means they needn't be reset,
which saves a lot of SelfID debugging output. For the Unibrain cards, which
seem not to have problems.
Reviewing our old conversation, I noticed in http://lkml.org/lkml/2006/11/27/178 that there is one IRQ shared between two hosts (I would guess 1x Unibrain + 1x Indigita?) and 3 separate IRQs (3x Indigita?). The shared interrupt stays on the same CPU almost all the time; the others are quite evenly distributed between CPUs. Stuff to think about. In one post I said something about write ops to the interrupt event mask perhaps not being flushed fast enough. The patch from comment #12 doesn't do any flushing. The patch merely preserves ordering of the MMIO in ohci_devctl/RESET_BUS versus ohci_irq_handler in presence of SMP + bus bridges, like described in http://lxr.linux.no/source/Documentation/memory-barriers.txt?v=2.6.18#L1285 (assuming the bus bridges work correctly). I'm also trying to realize what http://lwn.net/Articles/146861/ wants to tell me. Ah, I hadn't remembered the interrupt sharing. That is not the case on the "new" machine, which has 2 Unibrain cards (ports 0 and 1) and 1 Indigita (2,3,4,5). [root@bubba trunk]# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 240 0 0 0 IO-APIC-edge timer 1: 1 0 0 9 IO-APIC-edge i8042 8: 0 0 0 0 IO-APIC-edge rtc 9: 0 0 0 0 IO-APIC-fasteoi acpi 12: 0 0 0 114 IO-APIC-edge i8042 19: 230585 0 0 33 IO-APIC-fasteoi eth0 21: 0 0 0 0 IO-APIC-fasteoi ohci_hcd:usb2 22: 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb1 72: 0 0 0 1 IO-APIC-fasteoi ohci1394 81: 0 0 0 2 IO-APIC-fasteoi ohci1394 88: 0 0 15 0 IO-APIC-fasteoi aic79xx 89: 3203 34300 12618 32923 IO-APIC-fasteoi aic79xx 93: 0 0 1 0 IO-APIC-fasteoi ohci1394 94: 0 0 1 0 IO-APIC-fasteoi ohci1394 95: 0 0 1 0 IO-APIC-fasteoi ohci1394 96: 0 0 1 0 IO-APIC-fasteoi ohci1394 NMI: 0 0 0 0 LOC: 60856314 60700937 60602709 60999361 ERR: 0 and for my test of isochronous transmit w/ realtime behavior (my end goal), interrupts are absolutely nailed to CPU 1. The short version of the test is 1200Hz data for 10 seconds (12k interrupts). Sure enough: [root@bubba trunk]# cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 240 0 0 0 IO-APIC-edge timer 1: 1 0 0 9 IO-APIC-edge i8042 8: 0 0 0 0 IO-APIC-edge rtc 9: 0 0 0 0 IO-APIC-fasteoi acpi 12: 0 0 0 114 IO-APIC-edge i8042 19: 231032 0 0 33 IO-APIC-fasteoi eth0 21: 0 0 0 0 IO-APIC-fasteoi ohci_hcd:usb2 22: 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb1 72: 0 0 0 1 IO-APIC-fasteoi ohci1394 81: 0 12000 0 3 IO-APIC-fasteoi ohci1394 88: 0 0 15 0 IO-APIC-fasteoi aic79xx 89: 3208 34335 12768 33755 IO-APIC-fasteoi aic79xx 93: 0 0 1 0 IO-APIC-fasteoi ohci1394 94: 0 0 1 0 IO-APIC-fasteoi ohci1394 95: 0 0 1 0 IO-APIC-fasteoi ohci1394 96: 0 0 1 0 IO-APIC-fasteoi ohci1394 NMI: 0 0 0 0 LOC: 60883385 60728002 60629667 61026483 ERR: 0 exactly 12k on CPU1 for IRQ 81 (port 1, Unibrain). This is actually behavior that I find desirable, since I hope to use processor affinity and nail the task to the appropriate CPU either via cpusets or pthread_setaffinity stuff. Hrm, I just noticed that the interrupt type is now "IO-APIC-fasteoi" rather than "IO-APIC-level". I hope that's not a cause for concern. I guess my config has drifted a bit more than I'd thought. FYI, this code is what I've used to cause: http://bugzilla.kernel.org/show_bug.cgi?id=7979 and the config and dmesg are there, too. Created attachment 10698 [details]
[PATCH linux1394-2.6.git] ieee1394: ohci1394: serialize reset requests and irq handler
Slightly revised patch: Now also handles the host->in_bus_reset flag
atomically. It still only addresses concurrency between local software
requesting a bus reset and the local reset IRQ handler. Therefore I don't
think it is more interesting to your setup than the previous version. I'm
posting this mostly just for reference.
Now that Kristian H Could this make a difference? Mitch Williams: Flush MSI-X table writes http://lkml.org/lkml/2007/3/30/173 Or this: Eric W. Biederman: x86_64 irq: handle irqs pending in IRR during irq migration http://marc.info/?l=linux-mm-commits&m=117046481708594 I've tried the patch from Mitch Williams, and it does not appear to fix things. I hope to try Eric's patch fairly soon. Created attachment 11133 [details]
New, hopefully better code for provoking reset problem
No longer reads /var/log/messages (yikes), excessive debugging output not
required.
I'm seeing a fair number of error messages being reported in /var/log/messages, probably hidden in the voluminous output of the earlier code. Looks like the following, which occurred during a sequence where after 119 resets the test code terminated. Apr 11 18:14:41 bubba kernel: ohci1394: fw-host5: Unrecoverable error! Apr 11 18:14:41 bubba kernel: ohci1394: fw-host5: physical posted write error Apr 11 18:14:41 bubba kernel: ohci1394: fw-host5: isochronous cycle too long Apr 11 18:14:41 bubba kernel: ohci1394: fw-host5: SelfID received outside of bus reset sequence Apr 11 18:14:41 bubba kernel: ohci1394: fw-host5: Unhandled interrupt(s) 0x00000100 Apr 11 18:16:12 bubba kernel: ohci1394: fw-host4: isochronous cycle too long Apr 11 18:16:12 bubba kernel: ohci1394: fw-host4: SelfID received outside of bus reset sequence I think the termination of the reset software coincided with that last selfID message, but sometimes it can continue past them (as it is doing as I type this). The unrecoverable error doesn't seem to cause the reset stuff any heartburn, either. Note that there are no cycle starts and no isochronous traffic. Only bus resets, self IDs, and some quadlet read requests from the FireSpy as it asks nodes for the start of their config ROMs: it keeps asking the root for the start of its config ROM (16 times per reset). However, the root is a phy-only 1394b hub, so there's no reply. Kernel version is 2.6.21-rc5 with Mitch William's patch applied. I've put up the complete output from a run at: http://rcrocomb.googlepages.com/ I just tried 2.6.21-rc6-mm1, which I presume has Eric's patch in it: sadly, the test terminated after 504 resets. The errors reported during the entirety of the run are: Apr 12 11:41:53 bubba kernel: ohci1394: fw-host3: physical posted write error Apr 12 11:41:53 bubba kernel: ohci1394: fw-host3: isochronous cycle too long Apr 12 11:41:53 bubba kernel: ohci1394: fw-host3: SelfID received outside of bus reset sequence Apr 12 11:41:53 bubba kernel: ohci1394: fw-host3: Unhandled interrupt(s) 0x00000100 Apr 12 11:42:20 bubba kernel: ohci1394: fw-host6: physical posted write error Apr 12 11:42:20 bubba kernel: ohci1394: fw-host6: isochronous cycle too long Apr 12 11:42:20 bubba kernel: ohci1394: fw-host6: SelfID received outside of bus reset sequence Apr 12 11:42:20 bubba kernel: ohci1394: fw-host6: Unhandled interrupt(s) 0x00000100 Apr 12 11:44:41 bubba kernel: ohci1394: fw-host4: Unrecoverable error! Apr 12 11:44:41 bubba kernel: ohci1394: fw-host4: physical posted write error Apr 12 11:44:41 bubba kernel: ohci1394: fw-host4: isochronous cycle too long Apr 12 11:44:41 bubba kernel: ohci1394: fw-host4: SelfID received outside of bus reset sequence Apr 12 11:44:41 bubba kernel: ohci1394: fw-host4: Unhandled interrupt(s) 0x00000100 Apr 12 11:45:50 bubba kernel: ohci1394: fw-host4: Unrecoverable error! Apr 12 11:45:50 bubba kernel: ohci1394: fw-host4: physical posted write error Apr 12 11:45:50 bubba kernel: ohci1394: fw-host4: isochronous cycle too long Apr 12 11:45:50 bubba kernel: ohci1394: fw-host4: SelfID received outside of bus reset sequence Apr 12 11:45:50 bubba kernel: ohci1394: fw-host4: Unhandled interrupt(s) 0x00000100 Apr 12 11:46:35 bubba kernel: ohci1394: fw-host6: physical posted write error Apr 12 11:46:35 bubba kernel: ohci1394: fw-host6: isochronous cycle too long Apr 12 11:46:35 bubba kernel: ohci1394: fw-host6: SelfID received outside of bus reset sequence Apr 12 11:46:35 bubba kernel: ohci1394: fw-host6: Unhandled interrupt(s) 0x00000100 Apr 12 11:46:41 bubba kernel: ohci1394: fw-host4: Unrecoverable error! Apr 12 11:46:41 bubba kernel: ohci1394: fw-host4: physical posted write error Apr 12 11:46:41 bubba kernel: ohci1394: fw-host4: isochronous cycle too long Apr 12 11:46:41 bubba kernel: ohci1394: fw-host4: SelfID received outside of bus reset sequence Apr 12 11:46:41 bubba kernel: ohci1394: fw-host4: Unhandled interrupt(s) 0x00000100 Apr 12 11:48:49 bubba kernel: ohci1394: fw-host3: isochronous cycle too long Apr 12 11:48:49 bubba kernel: ohci1394: fw-host3: SelfID received outside of bus reset sequence Apr 12 11:49:00 bubba last message repeated 11 times Again, there were no isochronous packets or cycle starts. The FireSpy shows: 504 resets 4032 self IDs 8064 Read Requests I notice that this kernel has the 'juju' driver, so I will try and get Kristian's libraw thing and try when I get the chance. BTW, the "physical posted write error"s mean the controller has difficulties to get access to the PCI bus. The "Unhandled interrupt(s) 0x00000100" is merely a small mistake on my part when I added the logging of posted write errors. There should be a event &= ~OHCI1394_postedWriteErr; in the respective if block in ohci_irq_handler. But this mistake doesn't affect the operation. I don't know though if or how physical posted write errors could be related to the subsequent undesired masking of interrupt events. How is it working with latest kernel? (I'd suggest enabling OHCI1394_DEBUG, maybe it will give more info on the event that was received. Also how about testing on single CPU. Sounds like bus analyszer would help.. ;) The FireSpy is a FireWire analyzer. This is more a case for a PCI analyzer, or actually two: One in front of and one behind the Indigita card's PCI-X/PCI bridge. Theoretically spoken. (The Unibrain card never had this problem, right?) PS: About testing a new kernel: The only interesting update in there is the new alternative firewire stack. The updates of the old ieee1394 stack are unlikely to affect the issue. Ah, great - is it something that Robert should try, was it mentioned in #20 I suppose? Yes, the firewire-ohci driver has for example a different bus reset interrupt handler. It's not split into bus reset event and self-id complete event anymore. Of course the new drivers aren't as widely tested yet; probably nobody tried them together with the -rt patch yet. And there are still some features missing in the drivers and the userland support. (I still need to compile a status-quo summary; I hope I have it ready about when 2.6.22 is released.) Library ports to the new driver stack: libraw1394 from git://people.freedesktop.org/~krh/libraw1394 (juju branch) libdc1394 v2 from SVN, https://libdc1394.svn.sourceforge.net/svnroot/libdc1394 This was on my queue to try, but unfortunately I left the job before I was able to try the new stack. I passed the bug # along to the appropriate people. One of the last things I did was to try a similar kind of card, from DAP design: a 3 interface card: http://www.dapdesign.com/index.php?id=84 Which lasted through ~350,000 resets without any problems (1 reset per second). I don't remember whose PCI bridge the card used, but it is 64-bit/66MHz and not PCI-X Per #29: we actually had PCI bus analyzers, but the card and the slot were PCI-X 100/133MHz, which means 1 device per "bus". We hadn't any way to split the slot to get the bus analyzer before the card, and of course after the bridge would have meant some kind of craziness (I don't think you solder wires to a 133MHz bus). There are currently no resources to fix this in drivers/ieee1394/. Anybody who is able to (attempt to) solve this, please reopen the bug. drivers/firewire/ has a fundamentally reworked bus reset handling and _hopefully_ does not feature this problem --- if it isn't a hardware problem in the first place. Bug with some similarities: bug 10307 |