I want to capture some DV data via firewire 1394a. Kernels 2.6.30.4 and 2.6.31- are both affected. I will attach the stacktraces. I haven't been resolving a linux kernel OOPS for a while so to ensure you I will also attach what I did, showing that the bzImage I booted off is relevant to the vmlinux file. The crashes had slighly different behavior. The one with 2.6.30.4 forced the machine to reboot. The one with 2.6.31-rc5-git7 just locked up the machine. No SysRq magic keys working. This is highly reproducible.
Created attachment 22792 [details] how I resolve the OOPS
Created attachment 22793 [details] rash-rawdv-2.6.30.4.txt Triggered by kino(1) program, go into Capture mode. Sometimes that's already enough. If not, start capturing the stream. Mostly, the machine locks up, hard, no keyboard diodes flashing (I think). Once it rebooted the machine.
Created attachment 22794 [details] crash-rawdv-2.6.31-rc5-git7.txt
Created attachment 22795 [details] resolved-2.6.30.4.txt
Created attachment 22796 [details] resolved-2.6.31-rc5-git7.txt
Stefan, couild you please take a peek at this one? It's a stack overrun so it's hard to tell what caused it. DV capture is implicated though. Martin, if you have CONFIG_4KSTACKS enabled then you could disable that. This may well "fix" the problem. We can then use CONFIG_DEBUG_STACKOVERFLOW and CONFIG_DEBUG_STACK_USAGE to work out where things are going wrong. Or there might be support for debugging stack usage problems in the new tracer code - we can ask Steven Rostedt and co to help out if so. Thanks.
$ grep CONFIG_4KSTACKS /usr/src/linux-2.6.30.4/.config # CONFIG_4KSTACKS is not set $ grep CONFIG_4KSTACKS /usr/src/linux-2.6.31-rc5-git7/.config # CONFIG_4KSTACKS is not set $ I will recompile the kernel to enable some more debug. So far I had: $ grep CONFIG_DEBUG /usr/src/linux-2.6.31-rc5-git7/.config # CONFIG_DEBUG_DRIVER is not set # CONFIG_DEBUG_DEVRES is not set CONFIG_DEBUG_FS=y CONFIG_DEBUG_KERNEL=y # CONFIG_DEBUG_SHIRQ is not set # CONFIG_DEBUG_OBJECTS is not set # CONFIG_DEBUG_SLAB is not set # CONFIG_DEBUG_KMEMLEAK is not set # CONFIG_DEBUG_RT_MUTEXES is not set # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_MUTEXES is not set # CONFIG_DEBUG_LOCK_ALLOC is not set # CONFIG_DEBUG_SPINLOCK_SLEEP is not set # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_HIGHMEM is not set # CONFIG_DEBUG_BUGVERBOSE is not set CONFIG_DEBUG_INFO=y # CONFIG_DEBUG_VM is not set # CONFIG_DEBUG_VIRTUAL is not set # CONFIG_DEBUG_WRITECOUNT is not set # CONFIG_DEBUG_MEMORY_INIT is not set # CONFIG_DEBUG_LIST is not set # CONFIG_DEBUG_SG is not set # CONFIG_DEBUG_NOTIFIERS is not set # CONFIG_DEBUG_BLOCK_EXT_DEVT is not set # CONFIG_DEBUG_PAGEALLOC is not set # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_DEBUG_STACK_USAGE is not set # CONFIG_DEBUG_RODATA is not set # CONFIG_DEBUG_NX_TEST is not set # CONFIG_DEBUG_BOOT_PARAMS is not set BTW, please apologize my messy report. I just forgot that I have the stacktraces resolved automtically and no need to run ksymoops at all. I just do not understand why ksymoops gave those tons of warnings regarding the System.map file mismatching vmlinux, while at least from the timestamps on files they seemed to be same compilation run.
Martin, which hardware do you use? (Camcorder, FireWire controller according to lspci.) Did it work for you before with an older kernel or in a different hardware combination? What do "grep OPTIMIZE .config" and "make checkstack | grep 1394" show?
Furthermore, what if you unload ohci1394 and use firewire-ohci instead? This requires you to - enable the newer FireWire stack in the kernel config, - use libraw1394 v2.x (latest is best; Gentoo has got it: sys-libs/libraw1394-2.0.4, revdep-rebuild if you had libraw1394 v1.x until now)
> This requires you to - chmod the respective device files or upgrade udev rules, http://ieee1394.wiki.kernel.org/index.php/Juju_Migration
I am trying to get a stacktrace out of 2.6.31-rc6-git6 (current). The machine once rebooted without sending messages to remote ttyS0 console. At second attempt I copied some 15000 frames through the firewire and then it locked up but again did not send any single character to the remote console. "echo test >> /dev/ttyS0" works. For the third time I just booted up, and pressed Alt+SysRq+p and the machine locked with the following sent to the remote console: test test test SysRq : Emergency Sync SysRq : Emergency Sync SysRq : Emergency Sync SysRq : Kill All Tasks SysRq : Show Regs Pid: 0, comm: swapper Not tainted (2.6.31-rc6-git6 #1) System Name EIP: 0060:[<c11c01ac>] EFLAGS: 00000282 CPU: 0 EIP is at acpi_idle_enter_simple+0x11a/0x145 EAX: c14a5f94 EBX: 00000cab ECX: 00000355 EDX: 00000031 ESI: 00000000 EDI: f70cd448 EBP: c14a5fb4 ESP: c14a5f94 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 CR0: 8005003b CR2: b8067f02 CR3: 3692a000 CR4: 000006d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Call Trace: [<c124cdb8>] cpuidle_idle_call+0x57/0x8b [<c1001b6d>] cpu_idle+0x1c/0x33 [<c12f59dd>] rest_init+0x4d/0x4f [<c14a68b8>] start_kernel+0x217/0x21c [<c14a62d8>] i386_start_kernel+0x32/0x37
(In reply to comment #8) > Martin, > which hardware do you use? (Camcorder, FireWire controller according to > lspci.) The computer is ASUS L3C/S laptop, P4M-based with ICH3-M chipset. It has onboard firewire chip, but I do not have the mini-connector on both sides of a cable. So I use PCMCIA Kouwell 7006 card with USB2.0 ports and FW1394. > Did it work for you before with an older kernel or in a different hardware > combination? It did some 4 years ago with the same HW, yes. ;)
Created attachment 22801 [details] stacktrace-2.6.31-rc6-git6.txt Finally a stacktrace from 2.6.31-rc6-git6. The machine rebooted itself but spit out some messages. I have manually unloaded dv1394 to ensure it does not interfere. I wonder why it talks about tcp stuff. Currently, it is not physically connected to the ethernet although it is enabled (had to move to a room with a computer with the serial console so no network connection).
It is a Gentoo Linux ~x86 machine. I have installed libdc1394-2.1.0 (the wikipage you mention suggests 2.1.2) and libraw1394-2.0.4. The camera is Sony Handycam DCR-HC18E-PAL.
Created attachment 22803 [details] 2.6.31-rc6-git6/.config triggering the bug
(In reply to comment #8) > What do "grep OPTIMIZE .config" and "make checkstack | grep 1394" show? linux-2.6.31-rc6-git6 # grep OPTIMIZE .config CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y # CONFIG_OPTIMIZE_INLINING is not set linux-2.6.31-rc6-git6 # make checkstack | grep 1394 0x00002d3b dma_rcv_tasklet [ohci1394]: 304 0x00000a6c video1394_ioctl [video1394]: 212 0x00007ed4 hpsb_default_host_entry [ieee1394]: 132 0x00001f05 state_connected [raw1394]: 120 0x000007a1 ether1394_data_handler [eth1394]: 116 0x00000e43 build_speed_map [ieee1394]: 108 0x000024cf ohci_iso_recv_task [ohci1394]: 104 0x00000e54 build_speed_map [ieee1394]: Dynamic (%eax) linux-2.6.31-rc6-git6 #
(In reply to comment #7) > $ grep CONFIG_4KSTACKS /usr/src/linux-2.6.30.4/.config > # CONFIG_4KSTACKS is not set > $ grep CONFIG_4KSTACKS /usr/src/linux-2.6.31-rc5-git7/.config > # CONFIG_4KSTACKS is not set > $ > > I will recompile the kernel to enable some more debug. So far I had: "Inadverently" I enabled in addition to many DEBUG variables -# CONFIG_CC_STACKPROTECTOR is not set +CONFIG_CC_STACKPROTECTOR_ALL=y +CONFIG_CC_STACKPROTECTOR=y and then both old firewire stack and the juju stack drivers worked (captured some 32 min of data). Probably not surprising for you. ;)
> I wonder why it talks about tcp stuff. Something involved in DV reception might have mistakenly written to the kernel stack, and unrelated timer interrupts trip over it. (Side note: Here is a comment from Ingo Molnar on whether it is a stack overflow or memory corruption: http://lkml.org/lkml/2009/8/21/311) > linux-2.6.31-rc6-git6 # make checkstack | grep 1394 > [...] OK, this is close to what I get on a x86-32 box here too. > "Inadverently" I enabled in addition to many DEBUG variables > [...] > and then both old firewire stack and the juju stack drivers worked Could you attach the working .config as well?
Created attachment 22804 [details] 2.6.31-rc6-git6/.config NOT triggering the bug
Created attachment 22806 [details] live_kernel_stack_traces_without_the_PCMCIA_card_inserted.txt
Created attachment 22807 [details] live_kernel_stack_traces_with_the_PCMCIA_card_inserted.txt
Created attachment 22808 [details] another-stacktrace-2.6.31-rc6-git6.txt So, after disabling back the gcc config stack protect feature while having enabled the stack tracing I got one reboot with no OOPS on remote serial console. On a second attempt I got this. This is using the old firewire stack.
Created attachment 22809 [details] yet-another-stacktrace-2.6.31-rc6-git6.txt This crash is funny. Mouse has stopped, but sound card spits out the sound recorded on the tape (quickly than it was in real time). The status line of kino(1) says "Waiting for DV 7 ...". Unplugging the firewire cable nor pulling out the PCMCIA card has changed anything. The computer still makes a sound. Weird. Disk diode does not flash, no Magic Keys working. Is it replaying some files? ;-) It was a fresh, warm boot.
Created attachment 22810 [details] juju-stacktrace-2.6.31-rc6-git6.txt JuJu driver has the same problem. The machine locked just after going into "Capture" tab in kino(1). However, it rebooted after few seconds itself. Unfortunately partial stacktrace.
Created attachment 22811 [details] another-juju-stacktrace-2.6.31-rc6-git6.txt Entered the "Capture" tab, saw some video frames but before even pressing "capture" machine locked and rebooted.
Created attachment 22812 [details] yet-another-juju-stacktrace-2.6.31-rc6-git6.txt I passed hpet=off acpi=off on the kernel commandline but it did not help.
Created attachment 22813 [details] juju-stacktrace-2.6.31-rc6-git6_with_highres-off.txt I added highres=off but hey, the machine rebooted again. Unfortunately the ftrace_dump_on_oops is too chatty so it not finish over the serial console so and is thus useless.
Created attachment 22814 [details] prelast-juju-stacktrace-2.6.31-rc6-git6_with_highres-off.txt So here is the pre-last stacktrace when highres=off was passed to kernel with JuJu stack and no ftrace dump enabled.
Created attachment 22815 [details] last-juju-stacktrace-2.6.31-rc6-git6_with_highres-off_nousb.txt Although I did rmmod ehci_hcd and uhci_hcd the machine rebooted but at least the stacktrace does not complain about stack corruption (booted with highres=off as previously). Maybe for the first time? Sorry for all the spam, I will wait what you say before shooting blindly.
Since isochronous receive DMA not only through ohci1394+ieee1394+raw1394 but also through firewire-ohci+firewire-core causes panics, it is highly unlikely that there is a driver problem. Isochronous reception has been newly implemented from scratch in the latter, and it even uses a different DMA mode of the OHCI-1394 chip (different buffer management, different DMA programs). So, either it's a hardware fault, or some other kernel component is buggy. Try to simplify your system step by step to eliminate one component after another. E.g. one starting step could be to use dvgrab (command line DV capture tool) rather than kino. You can run it first in the same environment as you run kino (full X desktop), then with X shut down, sound drivers unloaded or unbound, networking drivers unloaded or unbound, etc. pp.. Also build test kernels with reduced features. E.g. remove ACPI powermanagement options. You can do so also in a bisection style, e.g. remove a whole bundle of features between two steps, and if that helps, bring back a few options. (Just keep track of what set of features you changed between any two steps.) ---- Well, you actually already did come up with a narrow bad--good pair of configs, but I can't really make much sense of it: From comment #19: $ diff -u "attachment 22803 [details]" "attachment 22804 [details]" @@ -1,7 +1,7 @@ # # Automatically generated make config: don't edit # Linux kernel version: 2.6.31-rc6-git6 -# Fri Aug 21 13:26:12 2009 +# Sat Aug 22 02:23:52 2009 # # CONFIG_64BIT is not set CONFIG_X86_32=y @@ -45,7 +45,6 @@ CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y CONFIG_GENERIC_IRQ_PROBE=y -CONFIG_X86_32_LAZY_GS=y CONFIG_KTIME_SCALAR=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_CONSTRUCTORS=y This is auto-selected dependent on CC_STACKPROTECTOR. @@ -302,7 +301,8 @@ # CONFIG_X86_PAT is not set # CONFIG_EFI is not set CONFIG_SECCOMP=y -# CONFIG_CC_STACKPROTECTOR is not set +CONFIG_CC_STACKPROTECTOR_ALL=y +CONFIG_CC_STACKPROTECTOR=y # CONFIG_HZ_100 is not set # CONFIG_HZ_250 is not set # CONFIG_HZ_300 is not set Huh? This option is supposed to /cause/ a kernel panic when the stack was corrupted, not to /suppress/ a kernel panic. @@ -347,7 +347,8 @@ CONFIG_ACPI_CUSTOM_DSDT_FILE="" # CONFIG_ACPI_CUSTOM_DSDT is not set CONFIG_ACPI_BLACKLIST_YEAR=0 -# CONFIG_ACPI_DEBUG is not set +CONFIG_ACPI_DEBUG=y +# CONFIG_ACPI_DEBUG_FUNC_TRACE is not set CONFIG_ACPI_PCI_SLOT=y CONFIG_X86_PM_TIMER=y CONFIG_ACPI_CONTAINER=y Could ACPI_DEBUG modify the behaviour or timings of the kernel's ACPI subsystem so that bad things aren't happening anymore, by chance? (Even though actualy debug output is only emitted if also a kernel command line option is set.) @@ -422,7 +423,7 @@ # CONFIG_SCx200 is not set # CONFIG_OLPC is not set CONFIG_PCCARD=y -# CONFIG_PCMCIA_DEBUG is not set +CONFIG_PCMCIA_DEBUG=y CONFIG_PCMCIA=m CONFIG_PCMCIA_LOAD_CIS=y CONFIG_PCMCIA_IOCTL=y Ditto, except s/ACPI/CardBus/. @@ -814,7 +815,11 @@ # # See the help texts for more information. # -# CONFIG_FIREWIRE is not set +CONFIG_FIREWIRE=m +CONFIG_FIREWIRE_OHCI=m +CONFIG_FIREWIRE_OHCI_DEBUG=y +CONFIG_FIREWIRE_SBP2=m +CONFIG_FIREWIRE_NET=m CONFIG_IEEE1394=m CONFIG_IEEE1394_OHCI1394=m # CONFIG_IEEE1394_PCILYNX is not set As you already found out, enabling these options and even using these drivers instead of the 1394 ones is inconsequential to the bug. @@ -825,7 +830,7 @@ CONFIG_IEEE1394_RAWIO=m CONFIG_IEEE1394_VIDEO1394=m CONFIG_IEEE1394_DV1394=m -CONFIG_IEEE1394_VERBOSEDEBUG=y +# CONFIG_IEEE1394_VERBOSEDEBUG is not set # CONFIG_I2O is not set # CONFIG_MACINTOSH_DRIVERS is not set CONFIG_NETDEVICES=y IEEE1394_VERBOSEDEBUG lets the 1394 stack emit rather massive log messages, hence I presume you ran most of your tests without this option and already found the kernel panicking with as well as without this option. @@ -2252,7 +2257,8 @@ # CONFIG_DEBUG_OBJECTS is not set # CONFIG_DEBUG_SLAB is not set # CONFIG_DEBUG_KMEMLEAK is not set -# CONFIG_DEBUG_RT_MUTEXES is not set +CONFIG_DEBUG_RT_MUTEXES=y +CONFIG_DEBUG_PI_LIST=y # CONFIG_RT_MUTEX_TESTER is not set # CONFIG_DEBUG_SPINLOCK is not set # CONFIG_DEBUG_MUTEXES is not set Looks inconsequential. @@ -2263,7 +2269,7 @@ # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set # CONFIG_DEBUG_KOBJECT is not set # CONFIG_DEBUG_HIGHMEM is not set -# CONFIG_DEBUG_BUGVERBOSE is not set +CONFIG_DEBUG_BUGVERBOSE=y CONFIG_DEBUG_INFO=y # CONFIG_DEBUG_VM is not set # CONFIG_DEBUG_VIRTUAL is not set Likely inconsequential, IMO. @@ -2281,7 +2287,7 @@ # CONFIG_DEBUG_BLOCK_EXT_DEVT is not set # CONFIG_FAULT_INJECTION is not set # CONFIG_LATENCYTOP is not set -# CONFIG_SYSCTL_SYSCALL_CHECK is not set +CONFIG_SYSCTL_SYSCALL_CHECK=y # CONFIG_DEBUG_PAGEALLOC is not set CONFIG_USER_STACKTRACE_SUPPORT=y CONFIG_HAVE_FUNCTION_TRACER=y Looks inconsequential. @@ -2310,6 +2316,7 @@ # CONFIG_BLK_DEV_IO_TRACE is not set # CONFIG_MMIOTRACE is not set # CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set +# CONFIG_FIREWIRE_OHCI_REMOTE_DMA is not set # CONFIG_DYNAMIC_DEBUG is not set # CONFIG_DMA_API_DEBUG is not set # CONFIG_SAMPLES is not set Inconsequential.
(In reply to comment #30) > -CONFIG_X86_32_LAZY_GS=y > This is auto-selected dependent on CC_STACKPROTECTOR. > @@ -302,7 +301,8 @@ > # CONFIG_X86_PAT is not set > # CONFIG_EFI is not set > CONFIG_SECCOMP=y > -# CONFIG_CC_STACKPROTECTOR is not set > +CONFIG_CC_STACKPROTECTOR_ALL=y > +CONFIG_CC_STACKPROTECTOR=y > Huh? This option is supposed to /cause/ a kernel panic when the stack was > corrupted, not to /suppress/ a kernel panic. Stefan, when I reverted this change and left out the stackprotector stuff the problem re-appeared, really. I used gcc-3.4.1 but now I have also 4.4.1 but could switch back using gcc-config to 3.4.1 if somebody wants me to (Gentoo Linux box). The problem(s) is(are) not related nor avoided by massive firewire debug messages nor acpi or other debug. I cannot interpret the stacktraces but unloading the usb drivers actually made the stack "uncorrupted", unlike in most previous cases. In this context I have to add that I just placed a comment to some Gentoo Linux but that my uhci-hcd was loaded before ehci-hcd. I was getting a Warning in dmesg(1) that it should be vice-versa. Oh, here is what I did to fix the Warning but haven't tried yet the DV-testcase ;): $ tail /etc/modprobe.conf # make sure this does not happen anymore: "Warning! ehci_hcd should always be loaded before uhci_hcd and ohci_hcd, not after" # fix from http://bugs.gentoo.org/show_bug.cgi?id=260139 install uhci-hcd /sbin/modprobe ehci-hcd ; /sbin/modprobe -i uhci-hcd install ohci-hcd /sbin/modprobe ehci-hcd ; /sbin/modprobe -i ohci-hcd $ The "nousb" case was not the first time the stacktrace was not automagically tagged as corrupted (more below). Maybe several bugs co-exist? Please note I did not use "nousb" as a kernel commandline but really unloaded the two drivers on the fly (just in case it would really matter). Stacks obtained and NOT tagged as corrupted (name of the crash file is followed by relevant kernel commandline used to boot the machine and with a strack_trace file captured before running kino or entering the "Capture" mode, acpi was compiled in statically): another-juju-stacktrace-2.6.31-rc6-git6.txt Kernel command line: root=/dev/sda3 console=ttyS0,115200n8 console=tty0 idebus=66 probe_mask=0x3f closcksource=acpi_pm,notsc stacktrace udev Depth Size Location (42 entries) ----- ---- -------- 0) 2388 32 getnstimeofday+0x53/0xd7 1) 2356 16 ktime_get_ts+0x25/0x49 2) 2340 32 ktime_get+0x15/0x33 3) 2308 16 sched_clock_tick+0x45/0x6e 4) 2292 24 scheduler_tick+0x19/0xcc 5) 2268 16 update_process_times+0x3e/0x49 6) 2252 48 tick_sched_timer+0x154/0x17a 7) 2204 20 __run_hrtimer+0x3a/0x5d 8) 2184 44 hrtimer_interrupt+0xe9/0x13d 9) 2140 8 timer_interrupt+0x1a/0x23 10) 2132 32 handle_IRQ_event+0x52/0xf1 11) 2100 16 handle_level_irq+0x5c/0x92 12) 2084 20 handle_irq+0x40/0x4c 13) 2064 24 do_IRQ+0x39/0x74 14) 2040 120 common_interrupt+0x29/0x30 15) 1920 432 extract_buf+0x6b/0xba 16) 1488 44 extract_entropy+0x44/0x8a 17) 1444 16 get_random_bytes+0x1a/0x1e 18) 1428 16 rt_cache_invalidate+0x1b/0x2a 19) 1412 24 rt_cache_flush+0x15/0x9c 20) 1388 32 fib_inetaddr_event+0x19d/0x1aa 21) 1356 28 notifier_call_chain+0x26/0x48 22) 1328 36 __blocking_notifier_call_chain+0x3c/0x51 23) 1292 16 blocking_notifier_call_chain+0x11/0x13 24) 1276 28 __inet_insert_ifa+0xfe/0x109 25) 1248 48 inetdev_event+0x142/0x32c 26) 1200 28 notifier_call_chain+0x26/0x48 27) 1172 16 raw_notifier_call_chain+0x11/0x13 28) 1156 8 call_netdevice_notifiers+0x16/0x18 29) 1148 16 dev_open+0xb5/0xbe 30) 1132 28 dev_change_flags+0x9b/0x14a 31) 1104 52 do_setlink+0x247/0x2e3 32) 1052 244 rtnl_newlink+0x299/0x410 33) 808 40 rtnetlink_rcv_msg+0x18d/0x1a3 34) 768 20 netlink_rcv_skb+0x35/0x7d 35) 748 12 rtnetlink_rcv+0x20/0x27 36) 736 36 netlink_unicast+0x18f/0x1ef 37) 700 64 netlink_sendmsg+0x21d/0x22a 38) 636 216 sock_sendmsg+0xcb/0xe1 39) 420 292 sys_sendmsg+0x14e/0x19b 40) 128 48 sys_socketcall+0x147/0x170 41) 80 80 sysenter_do_call+0x12/0x26 yet-another-juju-stacktrace-2.6.31-rc6-git6.txt Kernel command line: root=/dev/sda3 console=ttyS0,115200n8 console=tty0 idebus=66 probe_mask=0x3f closcksource=acpi_pm,notsc stacktrace ftrace_dump_on_oops highres=off udev Depth Size Location (32 entries) ----- ---- -------- 0) 2192 20 check_preempt_wakeup+0x1a/0xbb 1) 2172 44 try_to_wake_up+0x131/0x145 2) 2128 8 wake_up_state+0xf/0x11 3) 2120 8 signal_wake_up+0x22/0x24 4) 2112 28 complete_signal+0x171/0x189 5) 2084 32 T.642+0x1a9/0x1bd 6) 2052 12 __group_send_sig_info+0xf/0x11 7) 2040 24 group_send_sig_info+0x41/0x4c 8) 2016 184 send_sigio+0x12c/0x179 9) 1832 20 __kill_fasync+0x43/0x52 10) 1812 8 kill_fasync+0x13/0x15 11) 1804 48 evdev_event+0x6d/0x101 12) 1756 32 input_pass_event+0x28/0x70 13) 1724 44 input_handle_event+0x371/0x37a 14) 1680 28 input_event+0x43/0x51 15) 1652 56 synaptics_process_byte+0x4ae/0x669 16) 1596 12 psmouse_handle_byte+0x11/0xc3 17) 1584 20 psmouse_interrupt+0x20d/0x21c 18) 1564 16 serio_interrupt+0x1a/0x3a 19) 1548 48 i8042_interrupt+0x1cd/0x1de 20) 1500 32 handle_IRQ_event+0x52/0xf1 21) 1468 16 handle_level_irq+0x5c/0x92 22) 1452 20 handle_irq+0x40/0x4c 23) 1432 24 do_IRQ+0x39/0x74 24) 1408 96 common_interrupt+0x29/0x30 25) 1312 24 ftrace_call+0x5/0x8 26) 1288 84 schedule_hrtimeout_range+0xb4/0xf4 27) 1204 16 poll_schedule_timeout+0x2c/0x43 28) 1188 752 do_select+0x4a9/0x4f0 29) 436 316 core_sys_select+0x1bf/0x272 30) 120 40 sys_select+0x6f/0x8b 31) 80 80 sysenter_do_call+0x12/0x26 last-juju-stacktrace-2.6.31-rc6-git6_with_highres-off_nousb.txt Kernel command line: root=/dev/sda3 console=ttyS0,115200n8 console=tty0 idebus=66 probe_mask=0x3f closcksource=acpi_pm,notsc stacktrace highres=off udev Depth Size Location (50 entries) ----- ---- -------- 0) 2260 24 enqueue_task_fair+0x2b/0xad 1) 2236 16 T.1248+0x31/0x3c 2) 2220 8 T.1250+0x20/0x28 3) 2212 44 try_to_wake_up+0x4f/0x145 4) 2168 8 default_wake_function+0x10/0x12 5) 2160 20 autoremove_wake_function+0x14/0x34 6) 2140 36 __wake_up_common+0x36/0x5d 7) 2104 20 __wake_up+0x16/0x1f 8) 2084 32 insert_work+0x7c/0x85 9) 2052 12 queue_work_on+0x31/0x3b 10) 2040 8 queue_work+0x13/0x15 11) 2032 8 kblockd_schedule_work+0x12/0x14 12) 2024 56 cfq_completed_request+0x269/0x271 13) 1968 16 elv_completed_request+0x47/0x9e 14) 1952 20 __blk_put_request+0x25/0x94 15) 1932 28 blk_finish_request+0x147/0x14e 16) 1904 20 blk_end_bidi_request+0x2c/0x38 17) 1884 12 blk_end_request+0xf/0x11 18) 1872 60 scsi_io_completion+0x164/0x379 19) 1812 24 scsi_finish_command+0x96/0x9c 20) 1788 24 scsi_softirq_done+0xdd/0xe5 21) 1764 24 blk_done_softirq+0x57/0x64 22) 1740 36 __do_softirq+0x93/0x132 23) 1704 12 do_softirq+0x2a/0x2f 24) 1692 8 irq_exit+0x2d/0x2f 25) 1684 24 do_IRQ+0x61/0x74 26) 1660 116 common_interrupt+0x29/0x30 27) 1544 132 generic_make_request+0x365/0x3a0 28) 1412 48 submit_bio+0x88/0x8f 29) 1364 32 submit_bh+0xf1/0x111 30) 1332 12 __bread+0x54/0x82 31) 1320 36 ext3_get_branch+0x66/0xcf 32) 1284 192 ext3_get_blocks_handle+0x84/0x6f2 33) 1092 56 ext3_get_block+0x91/0xcd 34) 1036 192 do_mpage_readpage+0x28a/0x5db 35) 844 124 mpage_readpages+0x9c/0xd3 36) 720 12 ext3_readpages+0x19/0x1b 37) 708 52 __do_page_cache_readahead+0xd9/0x14f 38) 656 20 ra_submit+0x1c/0x21 39) 636 44 filemap_fault+0x15e/0x2cd 40) 592 68 __do_fault+0x40/0x2ec 41) 524 68 handle_mm_fault+0x1d9/0x402 42) 456 48 do_page_fault+0x252/0x268 43) 408 80 error_code+0x5e/0x64 44) 328 8 padzero+0x1e/0x2d 45) 320 144 load_elf_binary+0x5d0/0xfe1 46) 176 36 search_binary_handler+0x6f/0x1c4 47) 140 36 do_execve+0x1ae/0x29d 48) 104 24 sys_execve+0x2b/0x53 49) 80 80 sysenter_do_call+0x12/0x26 For completeness I am attaching the stack_trace stats when the old firewire stack was used. This should be matching attempts in another-stacktrace-2.6.31-rc6-git6.txt or yet-another-stacktrace-2.6.31-rc6-git6.txt cases in which acpi symbols were in the stack dump although the traces were corrupted. Kernel command line: root=/dev/sda3 console=ttyS0,115200n8 console=tty0 idebus=66 probe_mask=0x3f closcksource=acpi_pm,notsc stacktrace udev Depth Size Location (58 entries) ----- ---- -------- 0) 2576 24 enqueue_task_fair+0x2b/0xad 1) 2552 16 T.1248+0x31/0x3c 2) 2536 8 T.1250+0x20/0x28 3) 2528 44 try_to_wake_up+0x4f/0x145 4) 2484 8 wake_up_state+0xf/0x11 5) 2476 8 signal_wake_up+0x22/0x24 6) 2468 28 complete_signal+0x171/0x189 7) 2440 32 T.642+0x1a9/0x1bd 8) 2408 12 __group_send_sig_info+0xf/0x11 9) 2396 24 group_send_sig_info+0x41/0x4c 10) 2372 184 send_sigio+0x12c/0x179 11) 2188 20 __kill_fasync+0x43/0x52 12) 2168 8 kill_fasync+0x13/0x15 13) 2160 48 evdev_event+0x6d/0x101 14) 2112 32 input_pass_event+0x28/0x70 15) 2080 44 input_handle_event+0x371/0x37a 16) 2036 28 input_event+0x43/0x51 17) 2008 56 synaptics_process_byte+0x4ae/0x669 18) 1952 12 psmouse_handle_byte+0x11/0xc3 19) 1940 20 psmouse_interrupt+0x20d/0x21c 20) 1920 16 serio_interrupt+0x1a/0x3a 21) 1904 48 i8042_interrupt+0x1cd/0x1de 22) 1856 32 handle_IRQ_event+0x52/0xf1 23) 1824 16 handle_level_irq+0x5c/0x92 24) 1808 20 handle_irq+0x40/0x4c 25) 1788 24 do_IRQ+0x39/0x74 26) 1764 80 common_interrupt+0x29/0x30 27) 1684 44 scsi_request_fn+0x29b/0x378 28) 1640 12 __blk_run_queue+0x3a/0x5c 29) 1628 12 blk_run_queue+0x11/0x16 30) 1616 52 scsi_run_queue+0x1a9/0x1f6 31) 1564 20 scsi_next_command+0x2d/0x39 32) 1544 60 scsi_io_completion+0x1a9/0x379 33) 1484 24 scsi_finish_command+0x96/0x9c 34) 1460 24 scsi_softirq_done+0xdd/0xe5 35) 1436 24 blk_done_softirq+0x57/0x64 36) 1412 36 __do_softirq+0x93/0x132 37) 1376 12 do_softirq+0x2a/0x2f 38) 1364 8 irq_exit+0x2d/0x2f 39) 1356 24 do_IRQ+0x61/0x74 40) 1332 116 common_interrupt+0x29/0x30 41) 1216 132 generic_make_request+0x365/0x3a0 42) 1084 48 submit_bio+0x88/0x8f 43) 1036 32 submit_bh+0xf1/0x111 44) 1004 12 __bread+0x54/0x82 45) 992 36 ext3_get_branch+0x66/0xcf 46) 956 192 ext3_get_blocks_handle+0x84/0x6f2 47) 764 56 ext3_get_block+0x91/0xcd 48) 708 192 do_mpage_readpage+0x28a/0x5db 49) 516 124 mpage_readpages+0x9c/0xd3 50) 392 12 ext3_readpages+0x19/0x1b 51) 380 52 __do_page_cache_readahead+0xd9/0x14f 52) 328 20 ra_submit+0x1c/0x21 53) 308 44 filemap_fault+0x15e/0x2cd 54) 264 68 __do_fault+0x40/0x2ec 55) 196 68 handle_mm_fault+0x1d9/0x402 56) 128 48 do_page_fault+0x252/0x268 57) 80 80 error_code+0x5e/0x64 I have disabled the highres timer now in the kernel but haven't tested that yet. The "highres=off" on commandline apparently did not disable it or at least not completely, as the "highres" string still appeared in stacktraces. I thought I did try noapci or acpi=off but according to the dmesg outputs I have stored I did not or it got ignored. Before trying more somebody please let me know whether you can gather something from the non-corrupted stack traces.
I at least have no further idea at this point.
First of all, I dug out an old kernel image 2.6.29-rc7, still the machine rebooted during the test. Same for 2.6.29.1. I fixed /etc/modprobe.conf so that the USB drivers get loaded in proper order. The "Warning" message disappeared from dmesg(1) but no improvement in terms of the firewire issue. I reproduced with "dvgrab fileprefix-". So, I shrunk down the kernel options. At one point I thought that removal od dr+agp+backlight_lcd+fb+parport+i2c did help (I got PANIC: doublefault message but was not on serial console), but re-adding all these back I could not find out which was the cause I got crashes until I reached exactly same .config (as judged by diff(1)). :( But, since that time the machine throws into the console (the one on tty at least) a message that a damaged firewire packet was received. I shrunk further. I suspected that when yenta and serial share an interrupt it could be a problem. I dropped even serial, mouse, MMIO mapped packets, special init code for Ricoh bridges ... Still, the machine rebooted after spitting out the message about a defective packet. Note that HPET, HIGHRES, watchdog, cpufreq are not in my kernel anymore. At the very moment I boot into initlevel 2 as dropping network, sockets etc. causes my X session to ignore my keyboard. I will attach the .config I have and some dmesg and other info. The problem can be even now triggered by "dvgrab -status". It prints the message about broken packet, a .dv file is created on the disk but sometimes the machine does not reboot. As magic keys worked I obtained some dumps. I haven't tried with the very last config but few steps ago I tried the JuJu stack and got same results - damaged packet notes and reboot. Questions to you: 1. The machine has onboard firewire chipset with two mini connectors. As I lost the proper cable I use PCMCIA card with 2 USB2.0 and and 2 firewire sockets (one is the mini sized and the other the big thing - which I use). dmesg(1) outputs show that the firewire controllers are 1.0 and 1.1 spec. Further, their max packet sizes are 1024 and 2048. Could that fool the driver? How can I disable the onboard chip? 2. The damaged packet(s) output I could post as a .jpg file from a camera. dvgrab stopped the camera but although it showed several packet timestamps it spoke only about a single packet? 3. Sometimes in the dmesg(1) output I saw a note: ieee1394: Current remote IRM is not 1394a-2000 compliant, resetting... Not always but might give you a clue. 4. The PCMCIA card has an optional input for external power. I do not have it. 5. The camera is powered from a battery, sorry, getting physical access to the power adapter will take months. The firewire cable is good, thick and shielded. I use it for my external firewire sound card without problems. I do have few more these cables with the big and mini connectors (but not the one with two mini connectors).
Created attachment 22827 [details] attempt45.tar.gz BTW, during my test once I did use "acpi=off" but that did not help. So, at the moment I still do have some acpi stuff in the kernel. Please advise what else I could drop from .config.
> BTW, during my test once I did use "acpi=off" but that did not help. > So, at the moment I still do have some acpi stuff in the kernel. > Please advise what else I could drop from .config. Could you nevertheless do one last test with a kernel with CONFIG_ACPI=n? Besides that, I don't spot anything interesting to switch off. You did already test with dvgrab on a text console, X11 shut down, right? I.e. after /etc/init.d/xdm stop. > I haven't tried with the very last config but few steps ago I tried > the JuJu stack and got same results - damaged packet notes and reboot. Your last tests with minimal kernels bring the DV reception DMA back into the picture as culprit. Still, what's curious is that (1.) both driver stacks trigger the bug even though they use very different DMA modes of the controller and don't share code in this area, (2.) your hardware/ software environment is pretty common yet I don't remember to have heard of similar crashes. > Questions to you: > 1. The machine has onboard firewire chipset with two mini connectors. > As I lost the proper cable I use PCMCIA card with 2 USB2.0 and and 2 > firewire sockets (one is the mini sized and the other the big thing - > which I use). dmesg(1) outputs show that the firewire controllers are > 1.0 and 1.1 spec. Further, their max packet sizes are 1024 and 2048. That's quite common. > Could that fool the driver? No. Some people use two or even more controllers regularly and extensively, alone and together, same ones or different ones. I for one have a box with up to four different 1394 controllers present at a time, among them one out of a few CardBus cards which I have here. > How can I disable the onboard chip? If you don't have a BIOS menu for that, then you can either unbind it: # lspci | grep 1394 # ls /sys/module/ohci1394/drivers/pci\:ohci1394/ # echo -n "0000:04:00.0" > /sys/module/ohci1394/drivers/pci\:ohci1394/unbind (Use the device ID from the sysfs listing, not the one from lspci.) Or you can hack linux/drivers/ieee1394/ohci1394.c ohci1394_pci_probe to return early with error if dev->vendor matches the vendor ID (from lspci -nn) of the built-in controller. BTW, a 1394 controller which is idle will only do a single brief DMA right after initialization (8 or 12 bytes or so). If driven by firewire-ohci instead of ohci1394, it will also emit an interrupt event every 64 seconds (signaling the wrapping of a bus timer, causing the CPU to update a kernel variable and a chip register). IOW an idle 1394 controller is almost like absent. > 2. The damaged packet(s) output I could post as a .jpg file from a > camera. dvgrab stopped the camera but although it showed several packet > timestamps it spoke only about a single packet? I'm not familiar with dvgrab's integrity testing or how it bails out when it detects data errors. The corrupt packet, as logged by dvgrab, could be because the camcorder sent junk, or because the controller wrote junk into memory, or because the drivers chased wrong buffer pointers after a memory corruption. BTW, if the camcorder already sent garbage (or the cable corrupted some data), then the controller should still _not_ overwrite random memory (it is programmed by the drivers to DMA into the designated buffers only, not anywhere else); or more likely in such a case it should detect a CRC error and pass just this error status up to the application. However, if your CardBus card (or the CardBus bridge) is buggy or damaged, then all bets are off. > 3. Sometimes in the dmesg(1) output I saw a note: > ieee1394: Current remote IRM is not 1394a-2000 compliant, resetting... > Not always but might give you a clue. That's normal when a 1394 node with lesser bus management capabilities is plugged in. Management of the 1394 bus (as a peer-to-peer or network-like bus) is voluntarily taken over by nodes whose firmware or OS is most capable to do so; or if several of such nodes are present, they have a protocol to select one who does the work. "IRM" = isochronous resource manager. > 4. The PCMCIA card has an optional input for external power. I do not > have it. No problem. This power input is only to inject bus power, it does not affect data signals. Camcorders don't need 1394 bus power. > 5. The camera is powered from a battery, sorry, getting physical access > to the power adapter will take months. Wouldn't make a difference. > The firewire cable is good, thick and shielded. I use it for my external > firewire sound card without problems. I do have few more these cables > with the big and mini connectors (but not the one with two mini > connectors). A 4-pin to 4-pin cable (or, 2nd choice, a 4-pin plug to 6-pin socket adapter) would be cool to test how well the built-in controller works in contrast to the CardBus card. Or maybe you find a local computer shop with friendly service who let you try out another FireWire CardBus card.
> I use it for my external firewire sound card without problems. Do you use this sound card on the same notebook and with this CardBus card?
...and for audio reception?
(In reply to comment #36) > > I use it for my external firewire sound card without problems. > > Do you use this sound card on the same notebook and with this CardBus card? It is a Phonic Firefly box with low-latency drivers available only on MS Windows. The laptop is dual-boot. Under WinXP I recorded a lot using just the same hw and cable(s). BTW, have external CD-RW/DVD-RW on firewire as well. I could try to burn on it something - I thing it used to work under linux. Does the dvgrab status really read the tape remotely? I do not think so as the DV-camera is silent. I bet it just talks to the chip over the wire. Why is the massive firewire debug option useless here? I disabled that in the past as I never saw anything useful. :( Maybe time to retry in this minimalistic setup (yes, am not using X nor framebuffer now when running dvgrab).
Created attachment 22832 [details] 2.6.31-rc6-git6-also-noacpi-nocpuidle-via-4-to-4-pin-and-pcmcia.txt This crash or the next one for which I did not have a stacktrace killed my ext3fs. Something like "group counts went wrong". Inodes of files I haven't touched for a while were wrong, e.g. some attribute saying it is compressed file although the ext3fs does not allow it, some ugly huge numbers ... A recovery CD-ROM boot corrected the errors but I think I will stop inspecting this. It only appears that it is related either to the firewoire chip on the pcmcia casrd itself or to the pmcia driver. The on-motherboard firewire chipset works fine. Kernel crashes can be triggered by "dvgrab -showstatus" as well as grabbing the stream. There are some many different stack traces that I doubt it makes sense to continue on this. The "damaged firewire packets" were caused by the fact the camera was in "camera mode" instead of "play/edit mode". I just do not believe in a hardware issue as the same ports on the pmcia card work well under win xp sp2.
I have attached the external CD/DVD drive via the firewire cable through the pmcia card. I managed to copy whole cdrom into /dev/null. The firewire sbp2 driver of the old stack (not JuJu) works. The chip and the cables as well. So, it looks like a yenta driver issue. I don't know why the last stacktrace talks about swap. As I said, I booted into initlevel 2, but yes, at that point swap is really enabled. Have 2 GB RAM and same of swap. I suspect I am hitting too many bugs in a row.
Created attachment 22833 [details] via-4-to-4-pin-and-pcmcia2.txt
Created attachment 22834 [details] via-4-to-4-pin-and-pcmcia3.txt
Created attachment 22835 [details] dmesg-via-4-to-4-pin-and-pcmcia4.txt A sample dmesg output when the DV camera is connected to the pcmcia firewire port with the kernel without acpi and cpu idle support. I will post .config.
Created attachment 22839 [details] .config which killed ext3fs
Created attachment 22840 [details] .config with re-enabled fw sbp2 (my current)
> The laptop is dual-boot. Under WinXP I recorded a lot using just the > same hw and cable(s). This means that the Linux kernel is buggy or that only the Linux OS + apps trigger a hardware fault. > I have attached the external CD/DVD drive via the firewire cable > through the pmcia card. I managed to copy whole cdrom into /dev/null. (1.) asynchronous I/O (through the old or new 1394 stack) as used with storage devices, (2.) isochronous (e.g. DV) reception through the old stack, (3.) isochronous reception through the new stack all use different DMA modes of the 1394 controller. Cases (2.) and (3.) are the same though WRT what's coming in over the wire. To reiterate, cases (2.) and (3.) don't share driver code. Furthermore, (2.) and (3.) cause a higher IRQ and tasklet load than (1.), because the bulk of (1.) works without involvement of the CPU. Nevertheless it's not entirely expected that (2.) and (3.) crash while (1.) doesn't. During (2.) and (3.) there is also some interleaved asynchronous traffic for control and status of the camera, similar in nature to control and status traffic in case of (1.). ---- Anyway, despite all your additional diagnostics I have alas still no idea what causes the bug or why CONFIG_CC_STACKPROTECTOR=y prevents the bug from happening.
> Kernel crashes can be triggered by "dvgrab -showstatus" as well > as grabbing the stream. Do you mean by the former, that the camera not only doesn't play back, it also doesn't transmit from its camera sensor at this time? (Or more to the point, dvgrab doesn't receive & record frames in this mode?)
PS: I ask because one can also capture live video from many camcorders (in contrast to capture from tape), yet those two ways of video capture are the same from the receiving end's POV. I.e. do you have even live video off but the kernel crashes nevertheless while dvgrab is active?
(In reply to comment #47) > > Kernel crashes can be triggered by "dvgrab -showstatus" as well > > as grabbing the stream. > > Do you mean by the former, that the camera not only doesn't play back, it > also > doesn't transmit from its camera sensor at this time? (Or more to the point, > dvgrab doesn't receive & record frames in this mode?) The .dv files were always non-empty. I only bothered to once run 'mplayer -vo null' to ensure the format was readable. Back to you question. I just wanted to say that just the communication over the wire kills the machine. And while taking into account that at some test I had the machine set to the camera mode instead of the play mode the camera did not send real video data but complained. And that is why dvgrab complained about damaged packets. Foolingly enough, dvgrab did create (as I said) non-empty files, readable by mplayer and probably containing just black frames. This matches the picture that 'any' communication with the device kills the machine. It happens within first say 5 seconds after the attempt, mostly within the first second. I can try to capture a live video. And will test from win xp. Is there a patch available so that the firewire driver would just receive data while not writing it into memory/file, whatever it really does? It might save my possible future ext3fs problems and might be enough to stress the IRQ lines. Regarding the ext3 crash: as I rebooted a lot and my fsck is scheduled every 30th mount of the filesystem I can assure you that the filesystem before the crash was clean about 5 kernel crashes ago. Just to emphasize that replaying the journal either resulted in broken inode attributes and metadata or that something wrote directly over the disk area?
> I can try to capture a live video. Not necessary. You already confirmed in comment 49 what I wanted to know: In all crashing cases, OHCI-1394 IR DMA is active. > Is there a patch available so that the firewire driver would just > receive data while not writing it into memory/file, whatever it > really does? It might save my possible future ext3fs problems No. These crashes obviously involve memory corruption (stack corruption in particular), which means that your on-disk data are in danger since the kernel may write random junk. You merely can reduce the risk somewhat by reducing regular I/O, e.g. by "dvgrab - >/dev/null". (Trailing "-" in dvgrab's command line means output to stdout.) However, at this point I don't know what'd be left to test anyway... Hmm, perhaps one thing if you have the means: A different CardBus card, as an indicator whether the problem relates more to the card or more to the laptop's chipset, especially its CardBus bridge. But would it be worth it? So far the whole thing sounds to me as if nobody else is affected, and that no other kernel developer will pop up with an idea what to fix. IOW the only realistic way forward for this bug seems to me to be "resolved; wontfix/ worksforme" for now, sorry.
I have enabled ISA BUS support in my kernel. I thought my laptop uses only PCI but it seems at least something got detected: --- old 2009-10-01 18:38:01.000000000 +0200 +++ new 2009-10-01 18:37:51.000000000 +0200 @@ -1,13 +1,16 @@ yenta_cardbus 0000:02:07.0: ISA IRQ mask 0x0490, PCI irq 5 -yenta_cardbus 0000:02:07.0: Socket status: 30000820 +yenta_cardbus 0000:02:07.0: Socket status: 30000006 pci_bus 0000:02: Raising subordinate bus# of parent bus (#02) from #02 to #06 yenta_cardbus 0000:02:07.0: pcmcia: parent PCI bridge I/O window: 0xa000 - 0xafff +pcmcia_socket pcmcia_socket0: cs: IO port probe 0xa000-0xafff: clean. yenta_cardbus 0000:02:07.0: pcmcia: parent PCI bridge Memory window: 0xd6000000 - 0xd6ffffff -yenta_cardbus 0000:02:07.0: pcmcia: parent PCI bridge Memory window: 0x80000000 - 0x87ffffff +yenta_cardbus 0000:02:07.0: pcmcia: parent PCI bridge Memory window: 0x88000000 - 0x8fffffff yenta_cardbus 0000:02:07.1: CardBus bridge found [1043:1624] yenta_cardbus 0000:02:07.1: ISA IRQ mask 0x0490, PCI irq 11 yenta_cardbus 0000:02:07.1: Socket status: 30000006 pci_bus 0000:02: Raising subordinate bus# of parent bus (#02) from #06 to #0a yenta_cardbus 0000:02:07.1: pcmcia: parent PCI bridge I/O window: 0xa000 - 0xafff +pcmcia_socket pcmcia_socket1: cs: IO port probe 0xa000-0xafff: clean. yenta_cardbus 0000:02:07.1: pcmcia: parent PCI bridge Memory window: 0xd6000000 - 0xd6ffffff -yenta_cardbus 0000:02:07.1: pcmcia: parent PCI bridge Memory window: 0x80000000 - 0x87ffffff +yenta_cardbus 0000:02:07.1: pcmcia: parent PCI bridge Memory window: 0x88000000 - 0x8fffffff Could it be related and help the IO stress? I did not bother test with the device attached.
Martin, sorry for the long period of silence. Do you still have the hardware? Is the issue still present in current kernels?
Martin responded on 2010-09-21: > I have long-standing problem with accessing the bugzilla, I guess they > have mysql issues there. I can lend the camera back during X-mas, and the > PCMCIA card and the laptop I still have. I will re-try, and probably buy > another comp with serial port. For now I think it is justified to close this bug as not reproducible. I am not aware of directly comparable reports. You can reopen it if you continue to run into this issue.