Most recent kernel where this bug did not occur: Distribution: Debian unstable/testing Hardware Environment: Acer Extensa 3002 WLMi Notebook Software Environment: 2.6.19-rc5 (kernel.org), irda-utils 0.9.18-3 Problem Description: Link error while waiting for incoming connection via irda Steps to reproduce: Call ircp -r and try to send a file via IR from a Pocket PC acer01:~$ ircp -r Waiting for incoming connection srv_obex_event(): Link error and in /var/log/messages: Assertion failed! net/irda/af_irda.c:irda_recvmsg_stream:1413 !sock_error(sk) Irda (FIR) device is identified as NSC Vishay TFDS-6500. Resources have been identified from windows hardware manager: base 0x2F8, irq 3, dma 1. I tried to debug the problem with echo 4 > /proc/sys/net/irda/debug ircp -r echo 0 > /proc/sys/net/irda/debug Then I've extracted the kernel messages from /var/log/kern.log. See: http://www.uni-koeln.de/~a0537/irda_debug.log Sending files to the Pocket PC works as expected. Additional informations: /etc/modprobe.d/irda-utils.local alias irda0 nsc-ircc options nsc-ircc dongle_id=0x09 io=0x2f8 irq=3 dma=1 install nsc-ircc /bin/setserial /dev/ttyS0 uart none port 0 irq 0; \ /sbin/modprobe --ignore-install nsc-ircc /etc/default/irda-utils ENABLE="" DISCOVERY="true" DEVICE="irda0" DONGLE="none" /var/log/messages during boot ... localhost kernel: nsc-ircc, chip->init localhost kernel: nsc-ircc, Found chip at base=0x164e localhost kernel: nsc-ircc, driver loaded (Dag Brattli) localhost kernel: IrDA: Registered device irda0 localhost kernel: nsc-ircc, Using dongle: IBM31T1100 or Temic TFDS6000/TFDS6500 ... localhost irattach: executing: '/sbin/modprobe irda0' localhost irattach: executing: 'echo acer01 > /proc/sys/net/irda/devname' localhost irattach: executing: 'echo 1 > proc/sys/net/irda/discovery' localhost irattach: Starting device irda0 /var/log/dmesg during boot ... nsc_ircc_pnp_probe() : From PnP, found firbase 0x2F8 ; irq 3 ; dma 1. nsc-ircc, chip->init nsc-ircc, Found chip at base=0x164e nsc-ircc, driver loaded (Dag Brattli) IrDA: Registered device irda0 nsc-ircc, Using dongle: IBM31T1100 or Temic TFDS6000/TFDS6500 Regards, Berthold Cogel
I've just tried linux-2.6.20-rc1. It's still the same: Assertion failed! net/irda/af_irda.c:irda_recvmsg_stream:1413 !sock_error(sk) Regards, Berthold Cogel
Out of curiosity, could you please try the following patch ? diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c index 7e1aea8..c3e45fe 100644 --- a/net/irda/af_irda.c +++ b/net/irda/af_irda.c @@ -1410,7 +1410,6 @@ static int irda_recvmsg_stream(struct kiocb *iocb, struct IRDA_DEBUG(3, "%s()\n", __FUNCTION__); IRDA_ASSERT(self != NULL, return -1;); - IRDA_ASSERT(!sock_error(sk), return -1;); if (sock->flags & __SO_ACCEPTCON) return(-EINVAL);
Berthold, I forgot to ask: Do you get this message at the end of the transfer ? From your debug output, I see some data going through...
Hello Samuel, at least none of the files I've tried to send was created. The Pocket PC tries to send the file, gives an error (without details) and retries with the same result. I don't know very much about irda and what I can do with it. I only tried to sync my Pocket PC with my laptop without an USB cable. I followed the various HowTo's I've found in the net and receiving files via 'ircp -r' was one of the steps. Are there any special tests I can do? Applying your patch to 2.6.20-rc1 leads to an 'Oops'. This s what I get in the logs: Dec 21 00:42:00 localhost kernel: irlap_change_speed(), setting speed to 115200 Dec 21 00:42:00 localhost kernel: irlmp_state_dtr(), Unknown event LM_LAP_CONNECT_CONFIRM on LSAP 0x10 Dec 21 00:42:00 localhost last message repeated 2 times Dec 21 00:42:01 localhost kernel: irda_poll(), POLLHUP Dec 21 00:42:01 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000004 Dec 21 00:42:01 localhost kernel: printing eip: Dec 21 00:42:01 localhost kernel: f99ab9d7 Dec 21 00:42:01 localhost kernel: *pde = 00000000 Dec 21 00:42:01 localhost kernel: Oops: 0002 [#1] Dec 21 00:42:01 localhost kernel: PREEMPT Dec 21 00:42:01 localhost kernel: Modules linked in: binfmt_misc radeon rfcomm drm l2cap bluetooth nfs lockd nfs_acl sunrpc af_packet thermal fan button sbs i2c_ec autofs4 snd_intel8x0m dm_crypt dm_mirror ipw2200 b44 mii ieee80211_crypt_tkip ieee80211_crypt_ccmp ieee80211_crypt_wep ieee80211 ieee80211_crypt cpufreq_conservative cpufreq_ondemand cpufreq_performance cpufreq_powersave acpi_cpufreq freq_table processor sg scsi_mod usbhid hid joydev pcmcia nsc_ircc snd_intel8x0 snd_ac97_codec firmware_class ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer tifm_7xx1 tifm_sd mmc_block mmc_core irda ehci_hcd uhci_hcd snd soundcore ide_cd tifm_core yenta_socket rsrc_nonstatic pcmcia_core i2c_i801 crc_ccitt psmouse usbcore snd_page_alloc rtc pcspkr cdrom ohci1394 ieee1394 intel_agp agpgart evdev unix Dec 21 00:42:01 localhost kernel: CPU: 0 Dec 21 00:42:01 localhost kernel: EIP: 0060:[<f99ab9d7>] Not tainted VLI Dec 21 00:42:01 localhost kernel: EFLAGS: 00210297 (2.6.20-rc1-vanilla #2) Dec 21 00:42:01 localhost kernel: EIP is at irda_recvmsg_stream+0xfb/0x2c8 [irda] Dec 21 00:42:01 localhost kernel: eax: 00000000 ebx: ffffffa1 ecx: 00000001 edx: f5b2ee54 Dec 21 00:42:01 localhost kernel: esi: f5b2ee00 edi: 00000000 ebp: f5b2ee54 esp: f328de2c Dec 21 00:42:01 localhost kernel: ds: 007b es: 007b ss: 0068 Dec 21 00:42:01 localhost kernel: Process ircp (pid: 3941, ti=f328c000 task=f2709a70 task.ti=f328c000) Dec 21 00:42:01 localhost kernel: Stack: f328de4c f328de4c 00000001 c015663e 00000000 f328dea4 00000000 00000000 Dec 21 00:42:01 localhost kernel: 00000001 00000000 f2709a70 c011370c 00000000 00000000 f99af4c0 f2137b40 Dec 21 00:42:01 localhost kernel: f328dee0 00000003 c0233611 00000003 00000000 00000001 f99ab1dd ffffffff Dec 21 00:42:01 localhost kernel: Call Trace: Dec 21 00:42:01 localhost kernel: [<c015663e>] core_sys_select+0x1aa/0x2bc Dec 21 00:42:01 localhost kernel: [<c011370c>] default_wake_function+0x0/0xc Dec 21 00:42:01 localhost kernel: [<c0233611>] sock_aio_read+0xc1/0xcd Dec 21 00:42:01 localhost kernel: [<f99ab1dd>] irda_accept+0x34b/0x384 [irda] Dec 21 00:42:01 localhost kernel: [<c014b8bf>] do_sync_read+0xc7/0x10a Dec 21 00:42:01 localhost kernel: [<f99a5dfc>] irttp_flush_queues+0xa3/0xac [irda] Dec 21 00:42:01 localhost kernel: [<c012603f>] autoremove_wake_function+0x0/0x35 Dec 21 00:42:01 localhost kernel: [<c014c0bb>] vfs_read+0x9c/0x134 Dec 21 00:42:01 localhost kernel: [<c014c447>] sys_read+0x41/0x67 Dec 21 00:42:01 localhost kernel: [<c0102c56>] sysenter_past_esp+0x5f/0x85 Dec 21 00:42:01 localhost kernel: ======================= Dec 21 00:42:01 localhost kernel: Code: 00 00 00 8d 6e 54 89 e8 e8 f5 bf 88 c6 89 c7 85 c0 0f 85 ba 00 00 00 8b 4c 24 20 39 4c 24 1c 0f 83 7a 01 00 00 8b 86 00 01 00 00 <0f> ba 68 04 01 8b 46 38 8d 54 24 24 e8 28 a8 77 c6 65 a1 08 00 Dec 21 00:42:01 localhost kernel: EIP: [<f99ab9d7>] irda_recvmsg_stream+0xfb/0x2c8 [irda] SS:ESP 0068:f328de2c Dec 21 00:42:03 localhost kernel: <7>irlap_change_speed(), setting speed to 9600
Hi again Berthold, I will need your help since I can not reproduce this bug myself (I don't own any PocketPC device). Also, sorry for the long delays... So, if you have time for that, could you please revert the following patch: http://sortiz.org/linux/irda/0005-IrDA-af_irda.c-cleanups.txt from your kernel tree: "patch -p1 -R < 0005-IrDA-af_irda.c-cleanups.txt" and then test again. Thanks in advance for your time. Cheers, Samuel.
Hello Samuel, I applied the patch to the latest rc-Kernel (linux-2.6.20-rc3) with no success. I still get this: acer01:~# ircp -r Waiting for incoming connection srv_obex_event(): Link error and then I get a 'int3: 0000 [#1]' message. /var/log/kern.log: Jan 7 20:03:44 localhost kernel: irlap_change_speed(), setting speed to 115200 Jan 7 20:03:44 localhost kernel: irlmp_state_dtr(), Unknown event LM_LAP_CONNECT_CONFIRM on LSAP 0x10 Jan 7 20:03:44 localhost last message repeated 2 times Jan 7 20:03:45 localhost kernel: irda_poll(), POLLHUP Jan 7 20:03:45 localhost kernel: int3: 0000 [#1] Jan 7 20:03:45 localhost kernel: PREEMPT Jan 7 20:03:45 localhost kernel: Modules linked in: binfmt_misc radeon drm rfcomm l2cap bluetooth nfs lockd nfs_acl sunrpc af_packet thermal fan button sbs i2c_ec autofs4 snd_intel8x0m dm_crypt dm_mirror ipw2200 b44 mii ieee80211_crypt_tkip ieee80211_crypt_ccmp ieee80211_crypt_wep ieee80211 ieee80211_crypt cpufreq_conservative cpufreq_ondemand cpufreq_performance cpufreq_powersave acpi_cpufreq freq_table processor sg scsi_mod usbhid hid pcmcia firmware_class joydev nsc_ircc snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss tifm_7xx1 tifm_sd mmc_block mmc_core ide_cd snd_pcm snd_timer irda yenta_socket rsrc_nonstatic pcmcia_core ohci1394 ieee1394 tifm_core cdrom i2c_i801 snd soundcore snd_page_alloc crc_ccitt psmouse ehci_hcd uhci_hcd rtc pcspkr usbcore intel_agp agpgart evdev unix Jan 7 20:03:45 localhost kernel: CPU: 0 Jan 7 20:03:45 localhost kernel: EIP: 0060:[<f7aa1945>] Not tainted VLI Jan 7 20:03:45 localhost kernel: EFLAGS: 00000007 (2.6.20-rc3-vanilla #1) Jan 7 20:03:45 localhost kernel: EIP is at 0xf7aa1945 Jan 7 20:03:45 localhost kernel: eax: f5a91e50 ebx: f5a91e50 ecx: 00000000 edx: 00000001 Jan 7 20:03:45 localhost kernel: esi: c0151d11 edi: 00000000 ebp: f5a91f14 esp: f5a91ef4 Jan 7 20:03:45 localhost kernel: ds: 007b es: 007b ss: 0068 Jan 7 20:03:45 localhost kernel: Process ircp (pid: 4047, ti=f5a90000 task=f61a3550 task.ti=f5a90000) Jan 7 20:03:45 localhost kernel: Stack: c0112a23 00000000 00000001 f26e6cd8 c018c195 f5a90000 00000000 00000292 Jan 7 20:03:45 localhost kernel: f5a91f30 c0113cc7 00000000 00000000 f468dc00 f26e6cc0 f26e6ce4 f273eea4 Jan 7 20:03:45 localhost kernel: c0236e5b 00000000 f99936e9 f26e6cc0 c0233ee0 00000000 f99aa780 f26e6cc0 Jan 7 20:03:45 localhost kernel: Call Trace: Jan 7 20:03:45 localhost kernel: [<c0112a23>] __wake_up_common+0x31/0x4f Jan 7 20:03:45 localhost kernel: [<c018c195>] ext3_permission+0x0/0xa Jan 7 20:03:45 localhost kernel: [<c0113cc7>] __wake_up+0x2a/0x4f Jan 7 20:03:45 localhost kernel: [<c0236e5b>] sock_def_wakeup+0x2c/0x45 Jan 7 20:03:45 localhost kernel: [<f99936e9>] irda_release+0x4f/0x151 [irda] Jan 7 20:03:45 localhost kernel: [<c0233ee0>] sock_fasync+0x107/0x113 Jan 7 20:03:45 localhost kernel: [<c0234b05>] sock_release+0x14/0x6f Jan 7 20:03:45 localhost kernel: [<c0234d85>] sock_close+0x2a/0x2f Jan 7 20:03:45 localhost kernel: [<c014c7ed>] __fput+0x96/0x150 Jan 7 20:03:45 localhost kernel: [<c014a3f5>] filp_close+0x51/0x58 Jan 7 20:03:45 localhost kernel: [<c014b3ec>] sys_close+0x75/0xc1 Jan 7 20:03:45 localhost kernel: [<c0102c56>] sysenter_past_esp+0x5f/0x85 Jan 7 20:03:45 localhost kernel: [<c0280033>] xfrm_hash_resize+0x1d/0x265 Jan 7 20:03:45 localhost kernel: ======================= Jan 7 20:03:45 localhost kernel: Code: ff ff ff ff ff ff 20 19 aa f7 20 19 aa f7 00 10 00 00 00 00 00 00 00 00 00 00 01 00 00 00 38 19 aa f7 38 19 aa f7 00 00 00 00 cc <44> 8a c1 a8 dc 63 f6 88 32 65 f6 f0 45 a1 f7 f0 15 aa f7 10 09 Jan 7 20:03:45 localhost kernel: EIP: [<f7aa1945>] 0xf7aa1945 SS:ESP 0068:f5a91ef4 Jan 7 20:03:45 localhost kernel: <6>note: ircp[4047] exited with preempt_count 2 Jan 7 20:03:45 localhost kernel: BUG: scheduling while atomic: ircp/0x10000002/4047 Jan 7 20:03:45 localhost kernel: [<c02825b6>] __sched_text_start+0x56/0x546 Jan 7 20:03:45 localhost kernel: [<c011d665>] run_timer_softirq+0x16e/0x176 Jan 7 20:03:45 localhost kernel: [<c0112d05>] __cond_resched+0x16/0x34 Jan 7 20:03:45 localhost kernel: [<c0282ba8>] cond_resched+0x26/0x31 Jan 7 20:03:45 localhost kernel: [<c013c95e>] unmap_vmas+0x370/0x465 Jan 7 20:03:45 localhost kernel: [<c013efe7>] exit_mmap+0x69/0xe4 Jan 7 20:03:45 localhost kernel: [<c01146a1>] mmput+0x1c/0x7c Jan 7 20:03:45 localhost kernel: [<c0118a05>] do_exit+0x1b0/0x6e5 Jan 7 20:03:45 localhost kernel: [<c0116c3b>] printk+0x1b/0x1f Jan 7 20:03:45 localhost kernel: [<c010407a>] die+0x1d1/0x1d9 Jan 7 20:03:45 localhost kernel: [<c01047a9>] do_int3+0x6e/0x74 Jan 7 20:03:45 localhost kernel: [<c0151d11>] permission+0x86/0xa2 Jan 7 20:03:45 localhost kernel: [<c02843ef>] int3+0x27/0x2c Jan 7 20:03:45 localhost kernel: [<c0151d11>] permission+0x86/0xa2 Jan 7 20:03:45 localhost kernel: [<c014007b>] arch_get_unmapped_area_topdown+0xed/0x11c Jan 7 20:03:45 localhost kernel: [<c0112a23>] __wake_up_common+0x31/0x4f Jan 7 20:03:45 localhost kernel: [<c018c195>] ext3_permission+0x0/0xa Jan 7 20:03:45 localhost kernel: [<c0113cc7>] __wake_up+0x2a/0x4f Jan 7 20:03:45 localhost kernel: [<c0236e5b>] sock_def_wakeup+0x2c/0x45 Jan 7 20:03:45 localhost kernel: [<f99936e9>] irda_release+0x4f/0x151 [irda] Jan 7 20:03:45 localhost kernel: [<c0233ee0>] sock_fasync+0x107/0x113 Jan 7 20:03:45 localhost kernel: [<c0234b05>] sock_release+0x14/0x6f Jan 7 20:03:45 localhost kernel: [<c0234d85>] sock_close+0x2a/0x2f Jan 7 20:03:45 localhost kernel: [<c014c7ed>] __fput+0x96/0x150 Jan 7 20:03:45 localhost kernel: [<c014a3f5>] filp_close+0x51/0x58 Jan 7 20:03:45 localhost kernel: [<c014b3ec>] sys_close+0x75/0xc1 Jan 7 20:03:45 localhost kernel: [<c0102c56>] sysenter_past_esp+0x5f/0x85 Jan 7 20:03:45 localhost kernel: [<c0280033>] xfrm_hash_resize+0x1d/0x265 Jan 7 20:03:45 localhost kernel: ======================= Jan 7 20:03:47 localhost kernel: irlap_change_speed(), setting speed to 9600 Regards, Berthold
Hello! Finally I found the time to check the latest stable kernel: 2.6.20.6 I get still the same error messages. But now I did something I've never tried before: Like the last time I called 'ircp -r' and then I started the transfer on my pocket pc. I got the 'Link error' and even the PDA reported an error. The last times I closed the 'Send' dialog on the PDA. This time I restarted ircp and clicked the 'resend' in the dialog on my PDA. The file was transmited. I can reproduce this behavior. acer01:/tmp# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 > /proc/sys/net/irda/debug Mi 11. Apr 22:17:45 CEST 2007 Waiting for incoming connection srv_obex_event(): Link error acer01:/tmp# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 > /proc/sys/net/irda/debug Mi 11. Apr 22:18:01 CEST 2007 Waiting for incoming connection Incoming connection Receiving SDReg.exe...done Disconnecting I'm attaching the debug log. Regards, Berthold
Created attachment 11132 [details] debug log
Hi Berthold, For some reasons, I can't read the attachment. I only get "attachment.cgi"... Could you please look at it and maybe try to attach it again ? Thanks for your time. Cheers, Samuel.
Samuel, if you save Berthold's attachment as file.bz2 and run bunzip2 on it you'll get the contents.
Yes, that seems to work. Thanks Adrian. Firefox proposes to save the attachment as attachment.cgi, so I didn't even bother to check...
Sorry, my fault. This was my first attachment in bugzilla.
Oops! I forgot to tell, that I haven't applied the patch for af_irda.c Samuel posted. I get the 'Assertion failed!...' messages in /var/log/messages and the stuff in the attachment.
It seems the Pocket PC closes the connection: Apr 11 22:17:53 localhost kernel: irda_disconnect_indication(f3844600) [...] This is signalled to the application as POLLHUP: Apr 11 22:17:53 localhost kernel: irda_poll() Apr 11 22:17:53 localhost kernel: irda_poll(), POLLHUP [...] And then the application calls recvmsg: Apr 11 22:17:53 localhost kernel: irda_recvmsg_stream() Apr 11 22:17:53 localhost kernel: Assertion failed! net/irda/af_irda.c:irda_recvmsg_stream:1413 !sock_error(sk) I think the problem is two-fold. - currently, irda_disconnect_indication will set sk->sk_err = ECONNRESET unconditionally whenever it sees a transition ESTABLISHED->CLOSED. I think this is plain wrong. You only need to return ECONNRESET if the user calls sendmsg() on a disconnected socket. - irda_recvmsg_stream should check for TCP_CLOSE, and return 0 in this case.
Created attachment 11135 [details] Possible fix This patch is entirely untested, as I don't have the hardware. Please give it a try.
Created attachment 11137 [details] debug log with patch (debug.log.bz2) Like the last time: acer01:/tmp# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 > /proc/sys/net/irda/debug Do 12. Apr 23:24:49 CEST 2007 Waiting for incoming connection Speicherzugriffsfehler Got an Oops. In /var/log/kern.log: Apr 12 23:24:54 localhost kernel: irda_poll() Apr 12 23:24:54 localhost kernel: irda_poll(), POLLHUP Apr 12 23:24:54 localhost kernel: irda_recvmsg_stream() Apr 12 23:24:54 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000004 Apr 12 23:24:54 localhost kernel: printing eip: Apr 12 23:24:54 localhost kernel: f99f8a28 Apr 12 23:24:54 localhost kernel: *pde = 00000000 Apr 12 23:24:54 localhost kernel: Oops: 0002 [#1] Apr 12 23:24:54 localhost kernel: PREEMPT Apr 12 23:24:54 localhost kernel: Modules linked in: nls_iso8859_1 nls_cp850 vfat fat radeon drm rfcomm l2cap bluetooth nfs lockd nfs_acl sunrpc af_packet thermal fan button sbs i2c_ec autofs4 snd_intel8x0m fuse dm_crypt md_mod dm_snapshot dm_mirror b44 mii ieee80211_crypt_tkip ieee80211_crypt_ccmp ieee80211_crypt_wep ieee80211 ieee80211_crypt cpufreq_conservative cpufreq_ondemand cpufreq_performance cpufreq_powersave acpi_cpufreq freq_table processor sg scsi_mod usbhid hid nsc_ircc snd_intel8x0 snd_ac97_codec pcmcia firmware_class ac97_bus snd_pcm_oss snd_mixer_oss joydev irda snd_pcm snd_timer snd ide_cd tifm_7xx1 tifm_sd mmc_block mmc_core rtc crc_ccitt soundcore cdrom ehci_hcd uhci_hcd yenta_socket rsrc_nonstatic pcmcia_core i2c_i801 pcspkr tifm_core ohci1394 ieee1394 psmouse usbcore snd_page_alloc intel_agp agpgart evdev unix Apr 12 23:24:54 localhost kernel: CPU: 0 Apr 12 23:24:54 localhost kernel: EIP: 0060:[<f99f8a28>] Not tainted VLI Apr 12 23:24:54 localhost kernel: EFLAGS: 00210297 (2.6.20.6-ir #2) Apr 12 23:24:54 localhost kernel: EIP is at irda_recvmsg_stream+0x132/0x2ff [irda] Apr 12 23:24:54 localhost kernel: eax: 00000000 ebx: 00000000 ecx: 00000001 edx: ea727654 Apr 12 23:24:54 localhost kernel: esi: ea727600 edi: 00000000 ebp: ea727654 esp: ef8dde2c Apr 12 23:24:54 localhost kernel: ds: 007b es: 007b ss: 0068 Apr 12 23:24:54 localhost kernel: Process ircp (pid: 10292, ti=ef8dc000 task=f7f4e550 task.ti=ef8dc000) Apr 12 23:24:54 localhost kernel: Stack: f9a00b16 f99fc6b9 00000001 c0162c96 00000000 ef8ddea4 00000000 00000000 Apr 12 23:24:54 localhost kernel: 00000001 00000000 f7f4e550 c0113473 00000000 00000000 f99fc500 ef9989c0 Apr 12 23:24:54 localhost kernel: ef8ddee0 00000003 c023ebc5 00000003 00000000 00000001 00000018 00000014 Apr 12 23:24:54 localhost kernel: Call Trace: Apr 12 23:24:54 localhost kernel: [<c0162c96>] core_sys_select+0x1aa/0x2bc Apr 12 23:24:54 localhost kernel: [<c0113473>] default_wake_function+0x0/0xc Apr 12 23:24:54 localhost kernel: [<c023ebc5>] sock_aio_read+0xc1/0xcd Apr 12 23:24:54 localhost kernel: [<c0157ef3>] do_sync_read+0xc7/0x10a Apr 12 23:24:54 localhost kernel: [<f99f2dfc>] irttp_flush_queues+0xa3/0xac [irda] Apr 12 23:24:54 localhost kernel: [<c0125e1f>] autoremove_wake_function+0x0/0x35 Apr 12 23:24:54 localhost kernel: [<c01586ef>] vfs_read+0x9c/0x134 Apr 12 23:24:54 localhost kernel: [<c0158a7b>] sys_read+0x41/0x67 Apr 12 23:24:54 localhost kernel: [<c0102c66>] sysenter_past_esp+0x5f/0x85 Apr 12 23:24:54 localhost kernel: ======================= Apr 12 23:24:54 localhost kernel: Code: 00 00 00 8d 6e 54 89 e8 e8 88 a5 84 c6 89 c7 85 c0 0f 85 ba 00 00 00 8b 4c 24 20 39 4c 24 1c 0f 83 7a 01 00 00 8b 86 00 01 00 00 <0f> ba 68 04 01 8b 46 38 8d 54 24 24 e8 b7 d5 72 c6 65 a1 08 00 Apr 12 23:24:54 localhost kernel: EIP: [<f99f8a28>] irda_recvmsg_stream+0x132/0x2ff [irda] SS:ESP 0068:ef8dde2c Apr 12 23:24:54 localhost kernel: <7>irda_release() Apr 12 23:24:54 localhost kernel: irda_destroy_socket(ea727600) And just for fun the second attempt to send the file from the Pocket PC like the last time without closing the transmit dialog on the Pocket PC: acer01:/tmp# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 > /proc/sys/net/irda/debug Do 12. Apr 23:25:14 CEST 2007 Waiting for incoming connection Incoming connection Receiving SDReg.exe...done Disconnecting I've attached the debug.log.
Hi Berthold, After spending sometime looking at your first debug log, this is my analysis: In the failure case. we see the following: 1) Once the PocketPC IAS requests succeeds, we see it connecting to the OBEX LSAP. 2) The corresponding LMP connection succeeds, but the PocketPC keeps on doing IAS requests, that fails. 3) After several failing IAS requests, the PocketPC decides to close the OBEX lSAP. In the successful case, we see: 1) same as above. 2) The LMP succeeds, but the PocketPC doesn't resend IAS requests. 3) The PocketPC seems to understand the the LMP link is OK, and start the OBEX/IrTTP transfer. So, I need to understand the following points: a) Why does the PocketPC peer keep on sending IAS requests when the IrLMP connection is established b) why does the IAS requests after the LMP connection is established keep on failing. For helping me answering those 2, I would really need to run the same test again (without Olaf's patch for now), and give us the irdadump output while running the test (make sure you start irdadump before starting the test, obviously). I don't need the kernel logs this time. Thanks a lot for your patience.
Olaf, in response to your comment #14: I think part of your patch makes sense, but what Berthold is mostly concerned about is why he can't run the OBEX transfer properly (at least not on the first try). I don't think your patch intends to fix that bug, even though it tries to fix other issues. Please correct me if I'm wrong.
Re comment #18: Indeed, I don't think my patch will fix the problem at hand. But the way af_irda currently deals with connection close is wrong (there should not be a kernel ASSERT when doing something legal on a socket). I would fix that as a first step and then see if the user space app can cope with the error condition (simply by retrying for instance). Whatever - this is something that ought to get fixed. I will readily admit that my knowledge about the IR protocols is rather sketchy. I looked at the debug traces anyway, and here's a difference that may be significant: in the non-working case, the client first looks for "OBEX:", and then for "OBEX:IrXfer". In the working case, it goes for "OBEX:IrXfer" straight away. Here's the non-working case: LM-IAS: Looking up OBEX: IrDA:TinyTP:LsapSel hashbin_find() LM-IAS: Object OBEX not found iriap_getvaluebyclass_response() iriap_getvaluebyclass_response: sending IAS_MISSING [...] The client disconnects: irlap_do_event(), event = RECV_I_CMD, state = LAP_NRM_S irlap_state_nrm_s(), event=RECV_I_CMD irlap_state_nrm_s(), event=RECV_I_CMD nr=2, vs=2, ns=2, vr=2, pf=16 irlap_validate_nr_received(), expected! irlmp_link_data_indication() irlmp_link_data_indication(), Disconnect indication! irlmp_do_lsap_event(), EVENT = LM_DISCONNECT_INDICATION, STATE = LSAP_DATA_TRANSFER_READY irlmp_state_dtr() irlmp_state_dtr(), trying to close IrLAP irlmp_do_lap_event(), EVENT = LM_LAP_DISCONNECT_REQUEST, STATE = LAP_ACTIVE irlmp_state_active() irlmp_disconnect_indication(), reason=LM_USER_REQUEST irlmp_disconnect_indication(), slsap_sel=00, dlsap_sel=03 And tries to connect to the OBEX server. But note the LSAP=0. That looks very bogus to me (in the working case, the connection will be to the LSAP of the OBEX service). irlmp_link_data_indication() irlmp_link_data_indication(), incoming connection, source LSAP=0, dest LSAP=3 irlmp_do_lsap_event(), EVENT = LM_CONNECT_INDICATION, STATE = LSAP_DISCONNECTED irlmp_state_disconnected() irlmp_do_lap_event(), EVENT = LM_LAP_CONNECT_REQUEST, STATE = LAP_ACTIVE irlmp_state_active() irlmp_state_active(), LS_CONNECT_REQUEST This may mean that the client is buggy, and fails to grok the reply we sent (indicating that the service isn't there). Unfortunately, I don't have the IrLMP spec...
Re comment #20 (stupid bugzilla and its no-mail-interface...): 1) I agree with the af_irda.c code. Note that we don't do kernel ASSERT but rather IRDA_ASSERT, which are just conditions check. Your patch was doing mostly the same thing when checking for sock_error(), except that it was forwarding the error code, which is right IMO. 2) The OBEX protocol specifies that an IrDA peer should first ask for an "OBEX" service and if that fails, then ask for an "OBEX:IrXFER" one. The "OBEX" service is for a generic OBEX service while the "OBEX:IrXFER" one is for a file oriented one. ircp correctly registers itself as "OBEX:IrXFER". The Pocket PC does not exactly that since it asks several times in a row for an "OBEX" service, but we cope with that (we send an IAS missing until it asks for "OBEX:IrXFER"). You can notice from the log that once we answer positively to the IAS request, it actually connects to the right LSAP: irlmp_link_data_indication(), incoming connection, source LSAP=26, dest LSAP=6 But then instead of starting the OBEX transfer, it goes again asking for the "OBEX" service, gets negative answers, and then decides to close all open LSAPs. I wonder if it's a PocketPC bug or if we don't correctly send the connection confirmation on LSAP 26. That's why I asked Berthold for irdadump traces. 3) LSAP 0 is the IAS assigned LSAP. It is specified on the IrLMP spec. Whenever you want to ask for a service from an IrDA peer, you establish a connection on its LSAP 0, and sent IAP commands on that link. That's what the PocketPC does, and once it gets the answer from its "OBEX" or "OBEX:IrXFER" request, it closes the connection on LSAP 0, while opening another one the the OBEX LSAP whenever it gets a positive answer. This behaviour is correct. FWIW though, my Symbian phone closes the whole connection (down to the IrLAP level) when the IAS request gets a negative answer, while the PocketPC closes only the LMP one.
Re: the reported crash in comment #16 - that's caused by irda_disconnect_indication calling sock_orphan. It dies in set_bit(SOCK_ASYNC_WAITDATA, &sk->sk_socket->flags) because sk->sk_socket is NULL. The normal usage of sock_orphan is in the release() routine. Anything else is wrong, IMHO. A user process may be sleeping in say recvmsg_stream when the disconnect arrives. It wakes up, finds that there's still no data on the receive queue, and walks into the same oops my patch triggered. I googled a bit and found your Sep 3, 2006 patch that introduced the sock_orphan call - but I don't understand why. The oops Carl-Daniel posted in http://lkml.org/lkml/2006/8/29/358 looks like sk->sk_sleep is corrupted - but how?
Ah, I think I've got the answer to my own question. The problem was this: in irda_recvmsg_stream, when waiting for data to arrive, we define a waitq, and add it to sk->sk_sleep. If we find that sock_error indicates an error, we break out of the loop and return without cleaning up sk_sleep.
Created attachment 11146 [details] Output of irdadump (irdadump.log.bz2) Hello Samuel, I hope this is what you wanted me to do: I started irdadump with: irdadump -d -x -s 256 > /tmp/irdadump.log And then the transmit test: acer01:~# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 > /proc/sys/net/irda/debug So 15. Apr 01:32:46 CEST 2007 Waiting for incoming connection srv_obex_event(): Link error acer01:~# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 > /proc/sys/net/irda/debug So 15. Apr 01:33:01 CEST 2007 Waiting for incoming connection Incoming connection Receiving SDReg.exe...done Disconnecting acer01:~# I've attached the output of irdadump. Berthold
Hi Berthold, That's exactly what I was looking for, thanks a lot. Unfortunately, I don't see anything clearly bogus: The Pocket PC peer gets our positive IAS answer, since it replies with a connection request. I can't see why it wouldn't get our connection response. I checked the LAP fields from the connection frames and they look good. One thing that look weird to me is the fact that the Pocket PC doesn't seem to accept being a secondary device. Even though we start the discovery process, it eventually sends discovery frames as if it would try to get the primary role. Moreover, on the second try (the succesful one), there is no role inversion since the PocketPC is already sending discovery frames when we show up. So, Berthold, could you try to start your PC as a secondary device (run "irattach irda0" instead of "irattach irda0 -s") and then initiate the file transfer from your Pocket PC ? Thanks again for your time.
Let's see: /etc/defaults/irda-utils: DISCOVERY="false" leads to a 'irattach irda0' instead of a 'irattach irda0 -s'. Starting 'irdadump -d -x -s 256 > /tmp/irdadump.log'. acer01:~# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 > /proc/sys/net/irda/debug Di 17. Apr 00:04:35 CEST 2007 Waiting for incoming connection Starting the transmission at the PocketPC. The PocketPC shows only 'Infrared: aligning..' until I abort the process. Aborting ircp with ^C. /tmp/irdadump.log': 0 packets received by filter /var/log/debug: Apr 17 00:04:35 localhost kernel: irda_create() Apr 17 00:04:35 localhost kernel: irda_create() : self is f3532a00 Apr 17 00:04:35 localhost kernel: irlmp_register_client() Apr 17 00:04:35 localhost kernel: hashbin_insert() Apr 17 00:04:35 localhost kernel: enqueue_first() Apr 17 00:04:35 localhost kernel: irda_bind(f3532a00) Apr 17 00:04:35 localhost kernel: irlmp_slsap_inuse() Apr 17 00:04:35 localhost kernel: irlmp_find_free_slsap(), found free lsap_sel=12 Apr 17 00:04:35 localhost kernel: hashbin_insert() Apr 17 00:04:35 localhost kernel: enqueue_first() Apr 17 00:04:35 localhost kernel: irttp_open_tsap(), stsap_sel=12 Apr 17 00:04:35 localhost kernel: hashbin_insert() Apr 17 00:04:35 localhost kernel: enqueue_first() Apr 17 00:04:35 localhost kernel: irias_new_object() Apr 17 00:04:35 localhost kernel: hashbin_insert() Apr 17 00:04:35 localhost kernel: enqueue_first() Apr 17 00:04:35 localhost kernel: hashbin_insert() Apr 17 00:04:35 localhost kernel: enqueue_first() Apr 17 00:04:35 localhost kernel: irda_setsockopt(f3532a00) Apr 17 00:04:35 localhost kernel: irlmp_register_service(), hints = 2080 Apr 17 00:04:35 localhost kernel: hashbin_insert() Apr 17 00:04:35 localhost kernel: enqueue_first() Apr 17 00:04:35 localhost kernel: irda_listen() Apr 17 00:04:35 localhost kernel: irda_poll() Apr 17 00:04:35 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:04:36 localhost kernel: irlmp_discovery_timer_expired() Apr 17 00:04:36 localhost kernel: irlmp_expire_discoveries() Apr 17 00:04:36 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:04:36 localhost kernel: irda_poll() Apr 17 00:04:36 localhost kernel: irda_poll() Apr 17 00:04:36 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:04:37 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:04:37 localhost kernel: irda_poll() Apr 17 00:04:37 localhost kernel: irda_poll() Apr 17 00:04:37 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:04:38 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:04:38 localhost kernel: irda_poll() Apr 17 00:04:38 localhost kernel: irda_poll() Apr 17 00:04:38 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:04:39 localhost kernel: irlmp_discovery_timer_expired() Apr 17 00:04:39 localhost kernel: irlmp_expire_discoveries() Apr 17 00:04:39 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) .... looping ... Apr 17 00:05:29 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:05:30 localhost kernel: irlmp_discovery_timer_expired() Apr 17 00:05:30 localhost kernel: irlmp_expire_discoveries() Apr 17 00:05:30 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:05:30 localhost kernel: irda_poll() Apr 17 00:05:30 localhost kernel: irda_poll() Apr 17 00:05:30 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:05:31 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:05:31 localhost kernel: irda_poll() Apr 17 00:05:31 localhost kernel: irda_poll() Apr 17 00:05:31 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:05:32 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:05:32 localhost kernel: irda_poll() Apr 17 00:05:32 localhost kernel: irda_poll() Apr 17 00:05:32 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:05:33 localhost kernel: irlmp_discovery_timer_expired() Apr 17 00:05:33 localhost kernel: irlmp_expire_discoveries() Apr 17 00:05:33 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:05:33 localhost kernel: irda_poll() Apr 17 00:05:33 localhost kernel: irda_poll() Apr 17 00:05:33 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01) Apr 17 00:05:34 localhost kernel: irda_poll() Apr 17 00:05:34 localhost kernel: irda_release() Apr 17 00:05:34 localhost kernel: irda_destroy_socket(f3532a00) Apr 17 00:05:34 localhost kernel: irlmp_unregister_client() Apr 17 00:05:34 localhost kernel: hashbin_find() Apr 17 00:05:34 localhost kernel: irlmp_unregister_client(), removing client! Apr 17 00:05:34 localhost kernel: hashbin_remove_this() Apr 17 00:05:34 localhost kernel: dequeue_general() Apr 17 00:05:34 localhost kernel: irlmp_unregister_service() Apr 17 00:05:34 localhost kernel: hashbin_find() Apr 17 00:05:34 localhost kernel: hashbin_remove_this() Apr 17 00:05:34 localhost kernel: dequeue_general() Apr 17 00:05:34 localhost kernel: hashbin_remove_this() Apr 17 00:05:34 localhost kernel: dequeue_general() Apr 17 00:05:34 localhost kernel: dequeue_first() Apr 17 00:05:34 localhost last message repeated 4 times Apr 17 00:05:34 localhost kernel: irias_delete_value() Apr 17 00:05:34 localhost kernel: dequeue_first() Apr 17 00:05:34 localhost last message repeated 3 times Apr 17 00:05:34 localhost kernel: irttp_disconnect_request(), already disconnected! Apr 17 00:05:34 localhost kernel: irttp_close_tsap() Apr 17 00:05:34 localhost kernel: hashbin_remove() Apr 17 00:05:34 localhost kernel: dequeue_general() Apr 17 00:05:34 localhost kernel: hashbin_remove() Apr 17 00:05:34 localhost kernel: dequeue_general() Apr 17 00:05:34 localhost kernel: __irlmp_close_lsap() Apr 17 00:05:34 localhost kernel: irttp_flush_queues() Nothing else happens. The IR interface simply doesn't seem to exist. Stopping the irda-utils, setting DISCOVERY='true' and restarting the irda-utils and the PocketPC is recognised....
Any updates on this bug? Berthold, can you confirm the problem with recent kernel please. Thanks
Just tested with 2.6.24.4 (only patched with tuxonice): acer01:~# echo 4 > /proc/sys/net/irda/debug acer01:~# ircp -r Waiting for incoming connection srv_obex_event(): Link error acer01:~# echo 0 > /proc/sys/net/irda/debug So the problem still exists.... I will attach /var/log/debug. Berthold
Created attachment 15628 [details] debug log with kernel 2.6.24.4
Closing as obsolete , if this is incorrect please re-open