Bug 7512

Summary: irda: sock_error in af_irda.c:irda_recvmsg_stream
Product: Networking Reporter: Berthold Cogel (cogel)
Component: OtherAssignee: Samuel Ortiz (samuel)
Status: CLOSED OBSOLETE    
Severity: normal CC: akpm, alan, bunk, okir, protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.24 Subsystem:
Regression: No Bisected commit-id:
Attachments: debug log
Possible fix
debug log with patch (debug.log.bz2)
Output of irdadump (irdadump.log.bz2)
debug log with kernel 2.6.24.4

Description Berthold Cogel 2006-11-13 08:41:19 UTC
Most recent kernel where this bug did not occur:
Distribution: Debian unstable/testing
Hardware Environment: Acer Extensa 3002 WLMi Notebook
Software Environment: 2.6.19-rc5 (kernel.org), irda-utils 0.9.18-3
Problem Description: Link error while waiting for incoming connection via irda

Steps to reproduce: Call ircp -r and try to send a file via IR from a Pocket PC

acer01:~$ ircp -r
Waiting for incoming connection
srv_obex_event(): Link error

and in /var/log/messages:

Assertion failed! net/irda/af_irda.c:irda_recvmsg_stream:1413 !sock_error(sk)


Irda (FIR) device is identified as NSC Vishay TFDS-6500. Resources have been
identified from windows hardware manager: base 0x2F8, irq 3, dma 1.

I tried to debug the problem with 

echo 4 > /proc/sys/net/irda/debug 
ircp -r
echo 0 > /proc/sys/net/irda/debug

Then I've extracted the kernel messages from /var/log/kern.log.
See: http://www.uni-koeln.de/~a0537/irda_debug.log

Sending files to the Pocket PC works as expected. 


Additional informations:

/etc/modprobe.d/irda-utils.local

alias irda0 nsc-ircc
options nsc-ircc dongle_id=0x09 io=0x2f8 irq=3 dma=1
install nsc-ircc /bin/setserial /dev/ttyS0 uart none port 0 irq 0; \
/sbin/modprobe --ignore-install nsc-ircc


/etc/default/irda-utils

ENABLE=""
DISCOVERY="true"
DEVICE="irda0"
DONGLE="none"


/var/log/messages during boot

...
localhost kernel: nsc-ircc, chip->init
localhost kernel: nsc-ircc, Found chip at base=0x164e
localhost kernel: nsc-ircc, driver loaded (Dag Brattli)
localhost kernel: IrDA: Registered device irda0
localhost kernel: nsc-ircc, Using dongle: IBM31T1100 or Temic TFDS6000/TFDS6500
...
localhost irattach: executing: '/sbin/modprobe irda0'
localhost irattach: executing: 'echo acer01 > /proc/sys/net/irda/devname'
localhost irattach: executing: 'echo 1 > proc/sys/net/irda/discovery'
localhost irattach: Starting device irda0


/var/log/dmesg during boot

...
nsc_ircc_pnp_probe() : From PnP, found firbase 0x2F8 ; irq 3 ; dma 1.
nsc-ircc, chip->init
nsc-ircc, Found chip at base=0x164e
nsc-ircc, driver loaded (Dag Brattli)
IrDA: Registered device irda0
nsc-ircc, Using dongle: IBM31T1100 or Temic TFDS6000/TFDS6500


Regards,
Berthold Cogel
Comment 1 Berthold Cogel 2006-12-18 15:48:59 UTC
I've just tried linux-2.6.20-rc1. It's still the same:

Assertion failed! net/irda/af_irda.c:irda_recvmsg_stream:1413 !sock_error(sk)


Regards,

Berthold Cogel
Comment 2 Samuel Ortiz 2006-12-19 15:44:45 UTC
Out of curiosity, could you please try the following patch ?

diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index 7e1aea8..c3e45fe 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1410,7 +1410,6 @@ static int irda_recvmsg_stream(struct kiocb *iocb, struct 
        IRDA_DEBUG(3, "%s()\n", __FUNCTION__);
 
        IRDA_ASSERT(self != NULL, return -1;);
-       IRDA_ASSERT(!sock_error(sk), return -1;);
 
        if (sock->flags & __SO_ACCEPTCON)
                return(-EINVAL);

Comment 3 Samuel Ortiz 2006-12-19 15:48:58 UTC
Berthold, I forgot to ask:

Do you get this message at the end of the transfer ? From your debug output, I
see some data going through...
Comment 4 Berthold Cogel 2006-12-20 16:00:02 UTC
Hello Samuel,

at least none of the files I've tried to send was created. The Pocket PC tries
to send the file, gives an error (without details) and retries with the same result.

I don't know very much about irda and what I can do with it. I only tried to
sync my Pocket PC with my laptop without an USB cable. I followed the various
HowTo's I've found in the net and receiving files via 'ircp -r' was one of the
steps. Are there any special tests I can do?

Applying your patch to 2.6.20-rc1 leads to an 'Oops'. This s what I get in the logs:


Dec 21 00:42:00 localhost kernel: irlap_change_speed(), setting speed to 115200
Dec 21 00:42:00 localhost kernel: irlmp_state_dtr(), Unknown event
LM_LAP_CONNECT_CONFIRM on LSAP 0x10
Dec 21 00:42:00 localhost last message repeated 2 times
Dec 21 00:42:01 localhost kernel: irda_poll(), POLLHUP
Dec 21 00:42:01 localhost kernel: BUG: unable to handle kernel NULL pointer
dereference at virtual address 00000004
Dec 21 00:42:01 localhost kernel:  printing eip:
Dec 21 00:42:01 localhost kernel: f99ab9d7
Dec 21 00:42:01 localhost kernel: *pde = 00000000
Dec 21 00:42:01 localhost kernel: Oops: 0002 [#1]
Dec 21 00:42:01 localhost kernel: PREEMPT 
Dec 21 00:42:01 localhost kernel: Modules linked in: binfmt_misc radeon rfcomm
drm l2cap bluetooth nfs lockd nfs_acl sunrpc af_packet thermal fan button sbs
i2c_ec autofs4 snd_intel8x0m dm_crypt dm_mirror ipw2200 b44 mii
ieee80211_crypt_tkip ieee80211_crypt_ccmp ieee80211_crypt_wep ieee80211
ieee80211_crypt cpufreq_conservative cpufreq_ondemand cpufreq_performance
cpufreq_powersave acpi_cpufreq freq_table processor sg scsi_mod usbhid hid
joydev pcmcia nsc_ircc snd_intel8x0 snd_ac97_codec firmware_class ac97_bus
snd_pcm_oss snd_mixer_oss snd_pcm snd_timer tifm_7xx1 tifm_sd mmc_block mmc_core
irda ehci_hcd uhci_hcd snd soundcore ide_cd tifm_core yenta_socket
rsrc_nonstatic pcmcia_core i2c_i801 crc_ccitt psmouse usbcore snd_page_alloc rtc
pcspkr cdrom ohci1394 ieee1394 intel_agp agpgart evdev unix
Dec 21 00:42:01 localhost kernel: CPU:    0
Dec 21 00:42:01 localhost kernel: EIP:    0060:[<f99ab9d7>]    Not tainted VLI
Dec 21 00:42:01 localhost kernel: EFLAGS: 00210297   (2.6.20-rc1-vanilla #2)
Dec 21 00:42:01 localhost kernel: EIP is at irda_recvmsg_stream+0xfb/0x2c8 [irda]
Dec 21 00:42:01 localhost kernel: eax: 00000000   ebx: ffffffa1   ecx: 00000001
  edx: f5b2ee54
Dec 21 00:42:01 localhost kernel: esi: f5b2ee00   edi: 00000000   ebp: f5b2ee54
  esp: f328de2c
Dec 21 00:42:01 localhost kernel: ds: 007b   es: 007b   ss: 0068
Dec 21 00:42:01 localhost kernel: Process ircp (pid: 3941, ti=f328c000
task=f2709a70 task.ti=f328c000)
Dec 21 00:42:01 localhost kernel: Stack: f328de4c f328de4c 00000001 c015663e
00000000 f328dea4 00000000 00000000 
Dec 21 00:42:01 localhost kernel:        00000001 00000000 f2709a70 c011370c
00000000 00000000 f99af4c0 f2137b40 
Dec 21 00:42:01 localhost kernel:        f328dee0 00000003 c0233611 00000003
00000000 00000001 f99ab1dd ffffffff 
Dec 21 00:42:01 localhost kernel: Call Trace:
Dec 21 00:42:01 localhost kernel:  [<c015663e>] core_sys_select+0x1aa/0x2bc
Dec 21 00:42:01 localhost kernel:  [<c011370c>] default_wake_function+0x0/0xc
Dec 21 00:42:01 localhost kernel:  [<c0233611>] sock_aio_read+0xc1/0xcd
Dec 21 00:42:01 localhost kernel:  [<f99ab1dd>] irda_accept+0x34b/0x384 [irda]
Dec 21 00:42:01 localhost kernel:  [<c014b8bf>] do_sync_read+0xc7/0x10a
Dec 21 00:42:01 localhost kernel:  [<f99a5dfc>] irttp_flush_queues+0xa3/0xac [irda]
Dec 21 00:42:01 localhost kernel:  [<c012603f>] autoremove_wake_function+0x0/0x35
Dec 21 00:42:01 localhost kernel:  [<c014c0bb>] vfs_read+0x9c/0x134
Dec 21 00:42:01 localhost kernel:  [<c014c447>] sys_read+0x41/0x67
Dec 21 00:42:01 localhost kernel:  [<c0102c56>] sysenter_past_esp+0x5f/0x85
Dec 21 00:42:01 localhost kernel:  =======================
Dec 21 00:42:01 localhost kernel: Code: 00 00 00 8d 6e 54 89 e8 e8 f5 bf 88 c6
89 c7 85 c0 0f 85 ba 00 00 00 8b 4c 24 20 39 4c 24 1c 0f 83 7a 01 00 00 8b 86 00
01 00 00 <0f> ba 68 04 01 8b 46 38 8d 54 24 24 e8 28 a8 77 c6 65 a1 08 00 
Dec 21 00:42:01 localhost kernel: EIP: [<f99ab9d7>]
irda_recvmsg_stream+0xfb/0x2c8 [irda] SS:ESP 0068:f328de2c
Dec 21 00:42:03 localhost kernel:  <7>irlap_change_speed(), setting speed to 9600


Comment 5 Samuel Ortiz 2007-01-07 05:31:29 UTC
Hi again Berthold,

I will need your help since I can not reproduce this bug myself (I don't own any
PocketPC device). Also, sorry for the long delays...

So, if you have time for that, could you please revert the following patch:
http://sortiz.org/linux/irda/0005-IrDA-af_irda.c-cleanups.txt

from your kernel tree: "patch -p1 -R < 0005-IrDA-af_irda.c-cleanups.txt"

and then test again.

Thanks in advance for your time.

Cheers,
Samuel.
Comment 6 Berthold Cogel 2007-01-07 11:09:36 UTC
Hello Samuel,

I applied the patch to the latest rc-Kernel (linux-2.6.20-rc3) with no success.
I still get this:

acer01:~# ircp -r
Waiting for incoming connection
srv_obex_event(): Link error


and then I get a 'int3: 0000 [#1]' message.

/var/log/kern.log:

Jan  7 20:03:44 localhost kernel: irlap_change_speed(), setting speed to 115200
Jan  7 20:03:44 localhost kernel: irlmp_state_dtr(), Unknown event
LM_LAP_CONNECT_CONFIRM on LSAP 0x10
Jan  7 20:03:44 localhost last message repeated 2 times
Jan  7 20:03:45 localhost kernel: irda_poll(), POLLHUP
Jan  7 20:03:45 localhost kernel: int3: 0000 [#1]
Jan  7 20:03:45 localhost kernel: PREEMPT 
Jan  7 20:03:45 localhost kernel: Modules linked in: binfmt_misc radeon drm
rfcomm l2cap bluetooth nfs lockd nfs_acl sunrpc af_packet thermal fan button sbs
i2c_ec autofs4 snd_intel8x0m dm_crypt dm_mirror ipw2200 b44 mii
ieee80211_crypt_tkip ieee80211_crypt_ccmp ieee80211_crypt_wep ieee80211
ieee80211_crypt cpufreq_conservative cpufreq_ondemand cpufreq_performance
cpufreq_powersave acpi_cpufreq freq_table processor sg scsi_mod usbhid hid
pcmcia firmware_class joydev nsc_ircc snd_intel8x0 snd_ac97_codec ac97_bus
snd_pcm_oss snd_mixer_oss tifm_7xx1 tifm_sd mmc_block mmc_core ide_cd snd_pcm
snd_timer irda yenta_socket rsrc_nonstatic pcmcia_core ohci1394 ieee1394
tifm_core cdrom i2c_i801 snd soundcore snd_page_alloc crc_ccitt psmouse ehci_hcd
uhci_hcd rtc pcspkr usbcore intel_agp agpgart evdev unix
Jan  7 20:03:45 localhost kernel: CPU:    0
Jan  7 20:03:45 localhost kernel: EIP:    0060:[<f7aa1945>]    Not tainted VLI
Jan  7 20:03:45 localhost kernel: EFLAGS: 00000007   (2.6.20-rc3-vanilla #1)
Jan  7 20:03:45 localhost kernel: EIP is at 0xf7aa1945
Jan  7 20:03:45 localhost kernel: eax: f5a91e50   ebx: f5a91e50   ecx: 00000000
  edx: 00000001
Jan  7 20:03:45 localhost kernel: esi: c0151d11   edi: 00000000   ebp: f5a91f14
  esp: f5a91ef4
Jan  7 20:03:45 localhost kernel: ds: 007b   es: 007b   ss: 0068
Jan  7 20:03:45 localhost kernel: Process ircp (pid: 4047, ti=f5a90000
task=f61a3550 task.ti=f5a90000)
Jan  7 20:03:45 localhost kernel: Stack: c0112a23 00000000 00000001 f26e6cd8
c018c195 f5a90000 00000000 00000292 
Jan  7 20:03:45 localhost kernel:        f5a91f30 c0113cc7 00000000 00000000
f468dc00 f26e6cc0 f26e6ce4 f273eea4 
Jan  7 20:03:45 localhost kernel:        c0236e5b 00000000 f99936e9 f26e6cc0
c0233ee0 00000000 f99aa780 f26e6cc0 
Jan  7 20:03:45 localhost kernel: Call Trace:
Jan  7 20:03:45 localhost kernel:  [<c0112a23>] __wake_up_common+0x31/0x4f
Jan  7 20:03:45 localhost kernel:  [<c018c195>] ext3_permission+0x0/0xa
Jan  7 20:03:45 localhost kernel:  [<c0113cc7>] __wake_up+0x2a/0x4f
Jan  7 20:03:45 localhost kernel:  [<c0236e5b>] sock_def_wakeup+0x2c/0x45
Jan  7 20:03:45 localhost kernel:  [<f99936e9>] irda_release+0x4f/0x151 [irda]
Jan  7 20:03:45 localhost kernel:  [<c0233ee0>] sock_fasync+0x107/0x113
Jan  7 20:03:45 localhost kernel:  [<c0234b05>] sock_release+0x14/0x6f
Jan  7 20:03:45 localhost kernel:  [<c0234d85>] sock_close+0x2a/0x2f
Jan  7 20:03:45 localhost kernel:  [<c014c7ed>] __fput+0x96/0x150
Jan  7 20:03:45 localhost kernel:  [<c014a3f5>] filp_close+0x51/0x58
Jan  7 20:03:45 localhost kernel:  [<c014b3ec>] sys_close+0x75/0xc1
Jan  7 20:03:45 localhost kernel:  [<c0102c56>] sysenter_past_esp+0x5f/0x85
Jan  7 20:03:45 localhost kernel:  [<c0280033>] xfrm_hash_resize+0x1d/0x265
Jan  7 20:03:45 localhost kernel:  =======================
Jan  7 20:03:45 localhost kernel: Code: ff ff ff ff ff ff 20 19 aa f7 20 19 aa
f7 00 10 00 00 00 00 00 00 00 00 00 00 01 00 00 00 38 19 aa f7 38 19 aa f7 00 00
00 00 cc <44> 8a c1 a8 dc 63 f6 88 32 65 f6 f0 45 a1 f7 f0 15 aa f7 10 09 
Jan  7 20:03:45 localhost kernel: EIP: [<f7aa1945>] 0xf7aa1945 SS:ESP 0068:f5a91ef4
Jan  7 20:03:45 localhost kernel:  <6>note: ircp[4047] exited with preempt_count 2
Jan  7 20:03:45 localhost kernel: BUG: scheduling while atomic: ircp/0x10000002/4047
Jan  7 20:03:45 localhost kernel:  [<c02825b6>] __sched_text_start+0x56/0x546
Jan  7 20:03:45 localhost kernel:  [<c011d665>] run_timer_softirq+0x16e/0x176
Jan  7 20:03:45 localhost kernel:  [<c0112d05>] __cond_resched+0x16/0x34
Jan  7 20:03:45 localhost kernel:  [<c0282ba8>] cond_resched+0x26/0x31
Jan  7 20:03:45 localhost kernel:  [<c013c95e>] unmap_vmas+0x370/0x465
Jan  7 20:03:45 localhost kernel:  [<c013efe7>] exit_mmap+0x69/0xe4
Jan  7 20:03:45 localhost kernel:  [<c01146a1>] mmput+0x1c/0x7c
Jan  7 20:03:45 localhost kernel:  [<c0118a05>] do_exit+0x1b0/0x6e5
Jan  7 20:03:45 localhost kernel:  [<c0116c3b>] printk+0x1b/0x1f
Jan  7 20:03:45 localhost kernel:  [<c010407a>] die+0x1d1/0x1d9
Jan  7 20:03:45 localhost kernel:  [<c01047a9>] do_int3+0x6e/0x74
Jan  7 20:03:45 localhost kernel:  [<c0151d11>] permission+0x86/0xa2
Jan  7 20:03:45 localhost kernel:  [<c02843ef>] int3+0x27/0x2c
Jan  7 20:03:45 localhost kernel:  [<c0151d11>] permission+0x86/0xa2
Jan  7 20:03:45 localhost kernel:  [<c014007b>]
arch_get_unmapped_area_topdown+0xed/0x11c
Jan  7 20:03:45 localhost kernel:  [<c0112a23>] __wake_up_common+0x31/0x4f
Jan  7 20:03:45 localhost kernel:  [<c018c195>] ext3_permission+0x0/0xa
Jan  7 20:03:45 localhost kernel:  [<c0113cc7>] __wake_up+0x2a/0x4f
Jan  7 20:03:45 localhost kernel:  [<c0236e5b>] sock_def_wakeup+0x2c/0x45
Jan  7 20:03:45 localhost kernel:  [<f99936e9>] irda_release+0x4f/0x151 [irda]
Jan  7 20:03:45 localhost kernel:  [<c0233ee0>] sock_fasync+0x107/0x113
Jan  7 20:03:45 localhost kernel:  [<c0234b05>] sock_release+0x14/0x6f
Jan  7 20:03:45 localhost kernel:  [<c0234d85>] sock_close+0x2a/0x2f
Jan  7 20:03:45 localhost kernel:  [<c014c7ed>] __fput+0x96/0x150
Jan  7 20:03:45 localhost kernel:  [<c014a3f5>] filp_close+0x51/0x58
Jan  7 20:03:45 localhost kernel:  [<c014b3ec>] sys_close+0x75/0xc1
Jan  7 20:03:45 localhost kernel:  [<c0102c56>] sysenter_past_esp+0x5f/0x85
Jan  7 20:03:45 localhost kernel:  [<c0280033>] xfrm_hash_resize+0x1d/0x265
Jan  7 20:03:45 localhost kernel:  =======================
Jan  7 20:03:47 localhost kernel: irlap_change_speed(), setting speed to 9600


Regards,
Berthold
Comment 7 Berthold Cogel 2007-04-11 14:15:04 UTC
Hello!

Finally I found the time to check the latest stable kernel: 2.6.20.6

I get still the same error messages. But now I did something I've never tried
before:

Like the last time I called 'ircp -r' and then I started the transfer on my
pocket pc. I got the 'Link error' and even the PDA reported an error. The last
times I closed the 'Send' dialog on the PDA. 
This time I restarted ircp and clicked the 'resend' in the dialog on my PDA. The
file was transmited.

I can reproduce this behavior.

acer01:/tmp# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 >
/proc/sys/net/irda/debug
Mi 11. Apr 22:17:45 CEST 2007
Waiting for incoming connection
srv_obex_event(): Link error
acer01:/tmp# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 >
/proc/sys/net/irda/debug
Mi 11. Apr 22:18:01 CEST 2007
Waiting for incoming connection
Incoming connection
Receiving SDReg.exe...done
Disconnecting

I'm attaching the debug log.

Regards,
Berthold 


 
Comment 8 Berthold Cogel 2007-04-11 14:16:34 UTC
Created attachment 11132 [details]
debug log
Comment 9 Samuel Ortiz 2007-04-11 15:05:46 UTC
Hi Berthold,

For some reasons, I can't read the attachment. I only get "attachment.cgi"...
Could you please look at it and maybe try to attach it again ?

Thanks for your time.

Cheers,
Samuel.
Comment 10 Adrian Bunk 2007-04-11 16:15:14 UTC
Samuel, if you save Berthold's attachment as file.bz2 and run bunzip2 on it
you'll get the contents.
Comment 11 Samuel Ortiz 2007-04-12 01:16:48 UTC
Yes, that seems to work. Thanks Adrian.
Firefox proposes to save the attachment as attachment.cgi, so I didn't even
bother to check...
Comment 12 Berthold Cogel 2007-04-12 01:29:49 UTC
Sorry, my fault. This was my first attachment in bugzilla.
Comment 13 Berthold Cogel 2007-04-12 01:37:16 UTC
Oops!

I forgot to tell, that I haven't applied the patch for af_irda.c Samuel posted.
I get the 'Assertion failed!...' messages in /var/log/messages and the stuff in
the attachment. 
Comment 14 Olaf Kirch 2007-04-12 04:10:18 UTC
It seems the Pocket PC closes the connection:

Apr 11 22:17:53 localhost kernel: irda_disconnect_indication(f3844600)
[...]

This is signalled to the application as POLLHUP:

Apr 11 22:17:53 localhost kernel: irda_poll()
Apr 11 22:17:53 localhost kernel: irda_poll(), POLLHUP

[...]

And then the application calls recvmsg:

Apr 11 22:17:53 localhost kernel: irda_recvmsg_stream()
Apr 11 22:17:53 localhost kernel: Assertion failed! 
net/irda/af_irda.c:irda_recvmsg_stream:1413 !sock_error(sk)

I think the problem is two-fold.

 -	currently, irda_disconnect_indication will set
	sk->sk_err = ECONNRESET unconditionally whenever it sees
	a transition ESTABLISHED->CLOSED. I think this is plain wrong.
	You only need to return ECONNRESET if the user calls sendmsg()
	on a disconnected socket.

 -	irda_recvmsg_stream should check for TCP_CLOSE, and return 0
	in this case.
Comment 15 Olaf Kirch 2007-04-12 04:11:40 UTC
Created attachment 11135 [details]
Possible fix

This patch is entirely untested, as I don't have the hardware. Please give
it a try.
Comment 16 Berthold Cogel 2007-04-12 15:10:11 UTC
Created attachment 11137 [details]
debug log with patch (debug.log.bz2)

Like the last time:

acer01:/tmp# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 >
/proc/sys/net/irda/debug
Do 12. Apr 23:24:49 CEST 2007
Waiting for incoming connection
Speicherzugriffsfehler

Got an Oops. In /var/log/kern.log:

Apr 12 23:24:54 localhost kernel: irda_poll()
Apr 12 23:24:54 localhost kernel: irda_poll(), POLLHUP
Apr 12 23:24:54 localhost kernel: irda_recvmsg_stream()
Apr 12 23:24:54 localhost kernel: BUG: unable to handle kernel NULL pointer
dereference at virtual address 00000004
Apr 12 23:24:54 localhost kernel:  printing eip:
Apr 12 23:24:54 localhost kernel: f99f8a28
Apr 12 23:24:54 localhost kernel: *pde = 00000000
Apr 12 23:24:54 localhost kernel: Oops: 0002 [#1]
Apr 12 23:24:54 localhost kernel: PREEMPT 
Apr 12 23:24:54 localhost kernel: Modules linked in: nls_iso8859_1 nls_cp850
vfat fat radeon drm rfcomm l2cap bluetooth nfs lockd nfs_acl sunrpc af_packet
thermal fan button sbs i2c_ec autofs4 snd_intel8x0m fuse dm_crypt md_mod
dm_snapshot dm_mirror b44 mii ieee80211_crypt_tkip ieee80211_crypt_ccmp
ieee80211_crypt_wep ieee80211 ieee80211_crypt cpufreq_conservative
cpufreq_ondemand cpufreq_performance cpufreq_powersave acpi_cpufreq freq_table
processor sg scsi_mod usbhid hid nsc_ircc snd_intel8x0 snd_ac97_codec pcmcia
firmware_class ac97_bus snd_pcm_oss snd_mixer_oss joydev irda snd_pcm snd_timer
snd ide_cd tifm_7xx1 tifm_sd mmc_block mmc_core rtc crc_ccitt soundcore cdrom
ehci_hcd uhci_hcd yenta_socket rsrc_nonstatic pcmcia_core i2c_i801 pcspkr
tifm_core ohci1394 ieee1394 psmouse usbcore snd_page_alloc intel_agp agpgart
evdev unix
Apr 12 23:24:54 localhost kernel: CPU:	  0
Apr 12 23:24:54 localhost kernel: EIP:	  0060:[<f99f8a28>]    Not tainted VLI
Apr 12 23:24:54 localhost kernel: EFLAGS: 00210297   (2.6.20.6-ir #2)
Apr 12 23:24:54 localhost kernel: EIP is at irda_recvmsg_stream+0x132/0x2ff
[irda]
Apr 12 23:24:54 localhost kernel: eax: 00000000   ebx: 00000000   ecx: 00000001
  edx: ea727654
Apr 12 23:24:54 localhost kernel: esi: ea727600   edi: 00000000   ebp: ea727654
  esp: ef8dde2c
Apr 12 23:24:54 localhost kernel: ds: 007b   es: 007b	ss: 0068
Apr 12 23:24:54 localhost kernel: Process ircp (pid: 10292, ti=ef8dc000
task=f7f4e550 task.ti=ef8dc000)
Apr 12 23:24:54 localhost kernel: Stack: f9a00b16 f99fc6b9 00000001 c0162c96
00000000 ef8ddea4 00000000 00000000 
Apr 12 23:24:54 localhost kernel:	 00000001 00000000 f7f4e550 c0113473
00000000 00000000 f99fc500 ef9989c0 
Apr 12 23:24:54 localhost kernel:	 ef8ddee0 00000003 c023ebc5 00000003
00000000 00000001 00000018 00000014 
Apr 12 23:24:54 localhost kernel: Call Trace:
Apr 12 23:24:54 localhost kernel:  [<c0162c96>] core_sys_select+0x1aa/0x2bc
Apr 12 23:24:54 localhost kernel:  [<c0113473>] default_wake_function+0x0/0xc
Apr 12 23:24:54 localhost kernel:  [<c023ebc5>] sock_aio_read+0xc1/0xcd
Apr 12 23:24:54 localhost kernel:  [<c0157ef3>] do_sync_read+0xc7/0x10a
Apr 12 23:24:54 localhost kernel:  [<f99f2dfc>] irttp_flush_queues+0xa3/0xac
[irda]
Apr 12 23:24:54 localhost kernel:  [<c0125e1f>]
autoremove_wake_function+0x0/0x35
Apr 12 23:24:54 localhost kernel:  [<c01586ef>] vfs_read+0x9c/0x134
Apr 12 23:24:54 localhost kernel:  [<c0158a7b>] sys_read+0x41/0x67
Apr 12 23:24:54 localhost kernel:  [<c0102c66>] sysenter_past_esp+0x5f/0x85
Apr 12 23:24:54 localhost kernel:  =======================
Apr 12 23:24:54 localhost kernel: Code: 00 00 00 8d 6e 54 89 e8 e8 88 a5 84 c6
89 c7 85 c0 0f 85 ba 00 00 00 8b 4c 24 20 39 4c 24 1c 0f 83 7a 01 00 00 8b 86
00 01 00 00 <0f> ba 68 04 01 8b 46 38 8d 54 24 24 e8 b7 d5 72 c6 65 a1 08 00 
Apr 12 23:24:54 localhost kernel: EIP: [<f99f8a28>]
irda_recvmsg_stream+0x132/0x2ff [irda] SS:ESP 0068:ef8dde2c
Apr 12 23:24:54 localhost kernel:  <7>irda_release()
Apr 12 23:24:54 localhost kernel: irda_destroy_socket(ea727600)

And just for fun the second attempt to send the file from the Pocket PC like
the last time without closing the transmit dialog on the Pocket PC:

acer01:/tmp# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 >
/proc/sys/net/irda/debug
Do 12. Apr 23:25:14 CEST 2007
Waiting for incoming connection
Incoming connection
Receiving SDReg.exe...done
Disconnecting

I've attached the debug.log.
Comment 17 Samuel Ortiz 2007-04-12 16:57:52 UTC
Hi Berthold,

After spending sometime looking at your first debug log, this is my analysis:

In the failure case. we see the following:

1) Once the PocketPC IAS requests succeeds, we see it connecting to the OBEX LSAP.
2) The corresponding LMP connection succeeds, but the PocketPC keeps on doing
IAS requests, that fails.
3) After several failing IAS requests, the PocketPC decides to close the OBEX lSAP.

In the successful case, we see:
1) same as above.
2) The LMP succeeds, but the PocketPC doesn't resend IAS requests. 
3) The PocketPC seems to understand the the LMP link is OK, and start the
OBEX/IrTTP transfer.

So, I need to understand the following points:
a) Why does the PocketPC peer keep on sending IAS requests when the IrLMP
connection is established
b) why does the IAS requests after the LMP connection is established keep on
failing.

For helping me answering those 2, I would really need to run the same test again
(without Olaf's patch for now), and give us the irdadump output while running
the test (make sure you start irdadump before starting the test, obviously).

I don't need the kernel logs this time.

Thanks a lot for your patience.
Comment 18 Samuel Ortiz 2007-04-12 17:00:49 UTC
Olaf, in response to your comment #14:

I think part of your patch makes sense, but what Berthold is mostly concerned
about is why he can't run the OBEX transfer properly (at least not on the first
try). I don't think your patch intends to fix that bug, even though it tries to
fix other issues. Please correct me if I'm wrong.
Comment 19 Olaf Kirch 2007-04-13 02:37:55 UTC
Re comment #18: Indeed, I don't think my patch will fix the problem at hand.
But the way af_irda currently deals with connection close is wrong (there
should not be a kernel ASSERT when doing something legal on a socket). I
would fix that as a first step and then see if the user space app can cope
with the error condition (simply by retrying for instance). Whatever - this
is something that ought to get fixed.

I will readily admit that my knowledge about the IR protocols is rather
sketchy. I looked at the debug traces anyway, and here's a difference
that may be significant: in the non-working case, the client first looks
for "OBEX:", and then for "OBEX:IrXfer". In the working case, it goes for
"OBEX:IrXfer" straight away. Here's the non-working case:

LM-IAS: Looking up OBEX: IrDA:TinyTP:LsapSel
hashbin_find()
LM-IAS: Object OBEX not found
iriap_getvaluebyclass_response()
iriap_getvaluebyclass_response: sending IAS_MISSING

[...]
The client disconnects:

irlap_do_event(), event = RECV_I_CMD, state = LAP_NRM_S
irlap_state_nrm_s(), event=RECV_I_CMD
irlap_state_nrm_s(), event=RECV_I_CMD nr=2, vs=2, ns=2, vr=2, pf=16
irlap_validate_nr_received(), expected!
irlmp_link_data_indication()
irlmp_link_data_indication(), Disconnect indication!
irlmp_do_lsap_event(), EVENT = LM_DISCONNECT_INDICATION, STATE = 
LSAP_DATA_TRANSFER_READY
irlmp_state_dtr()
irlmp_state_dtr(), trying to close IrLAP
irlmp_do_lap_event(), EVENT = LM_LAP_DISCONNECT_REQUEST, STATE = LAP_ACTIVE
irlmp_state_active()
irlmp_disconnect_indication(), reason=LM_USER_REQUEST
irlmp_disconnect_indication(), slsap_sel=00, dlsap_sel=03

And tries to connect to the OBEX server. But note the LSAP=0. That looks
very bogus to me (in the working case, the connection will be to the LSAP
of the OBEX service).

irlmp_link_data_indication()
irlmp_link_data_indication(), incoming connection, source LSAP=0, dest LSAP=3
irlmp_do_lsap_event(), EVENT = LM_CONNECT_INDICATION, STATE = LSAP_DISCONNECTED
irlmp_state_disconnected()
irlmp_do_lap_event(), EVENT = LM_LAP_CONNECT_REQUEST, STATE = LAP_ACTIVE
irlmp_state_active()
irlmp_state_active(), LS_CONNECT_REQUEST

This may mean that the client is buggy, and fails to grok the reply we
sent (indicating that the service isn't there).
Unfortunately, I don't have the IrLMP spec...
Comment 20 Samuel Ortiz 2007-04-13 03:02:48 UTC
Re comment #20 (stupid bugzilla and its no-mail-interface...):

1) I agree with the af_irda.c code. Note that we don't do kernel ASSERT but
rather IRDA_ASSERT, which are just conditions check. Your patch was doing mostly
the same thing when checking for sock_error(), except that it was forwarding the
error code, which is right IMO.

2) The OBEX protocol specifies that an IrDA peer should first ask for an "OBEX"
service and if that fails, then ask for an "OBEX:IrXFER" one. The "OBEX" service
is for a generic OBEX service while the "OBEX:IrXFER" one is for a file oriented
one. ircp correctly registers itself as "OBEX:IrXFER".
The Pocket PC does not exactly that since it asks several times in a row for an
"OBEX" service, but we cope with that (we send an IAS missing until it asks for
"OBEX:IrXFER"). You can notice from the log that once we answer positively to
the IAS request, it actually connects to the right LSAP:
  irlmp_link_data_indication(), incoming connection, source LSAP=26, dest LSAP=6
But then instead of starting the OBEX transfer, it goes again asking for the
"OBEX" service, gets negative answers, and then decides to close all open LSAPs.
I wonder if it's a PocketPC bug or if we don't correctly send the connection
confirmation on LSAP 26. That's why I asked Berthold for irdadump traces.

3) LSAP 0 is the IAS assigned LSAP. It is specified on the IrLMP spec. Whenever
you want to ask for a service from an IrDA peer, you establish a connection on
its LSAP 0, and sent IAP commands on that link. That's what the PocketPC does,
and once it gets the answer from its "OBEX" or "OBEX:IrXFER" request, it closes
the connection on LSAP 0, while opening another one the the OBEX LSAP whenever
it gets a positive answer. This behaviour is correct.
FWIW though, my Symbian phone closes the whole connection (down to the IrLAP
level) when the IAS request gets a negative answer, while the PocketPC closes
only the LMP one.
Comment 21 Olaf Kirch 2007-04-13 03:51:16 UTC
Re: the reported crash in comment #16 - that's caused by
irda_disconnect_indication calling sock_orphan. It dies in
set_bit(SOCK_ASYNC_WAITDATA, &sk->sk_socket->flags) because
sk->sk_socket is NULL.

The normal usage of sock_orphan is in the release() routine. Anything
else is wrong, IMHO. A user process may be sleeping in say recvmsg_stream
when the disconnect arrives. It wakes up, finds that there's still no
data on the receive queue, and walks into the same oops my patch triggered.

I googled a bit and found your Sep 3, 2006 patch that introduced the
sock_orphan call - but I don't understand why. The oops Carl-Daniel posted
in http://lkml.org/lkml/2006/8/29/358 looks like sk->sk_sleep is
corrupted - but how?
Comment 22 Olaf Kirch 2007-04-13 04:09:17 UTC
Ah, I think I've got the answer to my own question.

The problem was this: in irda_recvmsg_stream, when waiting for data
to arrive, we define a waitq, and add it to sk->sk_sleep. If we find
that sock_error indicates an error, we break out of the loop and return
without cleaning up sk_sleep.
Comment 23 Berthold Cogel 2007-04-14 16:41:16 UTC
Created attachment 11146 [details]
Output of irdadump (irdadump.log.bz2)

Hello Samuel,

I hope this is what you wanted me to do:

I started irdadump with:

irdadump  -d  -x  -s  256 > /tmp/irdadump.log

And then the transmit test:

acer01:~# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 >
/proc/sys/net/irda/debug
So 15. Apr 01:32:46 CEST 2007
Waiting for incoming connection
srv_obex_event(): Link error
acer01:~# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 >
/proc/sys/net/irda/debug
So 15. Apr 01:33:01 CEST 2007
Waiting for incoming connection
Incoming connection
Receiving SDReg.exe...done
Disconnecting
acer01:~#

I've attached the output of irdadump.


Berthold
Comment 24 Samuel Ortiz 2007-04-15 16:03:42 UTC
Hi Berthold,

That's exactly what I was looking for, thanks a lot.
Unfortunately, I don't see anything clearly bogus: The Pocket PC peer gets our
positive IAS answer, since it replies with a connection request. I can't see why
it wouldn't get our connection response. I checked the LAP fields from the
connection frames and they look good.
One thing that look weird to me is the fact that the Pocket PC doesn't seem to
accept being a secondary device. Even though we start the discovery process, it
eventually sends discovery frames as if it would try to get the primary role.
Moreover, on the second try (the succesful one), there is no role inversion
since the PocketPC is already sending discovery frames when we show up.

So, Berthold, could you try to start your PC as a secondary device (run
"irattach irda0" instead of "irattach irda0 -s") and then initiate the file
transfer from your Pocket PC ?

Thanks again for your time.
Comment 25 Berthold Cogel 2007-04-16 15:23:33 UTC
Let's see:

/etc/defaults/irda-utils: 

DISCOVERY="false" leads to a 'irattach irda0' instead of a 'irattach irda0 -s'.

Starting 'irdadump  -d  -x  -s  256 > /tmp/irdadump.log'.

acer01:~# echo 4 > /proc/sys/net/irda/debug;date;ircp -r;echo 0 >
/proc/sys/net/irda/debug
Di 17. Apr 00:04:35 CEST 2007
Waiting for incoming connection

Starting the transmission at the PocketPC. The PocketPC shows only 'Infrared:
aligning..' until I abort the process.

Aborting ircp with ^C.


/tmp/irdadump.log':

0 packets received by filter


/var/log/debug:

Apr 17 00:04:35 localhost kernel: irda_create()
Apr 17 00:04:35 localhost kernel: irda_create() : self is f3532a00
Apr 17 00:04:35 localhost kernel: irlmp_register_client()
Apr 17 00:04:35 localhost kernel: hashbin_insert()
Apr 17 00:04:35 localhost kernel: enqueue_first()
Apr 17 00:04:35 localhost kernel: irda_bind(f3532a00)
Apr 17 00:04:35 localhost kernel: irlmp_slsap_inuse()
Apr 17 00:04:35 localhost kernel: irlmp_find_free_slsap(), found free lsap_sel=12
Apr 17 00:04:35 localhost kernel: hashbin_insert()
Apr 17 00:04:35 localhost kernel: enqueue_first()
Apr 17 00:04:35 localhost kernel: irttp_open_tsap(), stsap_sel=12
Apr 17 00:04:35 localhost kernel: hashbin_insert()
Apr 17 00:04:35 localhost kernel: enqueue_first()
Apr 17 00:04:35 localhost kernel: irias_new_object()
Apr 17 00:04:35 localhost kernel: hashbin_insert()
Apr 17 00:04:35 localhost kernel: enqueue_first()
Apr 17 00:04:35 localhost kernel: hashbin_insert()
Apr 17 00:04:35 localhost kernel: enqueue_first()
Apr 17 00:04:35 localhost kernel: irda_setsockopt(f3532a00)
Apr 17 00:04:35 localhost kernel: irlmp_register_service(), hints = 2080
Apr 17 00:04:35 localhost kernel: hashbin_insert()
Apr 17 00:04:35 localhost kernel: enqueue_first()
Apr 17 00:04:35 localhost kernel: irda_listen()
Apr 17 00:04:35 localhost kernel: irda_poll()
Apr 17 00:04:35 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:04:36 localhost kernel: irlmp_discovery_timer_expired()
Apr 17 00:04:36 localhost kernel: irlmp_expire_discoveries()
Apr 17 00:04:36 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:04:36 localhost kernel: irda_poll()
Apr 17 00:04:36 localhost kernel: irda_poll()
Apr 17 00:04:36 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:04:37 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:04:37 localhost kernel: irda_poll()
Apr 17 00:04:37 localhost kernel: irda_poll()
Apr 17 00:04:37 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:04:38 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:04:38 localhost kernel: irda_poll()
Apr 17 00:04:38 localhost kernel: irda_poll()
Apr 17 00:04:38 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:04:39 localhost kernel: irlmp_discovery_timer_expired()
Apr 17 00:04:39 localhost kernel: irlmp_expire_discoveries()
Apr 17 00:04:39 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)

.... looping ...

Apr 17 00:05:29 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:05:30 localhost kernel: irlmp_discovery_timer_expired()
Apr 17 00:05:30 localhost kernel: irlmp_expire_discoveries()
Apr 17 00:05:30 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:05:30 localhost kernel: irda_poll()
Apr 17 00:05:30 localhost kernel: irda_poll()
Apr 17 00:05:30 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:05:31 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:05:31 localhost kernel: irda_poll()
Apr 17 00:05:31 localhost kernel: irda_poll()
Apr 17 00:05:31 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:05:32 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:05:32 localhost kernel: irda_poll()
Apr 17 00:05:32 localhost kernel: irda_poll()
Apr 17 00:05:32 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:05:33 localhost kernel: irlmp_discovery_timer_expired()
Apr 17 00:05:33 localhost kernel: irlmp_expire_discoveries()
Apr 17 00:05:33 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:05:33 localhost kernel: irda_poll()
Apr 17 00:05:33 localhost kernel: irda_poll()
Apr 17 00:05:33 localhost kernel: nsc_ircc_net_ioctl(), irda0, (cmd=0x8B01)
Apr 17 00:05:34 localhost kernel: irda_poll()
Apr 17 00:05:34 localhost kernel: irda_release()
Apr 17 00:05:34 localhost kernel: irda_destroy_socket(f3532a00)
Apr 17 00:05:34 localhost kernel: irlmp_unregister_client()
Apr 17 00:05:34 localhost kernel: hashbin_find()
Apr 17 00:05:34 localhost kernel: irlmp_unregister_client(), removing client!
Apr 17 00:05:34 localhost kernel: hashbin_remove_this()
Apr 17 00:05:34 localhost kernel: dequeue_general()
Apr 17 00:05:34 localhost kernel: irlmp_unregister_service()
Apr 17 00:05:34 localhost kernel: hashbin_find()
Apr 17 00:05:34 localhost kernel: hashbin_remove_this()
Apr 17 00:05:34 localhost kernel: dequeue_general()
Apr 17 00:05:34 localhost kernel: hashbin_remove_this()
Apr 17 00:05:34 localhost kernel: dequeue_general()
Apr 17 00:05:34 localhost kernel: dequeue_first()
Apr 17 00:05:34 localhost last message repeated 4 times
Apr 17 00:05:34 localhost kernel: irias_delete_value()
Apr 17 00:05:34 localhost kernel: dequeue_first()
Apr 17 00:05:34 localhost last message repeated 3 times
Apr 17 00:05:34 localhost kernel: irttp_disconnect_request(), already disconnected!
Apr 17 00:05:34 localhost kernel: irttp_close_tsap()
Apr 17 00:05:34 localhost kernel: hashbin_remove()
Apr 17 00:05:34 localhost kernel: dequeue_general()
Apr 17 00:05:34 localhost kernel: hashbin_remove()
Apr 17 00:05:34 localhost kernel: dequeue_general()
Apr 17 00:05:34 localhost kernel: __irlmp_close_lsap()
Apr 17 00:05:34 localhost kernel: irttp_flush_queues()


Nothing else happens. The IR interface simply doesn't seem to exist.

Stopping the irda-utils, setting DISCOVERY='true' and restarting the irda-utils
and the PocketPC is recognised....
Comment 26 Natalie Protasevich 2008-04-05 02:22:36 UTC
Any updates on this bug?
Berthold, can you confirm the problem with recent kernel please.
Thanks
Comment 27 Berthold Cogel 2008-04-05 12:35:40 UTC
Just tested with 2.6.24.4 (only patched with tuxonice):

acer01:~# echo 4 > /proc/sys/net/irda/debug 
acer01:~# ircp -r
Waiting for incoming connection
srv_obex_event(): Link error
acer01:~# echo 0 > /proc/sys/net/irda/debug

So the problem still exists.... 

I will attach /var/log/debug.

Berthold
Comment 28 Berthold Cogel 2008-04-05 12:38:06 UTC
Created attachment 15628 [details]
debug log with kernel 2.6.24.4
Comment 29 Alan 2012-05-12 13:12:11 UTC
Closing as obsolete , if this is incorrect please re-open