Bug 5505
Summary: | BIND9 getting errno 14: Bad address | ||
---|---|---|---|
Product: | Networking | Reporter: | Stefan Schmidt (zaphodb) |
Component: | Other | Assignee: | Herbert Xu (herbert) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | chutz |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.14-rc5 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: | Patch to fix zero-length packet reception |
Description
Stefan Schmidt
2005-10-27 05:02:52 UTC
Date: Mon, 31 Oct 2005 11:04:54 +0100 From: Stefan Schmidt <zaphodb@zaphods.net> To: Ian McDonald <imcdnzl@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>, Andrew Morton <akpm@osdl.org>, Herbert Xu <herbert@gondor.apana.org.au>, Arnaldo Carvalho de Melo <acme@conectiva.com.br>, "David S. Miller" <davem@davemloft.net>, netdev@vger.kernel.org, Linus Torvalds <torvalds@osdl.org> Subject: Re: [Bug 5505] New: BIND9 getting errno 14: Bad address On Fri, Oct 28, 2005 at 12:18:27AM +0200, Stefan Schmidt wrote: > > > > Stefan, are you able to obtain the `strace -f' output from a failur= e? > > > Please. =2E.. > I will try to get hold of another machine and load it with the same > installation and zone-data in order to reproduce the problem. > Getting the same machine model will be difficult as we are low on pool > machines but i'll try to aquire a PIV Xeon HT machine. It seems my strategy to forget about the week through some pints of guinness was a full success. I just remembered i haven't sent you my results yet this very morning, sorry: I was able to reproduce the bug with on another DL360 and kernel 2.6.14-rc5 and i have a full strace for you. This is the interesting part of it: 2539 18:31:57.648544 <... futex resumed> ) =3D 0 2539 18:31:57.648624 futex(0x818ab08, FUTEX_WAKE, 1) =3D 0 2539 18:31:57.648722 gettimeofday({1130517117, 648752}, NULL) =3D 0 2539 18:31:57.648797 recvmsg(20, {msg_name(16)=3D{sa_family=3DAF_INET, sin= _port=3Dhtons(4269), sin_addr=3Dinet_addr("222.255.121.130")}, msg_iov(1)= =3D[{"+\236\0\0\0\1\0\0\0\0\0\1\rextreme-event\2de\0\0\17"..., 4096}], msg_= con trollen=3D20, {cmsg_len=3D20, cmsg_level=3DSOL_SOCKET, cmsg_type=3D0x1d /* = SCM_??? */, ...}, msg_flags=3D0}, 0) =3D 45 2539 18:31:57.648930 recvmsg(20, 0xb764e3c8, 0) =3D -1 EFAULT (Bad address) 2539 18:31:57.648993 futex(0x8186eac, FUTEX_WAKE, 2147483647) =3D 0 2539 18:31:57.649086 time([1130517117]) =3D 1130517117 2539 18:31:57.649152 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649233 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649305 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649391 send(3, "<27>Oct 28 16:31:57 named[2537]:"..., 70, MS= G_NOSIGNAL) =3D 70 2539 18:31:57.649478 time([1130517117]) =3D 1130517117 2539 18:31:57.649544 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649617 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649688 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649769 send(3, "<27>Oct 28 16:31:57 named[2537]:"..., 87, MS= G_NOSIGNAL) =3D 87 2539 18:31:57.648930 recvmsg(20, 0xb764e3c8, 0) =3D -1 EFAULT (Bad address) 2539 18:31:57.648993 futex(0x8186eac, FUTEX_WAKE, 2147483647) =3D 0 2539 18:31:57.649086 time([1130517117]) =3D 1130517117 2539 18:31:57.649152 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649233 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649305 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649391 send(3, "<27>Oct 28 16:31:57 named[2537]:"..., 70, MS= G_NOSIGNAL) =3D 70 2539 18:31:57.649478 time([1130517117]) =3D 1130517117 2539 18:31:57.649544 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649617 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649688 open("/etc/localtime", O_RDONLY) =3D -1 ENOENT (No su= ch file or directory) 2539 18:31:57.649769 send(3, "<27>Oct 28 16:31:57 named[2537]:"..., 87, MS= G_NOSIGNAL) =3D 87 2539 18:31:57.649852 futex(0x818ab40, FUTEX_WAKE, 1) =3D 1 2539 18:31:57.649954 recvmsg(20, <unfinished ...> 2539 18:31:57.650021 <... recvmsg resumed> 0xb764e3c8, 0) =3D -1 EAGAIN (R= esource temporarily unavailable) 2539 18:31:57.650088 write(8, "\24\0\0\0\375\377\377\377", 8 <unfinished .= =2E.> 2539 18:31:57.650209 <... write resumed> ) =3D 8 2539 18:31:57.650459 sendmsg(20, {msg_name(16)=3D{sa_family=3DAF_INET, sin= _port=3Dhtons(4269), sin_addr=3Dinet_addr("222.255.121.130")}, msg_iov(1)= =3D[{"+\236\204\0\0\1\0\1\0\2\0\3\rextreme-event\2de\0\0\17"..., 152}], msg= _co ntrollen=3D0, msg_flags=3D0}, 0 <unfinished ...> 2539 18:31:57.650625 <... sendmsg resumed> ) =3D 152 2539 18:31:57.650764 recvmsg(20, <unfinished ...> 2539 18:31:57.650880 <... recvmsg resumed> 0xb764cdc8, 0) =3D -1 EAGAIN (R= esource temporarily unavailable) 2539 18:31:57.650964 futex(0x818ab40, FUTEX_WAIT, 3815, NULL <unfinished .= =2E.> I only provided you with the output from pid 2539 here. Find the full strace -f -tt here: -> http://www.zaphods.net/~zaphodb/bigger.named.strace.bz2 but be warned: -rw-r--r-- 1 root root 256M 2005-10-28 18:38 bigger.named.strace -rw-r--r-- 1 root root 19M 2005-10-28 18:38 bigger.named.strace.bz2 > > I would also add are you able to use git-bisect to track down the > > changeset causing this. I've used this to find obscure bugs already > > and it is easy to use. > I will google on what git-bisect is and then tell you whether or not i am > able to. ;) I will have to give back the machine soon, i'll see what i can do. Stefan PS: tcpreplay rocks. I had this error twice now on my Gentoo Linux System with - bind-9.3.1 - kernel 2.6.14 Since i upgraded both Kernel and Bind on the same day, i can't really say who's fault this is. Bind 9.3.1 has been out for quite some time, however... I will go back to 2.6.13 tonight and within a week i'll know if this has fixed it :-) Oct 30 19:04:08 [named] errno2result.c:109: unexpected error: Oct 30 19:04:08 [named] unable to convert errno to isc_result: 14: Bad address Oct 30 19:04:08 [named] UDP client handler shutting down due to fatal receive error: unexpected error Oct 31 13:44:44 [named] errno2result.c:109: unexpected error: Oct 31 13:44:44 [named] unable to convert errno to isc_result: 14: Bad address Oct 31 13:44:44 [named] UDP client handler shutting down due to fatal receive error: unexpected error Created attachment 6453 [details]
Patch to fix zero-length packet reception
This patch fixes zero-length packet reception which currently results in EFAULT
due to the rewrite of skb_copy_datagram_iovec.
|