Most recent kernel where this bug did not occur: 2.6.20 Distribution: Fedora Core 6 Hardware Environment: x86_64 Software Environment: NFS Problem Description: When locking a file, an invalid RPCBPROC_GETVERSADDR procedure call is sent out Steps to reproduce: on the NFS server, run: tcpdump -xX -pni eth0 -s0 port 111 and src host theNFSclient on a client running 2.6.22.1, run: flock -x /nfsmount/somefile ls tcpdump will show: 17:10:01.290655 IP 141.2.15.141.34572 > 141.2.1.1.sunrpc: P 0:168(168) ack 1 win 183 <nop,nop,timestamp 966695 115079499> 0x0000: 4500 00dc 53ad 4000 3f06 bcdc 8d02 0f8d E...S.@.?....... 0x0010: 8d02 0101 870c 006f cdcd 938d db00 c8a7 .......o........ 0x0020: 8018 00b7 e59d 0000 0101 080a 000e c027 ...............' 0x0030: 06db f94b 8000 00a4 fd48 2a07 0000 0000 ...K.....H*..... 0x0040: 0000 0002 0001 86a0 0000 0004 0000 0009 ................ 0x0050: 0000 0001 0000 0054 0000 03c6 0000 0007 .......T........ 0x0060: 6572 6964 796b 6500 0000 0000 0000 0000 eridyke......... 0x0070: 0000 000e 0000 0000 0000 0001 0000 0002 ................ 0x0080: 0000 0003 0000 0004 0000 0005 0000 0006 ................ 0x0090: 0000 0007 0000 0007 0000 0009 0000 000a ................ 0x00a0: 0000 000c 0000 0014 0000 001c 0000 0000 ................ 0x00b0: 0000 0000 0001 86b5 0000 0004 0000 0003 ................ 0x00c0: 7463 7000 0000 0009 3134 312e 322e 312e tcp.....141.2.1. 0x00d0: 3100 0000 0000 0004 7270 6362 1.......rpcb Note the "141.2.1.1" in the output. According to RFC 1833, you can read here the following fields: RPCBPROC_GETVERSADDR version 4 procedure is being called r_prog == 0x000186b5 == 100021 == nfs.lockd r_vers == 4 r_netid == (length 3) "tcp" r_addr == (length 9) "141.2.1.1" r_owner == (length 4) "rpcb" This r_addr member is supposed to contain an universal address. Although I have no source for that, the RFC clearly says a service can listen to an address, and RPCBPROC_GETVERSADDR is supposed to return an universal address too. From this I can conclude that an universal address is supposed to contain the port number, and other operating systems are using the format a.b.c.d.PortHighByte.PortLowByte (like in FTP, but with . instead of ,). I interpret the RFC 1833 so that the port number of the rpcbind, that is, 111, is to be appended, so the correct value of r_addr would be "141.2.1.1.0.111". This matches other implementations. Of course, all this does sound like nitpicking, and as the callee is not even supposed to look at the port number (this argument is only there so the callee can decide correctly which interface address to use for the response). However, this omission triggers an apparently unpatched denial of service vulnerability against HP-UX 11.11's rpcbind and causes it to quit with a core dump and a bus error. When using a correctly formed universal address, or the empty string, there is no crash of HP-UX's rpcbind. And of course it is HP who has to fix the DoS bug - but Linux 2.6.22 triggers it by a RFC violation, which is a minor bug to be fixed on Linux's side. By the way, does anyone have the right contacts to HP to report this bug? I do not have a HP service contract any more, and the only other support place I found is their public forum, where I of course can only provide limited information about the bug (basically, the same I am posting here). Maybe the bug is fixed in a more current kernel, I was only using Fedora's highly patched distribution kernel and compared it to the sources of the base version in the main tree, where I see the very same problem in the source code. So I do not think it was Fedora who caused the problem, and am assigning it to mainline.
Reply-To: akpm@linux-foundation.org On Wed, 15 Aug 2007 12:22:51 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8891 > > Summary: in-kernel rpc generates broken RPCBPROC_GETVERSADDR v4 > requests > Product: Networking > Version: 2.5 > KernelVersion: 2.6.22.1 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > AssignedTo: acme@ghostprotocols.net > ReportedBy: speidel@ueberschuss.de > > > Most recent kernel where this bug did not occur: 2.6.20 Apparently a regression. > Distribution: Fedora Core 6 > Hardware Environment: x86_64 > Software Environment: NFS > Problem Description: When locking a file, an invalid RPCBPROC_GETVERSADDR > procedure call is sent out > > Steps to reproduce: > on the NFS server, run: tcpdump -xX -pni eth0 -s0 port 111 and src host > theNFSclient > on a client running 2.6.22.1, run: flock -x /nfsmount/somefile ls > tcpdump will show: > > 17:10:01.290655 IP 141.2.15.141.34572 > 141.2.1.1.sunrpc: P 0:168(168) ack 1 > win 183 <nop,nop,timestamp 966695 115079499> > 0x0000: 4500 00dc 53ad 4000 3f06 bcdc 8d02 0f8d E...S.@.?....... > 0x0010: 8d02 0101 870c 006f cdcd 938d db00 c8a7 > .......o........ > 0x0020: 8018 00b7 e59d 0000 0101 080a 000e c027 ...............' > 0x0030: 06db f94b 8000 00a4 fd48 2a07 0000 0000 ...K.....H*..... > 0x0040: 0000 0002 0001 86a0 0000 0004 0000 0009 ................ > 0x0050: 0000 0001 0000 0054 0000 03c6 0000 0007 .......T........ > 0x0060: 6572 6964 796b 6500 0000 0000 0000 0000 eridyke......... > 0x0070: 0000 000e 0000 0000 0000 0001 0000 0002 ................ > 0x0080: 0000 0003 0000 0004 0000 0005 0000 0006 ................ > 0x0090: 0000 0007 0000 0007 0000 0009 0000 000a ................ > 0x00a0: 0000 000c 0000 0014 0000 001c 0000 0000 ................ > 0x00b0: 0000 0000 0001 86b5 0000 0004 0000 0003 ................ > 0x00c0: 7463 7000 0000 0009 3134 312e 322e 312e tcp.....141.2.1. > 0x00d0: 3100 0000 0000 0004 7270 6362 1.......rpcb > > Note the "141.2.1.1" in the output. > > According to RFC 1833, you can read here the following fields: > > RPCBPROC_GETVERSADDR version 4 procedure is being called > r_prog == 0x000186b5 == 100021 == nfs.lockd > r_vers == 4 > r_netid == (length 3) "tcp" > r_addr == (length 9) "141.2.1.1" > r_owner == (length 4) "rpcb" > > This r_addr member is supposed to contain an universal address. Although I > have > no source for that, the RFC clearly says a service can listen to an address, > and RPCBPROC_GETVERSADDR is supposed to return an universal address too. From > this I can conclude that an universal address is supposed to contain the port > number, and other operating systems are using the format > a.b.c.d.PortHighByte.PortLowByte (like in FTP, but with . instead of ,). I > interpret the RFC 1833 so that the port number of the rpcbind, that is, 111, > is > to be appended, so the correct value of r_addr would be "141.2.1.1.0.111". > This > matches other implementations. > > Of course, all this does sound like nitpicking, and as the callee is not even > supposed to look at the port number (this argument is only there so the > callee > can decide correctly which interface address to use for the response). > However, > this omission triggers an apparently unpatched denial of service > vulnerability > against HP-UX 11.11's rpcbind and causes it to quit with a core dump and a > bus > error. When using a correctly formed universal address, or the empty string, > there is no crash of HP-UX's rpcbind. > > And of course it is HP who has to fix the DoS bug - but Linux 2.6.22 triggers > it by a RFC violation, which is a minor bug to be fixed on Linux's side. By > the > way, does anyone have the right contacts to HP to report this bug? I do not > have a HP service contract any more, and the only other support place I found > is their public forum, where I of course can only provide limited information > about the bug (basically, the same I am posting here). > > Maybe the bug is fixed in a more current kernel, I was only using Fedora's > highly patched distribution kernel and compared it to the sources of the base > version in the main tree, where I see the very same problem in the source > code. > So I do not think it was Fedora who caused the problem, and am assigning it > to > mainline. >
Reply-To: chuck.lever@oracle.com Andrew Morton wrote: > On Wed, 15 Aug 2007 12:22:51 -0700 (PDT) > bugme-daemon@bugzilla.kernel.org wrote: > >> http://bugzilla.kernel.org/show_bug.cgi?id=8891 >> >> Summary: in-kernel rpc generates broken RPCBPROC_GETVERSADDR v4 >> requests >> Product: Networking >> Version: 2.5 >> KernelVersion: 2.6.22.1 >> Platform: All >> OS/Version: Linux >> Tree: Mainline >> Status: NEW >> Severity: normal >> Priority: P1 >> Component: Other >> AssignedTo: acme@ghostprotocols.net >> ReportedBy: speidel@ueberschuss.de >> >> >> Most recent kernel where this bug did not occur: 2.6.20 > > Apparently a regression. RPCBIND v4 requests are an experimental feature. They can be disabled via a kernel build option (CONFIG_SUNRPC_BIND34). I think these are left enabled in Fedora to find just the type of problem before v3 and v4 becomes the default. >> Distribution: Fedora Core 6 >> Hardware Environment: x86_64 >> Software Environment: NFS >> Problem Description: When locking a file, an invalid RPCBPROC_GETVERSADDR >> procedure call is sent out >> >> Steps to reproduce: >> on the NFS server, run: tcpdump -xX -pni eth0 -s0 port 111 and src host >> theNFSclient >> on a client running 2.6.22.1, run: flock -x /nfsmount/somefile ls >> tcpdump will show: >> >> 17:10:01.290655 IP 141.2.15.141.34572 > 141.2.1.1.sunrpc: P 0:168(168) ack 1 >> win 183 <nop,nop,timestamp 966695 115079499> >> 0x0000: 4500 00dc 53ad 4000 3f06 bcdc 8d02 0f8d E...S.@.?....... >> 0x0010: 8d02 0101 870c 006f cdcd 938d db00 c8a7 >> .......o........ >> 0x0020: 8018 00b7 e59d 0000 0101 080a 000e c027 ...............' >> 0x0030: 06db f94b 8000 00a4 fd48 2a07 0000 0000 ...K.....H*..... >> 0x0040: 0000 0002 0001 86a0 0000 0004 0000 0009 ................ >> 0x0050: 0000 0001 0000 0054 0000 03c6 0000 0007 .......T........ >> 0x0060: 6572 6964 796b 6500 0000 0000 0000 0000 eridyke......... >> 0x0070: 0000 000e 0000 0000 0000 0001 0000 0002 ................ >> 0x0080: 0000 0003 0000 0004 0000 0005 0000 0006 ................ >> 0x0090: 0000 0007 0000 0007 0000 0009 0000 000a ................ >> 0x00a0: 0000 000c 0000 0014 0000 001c 0000 0000 ................ >> 0x00b0: 0000 0000 0001 86b5 0000 0004 0000 0003 ................ >> 0x00c0: 7463 7000 0000 0009 3134 312e 322e 312e tcp.....141.2.1. >> 0x00d0: 3100 0000 0000 0004 7270 6362 1.......rpcb >> >> Note the "141.2.1.1" in the output. >> >> According to RFC 1833, you can read here the following fields: >> >> RPCBPROC_GETVERSADDR version 4 procedure is being called >> r_prog == 0x000186b5 == 100021 == nfs.lockd >> r_vers == 4 >> r_netid == (length 3) "tcp" >> r_addr == (length 9) "141.2.1.1" >> r_owner == (length 4) "rpcb" >> >> This r_addr member is supposed to contain an universal address. Although I >> have >> no source for that, the RFC clearly says a service can listen to an address, >> and RPCBPROC_GETVERSADDR is supposed to return an universal address too. >> From >> this I can conclude that an universal address is supposed to contain the >> port >> number, and other operating systems are using the format >> a.b.c.d.PortHighByte.PortLowByte (like in FTP, but with . instead of ,). I >> interpret the RFC 1833 so that the port number of the rpcbind, that is, 111, >> is >> to be appended, so the correct value of r_addr would be "141.2.1.1.0.111". >> This >> matches other implementations. That RFC is unclear on exactly what a universal address should look like. However speidel's interpretation is not unreasonable, and I'm glad he was able to check this against other implementations that I don't have. My unit testing against Solaris did not reveal this issue. I'll look into the problem.
Yes, I tried my test program that emits the same request against AIX and Solaris, and both are not vulnerable to this DoS. However, I had no means to verify if an incomplete r_addr is considered at all or ignored, because I have no access to a multi-network-card AIX or Solaris server. I tried to specify 0.0.0.0.0.111 and 127.0.0.1.0.111 to make the rpcbind find another interfaces without luck on both systems, so apparently they do not provide this feature and ignore r_addr or they explicitly catch such abuses and assume no specified address if such a "bogus" one was used. Also... no idea where the specification of an universal address is to be found. The RFC refers to "authorities" defining the format, but I now did some googling and found that RFC 3530 (which is for NFS, not for RPC!) defines the term: For TCP over IPv4 and for UDP over IPv4, the format of r_addr is the US-ASCII string: h1.h2.h3.h4.p1.p2 The prefix, "h1.h2.h3.h4", is the standard textual form for representing an IPv4 address, which is always four octets long. Assuming big-endian ordering, h1, h2, h3, and h4, are respectively, the first through fourth octets each converted to ASCII-decimal. Assuming big-endian ordering, p1 and p2 are, respectively, the first and second octets each converted to ASCII-decimal. For example, if a host, in big-endian order, has an address of 0x0A010307 and there is a service listening on, in big endian order, port 0x020F (decimal 527), then the complete universal address is "10.1.3.7.2.15". For TCP over IPv4 the value of r_netid is the string "tcp". For UDP over IPv4 the value of r_netid is the string "udp". For TCP over IPv6 and for UDP over IPv6, the format of r_addr is the US-ASCII string: x1:x2:x3:x4:x5:x6:x7:x8.p1.p2
Has the problem been fixed? If not - Victor, since you stated that this is a regression and the problem wasn't in 2.6.20, is it possible to use git bisect to identify what broke it? Thanks.