Bug 8891

Summary: in-kernel rpc generates broken RPCBPROC_GETVERSADDR v4 requests
Product: Networking Reporter: Viktor Speidel (speidel)
Component: OtherAssignee: Arnaldo Carvalho de Melo (acme)
Status: REJECTED UNREPRODUCIBLE    
Severity: normal CC: protasnb
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.22.1 Subsystem:
Regression: Yes Bisected commit-id:

Description Viktor Speidel 2007-08-15 12:29:51 UTC
Most recent kernel where this bug did not occur: 2.6.20
Distribution: Fedora Core 6
Hardware Environment: x86_64
Software Environment: NFS
Problem Description: When locking a file, an invalid RPCBPROC_GETVERSADDR procedure call is sent out

Steps to reproduce:
on the NFS server, run: tcpdump -xX -pni eth0 -s0 port 111 and src host theNFSclient
on a client running 2.6.22.1, run: flock -x /nfsmount/somefile ls
tcpdump will show:

17:10:01.290655 IP 141.2.15.141.34572 > 141.2.1.1.sunrpc: P 0:168(168) ack 1 win 183 <nop,nop,timestamp 966695 115079499>
        0x0000:  4500 00dc 53ad 4000 3f06 bcdc 8d02 0f8d  E...S.@.?.......
		0x0010:  8d02 0101 870c 006f cdcd 938d db00 c8a7  .......o........
        0x0020:  8018 00b7 e59d 0000 0101 080a 000e c027  ...............'
        0x0030:  06db f94b 8000 00a4 fd48 2a07 0000 0000  ...K.....H*.....
        0x0040:  0000 0002 0001 86a0 0000 0004 0000 0009  ................
        0x0050:  0000 0001 0000 0054 0000 03c6 0000 0007  .......T........
        0x0060:  6572 6964 796b 6500 0000 0000 0000 0000  eridyke.........
        0x0070:  0000 000e 0000 0000 0000 0001 0000 0002  ................
        0x0080:  0000 0003 0000 0004 0000 0005 0000 0006  ................
        0x0090:  0000 0007 0000 0007 0000 0009 0000 000a  ................
        0x00a0:  0000 000c 0000 0014 0000 001c 0000 0000  ................
        0x00b0:  0000 0000 0001 86b5 0000 0004 0000 0003  ................
        0x00c0:  7463 7000 0000 0009 3134 312e 322e 312e  tcp.....141.2.1.
        0x00d0:  3100 0000 0000 0004 7270 6362            1.......rpcb

Note the "141.2.1.1" in the output.

According to RFC 1833, you can read here the following fields:

RPCBPROC_GETVERSADDR version 4 procedure is being called
r_prog == 0x000186b5 == 100021 == nfs.lockd
r_vers == 4
r_netid == (length 3) "tcp"
r_addr == (length 9) "141.2.1.1"
r_owner == (length 4) "rpcb"

This r_addr member is supposed to contain an universal address. Although I have
no source for that, the RFC clearly says a service can listen to an address,
and RPCBPROC_GETVERSADDR is supposed to return an universal address too. From
this I can conclude that an universal address is supposed to contain the port
number, and other operating systems are using the format
a.b.c.d.PortHighByte.PortLowByte (like in FTP, but with . instead of ,). I
interpret the RFC 1833 so that the port number of the rpcbind, that is, 111, is
to be appended, so the correct value of r_addr would be "141.2.1.1.0.111". This
matches other implementations.

Of course, all this does sound like nitpicking, and as the callee is not even
supposed to look at the port number (this argument is only there so the callee
can decide correctly which interface address to use for the response). However,
this omission triggers an apparently unpatched denial of service vulnerability
against HP-UX 11.11's rpcbind and causes it to quit with a core dump and a bus
error. When using a correctly formed universal address, or the empty string,
there is no crash of HP-UX's rpcbind.

And of course it is HP who has to fix the DoS bug - but Linux 2.6.22 triggers
it by a RFC violation, which is a minor bug to be fixed on Linux's side. By the
way, does anyone have the right contacts to HP to report this bug? I do not
have a HP service contract any more, and the only other support place I found
is their public forum, where I of course can only provide limited information
about the bug (basically, the same I am posting here).

Maybe the bug is fixed in a more current kernel, I was only using Fedora's
highly patched distribution kernel and compared it to the sources of the base
version in the main tree, where I see the very same problem in the source code.
So I do not think it was Fedora who caused the problem, and am assigning it to
mainline.
Comment 1 Anonymous Emailer 2007-08-15 12:36:25 UTC
Reply-To: akpm@linux-foundation.org

On Wed, 15 Aug 2007 12:22:51 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=8891
> 
>            Summary: in-kernel rpc generates broken RPCBPROC_GETVERSADDR v4
>                     requests
>            Product: Networking
>            Version: 2.5
>      KernelVersion: 2.6.22.1
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: acme@ghostprotocols.net
>         ReportedBy: speidel@ueberschuss.de
> 
> 
> Most recent kernel where this bug did not occur: 2.6.20

Apparently a regression.

> Distribution: Fedora Core 6
> Hardware Environment: x86_64
> Software Environment: NFS
> Problem Description: When locking a file, an invalid RPCBPROC_GETVERSADDR
> procedure call is sent out
> 
> Steps to reproduce:
> on the NFS server, run: tcpdump -xX -pni eth0 -s0 port 111 and src host
> theNFSclient
> on a client running 2.6.22.1, run: flock -x /nfsmount/somefile ls
> tcpdump will show:
> 
> 17:10:01.290655 IP 141.2.15.141.34572 > 141.2.1.1.sunrpc: P 0:168(168) ack 1
> win 183 <nop,nop,timestamp 966695 115079499>
>         0x0000:  4500 00dc 53ad 4000 3f06 bcdc 8d02 0f8d  E...S.@.?.......
>                 0x0010:  8d02 0101 870c 006f cdcd 938d db00 c8a7 
> .......o........
>         0x0020:  8018 00b7 e59d 0000 0101 080a 000e c027  ...............'
>         0x0030:  06db f94b 8000 00a4 fd48 2a07 0000 0000  ...K.....H*.....
>         0x0040:  0000 0002 0001 86a0 0000 0004 0000 0009  ................
>         0x0050:  0000 0001 0000 0054 0000 03c6 0000 0007  .......T........
>         0x0060:  6572 6964 796b 6500 0000 0000 0000 0000  eridyke.........
>         0x0070:  0000 000e 0000 0000 0000 0001 0000 0002  ................
>         0x0080:  0000 0003 0000 0004 0000 0005 0000 0006  ................
>         0x0090:  0000 0007 0000 0007 0000 0009 0000 000a  ................
>         0x00a0:  0000 000c 0000 0014 0000 001c 0000 0000  ................
>         0x00b0:  0000 0000 0001 86b5 0000 0004 0000 0003  ................
>         0x00c0:  7463 7000 0000 0009 3134 312e 322e 312e  tcp.....141.2.1.
>         0x00d0:  3100 0000 0000 0004 7270 6362            1.......rpcb
> 
> Note the "141.2.1.1" in the output.
> 
> According to RFC 1833, you can read here the following fields:
> 
> RPCBPROC_GETVERSADDR version 4 procedure is being called
> r_prog == 0x000186b5 == 100021 == nfs.lockd
> r_vers == 4
> r_netid == (length 3) "tcp"
> r_addr == (length 9) "141.2.1.1"
> r_owner == (length 4) "rpcb"
> 
> This r_addr member is supposed to contain an universal address. Although I
> have
> no source for that, the RFC clearly says a service can listen to an address,
> and RPCBPROC_GETVERSADDR is supposed to return an universal address too. From
> this I can conclude that an universal address is supposed to contain the port
> number, and other operating systems are using the format
> a.b.c.d.PortHighByte.PortLowByte (like in FTP, but with . instead of ,). I
> interpret the RFC 1833 so that the port number of the rpcbind, that is, 111,
> is
> to be appended, so the correct value of r_addr would be "141.2.1.1.0.111".
> This
> matches other implementations.
> 
> Of course, all this does sound like nitpicking, and as the callee is not even
> supposed to look at the port number (this argument is only there so the
> callee
> can decide correctly which interface address to use for the response).
> However,
> this omission triggers an apparently unpatched denial of service
> vulnerability
> against HP-UX 11.11's rpcbind and causes it to quit with a core dump and a
> bus
> error. When using a correctly formed universal address, or the empty string,
> there is no crash of HP-UX's rpcbind.
> 
> And of course it is HP who has to fix the DoS bug - but Linux 2.6.22 triggers
> it by a RFC violation, which is a minor bug to be fixed on Linux's side. By
> the
> way, does anyone have the right contacts to HP to report this bug? I do not
> have a HP service contract any more, and the only other support place I found
> is their public forum, where I of course can only provide limited information
> about the bug (basically, the same I am posting here).
> 
> Maybe the bug is fixed in a more current kernel, I was only using Fedora's
> highly patched distribution kernel and compared it to the sources of the base
> version in the main tree, where I see the very same problem in the source
> code.
> So I do not think it was Fedora who caused the problem, and am assigning it
> to
> mainline.
> 
Comment 2 Anonymous Emailer 2007-08-15 13:04:58 UTC
Reply-To: chuck.lever@oracle.com

Andrew Morton wrote:
> On Wed, 15 Aug 2007 12:22:51 -0700 (PDT)
> bugme-daemon@bugzilla.kernel.org wrote:
> 
>> http://bugzilla.kernel.org/show_bug.cgi?id=8891
>>
>>            Summary: in-kernel rpc generates broken RPCBPROC_GETVERSADDR v4
>>                     requests
>>            Product: Networking
>>            Version: 2.5
>>      KernelVersion: 2.6.22.1
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: Other
>>         AssignedTo: acme@ghostprotocols.net
>>         ReportedBy: speidel@ueberschuss.de
>>
>>
>> Most recent kernel where this bug did not occur: 2.6.20
> 
> Apparently a regression.

RPCBIND v4 requests are an experimental feature.  They can be disabled 
via a kernel build option (CONFIG_SUNRPC_BIND34).  I think these are 
left enabled in Fedora to find just the type of problem before v3 and v4 
becomes the default.

>> Distribution: Fedora Core 6
>> Hardware Environment: x86_64
>> Software Environment: NFS
>> Problem Description: When locking a file, an invalid RPCBPROC_GETVERSADDR
>> procedure call is sent out
>>
>> Steps to reproduce:
>> on the NFS server, run: tcpdump -xX -pni eth0 -s0 port 111 and src host
>> theNFSclient
>> on a client running 2.6.22.1, run: flock -x /nfsmount/somefile ls
>> tcpdump will show:
>>
>> 17:10:01.290655 IP 141.2.15.141.34572 > 141.2.1.1.sunrpc: P 0:168(168) ack 1
>> win 183 <nop,nop,timestamp 966695 115079499>
>>         0x0000:  4500 00dc 53ad 4000 3f06 bcdc 8d02 0f8d  E...S.@.?.......
>>                 0x0010:  8d02 0101 870c 006f cdcd 938d db00 c8a7 
>> .......o........
>>         0x0020:  8018 00b7 e59d 0000 0101 080a 000e c027  ...............'
>>         0x0030:  06db f94b 8000 00a4 fd48 2a07 0000 0000  ...K.....H*.....
>>         0x0040:  0000 0002 0001 86a0 0000 0004 0000 0009  ................
>>         0x0050:  0000 0001 0000 0054 0000 03c6 0000 0007  .......T........
>>         0x0060:  6572 6964 796b 6500 0000 0000 0000 0000  eridyke.........
>>         0x0070:  0000 000e 0000 0000 0000 0001 0000 0002  ................
>>         0x0080:  0000 0003 0000 0004 0000 0005 0000 0006  ................
>>         0x0090:  0000 0007 0000 0007 0000 0009 0000 000a  ................
>>         0x00a0:  0000 000c 0000 0014 0000 001c 0000 0000  ................
>>         0x00b0:  0000 0000 0001 86b5 0000 0004 0000 0003  ................
>>         0x00c0:  7463 7000 0000 0009 3134 312e 322e 312e  tcp.....141.2.1.
>>         0x00d0:  3100 0000 0000 0004 7270 6362            1.......rpcb
>>
>> Note the "141.2.1.1" in the output.
>>
>> According to RFC 1833, you can read here the following fields:
>>
>> RPCBPROC_GETVERSADDR version 4 procedure is being called
>> r_prog == 0x000186b5 == 100021 == nfs.lockd
>> r_vers == 4
>> r_netid == (length 3) "tcp"
>> r_addr == (length 9) "141.2.1.1"
>> r_owner == (length 4) "rpcb"
>>
>> This r_addr member is supposed to contain an universal address. Although I
>> have
>> no source for that, the RFC clearly says a service can listen to an address,
>> and RPCBPROC_GETVERSADDR is supposed to return an universal address too.
>> From
>> this I can conclude that an universal address is supposed to contain the
>> port
>> number, and other operating systems are using the format
>> a.b.c.d.PortHighByte.PortLowByte (like in FTP, but with . instead of ,). I
>> interpret the RFC 1833 so that the port number of the rpcbind, that is, 111,
>> is
>> to be appended, so the correct value of r_addr would be "141.2.1.1.0.111".
>> This
>> matches other implementations.

That RFC is unclear on exactly what a universal address should look 
like.  However speidel's interpretation is not unreasonable, and I'm 
glad he was able to check this against other implementations that I 
don't have.  My unit testing against Solaris did not reveal this issue.

I'll look into the problem.
Comment 3 Viktor Speidel 2007-08-15 13:19:24 UTC
Yes, I tried my test program that emits the same request against AIX and Solaris, and both are not vulnerable to this DoS. However, I had no means to verify if an incomplete r_addr is considered at all or ignored, because I have no access to a multi-network-card AIX or Solaris server. I tried to specify 0.0.0.0.0.111 and 127.0.0.1.0.111 to make the rpcbind find another interfaces without luck on both systems, so apparently they do not provide this feature and ignore r_addr or they explicitly catch such abuses and assume no specified address if such a "bogus" one was used.  Also... no idea where the specification of an universal address is to be found. The RFC refers to "authorities" defining the format, but I now did some googling and found that RFC 3530 (which is for NFS, not for RPC!) defines the term:     For TCP over IPv4 and for UDP over IPv4, the format of r_addr is the    US-ASCII string:        h1.h2.h3.h4.p1.p2     The prefix, "h1.h2.h3.h4", is the standard textual form for    representing an IPv4 address, which is always four octets long.    Assuming big-endian ordering, h1, h2, h3, and h4, are respectively,    the first through fourth octets each converted to ASCII-decimal.    Assuming big-endian ordering, p1 and p2 are, respectively, the first    and second octets each converted to ASCII-decimal.  For example, if a    host, in big-endian order, has an address of 0x0A010307 and there is    a service listening on, in big endian order, port 0x020F (decimal    527), then the complete universal address is "10.1.3.7.2.15".     For TCP over IPv4 the value of r_netid is the string "tcp".  For UDP    over IPv4 the value of r_netid is the string "udp".     For TCP over IPv6 and for UDP over IPv6, the format of r_addr is the    US-ASCII string:           x1:x2:x3:x4:x5:x6:x7:x8.p1.p2 
Comment 4 Natalie Protasevich 2008-02-10 19:36:01 UTC
Has the problem been fixed? If not - Victor, since you stated that this is a regression and the problem wasn't in 2.6.20, is it possible to use git bisect to identify what broke it?
Thanks.