Bug 6828

Summary: Bogus FH in NFS request causes DoS in file system code
Product: File System Reporter: Laurence Withers (l)
Component: NFSAssignee: Neil Brown (neilb)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal    
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.14.4 Subsystem:
Regression: --- Bisected commit-id:
Attachments: working exploit code for this bug

Description Laurence Withers 2006-07-13 03:29:41 UTC
We found this rather surprising behaviour when debugging a
network card for one of our embedded systems. There was a
bus problem that occasionally caused the network card to
place random data in the outgoing packets. We were using
NFS root, as we hadn't written drivers for the block
devices yet, and discovered our Linux NFS servers getting
ext3 errors. It turned out that the 3com cards we have in
the servers lie about checking UDP checksums, and passed
the rubbish to knfsd where it was causing the problem. 

Here's an example one of our widgets (dcm503) is talking
to an NFS server (dufftown)

17:28:38.535011 dcm503.guralp.local.984095109 > dufftown.guralp.local.nfs: 116 
lookup fh Unknown/1 "" (DF) (ttl 64, id 0, len 144)
                         4500 0090 0000 4000 4011 3d45 0a52 01fa
                         c0a8 3024 03ff 0801 007c 8e9c 3aa8 1985
                         0000 0000 0000 0002 0001 86a3 0000 0002
                         0000 0004 0000 0001 0000 001c 028f 5b0c
                         0000 0006 6463 6d35 3033 0000 0000 0000
                         0000 0000 0000 0000 0000 0000 0000 0000
                         0100 0001 0021 0003 3d26 3d00 4a2f ffff
                         3d00 2c08 c923 0000 0000 0000 0000 0000
                         0000 0000 000a 6d6f 756e 7470 6f69 6e74

so what's happened here is 4a2f ffff should have been 4a2f
xxxx but the network card has missed the clock on the bus
and gotten ffff instead

nfsd_dispatch: vers 2 proc 4
nfsd: LOOKUP   32: 01000001 03002100 003d263d ffff2f4a 082c003d 000023c9
nfsd: nfsd_lookup(fh 32: 01000001 03002100 003d263d ffff2f4a 082c003d 
000023c9, )
nfsd: fh_verify(32: 01000001 03002100 003d263d ffff2f4a 082c003d 000023c9)

so here the client does a V2 lookup with a DH which has
gotten screwed up by my clients network card, this is
received by my server, gets past the UDP checksum code
(thank you 3com) and ends up at knfsd.

knfsd passes this to fh_verify which decodes it to be hde3
and inode 4294913866 (0xffff2f4a)

that then gets passed to ext3 which then panics.

EXT3-fs error (device hde3): ext3_get_inode_block: bad inode number: 
4294913866

marks the file system as containing an error, and remounts
the system read only.

Obviously this is sub optimal, and a fairly horrid DoS
since anyone can craft a UDP packet, with a bogus FH in
it. Whilst this is for V2_LOOKUP it works for all of the
V2 procedures we tried.
Comment 1 Trond Myklebust 2006-07-13 05:14:52 UTC
That sounds more like an ext3 bug to me.

Why should it panic when confronted with an iget() request for an invalid inode
instead of just returning an error?
Comment 2 James McKenzie 2006-07-13 06:49:05 UTC
We don't assert that it's an NFS bug. The problem is that ext3 doesn't
distinguish were calls to iget come from. In one case a lookup from a directory
entry containing an invalid i-node number should cause a filesystem panic, one
from some other part of the kernel shouldn't. Reiserfs silently ignores invalid
i-node numbers. It would be trivial to patch ext3 to do the same, but then one
loses some of the important consistencey checking in the filesystem. It would be
better to have the code in ext3 track where the inode lookup request came from
and act approriately. (Our tempororary work-around looked at the kernel call
stack to determine if the inode came from NFS and ignore the error if it did.
This is definately not the right way to fix the problem though.)
Comment 3 James McKenzie 2006-07-17 05:30:52 UTC
Created attachment 8565 [details]
working exploit code for this bug
Comment 4 Ferdinand Beljaars 2006-07-25 06:57:17 UTC
Possibly superfluous: this bug is easily reproduced with the 2.6.17.6 and
2.6.17.7 kernels.
Comment 5 Rituraj 2006-08-16 08:16:04 UTC
Personally I have not run this exploit but we have been facing the similar
problem on our file server. There are two partitions (LVM) which are repeatedly
being remounted as read-only after exporting them in NFS. The log messages gives
error as "bad_inode_number: so-and-so" but after unmounting the
partitions.Running fsck we do not find any errors. 
What is the fix for this? We are running RHEL-2.6.9-22. Anyway I found from one
post on google that this could be because of NFS client's cache(when the server
is offline) with wrong FH/non-existatnt inode request. Should we reboot/clear
cache for all clients?
This has been a BLOCKER for us. Please let us know if there is any fix in latest
kernel or so.

Regards;
Rituraj
Comment 6 Neil Brown 2006-08-16 22:06:45 UTC
This should be fixed in the latest 2.6.17.X kernel and will be fixed in
2.6.18.
If you need a RHEL specific fix, you should talk to RedHat.