Bug 5897
Summary: | oops when reading battery info | ||
---|---|---|---|
Product: | File System | Reporter: | Vladimir Kondratiev (vladimir.kondratiev) |
Component: | VFS | Assignee: | Suzuki (suzuki) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | ||
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.15 | Subsystem: | |
Regression: | --- | Bisected commit-id: |
Description
Vladimir Kondratiev
2006-01-15 07:14:38 UTC
Failure caused by garbage (6b6b6b6b in my case) in 'inode' parameter, and 1-st access to 'inode' member leads to page fault. Following is audit_inode function, line where page fault occurs, marked. void audit_inode(const char *name, const struct inode *inode, unsigned flags) { int idx; struct audit_context *context = current->audit_context; if (!context->in_syscall) return; if (context->name_count && context->names[context->name_count-1].name && context->names[context->name_count-1].name == name) idx = context->name_count - 1; else if (context->name_count > 1 && context->names[context->name_count-2].name && context->names[context->name_count-2].name == name) idx = context->name_count - 2; else { /* FIXME: how much do we care about inodes that have no * associated name? */ if (context->name_count >= AUDIT_NAMES - AUDIT_NAMES_RESERVED) return; idx = context->name_count++; context->names[idx].name = NULL; #if AUDIT_DEBUG ++context->ino_count; #endif } context->names[idx].flags = flags; ===fault on the next line===> context->names[idx].ino = inode->i_ino; context->names[idx].dev = inode->i_sb->s_dev; context->names[idx].mode = inode->i_mode; context->names[idx].uid = inode->i_uid; context->names[idx].gid = inode->i_gid; context->names[idx].rdev = inode->i_rdev; } How does it relate to ACPI battery? What inode do you evaluate and where do you get it from? Point is, I am not 100% sure it is battery who is guilty; but usually after this, `cat /proc/acpi/battery/BAT0/state` gives some form of error: `cat` process stall, or got reading error, or reads empty file. But it may be that this is some general bug in procfs and mentioned above file is simply one that happens to suffer most in my environment. I will add printk's around audit_inode to see more info. Another variant of ACPI problem: after resume, the following appears in syslog: Jan 31 20:34:38 vkondra-mobl kernel: ACPI-0292: *** Error: Looking up [SERN] in namespace, AE_ALREADY_EXISTS Jan 31 20:34:38 vkondra-mobl kernel: ACPI-0508: *** Error: Method execution failed [\_SB_.PCI0.LPC_.EC__.GBIF] (Node f7f02354), AE_ALREADY_EXISTS Jan 31 20:34:41 vkondra-mobl kernel: ACPI-0213: *** Error: Method reached ma ximum reentrancy limit (255) Jan 31 20:34:41 vkondra-mobl kernel: ACPI-0508: *** Error: Method execution failed [\_SB_.PCI0.LPC_.EC__.BAT0._BIF] (Node f7f0219c), AE_AML_METHOD_LIMIT Jan 31 20:34:41 vkondra-mobl kernel: done Jan 31 20:34:44 vkondra-mobl kernel: ACPI-0213: *** Error: Method reached ma ximum reentrancy limit (255) .........etc. etc. now, /proc/acpi/battery/BAT0/state is OK but /proc/acpi/battery/BAT0/state is not: [vkondra-l@vkondra-mobl ~]$ cat /proc/acpi/battery/BAT0/state present: yes capacity state: ok charging state: charging present rate: 16330 mW remaining capacity: 61970 mWh present voltage: 12325 mV [vkondra-l@vkondra-mobl ~]$ cat /proc/acpi/battery/BAT0/info present: yes ERROR: Unable to read battery information I am also looking into a similar issue. I encountered it while running fsracer tests on SLES10 B1, which is 2.6.15-git12-6 I think the bad Inode gets in from do_lookup(). I suspect __d_lookup and real_lookup. Hi Vladimir, Could you check out the following patch ? Its against 2.6.15-rc7. The path_lookup() doesn't check whether the link_path_walk() successful or not. --- linux-2.6.15-rc7-I/fs/namei.c 2006-01-03 10:25:57.000000000 -0800 +++ linux-2.6.15-rc7-I/fs/namei.c.mod 2006-02-06 05:08:36.000000000 -0800 @@ -1090,7 +1090,7 @@ int fastcall path_lookup(const char *nam current->total_link_count = 0; retval = link_path_walk(name, nd); out: - if (unlikely(current->audit_context + if (unlikely(!retval && current->audit_context && nd && nd->dentry && nd->dentry->d_inode)) audit_inode(name, nd->dentry->d_inode, flags); return retval; Thanks. Applied. I use 2.6.16-rc2-git2 (latest for today). Patch as is fails, but idea is clear so it is easy to adjust. I run watch -n 0 cat /proc/acpi/battery/BAT0/[is]* and all applets that I know to monitor battery state; I did couple of suspend/resume cycles. So far, so good. But, to be sure I need to run it for some more time. Let me test it for, say, one week from now. I will report immediately in case it oops again. FYI.. The following patch is now in mm tree. diff -puN fs/namei.c~fix-do_path_lookup-to-add-the-check-for-error-in-link_path_walk fs/namei.c --- devel/fs/namei.c~fix-do_path_lookup-to-add-the-check-for-error-in-link_path_walk 2006-02-06 22:25:12.000000000 -0800 +++ devel-akpm/fs/namei.c 2006-02-06 22:27:45.000000000 -0800 @@ -1119,9 +1119,11 @@ static int fastcall do_path_lookup(int d current->total_link_count = 0; retval = link_path_walk(name, nd); out: - if (unlikely(current->audit_context - && nd && nd->dentry && nd->dentry->d_inode)) + if (likely(retval == 0)) { + if (unlikely(current->audit_context && nd && nd->dentry && + nd->dentry->d_inode)) audit_inode(name, nd->dentry->d_inode, flags); + } return retval; fput_unlock_fail: I would consider it fixed. During week of intensive testing I have no single failure. Closing as per Comment #10. |