Bug 14998

Summary: Caught 32-bit read from uninitialized memory in acpi_system_read_event -- 2.6.31 regression
Product: File System Reporter: Christian Casteyde (casteyde.christian)
Component: VFSAssignee: fs_vfs
Status: RESOLVED CODE_FIX    
Severity: high CC: acpi-bugzilla, error27, lenb, rjw, rui.zhang, vegard.nossum
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.32-rc4 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 14230, 56331    
Attachments: git bisect log
change sprintf to snprintf
dmesg output for 2.6.34 with snprintf patch and suspend/resume
lspci -v for 2.6.33
mark allocation as false positive

Description Christian Casteyde 2010-01-06 21:40:59 UTC
2.6.33-rc3 in 64bits mode
Distro: Bluewhite64 13 (64bits ports of Slackware 13)
Acer Aspire 1511Lmi laptop

At first suspend after a fresh reboot, I got the following warning:

WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88004cd08b1c)
00011000000000000002200000000000627574746f6e2f6c6964000000000000
 i i i i i i i i i i i i i i i i i i i i i i i i i i i u u u u u
                                                         ^

Pid: 3264, comm: hald Not tainted 2.6.33-rc3 #2 Aspire 1510/Aspire 1510
RIP: 0010:[<ffffffff812692c1>]  [<ffffffff812692c1>] acpi_bus_receive_event+0xf5/0x119
RSP: 0018:ffff88004bfb5dd8  EFLAGS: 00010286
RAX: 0000000000000038 RBX: ffff88004cd08b00 RCX: 0000000000000007
RDX: 00000000ffffffed RSI: ffff88004cd08b1c RDI: ffff88004bfb5e64
RBP: ffff88004bfb5e28 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000282
R13: ffff88004bfb5e48 R14: ffff88004bfb5f48 R15: 0000000001da4420
FS:  00007f93fe93f6f0(0000) GS:ffffffff81a1c000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff88004f810c20 CR3: 000000004bf4d000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
 [<ffffffff8126f96a>] acpi_system_read_event+0x6b/0x115
 [<ffffffff810fd2b9>] proc_reg_read+0x79/0xb0
 [<ffffffff810b1867>] vfs_read+0xc7/0x160
 [<ffffffff810b19f0>] sys_read+0x50/0x90
 [<ffffffff81002468>] system_call_fastpath+0x16/0x1b
 [<ffffffffffffffff>] 0xffffffffffffffff
Comment 1 Rafael J. Wysocki 2010-01-11 23:24:43 UTC
Why do you think it's a regression?
Comment 2 Len Brown 2010-01-12 03:27:29 UTC
does this happen during suspend, or during resume?
is that the only time it is possible to provoke it?
Comment 3 Christian Casteyde 2010-01-13 19:03:19 UTC
I think it's a regression because I've never got this warning until 2.6.33-rc3.
Indeed, I used to lose network connectivity while suspending, but I never got an uninitialized read in acpi before now. Maybe this is just occuring now however, so it may be an existing problem that didn't occurred for whatever reason before.

The warning occurs before the system goes to sleep, when I close the lid. I think it's when the event that calls /etc/acpi/acpi_handler.sh that triggers this.

I'm testing -rc4 to see if it is still there, and will give the faulty line then.
Comment 4 Christian Casteyde 2010-01-13 19:36:19 UTC
I've forgotten to say: I confirm I get this warning only by closing the lid. No other operation seem to produce it, however I don't do many thinks while testing kernels: kmemcheck is slow and I'm not too confident with it.

I still have it with 2.6.33-rc4, the warning is the first thing that appears in dmesg when I close the lid, then it is followed by the suspension code:


WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88004bfd3b5c)
000110000000adde000220000000adde627574746f6e2f6c6964000000000000
 i i i i i i i i i i i i i i i i i i i i i i i i i i i u u u u u
                                                         ^

Pid: 3396, comm: acpid Not tainted 2.6.33-rc4 #3 Aspire 1510/Aspire 1510
RIP: 0010:[<ffffffff812694f5>]  [<ffffffff812694f5>] acpi_bus_receive_event+0xf5/0x119
RSP: 0018:ffff88004cd79dd8  EFLAGS: 00010282
RAX: 0000000000000038 RBX: ffff88004bfd3b40 RCX: 0000000000000007
RDX: 00000000ffffffed RSI: ffff88004bfd3b5c RDI: ffff88004cd79e64
RBP: ffff88004cd79e28 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000282
R13: ffff88004cd79e48 R14: ffff88004cd79f48 R15: 00000000009e0420
FS:  00007fa9666316f0(0000) GS:ffffffff81a1c000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff88004f810c48 CR3: 00000000419ce000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
 [<ffffffff8126fb9e>] acpi_system_read_event+0x6b/0x115
 [<ffffffff810fd3f9>] proc_reg_read+0x79/0xb0
 [<ffffffff810b1917>] vfs_read+0xc7/0x160
 [<ffffffff810b1aa0>] sys_read+0x50/0x90
 [<ffffffff81002468>] system_call_fastpath+0x16/0x1b
 [<ffffffffffffffff>] 0xffffffffffffffff
eth1: deauthenticated from 00:07:cb:55:12:b0 (Reason: 7)
PM: Syncing filesystems ... done.
PM: Preparing system for mem sleep
Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
PM: Entering mem sleep
Suspending console(s) (use no_console_suspend to debug)

The faulty line is:
christian@athor:~$ addr2line -e /usr/src/linux-2.6.33/vmlinux -i ffffffff8126fb9e
/usr/src/linux-2.6.33/drivers/acpi/event.c:64

That is:

static ssize_t
acpi_system_read_event(struct file *file, char __user * buffer, size_t count,
                       loff_t * ppos)
{
        int result = 0;
        struct acpi_bus_event event;
        static char str[ACPI_MAX_STRING];
        static int chars_remaining = 0;
        static char *ptr;

        if (!chars_remaining) {
                memset(&event, 0, sizeof(struct acpi_bus_event));

                if ((file->f_flags & O_NONBLOCK)
                    && (list_empty(&acpi_bus_event_list)))
                        return -EAGAIN;

                result = acpi_bus_receive_event(&event);
--->                if (result)
                        return result;

                chars_remaining = sprintf(str, "%s %s %08x %08x\n",
                                          event.device_class ? event.
                                          device_class : "<unknown>",
                                          event.bus_id ? event.
                                          bus_id : "<unknown>", event.type,
                                          event.data);

One more time, I don't understand what can trigger this, since result is assigned just the preceding line. Maybe acpi_bus_receive_events is inlined?
Comment 5 Rafael J. Wysocki 2010-01-13 20:45:35 UTC
The requested info has been provided.
Comment 6 Zhang Rui 2010-01-18 03:14:36 UTC
now can you reproduce it by closing the lid in 2.6.32?
Comment 7 Christian Casteyde 2010-01-18 23:46:07 UTC
Indeed I managed to reproduce it by closing the lid in 2.6.32.4.
Finally it doesn't seem to be a regression, even if 2.6.33-rc2 and -rc1 didn't trigger it (don't know why I don't get it).

I remove the regression flag and adjust the kernel version on which it occurs to 2.6.32.4 (it may also occurs before that version, but I havent tested previous kernels).
Comment 8 Zhang Rui 2010-01-21 08:07:05 UTC
I'm confused...
(In reply to comment #7)
> Indeed I managed to reproduce it by closing the lid in 2.6.32.4.
2.6.32.4 has this problem.

> Finally it doesn't seem to be a regression, even if 2.6.33-rc2 and -rc1
> didn't
> trigger it (don't know why I don't get it).
> 
and the problem doesn't exist in 2.6.33-rc1 and rc1.

> I remove the regression flag and adjust the kernel version on which it occurs
> to 2.6.32.4 (it may also occurs before that version, but I havent tested
> previous kernels).

so the problem is fixed in 2.6.33-rc?
it would be great if you can verify if the problem exists in older kernels releases.
Comment 9 Christian Casteyde 2010-01-24 18:24:37 UTC
I've not seen this problem with 2.6.33-rc1 and -rc2. However, that doesn't mean that these versions fix it, since I also got other problems with suspend and maybe the code path was different, or maybe I simply missed the warning by out of luck (after all, I also missed it with 2.6.32 until I retry it explicitly as said in comment #7, since I also checked 2.6.32 with kmemcheck at least up to 2.6.32-rc2).

I've also tester 2.6.31, and it didn't trigger the warning (I got dozens of schedule while atomic in acpi code however that are fixed now, but I haven't got any read uninitialized).

To sum up:

2.6.31: ok as far as this warning is concerned, didn't occurs
2.6.32.4 : nok (warning triggered)
2.6.33-rc1 and rc2: ok, but maybe I should recheck it
2.6.33-rc3 to -rc5: nok (warning triggered)

So I think the problem is indeed there since somewhere between 2.6.31 and 2.6.32.4, and I simply missed it in 33-rc1 & 2 for another reason. Maybe it was introduced lately in 2.6.32 development cycle (I remove kmemcheck options as soon as the kernel begins to work correctly on my laptop to boot more quickly).

I could bisect it if necessary, but I have only time for 1-2 kernel check per week on my spare time so it will take a long if I start this bisection.
Comment 10 Rafael J. Wysocki 2010-01-24 18:51:38 UTC
So this looks like a regression from 2.6.31.
Comment 11 Christian Casteyde 2010-01-24 21:32:56 UTC
I've done 3 more checked:

- 2.6.32-rc5 does have the warning.
- 2.6.32-rc3 is unable to suspend correctly my laptop (it freeze). However, it does not warn when I close the lid but do not suspend (by commenting out my script /etc/acpi/hibernate.sh, that is called from /etc/acpi/acpi_handler.sh on the lid close event).
- 2.6.32-rc4 does have the warning when I close the lid, even without suspending to RAM (that is, alos with /etc/acpi/hibernate.sh commented out). It seems I don't need to suspend to have the warning: only closing the lid is sufficient.

I'll have tot recheck 2.6.32-rc3, to be sure it doesn't warn, since I feel the warning is not always triggered. But I'm quite confident the warning was introduced between rc3 and rc4. The curious thing is that I've not seen any acpi related commit between these two rc.

I've also forgotten how to get a git repo to bisect so I won't bisect more tonight.
Comment 12 Christian Casteyde 2010-01-24 23:12:26 UTC
2.6.32-rc1 triggers the warning too. I missed it in rc3 apparently.
Comment 13 Rafael J. Wysocki 2010-01-26 12:36:24 UTC
On Tuesday 26 January 2010, Christian Casteyde wrote:
> It's also still present in 33-rc5.
Comment 14 Zhang Rui 2010-01-27 07:26:37 UTC
I don't have any clues about this one.
It would be great of you can run git-bisect to narrow down the problem.
Comment 15 Christian Casteyde 2010-01-31 10:52:41 UTC
It was quite long due to slow boots with kmemcheck, but I managed to bisect it.
I strictly don't understand why this commit causes the bug, more related to /proc fs or vfs than to acpi apparently:

eca6f534e61919b28fb21aafbd1c2983deae75be is first bad commit
commit eca6f534e61919b28fb21aafbd1c2983deae75be             
Author: Vegard Nossum <vegard.nossum@gmail.com>             
Date:   Fri Sep 18 13:05:45 2009 -0700                      

    fs: fix overflow in sys_mount() for in-kernel calls
                                                       
    sys_mount() reads/copies a whole page for its "type" parameter.  When
    do_mount_root() passes a kernel address that points to an object which is
    smaller than a whole page, copy_mount_options() will happily go past this
    memory object, possibly dereferencing "wild" pointers that could be in any
    state (hence the kmemcheck warning, which shows that parts of the next    
    page are not even allocated).

    (The likelihood of something going wrong here is pretty low -- first of
    all this only applies to kernel calls to sys_mount(), which are mostly
    found in the boot code.  Secondly, I guess if the page was not mapped,
    exact_copy_from_user() _would_ in fact handle it correctly because of its
    access_ok(), etc.  checks.)

    But it is much nicer to avoid the dubious reads altogether, by stopping as
    soon as we find a NUL byte.  Is there a good reason why we can't do
    something like this, using the already existing strndup_from_user()?

    [akpm@linux-foundation.org: make copy_mount_string() static]
    [AV: fix compat mount breakage, which involves undoing akpm's change above]

    Reported-by: Ingo Molnar <mingo@elte.hu>
    Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Pekka Enberg <penberg@cs.helsinki.fi>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: al <al@dizzy.pdmi.ras.ru>

:040000 040000 a89bfba8d50e5dfc6e77696d93cd772cd0ffacf3 9a3066b7678c117ea08efaf15fe558a63ccbda61 M        fs

I append the git bisect log.
Comment 16 Christian Casteyde 2010-01-31 10:54:15 UTC
Created attachment 24815 [details]
git bisect log
Comment 17 Rafael J. Wysocki 2010-01-31 12:18:21 UTC
First-Bad-Commit : eca6f534e61919b28fb21aafbd1c2983deae75be
Comment 18 Rafael J. Wysocki 2010-02-01 21:29:28 UTC
On Monday 01 February 2010, Christian Casteyde wrote:
> Yes, still applies to 2.6.32-rc6.
Comment 19 Rafael J. Wysocki 2010-02-08 21:06:45 UTC
On Monday 08 February 2010, Christian Casteyde wrote:
> Still present in 2.6.33-rc7.
> 
> Bonjour,
> Le lundi 8 février 2010, vous avez écrit :
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32.  Please verify if it still should
> > be listed and let the tracking team know (either way).
> >
> >
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=14998
> > Subject             : Caught 32-bit read from uninitialized memory in
> > acpi_system_read_event -- 2.6.31 regression Submitter       : Christian
> Casteyde
> > <casteyde.christian@free.fr>
> > Date                : 2010-01-06 21:40 (33 days old)
Comment 20 Rafael J. Wysocki 2010-02-27 19:09:09 UTC
On Saturday 27 February 2010, Christian Casteyde wrote:
> Still present in 2.6.33.
Comment 21 Christian Casteyde 2010-03-09 22:49:34 UTC
Update : still present in 2.6.34-rc1.
Comment 22 Christian Casteyde 2010-03-22 18:35:05 UTC
  
Update : still present in 2.6.34-rc2.
Comment 23 Dan Carpenter 2010-03-31 18:21:01 UTC
It's hard for me to see how eca6f534e61919 affects anything.

What about if (just for laughs) we change the sprintf() to snprintf()?

I've attached a patch.

After that could you post a full dmesg and an lspci?

(I'm getting kicked out of my internet cafe.  :P)
Comment 24 Dan Carpenter 2010-03-31 18:25:27 UTC
Created attachment 25775 [details]
change sprintf to snprintf

changes sprintf() to snprintf().
Comment 25 Christian Casteyde 2010-03-31 21:41:12 UTC
Changing to snprintf didn't changed anything on 2.6.34-rc3.
I'm appending the full dmesg log of 34-rc3, and lspci from 2.6.33 (I forgot to do it on -34).
Comment 26 Christian Casteyde 2010-03-31 21:48:36 UTC
Created attachment 25783 [details]
dmesg output for 2.6.34 with snprintf patch and suspend/resume
Comment 27 Christian Casteyde 2010-03-31 21:56:19 UTC
Created attachment 25784 [details]
lspci -v for 2.6.33
Comment 28 Dan Carpenter 2010-04-07 09:55:56 UTC
My sprintf() patch isn't right.  snprintf() returns the same number that sprintf() would have.

But I think the bug might be one of these:
https://patchwork.kernel.org/patch/86927/
http://lkml.org/lkml/2010/3/25/217

Those could easily cause a bug at the result = acpi_bus_receive_event(&event);
Comment 29 Christian Casteyde 2010-04-13 21:48:13 UTC
Update: I haven't checked those patches yet, however this bug is still present in 2.6.34-rc4.
Comment 30 Christian Casteyde 2010-04-13 22:28:17 UTC
Well, those patches doesn't fix this bug. I still have the warning of reading uninitialized memory with them.
Comment 31 Dan Carpenter 2010-04-20 05:49:10 UTC
If we take the last bit of initialized data:

000110000000adde000220000000adde627574746f6e2f6c6964000000000000
 i i i i i i i i i i i i i i i i i i i i i i i i i i i u u u u u

unsigned short a[] = {0x6275, 0x7474, 0x6f6e, 0x2f6c, 0x6964, 0x0000};
printf("%s\n", a);

That prints "ubttnol/di" or "button/lid" after we swap the characters. 
So basically there are some uninitialized bytes after the NULL terminator.

This looks like a false positive.  BTW we don't leak the uninitialized 
bytes to the user do we?  I'll post a patch that probably will get rid
of the message.
Comment 32 Dan Carpenter 2010-04-20 05:53:40 UTC
Created attachment 26055 [details]
mark allocation as false positive

This marks the allocation as a false positive so that kmemcheck doesn't complain.
Comment 33 Christian Casteyde 2010-04-20 18:58:11 UTC
Tested: With 2.6.34-rc5 + patch from comment #32, the warning doesn't occur anymore.
Comment 34 Vegard Nossum 2010-04-22 20:20:43 UTC
(In reply to comment #4)
> I've forgotten to say: I confirm I get this warning only by closing the lid.
> No
> other operation seem to produce it, however I don't do many thinks while
> testing kernels: kmemcheck is slow and I'm not too confident with it.
> 
> I still have it with 2.6.33-rc4, the warning is the first thing that appears
> in
> dmesg when I close the lid, then it is followed by the suspension code:
> 
> 
> WARNING: kmemcheck: Caught 32-bit read from uninitialized memory
> (ffff88004bfd3b5c)
> 000110000000adde000220000000adde627574746f6e2f6c6964000000000000
>  i i i i i i i i i i i i i i i i i i i i i i i i i i i u u u u u
>                                                          ^
> 
> Pid: 3396, comm: acpid Not tainted 2.6.33-rc4 #3 Aspire 1510/Aspire 1510
> RIP: 0010:[<ffffffff812694f5>]  [<ffffffff812694f5>]
> acpi_bus_receive_event+0xf5/0x119
> RSP: 0018:ffff88004cd79dd8  EFLAGS: 00010282
> RAX: 0000000000000038 RBX: ffff88004bfd3b40 RCX: 0000000000000007
> RDX: 00000000ffffffed RSI: ffff88004bfd3b5c RDI: ffff88004cd79e64
> RBP: ffff88004cd79e28 R08: 0000000000000000 R09: 0000000000000001
> R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000282
> R13: ffff88004cd79e48 R14: ffff88004cd79f48 R15: 00000000009e0420
> FS:  00007fa9666316f0(0000) GS:ffffffff81a1c000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: ffff88004f810c48 CR3: 00000000419ce000 CR4: 00000000000006f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
>  [<ffffffff8126fb9e>] acpi_system_read_event+0x6b/0x115
>  [<ffffffff810fd3f9>] proc_reg_read+0x79/0xb0
>  [<ffffffff810b1917>] vfs_read+0xc7/0x160
>  [<ffffffff810b1aa0>] sys_read+0x50/0x90
>  [<ffffffff81002468>] system_call_fastpath+0x16/0x1b
>  [<ffffffffffffffff>] 0xffffffffffffffff
> eth1: deauthenticated from 00:07:cb:55:12:b0 (Reason: 7)
> PM: Syncing filesystems ... done.
> PM: Preparing system for mem sleep
> Freezing user space processes ... (elapsed 0.01 seconds) done.
> Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
> PM: Entering mem sleep
> Suspending console(s) (use no_console_suspend to debug)
> 
> The faulty line is:
> christian@athor:~$ addr2line -e /usr/src/linux-2.6.33/vmlinux -i
> ffffffff8126fb9e
> /usr/src/linux-2.6.33/drivers/acpi/event.c:64
> 
> That is:
> 
> static ssize_t
> acpi_system_read_event(struct file *file, char __user * buffer, size_t count,
>                        loff_t * ppos)
> {
>         int result = 0;
>         struct acpi_bus_event event;
>         static char str[ACPI_MAX_STRING];
>         static int chars_remaining = 0;
>         static char *ptr;
> 
>         if (!chars_remaining) {
>                 memset(&event, 0, sizeof(struct acpi_bus_event));
> 
>                 if ((file->f_flags & O_NONBLOCK)
>                     && (list_empty(&acpi_bus_event_list)))
>                         return -EAGAIN;
> 
>                 result = acpi_bus_receive_event(&event);
> --->                if (result)
>                         return result;
> 
>                 chars_remaining = sprintf(str, "%s %s %08x %08x\n",
>                                           event.device_class ? event.
>                                           device_class : "<unknown>",
>                                           event.bus_id ? event.
>                                           bus_id : "<unknown>", event.type,
>                                           event.data);
> 
> One more time, I don't understand what can trigger this, since result is
> assigned just the preceding line. Maybe acpi_bus_receive_events is inlined?

I can explain why. You ran addr2line on the address of the first stack entry:

>  [<ffffffff8126fb9e>] acpi_system_read_event+0x6b/0x115
>  [<ffffffff810fd3f9>] proc_reg_read+0x79/0xb0

> christian@athor:~$ addr2line -e /usr/src/linux-2.6.33/vmlinux -i
> ffffffff8126fb9e
> /usr/src/linux-2.6.33/drivers/acpi/event.c:64

It would have been more appropriate to run addr2line on the RIP value, i.e. ffffffff812694f5:

> RIP: 0010:[<ffffffff812694f5>]  [<ffffffff812694f5>]

Additionally, since the stack entries are essentially return addresses, they generally refer to the instruction just after the CALL instruction, not the CALL itself.
Comment 35 Vegard Nossum 2010-04-22 21:13:00 UTC
(By the way, I forgot to say: Thanks for the testing effort and the report!)

The actual problem here is that acpi_bus_receive_event() does a full memcpy between two structures:

int acpi_bus_receive_event(struct acpi_bus_event *event)
{
        ...
        memcpy(event, entry, sizeof(struct acpi_bus_event));
        ...
}

Clearly, since kmemcheck complains about a read, at least the "entry" pointer is tracked by kmemcheck and partially uninitialized. What about the "event" pointer? It comes from the caller -- acpi_system_read_event():

acpi_system_read_event(struct file *file, char __user * buffer, size_t count,
                       loff_t * ppos)
{               
        ...
        struct acpi_bus_event event;
        ...
        result = acpi_bus_receive_event(&event);
        ...
}

So the actual problem is that "event" was allocated on the stack and is not tracked by kmemcheck, so copying something uninitialized into it will trigger the report.

I am in two minds about the proposed fix. The currently posted patch is obviously the less intrusive one in terms of both correctness and churn, but here are some alternative suggestions:

1. Copy the struct "by hand" and use strncpy() for the strings to avoid the uninitialized areas.

2. Allocate the event with kmalloc() instead of on the stack and free it after the call to acpi_bus_receive_event().

3. Allocate the event with kmalloc() and pass ownership of the structure to acpi_bus_receive_event() to avoid the copying altogether.


Vegard
Comment 36 Christian Casteyde 2010-04-22 21:51:33 UTC
Just to confirm, I redit an addr2kine with rip as you mentioned: it is indeed in memcpy that the problem occurs:

/usr/src/linux-2.6.33/arch/x86/include/asm/string_64.h:12
/usr/src/linux/drivers/acpi/bus.c:598

        spin_lock_irqsave(&acpi_bus_event_lock, flags);
        if (!list_empty(&acpi_bus_event_list)) {
                entry = list_entry(acpi_bus_event_list.next,
                                   struct acpi_bus_event, node);
                list_del(&entry->node);
        }
        spin_unlock_irqrestore(&acpi_bus_event_lock, flags);

        if (!entry)
                return -ENODEV;

here --->        memcpy(event, entry, sizeof(struct acpi_bus_event));

        kfree(entry);

        return 0;
}
Comment 37 Len Brown 2010-04-27 07:17:08 UTC
https://patchwork.kernel.org/patch/95222/

applied to acpi tree
Comment 38 Christian Casteyde 2010-04-30 08:12:25 UTC
Update: Still present in 2.6.34-rc6.
Comment 39 Dan Carpenter 2010-04-30 08:54:46 UTC
Len Brown picked up the kmalloc() => kzalloc() patch but it takes a while to filter into Linus's tree.

I don't know what Len's plan is, but there is even a possibility that the patch might get delayed until 2.6.35.  We're pretty far into the -rc cycle for 2.6.34 and we're only supposed to be merging crashing fixes now.  This isn't a crashing bug, it's just a warning and the kmemcheck option is mostly for debugging only.

Then again, I think Len has some crashing bugs still left to merge so this might get included with those.  Or maybe he'll let it sit in -next for a week and see if anyone complains.  I don't know.  I don't have to make any decisions fortunately.

For now just try ignore the message.  :)
Comment 40 Len Brown 2010-05-20 02:59:09 UTC
shipped in 2.6.34
closed.