Bug 7889 - an oops inside kmem_get_pages
Summary: an oops inside kmem_get_pages
Status: CLOSED CODE_FIX
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: i386 Linux
: P2 blocking
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-01-26 11:43 UTC by Pawel Sikora
Modified: 2007-02-04 01:56 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.20rc5/smp
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
smp config. (69.06 KB, text/plain)
2007-01-26 11:45 UTC, Pawel Sikora
Details
oops screenshot. (475.82 KB, image/jpeg)
2007-01-26 11:46 UTC, Pawel Sikora
Details
minimal working config for bzImage. (13.16 KB, application/octet-stream)
2007-01-31 00:17 UTC, Pawel Sikora
Details
minimal oopsing config for bzImage. (13.25 KB, application/octet-stream)
2007-01-31 00:18 UTC, Pawel Sikora
Details

Description Pawel Sikora 2007-01-26 11:43:08 UTC
Most recent kernel where this bug did *NOT* occur:

2.6.20rc5 with smp config and disabled `amd k8 cool'n'quiet' bios option.
2.6.20rc5 with non-smp config and enabled cool'n'quiet.
2.6.18.x works fine with all configuratioins, 2.6.19.x not tested so far.

Hardware Environment:

M/B: http://www.epox.nl/products/view.php?product_id=421

processor 
Comment 1 Pawel Sikora 2007-01-26 11:45:05 UTC
Created attachment 10199 [details]
smp config.
Comment 2 Pawel Sikora 2007-01-26 11:46:38 UTC
Created attachment 10200 [details]
oops screenshot.
Comment 3 Andrew Morton 2007-01-26 12:15:31 UTC
On Fri, 26 Jan 2007 11:51:29 -0800
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=7889
> 
>            Summary: an oops inside kmem_get_pages
>     Kernel Version: 2.6.20rc5/smp
>             Status: NEW
>           Severity: blocking
>              Owner: akpm@osdl.org
>          Submitter: pluto@pld-linux.org
> 
> 
> Most recent kernel where this bug did *NOT* occur:
> 
> 2.6.20rc5 with smp config and disabled `amd k8 cool'n'quiet' bios option.
> 2.6.20rc5 with non-smp config and enabled cool'n'quiet.
> 2.6.18.x works fine with all configuratioins, 2.6.19.x not tested so far.
> 
> Hardware Environment:
> 
> M/B: http://www.epox.nl/products/view.php?product_id=421
> 
> processor _ _ _ : 0
> vendor_id _ _ _ : AuthenticAMD
> cpu family _ _ _: 15
> model _ _ _ _ _ : 55
> model name _ _ _: AMD Athlon(tm) 64 Processor 3700+
> stepping _ _ _ _: 2
> cpu MHz _ _ _ _ : 2200.000
> cache size _ _ _: 1024 KB
> fpu _ _ _ _ _ _ : yes
> fpu_exception _ : yes
> cpuid level _ _ : 1
> wp _ _ _ _ _ _ _: yes
> flags _ _ _ _ _ : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
> cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 
> 3dnowext 3dnow pni lahf_lm
> bogomips _ _ _ _: 4423.06
> TLB size _ _ _ _: 1024 4K pages
> clflush size _ _: 64
> cache_alignment : 64
> address sizes _ : 40 bits physical, 48 bits virtual
> power management: ts fid vid ttp
> 
> RAM: 1GB DDR.
> 
> Software Environment:
> 
> gcc-4.1.2 with recent binutils.
> 
> Problem Description:
> 
> smp kernel oopses on uniprocessor hardware in early boot stage.
> disabling the amd/k8 cool'n'quiet feature in bios ( acpi/cpufreq related )
> helps, however this may be only a side effect of different flow control.
> disabling config_cpu_freq doesn't help when above metioned bios option
> is enabled.
> 
> Steps to reproduce:
> 
> just boot the bzImage.
> 

Gad.  A null pointer deref right in the main path of the page allocator.

Presumably something has gone wrong with the core MM initialisation on
that kernel.  Could you please work out the exact file-n-line of the
oops?  Build the kernel with CONFIG_DEBUG_INFO and do

gdb vmlinux

(gdb) l *<EIP where it oopsed>

You'll find this points at list_del(), so you'll need to offset the hex
address by a little bit to identify the list_del() caller.

Thanks.

Comment 4 Pawel Sikora 2007-01-26 13:44:50 UTC
here's an stack unwind chain:

0xffffffff802d27be is in __rmqueue (mm/page_alloc.c:633).
628                             continue;
629
630                     page = list_entry(area->free_list.next, struct page, 
lru);
631                     list_del(&page->lru);
632                     rmv_page_order(page);
633                     area->nr_free--;
634                     zone->free_pages -= 1UL << order;
635                     expand(zone, page, order, current_order, area);
636                     return page;
637             }

0xffffffff8020a52f is in get_page_from_freelist (mm/page_alloc.c:870).
865                     list_del(&page->lru);
866                     pcp->count--;
867             } else {
868                     spin_lock_irqsave(&zone->lock, flags);
869                     page = __rmqueue(zone, order);
870                     spin_unlock(&zone->lock);
871                     if (!page)
872                             goto failed;
873             }

0xffffffff8020f593 is in __alloc_pages (mm/page_alloc.c:1241).
1236                    return NULL;
1237            }
1238
1239            page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
1240                                    zonelist, ALLOC_WMARK_LOW|
ALLOC_CPUSET);
1241            if (page)
1242                    goto got_pg;
1243
1244            /*
1245             * GFP_THISNODE (meaning __GFP_THISNODE, __GFP_NORETRY and

0xffffffff802e68d0 is in kmem_getpages (mm/slab.c:1627).
1622
1623            page = alloc_pages_node(nodeid, flags, cachep->gfporder);
1624            if (!page)
1625                    return NULL;
1626
1627            nr_pages = (1 << cachep->gfporder);
1628            if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
1629                    add_zone_page_state(page_zone(page),
1630                            NR_SLAB_RECLAIMABLE, nr_pages);
1631            else

0xffffffff80217de0 is in cache_grow (mm/slab.c:2774).
2769             * Get mem for the objs.  Attempt to allocate a physical page 
from
2770             * 'nodeid'.
2771             */
2772            if (!objp)
2773                    objp = kmem_getpages(cachep, flags, nodeid);
2774            if (!objp)
2775                    goto failed;
2776
2777            /* Get slab management. */
2778            slabp = alloc_slabmgmt(cachep, objp, offset,

0xffffffff80261e9b is in cache_alloc_refill (mm/slab.c:773).
768
769     static DEFINE_PER_CPU(struct delayed_work, reap_work);
770
771     static inline struct array_cache *cpu_cache_get(struct kmem_cache 
*cachep)
772     {
773             return cachep->array[smp_processor_id()];
774     }
775
776     static inline struct kmem_cache *__find_general_cachep(size_t size,
777                                                             gfp_t 
gfpflags)

0xffffffff802e7d7c is in do_tune_cpucache (mm/slab.c:3891).
3886
3887            new = kzalloc(sizeof(*new), GFP_KERNEL);
3888            if (!new)
3889                    return -ENOMEM;
3890
3891            for_each_online_cpu(i) {
3892                    new->new[i] = alloc_arraycache(cpu_to_node(i), limit,
3893                                                    batchcount);
3894                    if (!new->new[i]) {
3895                            for (i--; i >= 0; i--)

0xffffffff802e721d is in kmem_cache_zalloc (mm/slab.c:3221).
3216            /*
3217             * We may just have run out of memory on the local node.
3218             * ____cache_alloc_node() knows how to locate memory on other 
nodes
3219             */
3220            if (NUMA_BUILD && !objp)
3221                    objp = ____cache_alloc_node(cachep, flags, 
numa_node_id());
3222            local_irq_restore(save_flags);
3223            objp = cache_alloc_debugcheck_after(cachep, flags, objp,
3224                                                caller);
3225            prefetchw(objp);

0xffffffff802e7d7c is in do_tune_cpucache (mm/slab.c:3891).
3886
3887            new = kzalloc(sizeof(*new), GFP_KERNEL);
3888            if (!new)
3889                    return -ENOMEM;
3890
3891            for_each_online_cpu(i) {
3892                    new->new[i] = alloc_arraycache(cpu_to_node(i), limit,
3893                                                    batchcount);
3894                    if (!new->new[i]) {
3895                            for (i--; i >= 0; i--)

0xffffffff802e835a is in enable_cpucache (mm/slab.c:3974).
3969            if (limit > 32)
3970                    limit = 32;
3971    #endif
3972            err = do_tune_cpucache(cachep, limit, (limit + 1) / 2, 
shared);
3973            if (err)
3974                    printk(KERN_ERR "enable_cpucache failed for %s, 
error %d.\n",
3975                           cachep->name, -err);
3976            return err;
3977    }

0xffffffff8062bc0a is in kmem_cache_init (mm/slab.c:1563).
1558            /* 6) resize the head arrays to their final sizes */
1559            {
1560                    struct kmem_cache *cachep;
1561                    mutex_lock(&cache_chain_mutex);
1562                    list_for_each_entry(cachep, &cache_chain, next)
1563                            if (enable_cpucache(cachep))
1564                                    BUG();
1565                    mutex_unlock(&cache_chain_mutex);
1566            }

0xffffffff8061773c is in start_kernel (init/main.c:583).
578             }
579     #endif
580             vfs_caches_init_early();
581             cpuset_init_early();
582             mem_init();
583             kmem_cache_init();
584             setup_per_cpu_pageset();
585             numa_policy_init();
586             if (late_time_init)
587                     late_time_init();

0xffffffff8061716d is in x86_64_start_kernel (arch/x86_64/kernel/head64.c:84).
79              copy_bootdata(real_mode_data);
80      #ifdef CONFIG_SMP
81              cpu_set(0, cpu_online_map);
82      #endif
83              start_kernel();
84      }
Comment 5 Andrew Morton 2007-01-26 14:09:15 UTC
On Fri, 26 Jan 2007 13:53:02 -0800
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=7889
> 
> 
> 
> 
> 
> ------- Additional Comments From pluto@pld-linux.org  2007-01-26 13:44 -------
> here's an stack unwind chain:

OK, thanks.  Please use email (reply-to-all) on this bug from now on.  I'm
hoping that someone else will look into this, as I'm not exactly brimming
with spare time at present.



> 0xffffffff802d27be is in __rmqueue (mm/page_alloc.c:633).
> 628                             continue;
> 629
> 630                     page = list_entry(area->free_list.next, struct page, 
> lru);
> 631                     list_del(&page->lru);
> 632                     rmv_page_order(page);
> 633                     area->nr_free--;
> 634                     zone->free_pages -= 1UL << order;
> 635                     expand(zone, page, order, current_order, area);
> 636                     return page;
> 637             }
> 
> 0xffffffff8020a52f is in get_page_from_freelist (mm/page_alloc.c:870).
> 865                     list_del(&page->lru);
> 866                     pcp->count--;
> 867             } else {
> 868                     spin_lock_irqsave(&zone->lock, flags);
> 869                     page = __rmqueue(zone, order);
> 870                     spin_unlock(&zone->lock);
> 871                     if (!page)
> 872                             goto failed;
> 873             }
> 
> 0xffffffff8020f593 is in __alloc_pages (mm/page_alloc.c:1241).
> 1236                    return NULL;
> 1237            }
> 1238
> 1239            page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, order,
> 1240                                    zonelist, ALLOC_WMARK_LOW|
> ALLOC_CPUSET);
> 1241            if (page)
> 1242                    goto got_pg;
> 1243
> 1244            /*
> 1245             * GFP_THISNODE (meaning __GFP_THISNODE, __GFP_NORETRY and
> 
> 0xffffffff802e68d0 is in kmem_getpages (mm/slab.c:1627).
> 1622
> 1623            page = alloc_pages_node(nodeid, flags, cachep->gfporder);
> 1624            if (!page)
> 1625                    return NULL;
> 1626
> 1627            nr_pages = (1 << cachep->gfporder);
> 1628            if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
> 1629                    add_zone_page_state(page_zone(page),
> 1630                            NR_SLAB_RECLAIMABLE, nr_pages);
> 1631            else
> 
> 0xffffffff80217de0 is in cache_grow (mm/slab.c:2774).
> 2769             * Get mem for the objs.  Attempt to allocate a physical page 
> from
> 2770             * 'nodeid'.
> 2771             */
> 2772            if (!objp)
> 2773                    objp = kmem_getpages(cachep, flags, nodeid);
> 2774            if (!objp)
> 2775                    goto failed;
> 2776
> 2777            /* Get slab management. */
> 2778            slabp = alloc_slabmgmt(cachep, objp, offset,
> 
> 0xffffffff80261e9b is in cache_alloc_refill (mm/slab.c:773).
> 768
> 769     static DEFINE_PER_CPU(struct delayed_work, reap_work);
> 770
> 771     static inline struct array_cache *cpu_cache_get(struct kmem_cache 
> *cachep)
> 772     {
> 773             return cachep->array[smp_processor_id()];
> 774     }
> 775
> 776     static inline struct kmem_cache *__find_general_cachep(size_t size,
> 777                                                             gfp_t 
> gfpflags)
> 
> 0xffffffff802e7d7c is in do_tune_cpucache (mm/slab.c:3891).
> 3886
> 3887            new = kzalloc(sizeof(*new), GFP_KERNEL);
> 3888            if (!new)
> 3889                    return -ENOMEM;
> 3890
> 3891            for_each_online_cpu(i) {
> 3892                    new->new[i] = alloc_arraycache(cpu_to_node(i), limit,
> 3893                                                    batchcount);
> 3894                    if (!new->new[i]) {
> 3895                            for (i--; i >= 0; i--)
> 
> 0xffffffff802e721d is in kmem_cache_zalloc (mm/slab.c:3221).
> 3216            /*
> 3217             * We may just have run out of memory on the local node.
> 3218             * ____cache_alloc_node() knows how to locate memory on other 
> nodes
> 3219             */
> 3220            if (NUMA_BUILD && !objp)
> 3221                    objp = ____cache_alloc_node(cachep, flags, 
> numa_node_id());
> 3222            local_irq_restore(save_flags);
> 3223            objp = cache_alloc_debugcheck_after(cachep, flags, objp,
> 3224                                                caller);
> 3225            prefetchw(objp);
> 
> 0xffffffff802e7d7c is in do_tune_cpucache (mm/slab.c:3891).
> 3886
> 3887            new = kzalloc(sizeof(*new), GFP_KERNEL);
> 3888            if (!new)
> 3889                    return -ENOMEM;
> 3890
> 3891            for_each_online_cpu(i) {
> 3892                    new->new[i] = alloc_arraycache(cpu_to_node(i), limit,
> 3893                                                    batchcount);
> 3894                    if (!new->new[i]) {
> 3895                            for (i--; i >= 0; i--)
> 
> 0xffffffff802e835a is in enable_cpucache (mm/slab.c:3974).
> 3969            if (limit > 32)
> 3970                    limit = 32;
> 3971    #endif
> 3972            err = do_tune_cpucache(cachep, limit, (limit + 1) / 2, 
> shared);
> 3973            if (err)
> 3974                    printk(KERN_ERR "enable_cpucache failed for %s, 
> error %d.\n",
> 3975                           cachep->name, -err);
> 3976            return err;
> 3977    }
> 
> 0xffffffff8062bc0a is in kmem_cache_init (mm/slab.c:1563).
> 1558            /* 6) resize the head arrays to their final sizes */
> 1559            {
> 1560                    struct kmem_cache *cachep;
> 1561                    mutex_lock(&cache_chain_mutex);
> 1562                    list_for_each_entry(cachep, &cache_chain, next)
> 1563                            if (enable_cpucache(cachep))
> 1564                                    BUG();
> 1565                    mutex_unlock(&cache_chain_mutex);
> 1566            }
> 
> 0xffffffff8061773c is in start_kernel (init/main.c:583).
> 578             }
> 579     #endif
> 580             vfs_caches_init_early();
> 581             cpuset_init_early();
> 582             mem_init();
> 583             kmem_cache_init();
> 584             setup_per_cpu_pageset();
> 585             numa_policy_init();
> 586             if (late_time_init)
> 587                     late_time_init();
> 
> 0xffffffff8061716d is in x86_64_start_kernel (arch/x86_64/kernel/head64.c:84).
> 79              copy_bootdata(real_mode_data);
> 80      #ifdef CONFIG_SMP
> 81              cpu_set(0, cpu_online_map);
> 82      #endif
> 83              start_kernel();
> 84      }
> 
> 
> ------- You are receiving this mail because: -------
> You are the assignee for the bug, or are watching the assignee.

Comment 6 Thomas Renninger 2007-01-28 07:36:56 UTC
git bisect should identify the bad patch...
Comment 7 Pawel Sikora 2007-01-30 13:47:49 UTC
Andrew Morton wrote:

> Gad.  A null pointer deref right in the main path of the page allocator.
>
> Presumably something has gone wrong with the core MM initialisation on
> that kernel.  Could you please work out the exact file-n-line of the
> oops?  Build the kernel with CONFIG_DEBUG_INFO and do

Hi Andrew,

I've tracked down the problem. The 2.6.19.2-SMP vanilla kernel
compiled with CONFIG_MEMORY_HOTPLUG=y causing mentioned oops
on uniprocessor machine.

BR,
Pawel.
Comment 8 Pawel Sikora 2007-01-31 00:17:53 UTC
Created attachment 10232 [details]
minimal working config for bzImage.
Comment 9 Pawel Sikora 2007-01-31 00:18:42 UTC
Created attachment 10233 [details]
minimal oopsing config for bzImage.
Comment 10 Pawel Sikora 2007-02-04 01:56:49 UTC
fixed by:

Revert "[PATCH] mm: micro optimise zone_watermark_ok"
author  Linus Torvalds <torvalds@woody.linux-foundation.org>
commit  6fd6b17c6d9713f56b5f20903ec3e00fa6cc435e

Note You need to log in before you can comment on or make changes to this bug.