Created attachment 26687 [details] dmesg Never seen this before. 2.6.35-rc1 #1 SMP Mon May 31 21:31:02 CEST 2010 x86_64 GNU/Linux [48126.787684] Xorg: page allocation failure. order:1, mode:0x50d0 [48126.787691] Pid: 1895, comm: Xorg Tainted: G W 2.6.35-rc1 #1 [48126.787694] Call Trace: [48126.787709] [<ffffffff811192f5>] __alloc_pages_nodemask+0x5f5/0x6f0 [48126.787716] [<ffffffff81148695>] alloc_pages_current+0x95/0x100 [48126.787720] [<ffffffff8114e04a>] new_slab+0x2ba/0x2c0 [48126.787724] [<ffffffff8114ed0b>] __slab_alloc+0x14b/0x4e0 [48126.787730] [<ffffffff81403f91>] ? security_vm_enough_memory_kern+0x21/0x30 [48126.787736] [<ffffffff81556e6a>] ? agp_alloc_page_array+0x5a/0x70 [48126.787740] [<ffffffff8115087f>] __kmalloc+0x11f/0x1c0 [48126.787744] [<ffffffff81556e6a>] agp_alloc_page_array+0x5a/0x70 [48126.787747] [<ffffffff81556ee4>] agp_generic_alloc_user+0x64/0x140 [48126.787750] [<ffffffff8155717a>] agp_allocate_memory+0x9a/0x140 [48126.787755] [<ffffffff8156c179>] drm_agp_allocate_memory+0x9/0x10 [48126.787758] [<ffffffff8156c1d7>] drm_agp_bind_pages+0x57/0x100 [48126.787765] [<ffffffff81627fe4>] i915_gem_object_bind_to_gtt+0x144/0x340 [48126.787768] [<ffffffff81628295>] i915_gem_object_pin+0xb5/0xd0 [48126.787772] [<ffffffff81629a4c>] i915_gem_do_execbuffer+0x6cc/0x14f0 [48126.787777] [<ffffffff81091ba0>] ? __is_ram+0x0/0x10 [48126.787783] [<ffffffff8106c76e>] ? lookup_memtype+0xce/0xe0 [48126.787787] [<ffffffff8162ab11>] ? i915_gem_execbuffer+0x91/0x390 [48126.787790] [<ffffffff8162ac55>] i915_gem_execbuffer+0x1d5/0x390 [48126.787794] [<ffffffff816255b0>] ? i915_gem_sw_finish_ioctl+0x90/0xc0 [48126.787799] [<ffffffff81565a0a>] drm_ioctl+0x32a/0x4b0 [48126.787802] [<ffffffff8162aa80>] ? i915_gem_execbuffer+0x0/0x390 [48126.787807] [<ffffffff8116c248>] vfs_ioctl+0x38/0xd0 [48126.787810] [<ffffffff8116c87a>] do_vfs_ioctl+0x8a/0x580 [48126.787814] [<ffffffff8116cdf1>] sys_ioctl+0x81/0xa0 [48126.787820] [<ffffffff8103af02>] system_call_fastpath+0x16/0x1b
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Mon, 7 Jun 2010 17:32:04 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=16148 > > Summary: page allocation failure. order:1, mode:0x50d0 > Product: Memory Management > Version: 2.5 > Kernel Version: 2.6.35-rc1 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Page Allocator > AssignedTo: akpm@linux-foundation.org > ReportedBy: devnull@plzk.org > Regression: No > > > Created an attachment (id=26687) > --> (https://bugzilla.kernel.org/attachment.cgi?id=26687) > dmesg > > Never seen this before. > > 2.6.35-rc1 #1 SMP Mon May 31 21:31:02 CEST 2010 x86_64 GNU/Linux > > [48126.787684] Xorg: page allocation failure. order:1, mode:0x50d0 > [48126.787691] Pid: 1895, comm: Xorg Tainted: G W 2.6.35-rc1 #1 > [48126.787694] Call Trace: > [48126.787709] [<ffffffff811192f5>] __alloc_pages_nodemask+0x5f5/0x6f0 > [48126.787716] [<ffffffff81148695>] alloc_pages_current+0x95/0x100 > [48126.787720] [<ffffffff8114e04a>] new_slab+0x2ba/0x2c0 > [48126.787724] [<ffffffff8114ed0b>] __slab_alloc+0x14b/0x4e0 > [48126.787730] [<ffffffff81403f91>] ? > security_vm_enough_memory_kern+0x21/0x30 > [48126.787736] [<ffffffff81556e6a>] ? agp_alloc_page_array+0x5a/0x70 > [48126.787740] [<ffffffff8115087f>] __kmalloc+0x11f/0x1c0 > [48126.787744] [<ffffffff81556e6a>] agp_alloc_page_array+0x5a/0x70 > [48126.787747] [<ffffffff81556ee4>] agp_generic_alloc_user+0x64/0x140 > [48126.787750] [<ffffffff8155717a>] agp_allocate_memory+0x9a/0x140 > [48126.787755] [<ffffffff8156c179>] drm_agp_allocate_memory+0x9/0x10 > [48126.787758] [<ffffffff8156c1d7>] drm_agp_bind_pages+0x57/0x100 > [48126.787765] [<ffffffff81627fe4>] i915_gem_object_bind_to_gtt+0x144/0x340 > [48126.787768] [<ffffffff81628295>] i915_gem_object_pin+0xb5/0xd0 > [48126.787772] [<ffffffff81629a4c>] i915_gem_do_execbuffer+0x6cc/0x14f0 > [48126.787777] [<ffffffff81091ba0>] ? __is_ram+0x0/0x10 > [48126.787783] [<ffffffff8106c76e>] ? lookup_memtype+0xce/0xe0 > [48126.787787] [<ffffffff8162ab11>] ? i915_gem_execbuffer+0x91/0x390 > [48126.787790] [<ffffffff8162ac55>] i915_gem_execbuffer+0x1d5/0x390 > [48126.787794] [<ffffffff816255b0>] ? i915_gem_sw_finish_ioctl+0x90/0xc0 > [48126.787799] [<ffffffff81565a0a>] drm_ioctl+0x32a/0x4b0 > [48126.787802] [<ffffffff8162aa80>] ? i915_gem_execbuffer+0x0/0x390 > [48126.787807] [<ffffffff8116c248>] vfs_ioctl+0x38/0xd0 > [48126.787810] [<ffffffff8116c87a>] do_vfs_ioctl+0x8a/0x580 > [48126.787814] [<ffffffff8116cdf1>] sys_ioctl+0x81/0xa0 > [48126.787820] [<ffffffff8103af02>] system_call_fastpath+0x16/0x1b > David, I have a vague feeling that we've been round this loop before.. Why does agp_alloc_page_array() use __GFP_NORETRY? It's pretty unusual and it's what caused this spew. There's nothing in the changelog and the only relevant commentary appears to be "This speeds things up and also saves memory for small AGP regions", which is inscrutable. Can you please add a usable comment there? Presumably this was added in response to some observed behaviour, but what was it?? If the __GFP_NORETRY is indeed useful and legitimate and given that we have a vmalloc fallback, I'd suggest that we add __GFP_NOWARN there as well to keep the bug reports away. btw, agp_memory.vmalloc_flag can be done away with - it's conventional to use is_vmalloc_addr() for this.
Reply-To: airlied@redhat.com On Thu, 2010-06-10 at 15:38 -0700, Andrew Morton wrote: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Mon, 7 Jun 2010 17:32:04 GMT > bugzilla-daemon@bugzilla.kernel.org wrote: > > > https://bugzilla.kernel.org/show_bug.cgi?id=16148 > > > > Summary: page allocation failure. order:1, mode:0x50d0 > > Product: Memory Management > > Version: 2.5 > > Kernel Version: 2.6.35-rc1 > > Platform: All > > OS/Version: Linux > > Tree: Mainline > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: Page Allocator > > AssignedTo: akpm@linux-foundation.org > > ReportedBy: devnull@plzk.org > > Regression: No > > > > > > Created an attachment (id=26687) > > --> (https://bugzilla.kernel.org/attachment.cgi?id=26687) > > dmesg > > > > Never seen this before. > > > > 2.6.35-rc1 #1 SMP Mon May 31 21:31:02 CEST 2010 x86_64 GNU/Linux > > > > [48126.787684] Xorg: page allocation failure. order:1, mode:0x50d0 > > [48126.787691] Pid: 1895, comm: Xorg Tainted: G W 2.6.35-rc1 #1 > > [48126.787694] Call Trace: > > [48126.787709] [<ffffffff811192f5>] __alloc_pages_nodemask+0x5f5/0x6f0 > > [48126.787716] [<ffffffff81148695>] alloc_pages_current+0x95/0x100 > > [48126.787720] [<ffffffff8114e04a>] new_slab+0x2ba/0x2c0 > > [48126.787724] [<ffffffff8114ed0b>] __slab_alloc+0x14b/0x4e0 > > [48126.787730] [<ffffffff81403f91>] ? > security_vm_enough_memory_kern+0x21/0x30 > > [48126.787736] [<ffffffff81556e6a>] ? agp_alloc_page_array+0x5a/0x70 > > [48126.787740] [<ffffffff8115087f>] __kmalloc+0x11f/0x1c0 > > [48126.787744] [<ffffffff81556e6a>] agp_alloc_page_array+0x5a/0x70 > > [48126.787747] [<ffffffff81556ee4>] agp_generic_alloc_user+0x64/0x140 > > [48126.787750] [<ffffffff8155717a>] agp_allocate_memory+0x9a/0x140 > > [48126.787755] [<ffffffff8156c179>] drm_agp_allocate_memory+0x9/0x10 > > [48126.787758] [<ffffffff8156c1d7>] drm_agp_bind_pages+0x57/0x100 > > [48126.787765] [<ffffffff81627fe4>] > i915_gem_object_bind_to_gtt+0x144/0x340 > > [48126.787768] [<ffffffff81628295>] i915_gem_object_pin+0xb5/0xd0 > > [48126.787772] [<ffffffff81629a4c>] i915_gem_do_execbuffer+0x6cc/0x14f0 > > [48126.787777] [<ffffffff81091ba0>] ? __is_ram+0x0/0x10 > > [48126.787783] [<ffffffff8106c76e>] ? lookup_memtype+0xce/0xe0 > > [48126.787787] [<ffffffff8162ab11>] ? i915_gem_execbuffer+0x91/0x390 > > [48126.787790] [<ffffffff8162ac55>] i915_gem_execbuffer+0x1d5/0x390 > > [48126.787794] [<ffffffff816255b0>] ? i915_gem_sw_finish_ioctl+0x90/0xc0 > > [48126.787799] [<ffffffff81565a0a>] drm_ioctl+0x32a/0x4b0 > > [48126.787802] [<ffffffff8162aa80>] ? i915_gem_execbuffer+0x0/0x390 > > [48126.787807] [<ffffffff8116c248>] vfs_ioctl+0x38/0xd0 > > [48126.787810] [<ffffffff8116c87a>] do_vfs_ioctl+0x8a/0x580 > > [48126.787814] [<ffffffff8116cdf1>] sys_ioctl+0x81/0xa0 > > [48126.787820] [<ffffffff8103af02>] system_call_fastpath+0x16/0x1b > > > > David, I have a vague feeling that we've been round this loop before.. > > Why does agp_alloc_page_array() use __GFP_NORETRY? It's pretty unusual > and it's what caused this spew. > > There's nothing in the changelog and the only relevant commentary > appears to be "This speeds things up and also saves memory for small > AGP regions", which is inscrutable. Can you please add a usable > comment there? cc'ing Thomas, who added this, I expect we could drop the NORETRY or just add NOWARN. Though an order 1 page alloc failure isn't a pretty sight, not sure how a vmalloc fallback could save us. > > Presumably this was added in response to some observed behaviour, but > what was it?? > > > If the __GFP_NORETRY is indeed useful and legitimate and given that we > have a vmalloc fallback, I'd suggest that we add __GFP_NOWARN there as > well to keep the bug reports away. > > btw, agp_memory.vmalloc_flag can be done away with - it's conventional > to use is_vmalloc_addr() for this. Lols, conventional my ass, we wanted to add that thing years ago for this purpose and got told that would be an insane interface, then the same person added the interface a year later and never fixed AGP to use it. I'll try and write a patch. Dave.
On 06/11/2010 01:15 AM, Dave Airlie wrote: > On Thu, 2010-06-10 at 15:38 -0700, Andrew Morton wrote: > >> (switched to email. Please respond via emailed reply-to-all, not via the >> bugzilla web interface). >> >> On Mon, 7 Jun 2010 17:32:04 GMT >> bugzilla-daemon@bugzilla.kernel.org wrote: >> >> >>> https://bugzilla.kernel.org/show_bug.cgi?id=16148 >>> >>> Summary: page allocation failure. order:1, mode:0x50d0 >>> Product: Memory Management >>> Version: 2.5 >>> Kernel Version: 2.6.35-rc1 >>> Platform: All >>> OS/Version: Linux >>> Tree: Mainline >>> Status: NEW >>> Severity: normal >>> Priority: P1 >>> Component: Page Allocator >>> AssignedTo: akpm@linux-foundation.org >>> ReportedBy: devnull@plzk.org >>> Regression: No >>> >>> >>> Created an attachment (id=26687) >>> --> (https://bugzilla.kernel.org/attachment.cgi?id=26687) >>> dmesg >>> >>> Never seen this before. >>> >>> 2.6.35-rc1 #1 SMP Mon May 31 21:31:02 CEST 2010 x86_64 GNU/Linux >>> >>> [48126.787684] Xorg: page allocation failure. order:1, mode:0x50d0 >>> [48126.787691] Pid: 1895, comm: Xorg Tainted: G W 2.6.35-rc1 #1 >>> [48126.787694] Call Trace: >>> [48126.787709] [<ffffffff811192f5>] __alloc_pages_nodemask+0x5f5/0x6f0 >>> [48126.787716] [<ffffffff81148695>] alloc_pages_current+0x95/0x100 >>> [48126.787720] [<ffffffff8114e04a>] new_slab+0x2ba/0x2c0 >>> [48126.787724] [<ffffffff8114ed0b>] __slab_alloc+0x14b/0x4e0 >>> [48126.787730] [<ffffffff81403f91>] ? >>> security_vm_enough_memory_kern+0x21/0x30 >>> [48126.787736] [<ffffffff81556e6a>] ? agp_alloc_page_array+0x5a/0x70 >>> [48126.787740] [<ffffffff8115087f>] __kmalloc+0x11f/0x1c0 >>> [48126.787744] [<ffffffff81556e6a>] agp_alloc_page_array+0x5a/0x70 >>> [48126.787747] [<ffffffff81556ee4>] agp_generic_alloc_user+0x64/0x140 >>> [48126.787750] [<ffffffff8155717a>] agp_allocate_memory+0x9a/0x140 >>> [48126.787755] [<ffffffff8156c179>] drm_agp_allocate_memory+0x9/0x10 >>> [48126.787758] [<ffffffff8156c1d7>] drm_agp_bind_pages+0x57/0x100 >>> [48126.787765] [<ffffffff81627fe4>] >>> i915_gem_object_bind_to_gtt+0x144/0x340 >>> [48126.787768] [<ffffffff81628295>] i915_gem_object_pin+0xb5/0xd0 >>> [48126.787772] [<ffffffff81629a4c>] i915_gem_do_execbuffer+0x6cc/0x14f0 >>> [48126.787777] [<ffffffff81091ba0>] ? __is_ram+0x0/0x10 >>> [48126.787783] [<ffffffff8106c76e>] ? lookup_memtype+0xce/0xe0 >>> [48126.787787] [<ffffffff8162ab11>] ? i915_gem_execbuffer+0x91/0x390 >>> [48126.787790] [<ffffffff8162ac55>] i915_gem_execbuffer+0x1d5/0x390 >>> [48126.787794] [<ffffffff816255b0>] ? i915_gem_sw_finish_ioctl+0x90/0xc0 >>> [48126.787799] [<ffffffff81565a0a>] drm_ioctl+0x32a/0x4b0 >>> [48126.787802] [<ffffffff8162aa80>] ? i915_gem_execbuffer+0x0/0x390 >>> [48126.787807] [<ffffffff8116c248>] vfs_ioctl+0x38/0xd0 >>> [48126.787810] [<ffffffff8116c87a>] do_vfs_ioctl+0x8a/0x580 >>> [48126.787814] [<ffffffff8116cdf1>] sys_ioctl+0x81/0xa0 >>> [48126.787820] [<ffffffff8103af02>] system_call_fastpath+0x16/0x1b >>> >>> >> David, I have a vague feeling that we've been round this loop before.. >> >> Why does agp_alloc_page_array() use __GFP_NORETRY? It's pretty unusual >> and it's what caused this spew. >> >> There's nothing in the changelog and the only relevant commentary >> appears to be "This speeds things up and also saves memory for small >> AGP regions", which is inscrutable. Can you please add a usable >> comment there? >> > cc'ing Thomas, who added this, I expect we could drop the NORETRY or > just add NOWARN. Though an order 1 page alloc failure isn't a pretty > sight, not sure how a vmalloc fallback could save us. > > Hmm. IIRC that was an untested speed optimization back from the time when I was reading ldd3. I think the idea was to avoid slow allocations of (order > 0) if they weren't immediately available and fall back to vmalloc single page allocations. It might be that that functionality is no longer preserved and only the __GFP_NORETRY remains. I think it should be safe to remove the NORETRY if it's annoying, but it should probably be equally safe to add a NOWARN and keep the vmalloc fallback. Now if we still get a "definitive" page allocation failure in this codepath, that's not good, but hardly the AGP driver's fault. Has Intel added some kind of accounting for pinned pages yet? >> Presumably this was added in response to some observed behaviour, but >> what was it?? >> >> >> If the __GFP_NORETRY is indeed useful and legitimate and given that we >> have a vmalloc fallback, I'd suggest that we add __GFP_NOWARN there as >> well to keep the bug reports away. >> >> btw, agp_memory.vmalloc_flag can be done away with - it's conventional >> to use is_vmalloc_addr() for this. >> > Lols, conventional my ass, we wanted to add that thing years ago for > this purpose and got told that would be an insane interface, then the > same person added the interface a year later and never fixed AGP to use > it. > > Indeed. I even recall the phrase "Too ugly to live" :). > I'll try and write a patch. > > Dave. > > /Thomas
On Fri, 11 Jun 2010 10:46:07 +0200 Thomas Hellstrom <thellstrom@vmware.com> wrote: > >>> > >>> > >> David, I have a vague feeling that we've been round this loop before.. > >> > >> Why does agp_alloc_page_array() use __GFP_NORETRY? It's pretty unusual > >> and it's what caused this spew. > >> > >> There's nothing in the changelog and the only relevant commentary > >> appears to be "This speeds things up and also saves memory for small > >> AGP regions", which is inscrutable. Can you please add a usable > >> comment there? > >> > > cc'ing Thomas, who added this, I expect we could drop the NORETRY or > > just add NOWARN. Though an order 1 page alloc failure isn't a pretty > > sight, not sure how a vmalloc fallback could save us. > > > > > > Hmm. IIRC that was an untested speed optimization back from the time > when I was > reading ldd3. I think the idea was to avoid slow allocations of (order > > 0) if they weren't > immediately available and fall back to vmalloc single page allocations. > It might be that that functionality is no longer preserved and only the > __GFP_NORETRY remains. > I think it should be safe to remove the NORETRY if it's annoying, but it > should probably be equally safe to add a NOWARN and keep the vmalloc > fallback. An order-1 GFP_KERNEL allocation is a breeze - slub does them often, we use them for kernel stacks all the time. I'd say just remove the __GFP_NORETRY and be happy. In fact if the allocations are always this small I'd say we can remove the vmalloc fallback too. However if under some circumstances the allocations can be "large", say order-4 or higher then allocation failures are still a risk.
On 06/11/2010 07:24 PM, Andrew Morton wrote: > On Fri, 11 Jun 2010 10:46:07 +0200 Thomas Hellstrom<thellstrom@vmware.com> > wrote: > > >>>>> >>>>> >>>> David, I have a vague feeling that we've been round this loop before.. >>>> >>>> Why does agp_alloc_page_array() use __GFP_NORETRY? It's pretty unusual >>>> and it's what caused this spew. >>>> >>>> There's nothing in the changelog and the only relevant commentary >>>> appears to be "This speeds things up and also saves memory for small >>>> AGP regions", which is inscrutable. Can you please add a usable >>>> comment there? >>>> >>>> >>> cc'ing Thomas, who added this, I expect we could drop the NORETRY or >>> just add NOWARN. Though an order 1 page alloc failure isn't a pretty >>> sight, not sure how a vmalloc fallback could save us. >>> >>> >>> >> Hmm. IIRC that was an untested speed optimization back from the time >> when I was >> reading ldd3. I think the idea was to avoid slow allocations of (order> >> 0) if they weren't >> immediately available and fall back to vmalloc single page allocations. >> It might be that that functionality is no longer preserved and only the >> __GFP_NORETRY remains. >> I think it should be safe to remove the NORETRY if it's annoying, but it >> should probably be equally safe to add a NOWARN and keep the vmalloc >> fallback. >> > An order-1 GFP_KERNEL allocation is a breeze - slub does them often, we > use them for kernel stacks all the time. I'd say just remove the > __GFP_NORETRY and be happy. > > In fact if the allocations are always this small I'd say we can remove > the vmalloc fallback too. However if under some circumstances the > allocations can be "large", say order-4 or higher then allocation > failures are still a risk. > > Actually, At that time I was working with a SiS GPU (128MiB system), and was getting persistent failures for order 1 GFP_KERNEL page allocations (albeit not in this codepath). So while they are highly unlikely for modern systems, it might be worthwhile keeping the fallback. /Thomas
On Fri, 11 Jun 2010 22:22:28 +0200 Thomas Hellstrom <thellstrom@vmware.com> wrote: > >>> cc'ing Thomas, who added this, I expect we could drop the NORETRY or > >>> just add NOWARN. Though an order 1 page alloc failure isn't a pretty > >>> sight, not sure how a vmalloc fallback could save us. > >>> > >>> > >>> > >> Hmm. IIRC that was an untested speed optimization back from the time > >> when I was > >> reading ldd3. I think the idea was to avoid slow allocations of (order> > >> 0) if they weren't > >> immediately available and fall back to vmalloc single page allocations. > >> It might be that that functionality is no longer preserved and only the > >> __GFP_NORETRY remains. > >> I think it should be safe to remove the NORETRY if it's annoying, but it > >> should probably be equally safe to add a NOWARN and keep the vmalloc > >> fallback. > >> > > An order-1 GFP_KERNEL allocation is a breeze - slub does them often, we > > use them for kernel stacks all the time. I'd say just remove the > > __GFP_NORETRY and be happy. > > > > In fact if the allocations are always this small I'd say we can remove > > the vmalloc fallback too. However if under some circumstances the > > allocations can be "large", say order-4 or higher then allocation > > failures are still a risk. > > > > > > Actually, At that time I was working with a SiS GPU (128MiB system), and > was getting persistent failures for order 1 GFP_KERNEL page allocations > (albeit not in this codepath). So while they are highly unlikely for > modern systems, it might be worthwhile keeping the fallback. 128MB total RAM? Those were the days. Various page reclaim changes have been made in the past year or so which _should_ improve that (eg, lumpy reclaim) but yeah, it's by no means a certainty. The vmalloc fallback hardly hurts anyone. But it does mean that hardly anyone ever executes that codepath, so it won't get tested much. There was a patch recently which added an API formalising the alloc_pages-then-vmalloc fallback approach. It didn't get merged, although there weren't strong feelings either way really. One benefit of that approach is that the alloc/free code itself would get more testing coverage, but callers can still screw things up by failing to handle vmalloc memory correctly for DMA mapping purposes. Oh well, where were we? Remove that __GFP_NORETRY?
On 06/11/2010 10:37 PM, Andrew Morton wrote: > On Fri, 11 Jun 2010 22:22:28 +0200 > Thomas Hellstrom<thellstrom@vmware.com> wrote: > > >>>>> cc'ing Thomas, who added this, I expect we could drop the NORETRY or >>>>> just add NOWARN. Though an order 1 page alloc failure isn't a pretty >>>>> sight, not sure how a vmalloc fallback could save us. >>>>> >>>>> >>>>> >>>>> >>>> Hmm. IIRC that was an untested speed optimization back from the time >>>> when I was >>>> reading ldd3. I think the idea was to avoid slow allocations of (order> >>>> 0) if they weren't >>>> immediately available and fall back to vmalloc single page allocations. >>>> It might be that that functionality is no longer preserved and only the >>>> __GFP_NORETRY remains. >>>> I think it should be safe to remove the NORETRY if it's annoying, but it >>>> should probably be equally safe to add a NOWARN and keep the vmalloc >>>> fallback. >>>> >>>> >>> An order-1 GFP_KERNEL allocation is a breeze - slub does them often, we >>> use them for kernel stacks all the time. I'd say just remove the >>> __GFP_NORETRY and be happy. >>> >>> In fact if the allocations are always this small I'd say we can remove >>> the vmalloc fallback too. However if under some circumstances the >>> allocations can be "large", say order-4 or higher then allocation >>> failures are still a risk. >>> >>> >>> >> Actually, At that time I was working with a SiS GPU (128MiB system), and >> was getting persistent failures for order 1 GFP_KERNEL page allocations >> (albeit not in this codepath). So while they are highly unlikely for >> modern systems, it might be worthwhile keeping the fallback. >> > 128MB total RAM? Those were the days. > > Various page reclaim changes have been made in the past year or so > which _should_ improve that (eg, lumpy reclaim) but yeah, it's by no > means a certainty. > > The vmalloc fallback hardly hurts anyone. But it does mean that hardly > anyone ever executes that codepath, so it won't get tested much. > > There was a patch recently which added an API formalising the > alloc_pages-then-vmalloc fallback approach. It didn't get merged, > although there weren't strong feelings either way really. One benefit > of that approach is that the alloc/free code itself would get more > testing coverage, but callers can still screw things up by failing to > handle vmalloc memory correctly for DMA mapping purposes. > > Oh well, where were we? Remove that __GFP_NORETRY? > Yeah, I think that's the sanest thing to do! /Thomas
I have been getting this with 2.6.35-rc2 and rc3. Could it be the same problem? X: page allocation failure. order:0, mode:0x4 Pid: 1514, comm: X Not tainted 2.6.35-rc3 #1 Call Trace: [<ffffffff8108ce49>] ? __alloc_pages_nodemask+0x629/0x680 [<ffffffff8108c920>] ? __alloc_pages_nodemask+0x100/0x680 [<ffffffffa00db8f3>] ? ttm_get_pages+0x2c3/0x448 [ttm] [<ffffffffa00d4658>] ? __ttm_tt_get_page+0x98/0xc0 [ttm] [<ffffffffa00d4988>] ? ttm_tt_populate+0x48/0x90 [ttm] [<ffffffffa00d4a26>] ? ttm_tt_bind+0x56/0xa0 [ttm] [<ffffffffa00d5230>] ? ttm_bo_handle_move_mem+0x1d0/0x430 [ttm] [<ffffffffa00d76d6>] ? ttm_bo_move_buffer+0x166/0x180 [ttm] [<ffffffffa00b9736>] ? drm_mm_kmalloc+0x26/0xc0 [drm] [<ffffffff81030ea9>] ? get_parent_ip+0x9/0x20 [<ffffffffa00d7786>] ? ttm_bo_validate+0x96/0x130 [ttm] [<ffffffffa00d7b35>] ? ttm_bo_init+0x315/0x390 [ttm] [<ffffffffa0122eb8>] ? radeon_bo_create+0x118/0x210 [radeon] [<ffffffffa0122fb0>] ? radeon_ttm_bo_destroy+0x0/0xb0 [radeon] [<ffffffffa013704c>] ? radeon_gem_object_create+0x8c/0x110 [radeon] [<ffffffffa013711f>] ? radeon_gem_create_ioctl+0x4f/0xe0 [radeon] [<ffffffffa00b10e6>] ? drm_ioctl+0x3d6/0x470 [drm] [<ffffffffa01370d0>] ? radeon_gem_create_ioctl+0x0/0xe0 [radeon] [<ffffffff810b965f>] ? do_sync_read+0xbf/0x100 [<ffffffff810c8965>] ? vfs_ioctl+0x35/0xd0 [<ffffffff810c8b28>] ? do_vfs_ioctl+0x88/0x530 [<ffffffff81031ed7>] ? sub_preempt_count+0x87/0xb0 [<ffffffff810c9019>] ? sys_ioctl+0x49/0x80 [<ffffffff810ba4fe>] ? sys_read+0x4e/0x90 [<ffffffff810024ab>] ? system_call_fastpath+0x16/0x1b Mem-Info: DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 CPU 2: hi: 0, btch: 1 usd: 0 CPU 3: hi: 0, btch: 1 usd: 0 DMA32 per-cpu: CPU 0: hi: 186, btch: 31 usd: 156 CPU 1: hi: 186, btch: 31 usd: 164 CPU 2: hi: 186, btch: 31 usd: 192 CPU 3: hi: 186, btch: 31 usd: 175 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 177 CPU 1: hi: 186, btch: 31 usd: 153 CPU 2: hi: 186, btch: 31 usd: 147 CPU 3: hi: 186, btch: 31 usd: 168 active_anon:248943 inactive_anon:70653 isolated_anon:0 active_file:231008 inactive_file:258837 isolated_file:0 unevictable:1 dirty:5 writeback:0 unstable:0 free:5083 slab_reclaimable:53052 slab_unreclaimable:16244 mapped:45401 shmem:4420 pagetables:6333 bounce:0 DMA free:13036kB min:28kB low:32kB high:40kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15704kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 3255 3760 3760 DMA32 free:5704kB min:6780kB low:8472kB high:10168kB active_anon:915704kB inactive_anon:184904kB active_file:789940kB inactive_file:901472kB unevictable:4kB isolated(anon):0kB isolated(file):0kB present:3333920kB mlocked:4kB dirty:4kB writeback:0kB mapped:114500kB shmem:17136kB slab_reclaimable:197620kB slab_unreclaimable:38500kB kernel_stack:776kB pagetables:12928kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 505 505 Normal free:1840kB min:1048kB low:1308kB high:1572kB active_anon:80068kB inactive_anon:97708kB active_file:134092kB inactive_file:133876kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:517120kB mlocked:0kB dirty:16kB writeback:0kB mapped:67104kB shmem:544kB slab_reclaimable:14588kB slab_unreclaimable:26468kB kernel_stack:1936kB pagetables:12404kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 DMA: 1*4kB 1*8kB 0*16kB 1*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 3*4096kB = 13036kB DMA32: 200*4kB 10*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 1*4096kB = 5952kB Normal: 232*4kB 2*8kB 0*16kB 2*32kB 3*64kB 1*128kB 2*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1840kB 494223 total pagecache pages 66 pages in swap cache Swap cache stats: add 475, delete 409, find 6/7 Free swap = 2094600kB Total swap = 2096476kB 983024 pages RAM 33615 pages reserved 397983 pages shared 655130 pages non-shared [TTM] Unable to allocate page. radeon 0000:01:05.0: object_init failed for (7827456, 0x00000002) [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (7827456, 2, 4096, -12) SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) thaw-filesystems(J) saK show-backtrace-all-active-cpus(L) show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount force-fb(V) show-blocked-tasks(W) SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) thaw-filesystems(J) saK show-backtrace-all-active-cpus(L) show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount force-fb(V) show-blocked-tasks(W) SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) thaw-filesystems(J) saK show-backtrace-all-active-cpus(L) show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount force-fb(V) show-blocked-tasks(W) SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) thaw-filesystems(J) saK show-backtrace-all-active-cpus(L) show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount force-fb(V) show-blocked-tasks(W) SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) thaw-filesystems(J) saK show-backtrace-all-active-cpus(L) show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount force-fb(V) show-blocked-tasks(W) SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) thaw-filesystems(J) saK show-backtrace-all-active-cpus(L) show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount force-fb(V) show-blocked-tasks(W)
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). (switching back to email, actually) On Sun, 13 Jun 2010 13:01:57 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=16148 > > > Mikko C. <mikko.cal@gmail.com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |mikko.cal@gmail.com > > > > > --- Comment #8 from Mikko C. <mikko.cal@gmail.com> 2010-06-13 13:01:53 --- > I have been getting this with 2.6.35-rc2 and rc3. > Could it be the same problem? > > > X: page allocation failure. order:0, mode:0x4 > Pid: 1514, comm: X Not tainted 2.6.35-rc3 #1 > Call Trace: > [<ffffffff8108ce49>] ? __alloc_pages_nodemask+0x629/0x680 > [<ffffffff8108c920>] ? __alloc_pages_nodemask+0x100/0x680 > [<ffffffffa00db8f3>] ? ttm_get_pages+0x2c3/0x448 [ttm] > [<ffffffffa00d4658>] ? __ttm_tt_get_page+0x98/0xc0 [ttm] > [<ffffffffa00d4988>] ? ttm_tt_populate+0x48/0x90 [ttm] > [<ffffffffa00d4a26>] ? ttm_tt_bind+0x56/0xa0 [ttm] > [<ffffffffa00d5230>] ? ttm_bo_handle_move_mem+0x1d0/0x430 [ttm] > [<ffffffffa00d76d6>] ? ttm_bo_move_buffer+0x166/0x180 [ttm] > [<ffffffffa00b9736>] ? drm_mm_kmalloc+0x26/0xc0 [drm] > [<ffffffff81030ea9>] ? get_parent_ip+0x9/0x20 > [<ffffffffa00d7786>] ? ttm_bo_validate+0x96/0x130 [ttm] > [<ffffffffa00d7b35>] ? ttm_bo_init+0x315/0x390 [ttm] > [<ffffffffa0122eb8>] ? radeon_bo_create+0x118/0x210 [radeon] > [<ffffffffa0122fb0>] ? radeon_ttm_bo_destroy+0x0/0xb0 [radeon] > [<ffffffffa013704c>] ? radeon_gem_object_create+0x8c/0x110 [radeon] > [<ffffffffa013711f>] ? radeon_gem_create_ioctl+0x4f/0xe0 [radeon] > [<ffffffffa00b10e6>] ? drm_ioctl+0x3d6/0x470 [drm] > [<ffffffffa01370d0>] ? radeon_gem_create_ioctl+0x0/0xe0 [radeon] > [<ffffffff810b965f>] ? do_sync_read+0xbf/0x100 > [<ffffffff810c8965>] ? vfs_ioctl+0x35/0xd0 > [<ffffffff810c8b28>] ? do_vfs_ioctl+0x88/0x530 > [<ffffffff81031ed7>] ? sub_preempt_count+0x87/0xb0 > [<ffffffff810c9019>] ? sys_ioctl+0x49/0x80 > [<ffffffff810ba4fe>] ? sys_read+0x4e/0x90 > [<ffffffff810024ab>] ? system_call_fastpath+0x16/0x1b That's different. ttm_get_pages() looks pretty busted to me. It's not using __GFP_WAIT and it's not using __GFP_FS. It's using a plain GFP_DMA32 so it's using atomic allocations even though it doesn't need to. IOW, it's shooting itself in the head. Given that it will sometimes use GFP_HIGHUSER which includes __GFP_FS and __GFP_WAIT, I assume it can always include __GFP_FS and __GFP_WAIT. If so, it should very much do so. If not then the function is misdesigned and should be altered to take a gfp_t argument so the caller can tell ttm_get_pages() which is the strongest allocation mode which it may use. > [TTM] Unable to allocate page. > radeon 0000:01:05.0: object_init failed for (7827456, 0x00000002) > [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object > (7827456, > 2, 4096, -12) This bug actually broke stuff for you. Something like this: --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c~a +++ a/drivers/gpu/drm/ttm/ttm_page_alloc.c @@ -677,7 +677,7 @@ int ttm_get_pages(struct list_head *page /* No pool for cached pages */ if (pool == NULL) { if (flags & TTM_PAGE_FLAG_DMA32) - gfp_flags |= GFP_DMA32; + gfp_flags |= GFP_KERNEL|GFP_DMA32; else gfp_flags |= GFP_HIGHUSER; _ although I wonder whether it should be using pool->gfp_flags. It's a shame that this code was developed and merged in secret :( Had we known, we could have looked at enhancing mempools to cover the requirement, or at implementing this in some generic fashion rather than hiding it down in drivers/gpu/drm.
Reply-To: airlied@redhat.com On Tue, 2010-06-15 at 15:41 -0700, Andrew Morton wrote: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > (switching back to email, actually) > > On Sun, 13 Jun 2010 13:01:57 GMT > bugzilla-daemon@bugzilla.kernel.org wrote: > > > https://bugzilla.kernel.org/show_bug.cgi?id=16148 > > > > > > Mikko C. <mikko.cal@gmail.com> changed: > > > > What |Removed |Added > > > ---------------------------------------------------------------------------- > > CC| |mikko.cal@gmail.com > > > > > > > > > > --- Comment #8 from Mikko C. <mikko.cal@gmail.com> 2010-06-13 13:01:53 --- > > I have been getting this with 2.6.35-rc2 and rc3. > > Could it be the same problem? > > > > > > X: page allocation failure. order:0, mode:0x4 > > Pid: 1514, comm: X Not tainted 2.6.35-rc3 #1 > > Call Trace: > > [<ffffffff8108ce49>] ? __alloc_pages_nodemask+0x629/0x680 > > [<ffffffff8108c920>] ? __alloc_pages_nodemask+0x100/0x680 > > [<ffffffffa00db8f3>] ? ttm_get_pages+0x2c3/0x448 [ttm] > > [<ffffffffa00d4658>] ? __ttm_tt_get_page+0x98/0xc0 [ttm] > > [<ffffffffa00d4988>] ? ttm_tt_populate+0x48/0x90 [ttm] > > [<ffffffffa00d4a26>] ? ttm_tt_bind+0x56/0xa0 [ttm] > > [<ffffffffa00d5230>] ? ttm_bo_handle_move_mem+0x1d0/0x430 [ttm] > > [<ffffffffa00d76d6>] ? ttm_bo_move_buffer+0x166/0x180 [ttm] > > [<ffffffffa00b9736>] ? drm_mm_kmalloc+0x26/0xc0 [drm] > > [<ffffffff81030ea9>] ? get_parent_ip+0x9/0x20 > > [<ffffffffa00d7786>] ? ttm_bo_validate+0x96/0x130 [ttm] > > [<ffffffffa00d7b35>] ? ttm_bo_init+0x315/0x390 [ttm] > > [<ffffffffa0122eb8>] ? radeon_bo_create+0x118/0x210 [radeon] > > [<ffffffffa0122fb0>] ? radeon_ttm_bo_destroy+0x0/0xb0 [radeon] > > [<ffffffffa013704c>] ? radeon_gem_object_create+0x8c/0x110 [radeon] > > [<ffffffffa013711f>] ? radeon_gem_create_ioctl+0x4f/0xe0 [radeon] > > [<ffffffffa00b10e6>] ? drm_ioctl+0x3d6/0x470 [drm] > > [<ffffffffa01370d0>] ? radeon_gem_create_ioctl+0x0/0xe0 [radeon] > > [<ffffffff810b965f>] ? do_sync_read+0xbf/0x100 > > [<ffffffff810c8965>] ? vfs_ioctl+0x35/0xd0 > > [<ffffffff810c8b28>] ? do_vfs_ioctl+0x88/0x530 > > [<ffffffff81031ed7>] ? sub_preempt_count+0x87/0xb0 > > [<ffffffff810c9019>] ? sys_ioctl+0x49/0x80 > > [<ffffffff810ba4fe>] ? sys_read+0x4e/0x90 > > [<ffffffff810024ab>] ? system_call_fastpath+0x16/0x1b > > That's different. ttm_get_pages() looks pretty busted to me. It's not > using __GFP_WAIT and it's not using __GFP_FS. It's using a plain > GFP_DMA32 so it's using atomic allocations even though it doesn't need > to. IOW, it's shooting itself in the head. > > Given that it will sometimes use GFP_HIGHUSER which includes __GFP_FS > and __GFP_WAIT, I assume it can always include __GFP_FS and __GFP_WAIT. > If so, it should very much do so. If not then the function is > misdesigned and should be altered to take a gfp_t argument so the > caller can tell ttm_get_pages() which is the strongest allocation mode > which it may use. > > > [TTM] Unable to allocate page. > > radeon 0000:01:05.0: object_init failed for (7827456, 0x00000002) > > [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object > (7827456, > > 2, 4096, -12) > > This bug actually broke stuff for you. > > Something like this: > > --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c~a > +++ a/drivers/gpu/drm/ttm/ttm_page_alloc.c > @@ -677,7 +677,7 @@ int ttm_get_pages(struct list_head *page > /* No pool for cached pages */ > if (pool == NULL) { > if (flags & TTM_PAGE_FLAG_DMA32) > - gfp_flags |= GFP_DMA32; > + gfp_flags |= GFP_KERNEL|GFP_DMA32; > else > gfp_flags |= GFP_HIGHUSER; > > _ > > although I wonder whether it should be using pool->gfp_flags. > > > It's a shame that this code was developed and merged in secret :( Had > we known, we could have looked at enhancing mempools to cover the > requirement, or at implementing this in some generic fashion rather > than hiding it down in drivers/gpu/drm. Its been post to lkml at least once or twice over the past few years though not as much as it was posted to dri-devel, but that was because we had never seen anyone show any interest in it outside of kernel hackers. Originally I was going to use the generic allocator stuff ia64 uses for uncached allocations but it allocates memory ranges not pages so it wasn't useful. I also suggested getting a page flag for uncached allocator stuff, I was told to go write the code in my own corner and prove it was required. So I did, and it was cleaned up and worked on by others and I merged it. So can we lay off with the "in secret", the original code is nearly 2 years old at this point and just because -mm hackers choose to ignore it isn't our fault. Patches welcome. So now back to the bug: So the page pools are setup with gfp flags, in the normal case, 4 pools, one WC GFP_HIGHUSER pages, one UC HIGHUSER pages, one WC GFP_USER| GFP_DMA32, one UC GFP_USER|GFP_DMA32, so the pools are all fine, the problem here is the same as before we added the pools, which is the normal page allocation path, which needs the GFP_USER added instead of GFP_KERNEL. That said I've noticed a lot more page allocation failure reports in 2.6.35-rcX than we've gotten for a long time, in code that hasn't changed (the AGP ones the other day for example) has something in the core MM regressed (again... didn't this happen back in 2.6.31 or something). (cc'ing Mel who tracked these down before). Dave.
Reply-To: mel@csn.ul.ie Hi Dave, On Wed, Jun 16, 2010 at 08:57:09AM +1000, Dave Airlie wrote: > > (switching back to email, actually) > > > > On Sun, 13 Jun 2010 13:01:57 GMT > > bugzilla-daemon@bugzilla.kernel.org wrote: > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=16148 > > > > > > > > > Mikko C. <mikko.cal@gmail.com> changed: > > > > > > What |Removed |Added > > > > ---------------------------------------------------------------------------- > > > CC| |mikko.cal@gmail.com > > > > > > > > > > > > > > > --- Comment #8 from Mikko C. <mikko.cal@gmail.com> 2010-06-13 13:01:53 > --- > > > I have been getting this with 2.6.35-rc2 and rc3. > > > Could it be the same problem? > > > > > > > > > X: page allocation failure. order:0, mode:0x4 > > > Pid: 1514, comm: X Not tainted 2.6.35-rc3 #1 > > > Call Trace: > > > [<ffffffff8108ce49>] ? __alloc_pages_nodemask+0x629/0x680 > > > [<ffffffff8108c920>] ? __alloc_pages_nodemask+0x100/0x680 > > > [<ffffffffa00db8f3>] ? ttm_get_pages+0x2c3/0x448 [ttm] > > > [<ffffffffa00d4658>] ? __ttm_tt_get_page+0x98/0xc0 [ttm] > > > [<ffffffffa00d4988>] ? ttm_tt_populate+0x48/0x90 [ttm] > > > [<ffffffffa00d4a26>] ? ttm_tt_bind+0x56/0xa0 [ttm] > > > [<ffffffffa00d5230>] ? ttm_bo_handle_move_mem+0x1d0/0x430 [ttm] > > > [<ffffffffa00d76d6>] ? ttm_bo_move_buffer+0x166/0x180 [ttm] > > > [<ffffffffa00b9736>] ? drm_mm_kmalloc+0x26/0xc0 [drm] > > > [<ffffffff81030ea9>] ? get_parent_ip+0x9/0x20 > > > [<ffffffffa00d7786>] ? ttm_bo_validate+0x96/0x130 [ttm] > > > [<ffffffffa00d7b35>] ? ttm_bo_init+0x315/0x390 [ttm] > > > [<ffffffffa0122eb8>] ? radeon_bo_create+0x118/0x210 [radeon] > > > [<ffffffffa0122fb0>] ? radeon_ttm_bo_destroy+0x0/0xb0 [radeon] > > > [<ffffffffa013704c>] ? radeon_gem_object_create+0x8c/0x110 [radeon] > > > [<ffffffffa013711f>] ? radeon_gem_create_ioctl+0x4f/0xe0 [radeon] > > > [<ffffffffa00b10e6>] ? drm_ioctl+0x3d6/0x470 [drm] > > > [<ffffffffa01370d0>] ? radeon_gem_create_ioctl+0x0/0xe0 [radeon] > > > [<ffffffff810b965f>] ? do_sync_read+0xbf/0x100 > > > [<ffffffff810c8965>] ? vfs_ioctl+0x35/0xd0 > > > [<ffffffff810c8b28>] ? do_vfs_ioctl+0x88/0x530 > > > [<ffffffff81031ed7>] ? sub_preempt_count+0x87/0xb0 > > > [<ffffffff810c9019>] ? sys_ioctl+0x49/0x80 > > > [<ffffffff810ba4fe>] ? sys_read+0x4e/0x90 > > > [<ffffffff810024ab>] ? system_call_fastpath+0x16/0x1b > > > > That's different. ttm_get_pages() looks pretty busted to me. It's not > > using __GFP_WAIT and it's not using __GFP_FS. It's using a plain > > GFP_DMA32 so it's using atomic allocations even though it doesn't need > > to. IOW, it's shooting itself in the head. > > This is an accurate assessment. > > Given that it will sometimes use GFP_HIGHUSER which includes __GFP_FS > > and __GFP_WAIT, I assume it can always include __GFP_FS and __GFP_WAIT. > > If so, it should very much do so. If not then the function is > > misdesigned and should be altered to take a gfp_t argument so the > > caller can tell ttm_get_pages() which is the strongest allocation mode > > which it may use. > > > > > [TTM] Unable to allocate page. > > > radeon 0000:01:05.0: object_init failed for (7827456, 0x00000002) > > > [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object > (7827456, > > > 2, 4096, -12) > > > > This bug actually broke stuff for you. > > > > Something like this: > > > > --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c~a > > +++ a/drivers/gpu/drm/ttm/ttm_page_alloc.c > > @@ -677,7 +677,7 @@ int ttm_get_pages(struct list_head *page > > /* No pool for cached pages */ > > if (pool == NULL) { > > if (flags & TTM_PAGE_FLAG_DMA32) > > - gfp_flags |= GFP_DMA32; > > + gfp_flags |= GFP_KERNEL|GFP_DMA32; > > else > > gfp_flags |= GFP_HIGHUSER; > > > > _ > > > > although I wonder whether it should be using pool->gfp_flags. > > > > > > It's a shame that this code was developed and merged in secret :( Had > > we known, we could have looked at enhancing mempools to cover the > > requirement, or at implementing this in some generic fashion rather > > than hiding it down in drivers/gpu/drm. > > Its been post to lkml at least once or twice over the past few years > though not as much as it was posted to dri-devel, but that was because > we had never seen anyone show any interest in it outside of kernel > hackers. Originally I was going to use the generic allocator stuff ia64 > uses for uncached allocations but it allocates memory ranges not pages > so it wasn't useful. I also suggested getting a page flag for uncached > allocator stuff, I was told to go write the code in my own corner and > prove it was required. So I did, and it was cleaned up and worked on by > others and I merged it. So can we lay off with the "in secret", the > original code is nearly 2 years old at this point and just because -mm > hackers choose to ignore it isn't our fault. Patches welcome. > No comment on this aspect of things. > So now back to the bug: > > So the page pools are setup with gfp flags, in the normal case, 4 pools, > one WC GFP_HIGHUSER pages, one UC HIGHUSER pages, one WC GFP_USER| > GFP_DMA32, one UC GFP_USER|GFP_DMA32, so the pools are all fine, the > problem here is the same as before we added the pools, which is the > normal page allocation path, which needs the GFP_USER added instead of > GFP_KERNEL. > > That said I've noticed a lot more page allocation failure reports in > 2.6.35-rcX than we've gotten for a long time, in code that hasn't > changed (the AGP ones the other day for example) has something in the > core MM regressed (again... didn't this happen back in 2.6.31 or > something). > > (cc'ing Mel who tracked these down before). > Ok, so there has been very little changed in the page allocator that might have caused this. The only possibility I can think of is [6dda9d: page allocator: reduce fragmentation in buddy allocator by adding buddies that are merging to the tail of the free lists] but I would have expected the patch to help, not hinder, this situation. In the last few kernel releases when there has been a spike in high-order allocation, it has been largely due to an increased source of high-order callers from somewhere. I am 99% sure the last time I encountered this, I used the following script to debug it even though it's a bit of a hack. It uses ftrace to track high-order allocators and reports the number of high-order allocations (either atomic or normal) and the count of unique backtraces (to point the finger at bad callers) Unfortunately, I don't have access to a suitable machine to test this with right now (my laptop isn't making high-order allocations to confirm yet) but I'm pretty sure it's the right one! If you do some fixed set of tests with 2.6.34 and 2.6.35-rc2 with this script running and compare the reports, it might point the finger at a new source of frequent high-order allocations that might be leading to more failures. Do something like ./watch-highorder.pl -o highorder-2.6.34-report.txt PID=$! Do your tests # Two sigints in quick succession to make the monitor exit kill $PID kill $PID ==== CUT HERE ==== #!/usr/bin/perl # This is a tool that analyses the trace output related to page allocation, # sums up the number of high-order allocations taking place and who the # callers are # # Example usage: trace-pagealloc-highorder.pl -o output-report.txt # options # -o, --output Where to store the report # # Copyright (c) IBM Corporation 2009 # Author: Mel Gorman <mel@csn.ul.ie> use strict; use Getopt::Long; # Tracepoint events use constant MM_PAGE_ALLOC => 1; use constant EVENT_UNKNOWN => 7; use constant HIGH_NORMAL_HIGHORDER_ALLOC => 10; use constant HIGH_ATOMIC_HIGHORDER_ALLOC => 11; my $opt_output; my %stats; my %unique_events; my $last_updated = 0; $stats{HIGH_NORMAL_HIGHORDER_ALLOC} = 0; $stats{HIGH_ATOMIC_HIGHORDER_ALLOC} = 0; # Catch sigint and exit on request my $sigint_report = 0; my $sigint_exit = 0; my $sigint_pending = 0; my $sigint_received = 0; sub sigint_handler { my $current_time = time; if ($current_time - 2 > $sigint_received) { print "SIGINT received, report pending. Hit ctrl-c again to exit\n"; $sigint_report = 1; } else { if (!$sigint_exit) { print "Second SIGINT received quickly, exiting\n"; } $sigint_exit++; } if ($sigint_exit > 3) { print "Many SIGINTs received, exiting now without report\n"; exit; } $sigint_received = $current_time; $sigint_pending = 1; } $SIG{INT} = "sigint_handler"; # Parse command line options GetOptions( 'output=s' => \$opt_output, ); # Defaults for dynamically discovered regex's my $regex_pagealloc_default = 'page=([0-9a-f]*) pfn=([0-9]*) order=([-0-9]*) migratetype=([-0-9]*) gfp_flags=([A-Z_|]*)'; # Dyanically discovered regex my $regex_pagealloc; # Static regex used. Specified like this for readability and for use with /o # (process_pid) (cpus ) ( time ) (tpoint ) (details) my $regex_traceevent = '\s*([a-zA-Z0-9-]*)\s*(\[[0-9]*\])\s*([0-9.]*):\s*([a-zA-Z_]*):\s*(.*)'; my $regex_statname = '[-0-9]*\s\((.*)\).*'; my $regex_statppid = '[-0-9]*\s\(.*\)\s[A-Za-z]\s([0-9]*).*'; sub generate_traceevent_regex { my $event = shift; my $default = shift; my $regex; # Read the event format or use the default if (!open (FORMAT, "/sys/kernel/debug/tracing/events/$event/format")) { $regex = $default; } else { my $line; while (!eof(FORMAT)) { $line = <FORMAT>; if ($line =~ /^print fmt:\s"(.*)",.*/) { $regex = $1; $regex =~ s/%p/\([0-9a-f]*\)/g; $regex =~ s/%d/\([-0-9]*\)/g; $regex =~ s/%lu/\([0-9]*\)/g; $regex =~ s/%s/\([A-Z_|]*\)/g; $regex =~ s/\(REC->gfp_flags\).*/REC->gfp_flags/; $regex =~ s/\",.*//; } } } # Verify fields are in the right order my $tuple; foreach $tuple (split /\s/, $regex) { my ($key, $value) = split(/=/, $tuple); my $expected = shift; if ($key ne $expected) { print("WARNING: Format not as expected '$key' != '$expected'"); $regex =~ s/$key=\((.*)\)/$key=$1/; } } if (defined shift) { die("Fewer fields than expected in format"); } return $regex; } $regex_pagealloc = generate_traceevent_regex("kmem/mm_page_alloc", $regex_pagealloc_default, "page", "pfn", "order", "migratetype", "gfp_flags"); sub process_events { my $traceevent; my $process_pid = 0; my $cpus; my $timestamp; my $tracepoint; my $details; my $statline; my $nextline = 1; # Read each line of the event log EVENT_PROCESS: while (($traceevent = <TRACING>) && !$sigint_exit) { SKIP_LINEREAD: if ($traceevent eq "") { last EVENT_PROCSS; } if ($traceevent =~ /$regex_traceevent/o) { $process_pid = $1; $tracepoint = $4; } else { next; } # Perl Switch() sucks majorly if ($tracepoint eq "mm_page_alloc") { my ($page, $order, $gfp_flags, $type); my ($atomic); my $details = $5; if ($details !~ /$regex_pagealloc/o) { print "WARNING: Failed to parse mm_page_alloc as expected\n"; print "$details\n"; print "$regex_pagealloc\n"; print "\n"; next; } $page = $1; $order = $3; $gfp_flags = $5; # Only concerned with high-order allocs if ($order == 0) { next; } $stats{MM_PAGE_ALLOC}++; if ($gfp_flags =~ /ATOMIC/) { $stats{HIGH_ATOMIC_HIGHORDER_ALLOC}++; $type = "atomic"; } else { $stats{HIGH_NORMAL_HIGHORDER_ALLOC}++; $type = "normal"; } # Record the stack trace $traceevent = <TRACING>; if ($traceevent !~ /stack trace/) { goto SKIP_LINEREAD; } my $event = "order=$order $type gfp_flags=$gfp_flags\n";; while ($traceevent = <TRACING>) { if ($traceevent !~ /^ =>/) { $unique_events{$event}++; my $current = time; if ($current - $last_updated > 60) { $last_updated = $current; update_report(); } goto SKIP_LINEREAD; } $event .= $traceevent; } } else { $stats{EVENT_UNKNOWN}++; } if ($sigint_pending) { last EVENT_PROCESS; } } } sub update_report() { my $event; open (REPORT, ">$opt_output") || die ("Failed to open $opt_output for writing"); foreach $event (keys %unique_events) { print REPORT $unique_events{$event} . " instances $event\n"; } print REPORT "High-order normal allocations: " . $stats{HIGH_NORMAL_HIGHORDER_ALLOC} . "\n"; print REPORT "High-order atomic allocations: " . $stats{HIGH_ATOMIC_HIGHORDER_ALLOC} . "\n"; close REPORT; } sub print_report() { print "\nReport\n"; open (REPORT, $opt_output) || die ("Failed to open $opt_output for reading"); while (<REPORT>) { print $_; } close REPORT; } # Process events or signals until neither is available sub signal_loop() { my $sigint_processed; do { $sigint_processed = 0; process_events(); # Handle pending signals if any if ($sigint_pending) { my $current_time = time; if ($sigint_exit) { print "Received exit signal\n"; $sigint_pending = 0; } if ($sigint_report) { if ($current_time >= $sigint_received + 2) { update_report(); print_report(); $sigint_report = 0; $sigint_pending = 0; $sigint_processed = 1; $sigint_exit = 0; } } } } while ($sigint_pending || $sigint_processed); } sub set_traceoption($) { my $option = shift; open(TRACEOPT, ">/sys/kernel/debug/tracing/trace_options") || die("Failed to open trace_options"); print TRACEOPT $option; close TRACEOPT; } sub enable_tracing($) { my $tracing = shift; open(TRACING, ">/sys/kernel/debug/tracing/events/$tracing/enable") || die("Failed to open tracing event $tracing"); print TRACING "1"; close TRACING; } sub disable_tracing($) { my $tracing = shift; open(TRACING, ">/sys/kernel/debug/tracing/events/$tracing/enable") || die("Failed to open tracing event $tracing"); print TRACING "0"; close TRACING; } set_traceoption("stacktrace"); set_traceoption("sym-offset"); set_traceoption("sym-addr"); enable_tracing("kmem/mm_page_alloc"); open(TRACING, "/sys/kernel/debug/tracing/trace_pipe") || die("Failed to open trace_pipe"); signal_loop(); close TRACING; disable_tracing("kmem/mm_page_alloc"); update_report(); print_report();