Most recent kernel where this bug did not occur: Distribution: Hardware Environment: celoron M 1,5GHz memory :256MB swap memory have 1GB Software Environment: My volumn be created on RAID0 kernel is 2.6.17.13 Filesyetm is ext3 Problem Description: We create the 8 snapshots in one volumn ,and copy a 4G file to the volumn , and system happen "out of memory" and oom-killer kill my processes. Someone have happen this problem ,have any way to solve it ? Thanks!! Steps to reproduce: - use three hdd to create RAID0 - create vg on RAID0 is called vg0 - Create lv on vg0 is called lv0 - mke2fsck -j /dev/vg0/lv0 , format to ext3 filesystem - mount /dev/vg0/lv0 to /raid/data - create a folder pub on /raid/data - create 8 snapshots base on lv0 ,snapshot lv sizes have 22GB,every snapshot are the same. - use dd command to create 4G file to lv0 dd if=/dev/zero of=/raid/data/4G.bin bs=1M count=4096 - system have some thing happen sometime system will crash ,not any more message. sometime have "out of memory" in dmesg
out of memory message . ------------------------------------- -killer: gfp_mask=0x201d2, order=0 <c012977a> out_of_memory+0x28/0x7a <c012a603> __alloc_pages+0x1e5/0x26a <c012778b> page_cache_read+0x3d/0x91 <c0127972> filemap_nopage+0x193/0x2a8 <c0130f17> do_no_page+0x6a/0x1c1 <c013118b> __handle_mm_fault+0xbc/0x15a <c010d87f> do_page_fault+0x239/0x561 <c010d646> do_page_fault+0x0/0x561 <c0102e87> error_code+0x4f/0x54 Mem-info: DMA per-cpu: cpu 0 hot: high 0, batch 1 used:0 cpu 0 cold: high 0, batch 1 used:0 DMA32 per-cpu: empty Normal per-cpu: cpu 0 hot: high 90, batch 15 used:18 cpu 0 cold: high 30, batch 7 used:24 HighMem per-cpu: empty Free pages: 24772kB (0kB HighMem) Active:1061 inactive:17650 dirty:3602 writeback:4502 unstable:0 free:6193 slab:32454 mapped:66 pagetables:169 DMA free:2880kB min:1928kB low:2408kB high:2892kB active:68kB inactive:4640kB present:16384kB pages_scanned:4506 all_unreclaimable? yes lowmem_reserve[]: 0 0 238 238 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 238 238 Normal free:21892kB min:28788kB low:35984kB high:43180kB active:4176kB inactive:65960kB present:244672kB pages_scanned:73425 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 DMA: 0*4kB 42*8kB 147*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2880kB DMA32: empty Normal: 1779*4kB 839*8kB 342*16kB 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 21892kB HighMem: empty Swap cache: add 5930, delete 5875, find 21346/21816, race 0+0 Free swap = 975336kB Total swap = 987832kB Free swap: 975336kB 65264 pages of RAM 0 pages of HIGHMEM 1324 reserved pages 9298 pages shared 55 pages swap cached 3602 pages dirty 4502 pages writeback 66 pages mapped 32454 pages slab 169 pages pagetables oom-killer: gfp_mask=0x200d2, order=0 <c012977a> out_of_memory+0x28/0x7a <c012a603> __alloc_pages+0x1e5/0x26a <c0135cc7> read_swap_cache_async+0x2e/0x73 <c0130c1a> do_swap_page+0x62/0x1ec <c01311c1> __handle_mm_fault+0xf2/0x15a <c010d87f> do_page_fault+0x239/0x561 <c010d646> do_page_fault+0x0/0x561 <c0102e87> error_code+0x4f/0x54 Mem-info: DMA per-cpu: cpu 0 hot: high 0, batch 1 used:0 cpu 0 cold: high 0, batch 1 used:0 DMA32 per-cpu: empty Normal per-cpu: cpu 0 hot: high 90, batch 15 used:18 cpu 0 cold: high 30, batch 7 used:24 HighMem per-cpu: empty Free pages: 24772kB (0kB HighMem) Active:1061 inactive:17650 dirty:3591 writeback:4513 unstable:0 free:6193 slab:32454 mapped:66 pagetables:169 DMA free:2880kB min:1928kB low:2408kB high:2892kB active:68kB inactive:4640kB present:16384kB pages_scanned:4538 all_unreclaimable? yes lowmem_reserve[]: 0 0 238 238 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 238 238 Normal free:21892kB min:28788kB low:35984kB high:43180kB active:4176kB inactive:65960kB present:244672kB pages_scanned:73489 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 DMA: 0*4kB 42*8kB 147*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2880kB DMA32: empty Normal: 1779*4kB 839*8kB 342*16kB 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 21892kB HighMem: empty Swap cache: add 5930, delete 5875, find 21346/21816, race 0+0 Free swap = 975336kB Total swap = 987832kB Free swap: 975336kB 65264 pages of RAM 0 pages of HIGHMEM 1324 reserved pages 9298 pages shared 55 pages swap cached 3591 pages dirty 4513 pages writeback 66 pages mapped 32454 pages slab 169 pages pagetables oom-killer: gfp_mask=0x200d2, order=0 <c012977a> out_of_memory+0x28/0x7a <c012a603> __alloc_pages+0x1e5/0x26a <c0135cc7> read_swap_cache_async+0x2e/0x73 <c0130c1a> do_swap_page+0x62/0x1ec <c01311c1> __handle_mm_fault+0xf2/0x15a <c010d87f> do_page_fault+0x239/0x561 <c010d646> do_page_fault+0x0/0x561 <c0102e87> error_code+0x4f/0x54 Mem-info: DMA per-cpu: cpu 0 hot: high 0, batch 1 used:0 cpu 0 cold: high 0, batch 1 used:0 DMA32 per-cpu: empty Normal per-cpu: cpu 0 hot: high 90, batch 15 used:18 cpu 0 cold: high 30, batch 7 used:24 HighMem per-cpu: empty Free pages: 24772kB (0kB HighMem) Active:1061 inactive:17650 dirty:3591 writeback:4513 unstable:0 free:6193 slab:32454 mapped:66 pagetables:169 DMA free:2880kB min:1928kB low:2408kB high:2892kB active:68kB inactive:4640kB present:16384kB pages_scanned:4538 all_unreclaimable? yes lowmem_reserve[]: 0 0 238 238 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 238 238 Normal free:21892kB min:28788kB low:35984kB high:43180kB active:4176kB inactive:65960kB present:244672kB pages_scanned:73489 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 DMA: 0*4kB 42*8kB 147*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2880kB DMA32: empty Normal: 1779*4kB 839*8kB 342*16kB 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 21892kB HighMem: empty Swap cache: add 5930, delete 5875, find 21346/21816, race 0+0 Free swap = 975336kB Total swap = 987832kB Free swap: 975336kB 65264 pages of RAM 0 pages of HIGHMEM 1324 reserved pages 9298 pages shared 55 pages swap cached 3591 pages dirty 4513 pages writeback 66 pages mapped 32454 pages slab 169 pages pagetables oom-killer: gfp_mask=0x201d2, order=0 <c012977a> out_of_memory+0x28/0x7a <c012a603> __alloc_pages+0x1e5/0x26a <c012778b> page_cache_read+0x3d/0x91 <c0127972> filemap_nopage+0x193/0x2a8 <c0130f17> do_no_page+0x6a/0x1c1 <c013118b> __handle_mm_fault+0xbc/0x15a <c010d87f> do_page_fault+0x239/0x561 <c010d646> do_page_fault+0x0/0x561 <c0102e87> error_code+0x4f/0x54 Mem-info: DMA per-cpu: cpu 0 hot: high 0, batch 1 used:0 cpu 0 cold: high 0, batch 1 used:0 DMA32 per-cpu: empty Normal per-cpu: cpu 0 hot: high 90, batch 15 used:18 cpu 0 cold: high 30, batch 7 used:24 HighMem per-cpu: empty Free pages: 24772kB (0kB HighMem) Active:1029 inactive:17682 dirty:3591 writeback:4513 unstable:0 free:6193 slab:32454 mapped:66 pagetables:169 DMA free:2880kB min:1928kB low:2408kB high:2892kB active:68kB inactive:4640kB present:16384kB pages_scanned:4538 all_unreclaimable? yes lowmem_reserve[]: 0 0 238 238 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 238 238 Normal free:21892kB min:28788kB low:35984kB high:43180kB active:4048kB inactive:66088kB present:244672kB pages_scanned:73521 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0 HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 DMA: 0*4kB 42*8kB 147*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2880kB DMA32: empty Normal: 1779*4kB 839*8kB 342*16kB 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 21892kB HighMem: empty Swap cache: add 5930, delete 5875, find 21346/21816, race 0+0 Free swap = 975336kB Total swap = 987832kB Free swap: 975336kB 65264 pages of RAM 0 pages of HIGHMEM 1324 reserved pages 9298 pages shared 55 pages swap cached 3591 pages dirty 4513 pages writeback 66 pages mapped 32454 pages slab 169 pages pagetables
I've switched this to the mailing list - please send all replies via email (not the bugzilla web interface) and please ensure that all cc's are retained. On Wed, 13 Sep 2006 20:20:00 -0700 bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=7158 > > Summary: Out of memory happen in snapshot > Kernel Version: 2.6.17.13 > Status: NEW > Severity: blocking > Owner: agk@redhat.com > Submitter: kevin_cheng@thecus.com > > > Most recent kernel where this bug did not occur: > Distribution: > Hardware Environment: > celoron M 1,5GHz > memory :256MB > swap memory have 1GB > > Software Environment: > My volumn be created on RAID0 > kernel is 2.6.17.13 > Filesyetm is ext3 > > Problem Description: > We create the 8 snapshots in one volumn ,and copy a 4G file to the volumn , > and system happen "out of memory" and oom-killer kill my processes. > Someone have happen this problem ,have any way to solve it ? Thanks!! > > Steps to reproduce: > - use three hdd to create RAID0 > - create vg on RAID0 is called vg0 > - Create lv on vg0 is called lv0 > - mke2fsck -j /dev/vg0/lv0 , format to ext3 filesystem > - mount /dev/vg0/lv0 to /raid/data > - create a folder pub on /raid/data > - create 8 snapshots base on lv0 ,snapshot lv sizes have 22GB,every snapshot > are the same. > - use dd command to create 4G file to lv0 > dd if=/dev/zero of=/raid/data/4G.bin bs=1M count=4096 > - system have some thing happen > sometime system will crash ,not any more message. > sometime have "out of memory" in dmesg > The oom-killer info which you've included there is ambiguous. A large amount f memory is in slab, which might indicate a slab leak. But there is also a large amount of memory on the page LRU, which one would expect to have been reclaimed before declaration of OOM. Could you please capture the contents of /proc/slabinfo after the oom-killing and send that?
Dear Andrew: Thanks for your message and help. Attech file is the slabinfo from my machine. Thanks!! Thanks for all advice , Regards, Kevin Cheng -----Original Message----- From: Andrew Morton [mailto:akpm@osdl.org] Sent: Thursday, September 14, 2006 11:53 AM To: bugme-daemon@kernel-bugs.osdl.org Cc: dm-devel@redhat.com; agk@redhat.com; kevin_cheng@thecus.com Subject: Re: [Bugme-new] [Bug 7158] New: Out of memory happen in snapshot I've switched this to the mailing list - please send all replies via email (not the bugzilla web interface) and please ensure that all cc's are retained. On Wed, 13 Sep 2006 20:20:00 -0700 bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=7158 > > Summary: Out of memory happen in snapshot > Kernel Version: 2.6.17.13 > Status: NEW > Severity: blocking > Owner: agk@redhat.com > Submitter: kevin_cheng@thecus.com > > > Most recent kernel where this bug did not occur: > Distribution: > Hardware Environment: > celoron M 1,5GHz > memory :256MB > swap memory have 1GB > > Software Environment: > My volumn be created on RAID0 > kernel is 2.6.17.13 > Filesyetm is ext3 > > Problem Description: > We create the 8 snapshots in one volumn ,and copy a 4G file to the > volumn , and system happen "out of memory" and oom-killer kill my processes. > Someone have happen this problem ,have any way to solve it ? Thanks!! > > Steps to reproduce: > - use three hdd to create RAID0 > - create vg on RAID0 is called vg0 > - Create lv on vg0 is called lv0 > - mke2fsck -j /dev/vg0/lv0 , format to ext3 filesystem > - mount /dev/vg0/lv0 to /raid/data > - create a folder pub on /raid/data > - create 8 snapshots base on lv0 ,snapshot lv sizes have 22GB,every > snapshot are the same. > - use dd command to create 4G file to lv0 > dd if=/dev/zero of=/raid/data/4G.bin bs=1M count=4096 > - system have some thing happen > sometime system will crash ,not any more message. > sometime have "out of memory" in dmesg > The oom-killer info which you've included there is ambiguous. A large amount f memory is in slab, which might indicate a slab leak. But there is also a large amount of memory on the page LRU, which one would expect to have been reclaimed before declaration of OOM. Could you please capture the contents of /proc/slabinfo after the oom-killing and send that?
On Thu, 14 Sep 2006 13:25:52 +0800 "kevin Cheng" <kevin_cheng@thecus.com> wrote: > Dear Andrew: > Thanks for your message and help. > > Attech file is the slabinfo from my machine. Thanks!! > OK, thanks. A number of DM-related slab caches have really high object counts. So there's a good chance that this is either a mistuning/imbalance issue in DM or an outright leak. I'd ask the DM developers to take over here please.
I have check before and after OOM-killer slabinfo There have some object have high counts. And dm-snapshot-ex always have high counts. Have any explain about those value? Kevin Cheng ---------------------------------------------------------------------------- ------------------------------------------------- Before: kcopyd-jobs 512 525 264 15 1 : tunables 54 27 0 : slabdata 35 35 0 After : kcopyd-jobs 16524 62685 264 15 1 : tunables 54 27 0 : slabdata 4179 4179 0 Before: dm-snapshot-in 128 177 64 59 1 : tunables 120 60 0 : slabdata 3 3 0 After : dm-snapshot-in 24833 73219 64 59 1 : tunables 120 60 0 : slabdata 1241 1241 0 Before: dm-snapshot-ex 780545 780680 24 145 1 : tunables 120 60 0 : slabdata 5384 5384 0 Before: dm_tio 5124 5278 16 203 1 : tunables 120 60 0 : slabdata 26 26 0 Before: dm_io 5133 5239 20 169 1 : tunables 120 60 0 : slabdata 31 31 0 After : dm_tio 23337 32277 16 203 1 : tunables 120 60 0 : slabdata 159 159 0 After : dm_io 23339 32448 20 169 1 : tunables 120 60 0 : slabdata 192 192 0 Before: journal_head 2 72 52 72 1 : tunables 120 60 0 : slabdata 1 1 0 After : journal_head 23029 25272 52 72 1 : tunables 120 60 0 : slabdata 351 351 0 Before: biovec-4 39 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0 Before: biovec-1 76 406 16 203 1 : tunables 120 60 0 : slabdata 2 2 0 Before: bio 330 330 128 30 1 : tunables 120 60 0 : slabdata 11 11 0 After : biovec-4 2815 3717 64 59 1 : tunables 120 60 0 : slabdata 63 63 0 After : biovec-1 33507 47096 16 203 1 : tunables 120 60 0 : slabdata 232 232 0 After : bio 36390 50130 128 30 1 : tunables 120 60 0 : slabdata 1671 1671 0 Before: buffer_head 16360 16416 52 72 1 : tunables 120 60 0 : slabdata 228 228 0 After : buffer_head 24234 36000 52 72 1 : tunables 120 60 0 : slabdata 500 500 0 -----Original Message----- From: Andrew Morton [mailto:akpm@osdl.org] Sent: Thursday, September 14, 2006 1:37 PM To: kevin Cheng Cc: 'bugme-daemon@kernel-bugs.osdl.org'; dm-devel@redhat.com; agk@redhat.com Subject: Re: [Bugme-new] [Bug 7158] New: Out of memory happen in snapshot On Thu, 14 Sep 2006 13:25:52 +0800 "kevin Cheng" <kevin_cheng@thecus.com> wrote: > Dear Andrew: > Thanks for your message and help. > > Attech file is the slabinfo from my machine. Thanks!! > OK, thanks. A number of DM-related slab caches have really high object counts. So there's a good chance that this is either a mistuning/imbalance issue in DM or an outright leak. I'd ask the DM developers to take over here please.
What is the output of these three commands? dmsetup info -c dmsetup table dmsetup status The current snapshot implementation does require a lot of kernel memory to store each snapshot's exception table, which grows as there is new I/O. Eight snapshots means eight times the memory needed by one snapshot. Are you using the default chunk size? If so, try a larger one (lvcreate -c). Alasdair
Dear Alasdair: Thanks for your help. We have capture the info about you want in the follow. And we have try to use chunk size 64KB ,it seens useful . And we get the slabinfo in the next follow. But the speed is very slow ,DISK IO throughtput just down to 1.68MB/s . Have any to solve it? Thanks!! Kevin ---------------------------------------------------------------------------- ----------------- root@127.0.0.1:~# /app/dmsetup info -c Name Maj Min Stat Open Targ Event UUID vg0-2006--09--15--10--32--35 253 7 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOOh53h60HLMcWIqLfCSt8vWO9B6DRhE5c vg0-lv0-real 253 3 L--w 9 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOW1BfYcHj2lYmvnb1TzyRCzp9hUwHsRl1-real vg0-2006--09--15--10--32--48 253 17 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VO2ryv6q3C0AxF4WxKdK0wNsgfaJW15uYA vg0-2006--09--15--10--32--50 253 19 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOdLWwX9RWix7ufflxxcU8pYJ7ugTiTqcr vg0-2006--09--15--10--32--46-cow 253 14 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOz0IxXWp1V2yB1E2GVNNuStIKS4HGDKul-cow vg0-syslv 253 0 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOgjsjzAhmgflRymCKspDrRAWEtKTvxGC2 vg0-2006--09--15--10--32--37-cow 253 8 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOXgG6ht87GUFzownX6qboGAjj3aMoDdmB-cow vg0-2006--09--15--10--32--29 253 5 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOK8ZP2bR28O99DZEXzZF5QmEIHtEyIt28 vg0-2006--09--15--10--32--46 253 15 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOz0IxXWp1V2yB1E2GVNNuStIKS4HGDKul vg0-2006--09--15--10--32--50-cow 253 18 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOdLWwX9RWix7ufflxxcU8pYJ7ugTiTqcr-cow vg0-2006--09--15--10--32--29-cow 253 4 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOK8ZP2bR28O99DZEXzZF5QmEIHtEyIt28-cow vg0-2006--09--15--10--32--44 253 13 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOaSFou8q5Xb3asN9U8WN2RiXaHb9jjVT4 vg0-lv1 253 2 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VO5SUSILIIEM2gLYakhI8maNb1UFI21b1h vg0-2006--09--15--10--32--48-cow 253 16 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VO2ryv6q3C0AxF4WxKdK0wNsgfaJW15uYA-cow vg0-2006--09--15--10--32--39-cow 253 10 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOAR6hd59N2ljEz0QIfaEwDgpxNTFENo0B-cow vg0-lv0 253 1 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOW1BfYcHj2lYmvnb1TzyRCzp9hUwHsRl1 vg0-2006--09--15--10--32--39 253 11 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOAR6hd59N2ljEz0QIfaEwDgpxNTFENo0B vg0-2006--09--15--10--32--37 253 9 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOXgG6ht87GUFzownX6qboGAjj3aMoDdmB vg0-2006--09--15--10--32--44-cow 253 12 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOaSFou8q5Xb3asN9U8WN2RiXaHb9jjVT4-cow vg0-2006--09--15--10--32--35-cow 253 6 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOOh53h60HLMcWIqLfCSt8vWO9B6DRhE5c-cow root@127.0.0.1:~# /app/dmsetup table vg0-2006--09--15--10--32--35: 0 161906688 snapshot 253:3 253:6 P 128 vg0-lv0-real: 0 161906688 linear 9:0 2097536 vg0-2006--09--15--10--32--48: 0 161906688 snapshot 253:3 253:16 P 128 vg0-2006--09--15--10--32--50: 0 161906688 snapshot 253:3 253:18 P 128 vg0-2006--09--15--10--32--46-cow: 0 34603008 linear 9:0 360087936 vg0-syslv: 0 2097152 linear 9:0 384 vg0-2006--09--15--10--32--37-cow: 0 34603008 linear 9:0 256278912 vg0-2006--09--15--10--32--29: 0 161906688 snapshot 253:3 253:4 P 128 vg0-2006--09--15--10--32--46: 0 161906688 snapshot 253:3 253:14 P 128 vg0-2006--09--15--10--32--50-cow: 0 35446784 linear 9:0 429293952 vg0-2006--09--15--10--32--29-cow: 0 34603008 linear 9:0 187072896 vg0-2006--09--15--10--32--44: 0 161906688 snapshot 253:3 253:12 P 128 vg0-lv1: 0 23068672 linear 9:0 164004224 vg0-2006--09--15--10--32--48-cow: 0 34603008 linear 9:0 394690944 vg0-2006--09--15--10--32--39-cow: 0 34603008 linear 9:0 290881920 vg0-lv0: 0 161906688 snapshot-origin 253:3 vg0-2006--09--15--10--32--39: 0 161906688 snapshot 253:3 253:10 P 128 vg0-2006--09--15--10--32--37: 0 161906688 snapshot 253:3 253:8 P 128 vg0-2006--09--15--10--32--44-cow: 0 34603008 linear 9:0 325484928 vg0-2006--09--15--10--32--35-cow: 0 34603008 linear 9:0 221675904 root@127.0.0.1:~# /app/dmsetup status vg0-2006--09--15--10--32--35: 0 161906688 snapshot 5669120/34603008 vg0-lv0-real: 0 161906688 linear vg0-2006--09--15--10--32--48: 0 161906688 snapshot 5669120/34603008 vg0-2006--09--15--10--32--50: 0 161906688 snapshot 5669120/35446784 vg0-2006--09--15--10--32--46-cow: 0 34603008 linear vg0-syslv: 0 2097152 linear vg0-2006--09--15--10--32--37-cow: 0 34603008 linear vg0-2006--09--15--10--32--29: 0 161906688 snapshot 5669120/34603008 vg0-2006--09--15--10--32--46: 0 161906688 snapshot 5669120/34603008 vg0-2006--09--15--10--32--50-cow: 0 35446784 linear vg0-2006--09--15--10--32--29-cow: 0 34603008 linear vg0-2006--09--15--10--32--44: 0 161906688 snapshot 5669120/34603008 vg0-lv1: 0 23068672 linear vg0-2006--09--15--10--32--48-cow: 0 34603008 linear vg0-2006--09--15--10--32--39-cow: 0 34603008 linear vg0-lv0: 0 161906688 snapshot-origin vg0-2006--09--15--10--32--39: 0 161906688 snapshot 5669120/34603008 vg0-2006--09--15--10--32--37: 0 161906688 snapshot 5669120/34603008 vg0-2006--09--15--10--32--44-cow: 0 34603008 linear vg0-2006--09--15--10--32--35-cow: 0 34603008 linear ---------------------------------------------------------------------------- ---------------------------------------- Before: kcopyd-jobs 512 525 264 15 1 : tunables 54 27 0 : slabdata 35 35 0 dm-snapshot-in 128 177 64 59 1 : tunables 120 60 0 : slabdata 3 3 0 dm-snapshot-ex 8 145 24 145 1 : tunables 120 60 0 : slabdata 1 1 0 dm_tio 5120 5278 16 203 1 : tunables 120 60 0 : slabdata 26 26 0 dm_io 5120 5239 20 169 1 : tunables 120 60 0 : slabdata 31 31 0 journal_head 1 72 52 72 1 : tunables 120 60 0 : slabdata 1 1 0 biovec-(256) 15 16 3072 2 2 : tunables 24 12 0 : slabdata 8 8 0 biovec-128 23 25 1536 5 2 : tunables 24 12 0 : slabdata 5 5 0 biovec-64 39 40 768 5 1 : tunables 54 27 0 : slabdata 8 8 0 biovec-16 39 40 192 20 1 : tunables 120 60 0 : slabdata 2 2 0 biovec-4 39 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0 biovec-1 100 1218 16 203 1 : tunables 120 60 0 : slabdata 6 6 0 bio 295 1050 128 30 1 : tunables 120 60 0 : slabdata 35 35 0 buffer_head 1419 7560 52 72 1 : tunables 120 60 0 : slabdata 105 105 0 After: kcopyd-jobs 512 1290 264 15 1 : tunables 54 27 0 : slabdata 86 86 0 dm-snapshot-in 129 1121 64 59 1 : tunables 120 60 0 : slabdata 19 19 0 dm-snapshot-ex 97904 98020 24 145 1 : tunables 120 60 0 : slabdata 676 676 0 dm_tio 5121 8526 16 203 1 : tunables 120 60 0 : slabdata 42 42 0 dm_io 5121 6760 20 169 1 : tunables 120 60 0 : slabdata 40 40 0 journal_head 15 576 52 72 1 : tunables 120 60 0 : slabdata 8 8 0 biovec-(256) 15 16 3072 2 2 : tunables 24 12 0 : slabdata 8 8 0 biovec-128 23 25 1536 5 2 : tunables 24 12 0 : slabdata 5 5 0 biovec-64 40 70 768 5 1 : tunables 54 27 0 : slabdata 14 14 0 biovec-16 40 140 192 20 1 : tunables 120 60 0 : slabdata 5 7 0 biovec-4 40 118 64 59 1 : tunables 120 60 0 : slabdata 2 2 0 biovec-1 81 2030 16 203 1 : tunables 120 60 0 : slabdata 10 10 0 bio 318 2250 128 30 1 : tunables 120 60 0 : slabdata 75 75 0 buffer_head 1331 12960 52 72 1 : tunables 120 60 0 : slabdata 180 180 0 ---------------------------------------------------------------------------- ---------------------------------------- -----Original Message----- From: Alasdair G Kergon [mailto:agk@redhat.com] Sent: Thursday, September 14, 2006 9:33 PM To: kevin Cheng Cc: 'Andrew Morton'; 'bugme-daemon@kernel-bugs.osdl.org'; dm-devel@redhat.com; agk@redhat.com; mbroz@redhat.com Subject: Re: [Bugme-new] [Bug 7158] New: Out of memory happen in snapshot What is the output of these three commands? dmsetup info -c dmsetup table dmsetup status The current snapshot implementation does require a lot of kernel memory to store each snapshot's exception table, which grows as there is new I/O. Eight snapshots means eight times the memory needed by one snapshot. Are you using the default chunk size? If so, try a larger one (lvcreate -c). Alasdair -- agk@redhat.com
Dear Alasdair: When I try use dd to general 8GB file to lv ,and "out of memory" still happen. And in slabinfo "dm-snapshot-ex" always keep high count , may be "1193248" more higher . Have any way to change dm-snapshot-ex memory use trans to swap or les memory use ? Thanks!! Regards, Kevin -----Original Message----- From: dm-devel-bounces@redhat.com [mailto:dm-devel-bounces@redhat.com] On Behalf Of kevin Cheng Sent: Friday, September 15, 2006 11:25 AM To: 'Alasdair G Kergon' Cc: 'Andrew Morton'; dm-devel@redhat.com; 'bugme-daemon@kernel-bugs.osdl.org' Subject: [dm-devel] RE: [Bugme-new] [Bug 7158] New: Out of memory happen insnapshot Dear Alasdair: Thanks for your help. We have capture the info about you want in the follow. And we have try to use chunk size 64KB ,it seens useful . And we get the slabinfo in the next follow. But the speed is very slow ,DISK IO throughtput just down to 1.68MB/s . Have any to solve it? Thanks!! Kevin ---------------------------------------------------------------------------- ----------------- root@127.0.0.1:~# /app/dmsetup info -c Name Maj Min Stat Open Targ Event UUID vg0-2006--09--15--10--32--35 253 7 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOOh53h60HLMcWIqLfCSt8vWO9B6DRhE5c vg0-lv0-real 253 3 L--w 9 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOW1BfYcHj2lYmvnb1TzyRCzp9hUwHsRl1-real vg0-2006--09--15--10--32--48 253 17 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VO2ryv6q3C0AxF4WxKdK0wNsgfaJW15uYA vg0-2006--09--15--10--32--50 253 19 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOdLWwX9RWix7ufflxxcU8pYJ7ugTiTqcr vg0-2006--09--15--10--32--46-cow 253 14 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOz0IxXWp1V2yB1E2GVNNuStIKS4HGDKul-cow vg0-syslv 253 0 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOgjsjzAhmgflRymCKspDrRAWEtKTvxGC2 vg0-2006--09--15--10--32--37-cow 253 8 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOXgG6ht87GUFzownX6qboGAjj3aMoDdmB-cow vg0-2006--09--15--10--32--29 253 5 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOK8ZP2bR28O99DZEXzZF5QmEIHtEyIt28 vg0-2006--09--15--10--32--46 253 15 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOz0IxXWp1V2yB1E2GVNNuStIKS4HGDKul vg0-2006--09--15--10--32--50-cow 253 18 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOdLWwX9RWix7ufflxxcU8pYJ7ugTiTqcr-cow vg0-2006--09--15--10--32--29-cow 253 4 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOK8ZP2bR28O99DZEXzZF5QmEIHtEyIt28-cow vg0-2006--09--15--10--32--44 253 13 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOaSFou8q5Xb3asN9U8WN2RiXaHb9jjVT4 vg0-lv1 253 2 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VO5SUSILIIEM2gLYakhI8maNb1UFI21b1h vg0-2006--09--15--10--32--48-cow 253 16 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VO2ryv6q3C0AxF4WxKdK0wNsgfaJW15uYA-cow vg0-2006--09--15--10--32--39-cow 253 10 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOAR6hd59N2ljEz0QIfaEwDgpxNTFENo0B-cow vg0-lv0 253 1 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOW1BfYcHj2lYmvnb1TzyRCzp9hUwHsRl1 vg0-2006--09--15--10--32--39 253 11 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOAR6hd59N2ljEz0QIfaEwDgpxNTFENo0B vg0-2006--09--15--10--32--37 253 9 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOXgG6ht87GUFzownX6qboGAjj3aMoDdmB vg0-2006--09--15--10--32--44-cow 253 12 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOaSFou8q5Xb3asN9U8WN2RiXaHb9jjVT4-cow vg0-2006--09--15--10--32--35-cow 253 6 L--w 1 1 0 LVM-tyDxlOveI21xekkD63t1eYO42UhWM4VOOh53h60HLMcWIqLfCSt8vWO9B6DRhE5c-cow root@127.0.0.1:~# /app/dmsetup table vg0-2006--09--15--10--32--35: 0 161906688 snapshot 253:3 253:6 P 128 vg0-lv0-real: 0 161906688 linear 9:0 2097536 vg0-2006--09--15--10--32--48: 0 161906688 snapshot 253:3 253:16 P 128 vg0-2006--09--15--10--32--50: 0 161906688 snapshot 253:3 253:18 P 128 vg0-2006--09--15--10--32--46-cow: 0 34603008 linear 9:0 360087936 vg0-syslv: 0 2097152 linear 9:0 384 vg0-2006--09--15--10--32--37-cow: 0 34603008 linear 9:0 256278912 vg0-2006--09--15--10--32--29: 0 161906688 snapshot 253:3 253:4 P 128 vg0-2006--09--15--10--32--46: 0 161906688 snapshot 253:3 253:14 P 128 vg0-2006--09--15--10--32--50-cow: 0 35446784 linear 9:0 429293952 vg0-2006--09--15--10--32--29-cow: 0 34603008 linear 9:0 187072896 vg0-2006--09--15--10--32--44: 0 161906688 snapshot 253:3 253:12 P 128 vg0-lv1: 0 23068672 linear 9:0 164004224 vg0-2006--09--15--10--32--48-cow: 0 34603008 linear 9:0 394690944 vg0-2006--09--15--10--32--39-cow: 0 34603008 linear 9:0 290881920 vg0-lv0: 0 161906688 snapshot-origin 253:3 vg0-2006--09--15--10--32--39: 0 161906688 snapshot 253:3 253:10 P 128 vg0-2006--09--15--10--32--37: 0 161906688 snapshot 253:3 253:8 P 128 vg0-2006--09--15--10--32--44-cow: 0 34603008 linear 9:0 325484928 vg0-2006--09--15--10--32--35-cow: 0 34603008 linear 9:0 221675904 root@127.0.0.1:~# /app/dmsetup status vg0-2006--09--15--10--32--35: 0 161906688 snapshot 5669120/34603008 vg0-lv0-real: 0 161906688 linear vg0-2006--09--15--10--32--48: 0 161906688 snapshot 5669120/34603008 vg0-2006--09--15--10--32--50: 0 161906688 snapshot 5669120/35446784 vg0-2006--09--15--10--32--46-cow: 0 34603008 linear vg0-syslv: 0 2097152 linear vg0-2006--09--15--10--32--37-cow: 0 34603008 linear vg0-2006--09--15--10--32--29: 0 161906688 snapshot 5669120/34603008 vg0-2006--09--15--10--32--46: 0 161906688 snapshot 5669120/34603008 vg0-2006--09--15--10--32--50-cow: 0 35446784 linear vg0-2006--09--15--10--32--29-cow: 0 34603008 linear vg0-2006--09--15--10--32--44: 0 161906688 snapshot 5669120/34603008 vg0-lv1: 0 23068672 linear vg0-2006--09--15--10--32--48-cow: 0 34603008 linear vg0-2006--09--15--10--32--39-cow: 0 34603008 linear vg0-lv0: 0 161906688 snapshot-origin vg0-2006--09--15--10--32--39: 0 161906688 snapshot 5669120/34603008 vg0-2006--09--15--10--32--37: 0 161906688 snapshot 5669120/34603008 vg0-2006--09--15--10--32--44-cow: 0 34603008 linear vg0-2006--09--15--10--32--35-cow: 0 34603008 linear ---------------------------------------------------------------------------- ---------------------------------------- Before: kcopyd-jobs 512 525 264 15 1 : tunables 54 27 0 : slabdata 35 35 0 dm-snapshot-in 128 177 64 59 1 : tunables 120 60 0 : slabdata 3 3 0 dm-snapshot-ex 8 145 24 145 1 : tunables 120 60 0 : slabdata 1 1 0 dm_tio 5120 5278 16 203 1 : tunables 120 60 0 : slabdata 26 26 0 dm_io 5120 5239 20 169 1 : tunables 120 60 0 : slabdata 31 31 0 journal_head 1 72 52 72 1 : tunables 120 60 0 : slabdata 1 1 0 biovec-(256) 15 16 3072 2 2 : tunables 24 12 0 : slabdata 8 8 0 biovec-128 23 25 1536 5 2 : tunables 24 12 0 : slabdata 5 5 0 biovec-64 39 40 768 5 1 : tunables 54 27 0 : slabdata 8 8 0 biovec-16 39 40 192 20 1 : tunables 120 60 0 : slabdata 2 2 0 biovec-4 39 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0 biovec-1 100 1218 16 203 1 : tunables 120 60 0 : slabdata 6 6 0 bio 295 1050 128 30 1 : tunables 120 60 0 : slabdata 35 35 0 buffer_head 1419 7560 52 72 1 : tunables 120 60 0 : slabdata 105 105 0 After: kcopyd-jobs 512 1290 264 15 1 : tunables 54 27 0 : slabdata 86 86 0 dm-snapshot-in 129 1121 64 59 1 : tunables 120 60 0 : slabdata 19 19 0 dm-snapshot-ex 97904 98020 24 145 1 : tunables 120 60 0 : slabdata 676 676 0 dm_tio 5121 8526 16 203 1 : tunables 120 60 0 : slabdata 42 42 0 dm_io 5121 6760 20 169 1 : tunables 120 60 0 : slabdata 40 40 0 journal_head 15 576 52 72 1 : tunables 120 60 0 : slabdata 8 8 0 biovec-(256) 15 16 3072 2 2 : tunables 24 12 0 : slabdata 8 8 0 biovec-128 23 25 1536 5 2 : tunables 24 12 0 : slabdata 5 5 0 biovec-64 40 70 768 5 1 : tunables 54 27 0 : slabdata 14 14 0 biovec-16 40 140 192 20 1 : tunables 120 60 0 : slabdata 5 7 0 biovec-4 40 118 64 59 1 : tunables 120 60 0 : slabdata 2 2 0 biovec-1 81 2030 16 203 1 : tunables 120 60 0 : slabdata 10 10 0 bio 318 2250 128 30 1 : tunables 120 60 0 : slabdata 75 75 0 buffer_head 1331 12960 52 72 1 : tunables 120 60 0 : slabdata 180 180 0 ---------------------------------------------------------------------------- ---------------------------------------- -----Original Message----- From: Alasdair G Kergon [mailto:agk@redhat.com] Sent: Thursday, September 14, 2006 9:33 PM To: kevin Cheng Cc: 'Andrew Morton'; 'bugme-daemon@kernel-bugs.osdl.org'; dm-devel@redhat.com; agk@redhat.com; mbroz@redhat.com Subject: Re: [Bugme-new] [Bug 7158] New: Out of memory happen in snapshot What is the output of these three commands? dmsetup info -c dmsetup table dmsetup status The current snapshot implementation does require a lot of kernel memory to store each snapshot's exception table, which grows as there is new I/O. Eight snapshots means eight times the memory needed by one snapshot. Are you using the default chunk size? If so, try a larger one (lvcreate -c). Alasdair -- agk@redhat.com -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
I notice that the 'dmsetup table' output does not correspond to the information you gave at the beginning of the bug. (Different volume sizes - 16.5GB not 22GB - and different chunk size - 64KB not 8KB.) Does that 'dmsetup status' output correspond to the time the system reached OOM and tie in with the slabinfo you provided? If not, please can you try to get fresh data from the time of the failure? (both slabinfo and 'dmsetup status' taken at approximately the same time when the system is in (or close to) OOM; it's OK to take 'dmsetup info' and 'dmsetup table' earlier in the test as they won't change during it) Alasdair
Dear Alasdair: Thanks for your message. This moning I redo the OOM problem. Hardware eviroment is the same. And use chunk size 64KB. Aatech file have the dm info and slab info . This time we use dd command and dd 8GB file to lvm ,second time out-of-memory happen. Thanks!! Regards, Kevin -----Original Message----- From: Alasdair G Kergon [mailto:agk@redhat.com] Sent: Saturday, September 16, 2006 4:10 AM To: kevin Cheng Cc: 'device-mapper development'; 'Andrew Morton'; 'bugme-daemon@kernel-bugs.osdl.org' Subject: Re: [dm-devel] RE: [Bugme-new] [Bug 7158] New: Out of memory happen insnapshot I notice that the 'dmsetup table' output does not correspond to the information you gave at the beginning of the bug. (Different volume sizes - 16.5GB not 22GB - and different chunk size - 64KB not 8KB.) Does that 'dmsetup status' output correspond to the time the system reached OOM and tie in with the slabinfo you provided? If not, please can you try to get fresh data from the time of the failure? (both slabinfo and 'dmsetup status' taken at approximately the same time when the system is in (or close to) OOM; it's OK to take 'dmsetup info' and 'dmsetup table' earlier in the test as they won't change during it) Alasdair -- agk@redhat.com
Kevin, where do we stand with this bug now? Still present in recent kernels? Thanks.
Please reopen this bug if it's still present with kernel 2.6.20.