Bug 201865 - BUG: Bad rss-counter state mm:00000000d5ef1295 idx:1 val:3
Summary: BUG: Bad rss-counter state mm:00000000d5ef1295 idx:1 val:3
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: PPC-64 Linux
: P1 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-03 18:27 UTC by Erhard F.
Modified: 2019-01-26 20:39 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.20-rc5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg output (62.14 KB, text/plain)
2018-12-03 18:27 UTC, Erhard F.
Details
kernel .config (4.20-rc5) (88.17 KB, text/plain)
2018-12-03 18:28 UTC, Erhard F.
Details

Description Erhard F. 2018-12-03 18:27:24 UTC
Created attachment 279823 [details]
dmesg output

The kernel (4.20-rc5) tells me:

[  873.263594] BUG: Bad rss-counter state mm:00000000d5ef1295 idx:1 val:3
[  873.263605] BUG: non-zero pgtables_bytes on freeing mm: 24576

I've seen bug #196569, but I am not quite sure if this is the same problem. So filing a new bug.

Machine is a Talos II, dual-socket NUMA 4-core POWER9:

# cat /proc/buddyinfo 
Node 0, zone      DMA 166543 151725 106193  58527  20325   4948    914    143     11      2      0      2      7 
Node 8, zone      DMA 229945 211748 103714  40310  16707   6915   1726    284     34      4      1      2     79 
# cat /proc/meminfo 
MemTotal:       32769896 kB
MemFree:        17302532 kB
MemAvailable:   25725428 kB
Buffers:           39732 kB
Cached:          8185260 kB
SwapCached:            0 kB
Active:          4121112 kB
Inactive:        3958012 kB
Active(anon):      69728 kB
Inactive(anon):      192 kB
Active(file):    4051384 kB
Inactive(file):  3957820 kB
Unevictable:      282456 kB
Mlocked:          282456 kB
SwapTotal:      35653624 kB
SwapFree:       35653624 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        136704 kB
Mapped:           250228 kB
Shmem:             66504 kB
KReclaimable:    1855456 kB
Slab:            2187396 kB
SReclaimable:    1855456 kB
SUnreclaim:       331940 kB
KernelStack:        6848 kB
PageTables:         3408 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    52038572 kB
Committed_AS:     522416 kB
VmallocTotal:   549755813888 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
Percpu:             9344 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:           0 kB
DirectMap64k:           0 kB
DirectMap2M:     1048576 kB
DirectMap1G:    32505856 kB
Comment 1 Erhard F. 2018-12-03 18:28:06 UTC
Created attachment 279825 [details]
kernel .config (4.20-rc5)
Comment 2 Andrew Morton 2018-12-03 21:54:55 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 03 Dec 2018 18:27:24 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=201865
> 
>             Bug ID: 201865
>            Summary: BUG: Bad rss-counter state mm:00000000d5ef1295 idx:1
>                     val:3
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 4.20-rc5
>           Hardware: PPC-64
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Page Allocator
>           Assignee: akpm@linux-foundation.org
>           Reporter: erhard_f@mailbox.org
>         Regression: No
> 
> Created attachment 279823 [details]
>   --> https://bugzilla.kernel.org/attachment.cgi?id=279823&action=edit
> dmesg output
> 
> The kernel (4.20-rc5) tells me:
> 
> [  873.263594] BUG: Bad rss-counter state mm:00000000d5ef1295 idx:1 val:3
> [  873.263605] BUG: non-zero pgtables_bytes on freeing mm: 24576
> 
> I've seen bug #196569, but I am not quite sure if this is the same problem.
> So
> filing a new bug.
> 
> Machine is a Talos II, dual-socket NUMA 4-core POWER9:
> 
> # cat /proc/buddyinfo 
> Node 0, zone      DMA 166543 151725 106193  58527  20325   4948    914    143 
>   11      2      0      2      7 
> Node 8, zone      DMA 229945 211748 103714  40310  16707   6915   1726    284 
>   34      4      1      2     79 
> # cat /proc/meminfo 
> MemTotal:       32769896 kB
> MemFree:        17302532 kB
> MemAvailable:   25725428 kB
> Buffers:           39732 kB
> Cached:          8185260 kB
> SwapCached:            0 kB
> Active:          4121112 kB
> Inactive:        3958012 kB
> Active(anon):      69728 kB
> Inactive(anon):      192 kB
> Active(file):    4051384 kB
> Inactive(file):  3957820 kB
> Unevictable:      282456 kB
> Mlocked:          282456 kB
> SwapTotal:      35653624 kB
> SwapFree:       35653624 kB
> Dirty:                 0 kB
> Writeback:             0 kB
> AnonPages:        136704 kB
> Mapped:           250228 kB
> Shmem:             66504 kB
> KReclaimable:    1855456 kB
> Slab:            2187396 kB
> SReclaimable:    1855456 kB
> SUnreclaim:       331940 kB
> KernelStack:        6848 kB
> PageTables:         3408 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    52038572 kB
> Committed_AS:     522416 kB
> VmallocTotal:   549755813888 kB
> VmallocUsed:           0 kB
> VmallocChunk:          0 kB
> Percpu:             9344 kB
> AnonHugePages:         0 kB
> ShmemHugePages:        0 kB
> ShmemPmdMapped:        0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
> Hugetlb:               0 kB
> DirectMap4k:           0 kB
> DirectMap64k:           0 kB
> DirectMap2M:     1048576 kB
> DirectMap1G:    32505856 kB
> 
> -- 
> You are receiving this mail because:
> You are the assignee for the bug.
Comment 3 H.J. Lu 2019-01-26 20:39:16 UTC
I also saw it under kernel 4.20.3 and 4.20.4 on a dual-socket NUMA
machine with 2 Intel Xeon Platinum 8180 CPUs. Kernel 4.19.xx is OK.

Note You need to log in before you can comment on or make changes to this bug.