Bug 201603

Summary: NULL pointer dereference when using z3fold and zswap
Product: Memory Management Reporter: Jagannathan Tiruvallur Eachambadi (jagannathante)
Component: Page AllocatorAssignee: Andrew Morton (akpm)
Status: RESOLVED CODE_FIX    
Severity: high CC: enelar, imwellcushtymelike, oleksandr, vitalywool
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.18.16 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg log of crash
Log before kernel panic and after hard reset
attachment-22937-0.html
attachment-27234-0.html

Description Jagannathan Tiruvallur Eachambadi 2018-11-02 10:41:46 UTC
Created attachment 279297 [details]
dmesg log of crash

This happens mostly during memory pressure but I am not sure how to trigger it reliably. I am attaching the full log.

This is the kernel commandline

>BOOT_IMAGE=../vmlinuz-linux root=UUID=57274b3a-92ab-468e-b03a-06026675c1af rw
>rd.luks.name=92b4aeb2-fb97-45c1-8a60-2816efe5d57e=home resume=/dev/mapper/home
>resume_offset=42772480 acpi_backlight=video zswap.enabled=1 zswap.zpool=z3fold
>zswap.max_pool_percent=5 transparent_hugepage=madvise scsi_mod.use_blk_mq=1
>vga=current initrd=../intel-ucode.img,../initramfs-linux.img

I found this bug https://bugzilla.kernel.org/show_bug.cgi?id=198585 to be very similar but the proposed fix has not been merged so I can't be sure if it will fix the issue I am having.
Comment 1 Kirill Berezin 2018-11-06 12:28:15 UTC
Created attachment 279341 [details]
Log before kernel panic and after hard reset

Arch+deepin+systemd-swap

zswap_enabled=1
zswap_compressor=lzo 
zswap_max_pool_percent=5
zswap_zpool=z3fold 

zram_enabled=1
zram_size=$(($RAM_SIZE*1/10))  
zram_streams=$NCPU
zram_alg=lz4                  
zram_prio=32767
Comment 2 Andrew Morton 2018-11-06 21:48:41 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Fri, 02 Nov 2018 10:41:46 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=201603
> 
>             Bug ID: 201603
>            Summary: NULL pointer dereference when using z3fold and zswap
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 4.18.16
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Page Allocator
>           Assignee: akpm@linux-foundation.org
>           Reporter: jagannathante@gmail.com
>         Regression: No
> 
> Created attachment 279297 [details]
>   --> https://bugzilla.kernel.org/attachment.cgi?id=279297&action=edit
> dmesg log of crash
> 
> This happens mostly during memory pressure but I am not sure how to trigger
> it
> reliably. I am attaching the full log.
> 
> This is the kernel commandline
> 
> >BOOT_IMAGE=../vmlinuz-linux root=UUID=57274b3a-92ab-468e-b03a-06026675c1af
> rw
> >rd.luks.name=92b4aeb2-fb97-45c1-8a60-2816efe5d57e=home
> resume=/dev/mapper/home
> >resume_offset=42772480 acpi_backlight=video zswap.enabled=1
> zswap.zpool=z3fold
> >zswap.max_pool_percent=5 transparent_hugepage=madvise scsi_mod.use_blk_mq=1
> >vga=current initrd=../intel-ucode.img,../initramfs-linux.img
> 
> I found this bug https://bugzilla.kernel.org/show_bug.cgi?id=198585 to be
> very
> similar but the proposed fix has not been merged so I can't be sure if it
> will
> fix the issue I am having.
> 
> -- 
> You are receiving this mail because:
> You are the assignee for the bug.
Comment 3 Vitaly 2018-11-06 22:10:44 UTC
Created attachment 279353 [details]
attachment-22937-0.html

Hi,
Den tis 6 nov. 2018 kl 22:48 skrev Andrew Morton <akpm@linux-foundation.org
>:

>
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Fri, 02 Nov 2018 10:41:46 +0000 bugzilla-daemon@bugzilla.kernel.org
> wrote:
>
> > https://bugzilla.kernel.org/show_bug.cgi?id=201603
> >
> >             Bug ID: 201603
> >            Summary: NULL pointer dereference when using z3fold and zswap
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 4.18.16
> >           Hardware: All
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Page Allocator
> >           Assignee: akpm@linux-foundation.org
> >           Reporter: jagannathante@gmail.com
> >         Regression: No
> >
> > Created attachment 279297 [details]
> >   --> https://bugzilla.kernel.org/attachment.cgi?id=279297&action=edit
> > dmesg log of crash
>
>
Basing on what I see in dmesg, it is highly likely to get fixed by
https://lkml.org/lkml/2018/11/5/726. Could you please apply/retest?

Best regards,
   Vitaly

> > This happens mostly during memory pressure but I am not sure how to
> trigger it
> > reliably. I am attaching the full log.
> >
> > This is the kernel commandline
> >
> > >BOOT_IMAGE=../vmlinuz-linux
> root=UUID=57274b3a-92ab-468e-b03a-06026675c1af rw
> > >rd.luks.name=92b4aeb2-fb97-45c1-8a60-2816efe5d57e=home
> resume=/dev/mapper/home
> > >resume_offset=42772480 acpi_backlight=video zswap.enabled=1
> zswap.zpool=z3fold
> > >zswap.max_pool_percent=5 transparent_hugepage=madvise
> scsi_mod.use_blk_mq=1
> > >vga=current initrd=../intel-ucode.img,../initramfs-linux.img
> >
> > I found this bug https://bugzilla.kernel.org/show_bug.cgi?id=198585 to
> be very
> > similar but the proposed fix has not been merged so I can't be sure if
> it will
> > fix the issue I am having.
> >
> > --
> > You are receiving this mail because:
> > You are the assignee for the bug.
>
Comment 4 Kirill Berezin 2018-11-07 01:40:55 UTC
Created attachment 279367 [details]
attachment-27234-0.html

Additional to original report I have
kernel BUG at mm/zswap.c:1175
(please take a look at my log)
Will test patch tomorrow

On Wed, Nov 7, 2018, 1:10 AM <bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=201603
>
> --- Comment #3 from Vitaly (vitalywool@gmail.com) ---
> Hi,
> Den tis 6 nov. 2018 kl 22:48 skrev Andrew Morton <
> akpm@linux-foundation.org
> >:
>
> >
> > (switched to email.  Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> >
> > On Fri, 02 Nov 2018 10:41:46 +0000 bugzilla-daemon@bugzilla.kernel.org
> > wrote:
> >
> > > https://bugzilla.kernel.org/show_bug.cgi?id=201603
> > >
> > >             Bug ID: 201603
> > >            Summary: NULL pointer dereference when using z3fold and
> zswap
> > >            Product: Memory Management
> > >            Version: 2.5
> > >     Kernel Version: 4.18.16
> > >           Hardware: All
> > >                 OS: Linux
> > >               Tree: Mainline
> > >             Status: NEW
> > >           Severity: high
> > >           Priority: P1
> > >          Component: Page Allocator
> > >           Assignee: akpm@linux-foundation.org
> > >           Reporter: jagannathante@gmail.com
> > >         Regression: No
> > >
> > > Created attachment 279297 [details]
> > >   --> https://bugzilla.kernel.org/attachment.cgi?id=279297&action=edit
> > > dmesg log of crash
> >
> >
> Basing on what I see in dmesg, it is highly likely to get fixed by
> https://lkml.org/lkml/2018/11/5/726. Could you please apply/retest?
>
> Best regards,
>    Vitaly
>
> > > This happens mostly during memory pressure but I am not sure how to
> > trigger it
> > > reliably. I am attaching the full log.
> > >
> > > This is the kernel commandline
> > >
> > > >BOOT_IMAGE=../vmlinuz-linux
> > root=UUID=57274b3a-92ab-468e-b03a-06026675c1af rw
> > > >rd.luks.name=92b4aeb2-fb97-45c1-8a60-2816efe5d57e=home
> > resume=/dev/mapper/home
> > > >resume_offset=42772480 acpi_backlight=video zswap.enabled=1
> > zswap.zpool=z3fold
> > > >zswap.max_pool_percent=5 transparent_hugepage=madvise
> > scsi_mod.use_blk_mq=1
> > > >vga=current initrd=../intel-ucode.img,../initramfs-linux.img
> > >
> > > I found this bug https://bugzilla.kernel.org/show_bug.cgi?id=198585 to
> > be very
> > > similar but the proposed fix has not been merged so I can't be sure if
> > it will
> > > fix the issue I am having.
> > >
> > > --
> > > You are receiving this mail because:
> > > You are the assignee for the bug.
> >
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 5 Ken Sharp 2019-02-06 09:16:54 UTC
Any update on this and does anyone know if this is a regression?
Comment 6 Ken Sharp 2019-02-12 10:47:55 UTC
This seems related from Chrome OS:
https://bugs.chromium.org/p/chromium/issues/detail?id=822360

I'm afraid that I don't currently have the resources to test the patches (that haven't been committed upstream) at the moment, although I may need to in the end as one of my servers is crashing every day.
Comment 7 Vitaly 2019-02-21 17:45:25 UTC
Wait a second, you are sitting on a discontinued kernel. The fix has gone in back in December. Upgrade to longterm series, it's there since 4.19.6, and retest. Otherwise I do suggest that we close this issue.
Comment 8 Jagannathan Tiruvallur Eachambadi 2019-02-21 17:59:25 UTC
I am going to enable zswap and z3fold to confirm the fix. Sorry, I didn't follow through after it was merged.
Comment 9 Ken Sharp 2019-02-21 18:08:53 UTC
I'm installing it too. I didn't know there was a fix in-place anywhere so this is good news! It'll take me a few days of testing to get a result.
Comment 10 Jagannathan Tiruvallur Eachambadi 2019-03-04 10:52:22 UTC
I can confirm there have been no crashes since I switched, so closing. Thanks for the help and fix :)