Bug 218039
Summary: | Memleaks in offset_ctx->xa (shmem) | ||
---|---|---|---|
Product: | Memory Management | Reporter: | vladbu |
Component: | Other | Assignee: | Andrew Morton (akpm) |
Status: | NEW --- | ||
Severity: | normal | CC: | bagasdotme, bcodding, cel, chuck.lever, regressions |
Priority: | P3 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: |
Description
vladbu
2023-10-23 07:06:34 UTC
(In reply to vladbu from comment #0) > We have been getting memleaks in offset_ctx->xa in our networking tests: > > unreferenced object 0xffff8881004cd080 (size 576): > comm "systemd", pid 1, jiffies 4294893373 (age 1992.864s) > hex dump (first 32 bytes): > 00 00 06 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 38 5c 7c 02 81 88 ff ff 98 d0 4c 00 81 88 ff ff 8\|.......L..... > backtrace: > [<000000000f554608>] xas_alloc+0x306/0x430 > [<0000000075537d52>] xas_create+0x4b4/0xc80 > [<00000000a927aab2>] xas_store+0x73/0x1680 > [<0000000020a61203>] __xa_alloc+0x1d8/0x2d0 > [<00000000ae300af2>] __xa_alloc_cyclic+0xf1/0x310 > [<000000001032332c>] simple_offset_add+0xd8/0x170 > [<0000000073229fad>] shmem_mknod+0xbf/0x180 > [<00000000242520ce>] vfs_mknod+0x3b0/0x5c0 > [<000000001ef218dd>] unix_bind+0x2c2/0xdb0 > [<0000000009b9a8dd>] __sys_bind+0x127/0x1e0 > [<000000003c949fbb>] __x64_sys_bind+0x6e/0xb0 > [<00000000b8a767c7>] do_syscall_64+0x3d/0x90 > [<000000006132ae0d>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 > > Memleak trace points to some syscall performed by systemd and none of our > tests do anything more advanced with it than 'systemctl restart > ovs-vswitchd'. Basically it is a setup with Fedora and an upstream kernel > that executes bunch of network offload tests with Open vSwitch, iproute2 tc, > Linux bridge, etc. > > It looks like those may be caused by recent commit 6faddda69f62 ("libfs: Add > directory operations for stable offsets") but we don't have a proper > reproduction, just sometimes arbitrary getting the memleak complains > during/after the regression run. Does reverting 6faddda69f62 fix this issue? (In reply to Bagas Sanjaya from comment #1) > (In reply to vladbu from comment #0) > > We have been getting memleaks in offset_ctx->xa in our networking tests: > > > > unreferenced object 0xffff8881004cd080 (size 576): > > comm "systemd", pid 1, jiffies 4294893373 (age 1992.864s) > > hex dump (first 32 bytes): > > 00 00 06 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > > 38 5c 7c 02 81 88 ff ff 98 d0 4c 00 81 88 ff ff 8\|.......L..... > > backtrace: > > [<000000000f554608>] xas_alloc+0x306/0x430 > > [<0000000075537d52>] xas_create+0x4b4/0xc80 > > [<00000000a927aab2>] xas_store+0x73/0x1680 > > [<0000000020a61203>] __xa_alloc+0x1d8/0x2d0 > > [<00000000ae300af2>] __xa_alloc_cyclic+0xf1/0x310 > > [<000000001032332c>] simple_offset_add+0xd8/0x170 > > [<0000000073229fad>] shmem_mknod+0xbf/0x180 > > [<00000000242520ce>] vfs_mknod+0x3b0/0x5c0 > > [<000000001ef218dd>] unix_bind+0x2c2/0xdb0 > > [<0000000009b9a8dd>] __sys_bind+0x127/0x1e0 > > [<000000003c949fbb>] __x64_sys_bind+0x6e/0xb0 > > [<00000000b8a767c7>] do_syscall_64+0x3d/0x90 > > [<000000006132ae0d>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 > > > > Memleak trace points to some syscall performed by systemd and none of our > > tests do anything more advanced with it than 'systemctl restart > > ovs-vswitchd'. Basically it is a setup with Fedora and an upstream kernel > > that executes bunch of network offload tests with Open vSwitch, iproute2 > tc, > > Linux bridge, etc. > > > > It looks like those may be caused by recent commit 6faddda69f62 ("libfs: > Add > > directory operations for stable offsets") but we don't have a proper > > reproduction, just sometimes arbitrary getting the memleak complains > > during/after the regression run. > > Does reverting 6faddda69f62 fix this issue? Unless I'm missing something neither simple_offset_add() function nor offset_ctx->xa xarray existed before 6faddda69f62, so reverting it would obviously fix the issue. Reverting 6faddda69f62 has unwanted side-effects, not the least of which is re-introducing a bug in the way tmpfs handles directory offsets. Here is a copy of ovs-vswitchd.service from github:
> [Unit]
> Description=Open vSwitch Forwarding Unit
> After=ovsdb-server.service network-pre.target systemd-udev-settle.service
> Before=network.target network.service
> Requires=ovsdb-server.service
> ReloadPropagatedFrom=ovsdb-server.service
> AssertPathIsReadWrite=/run/openvswitch/db.sock
> PartOf=openvswitch.service
>
> [Service]
> Type=forking
> PIDFile=/run/openvswitch/ovs-vswitchd.pid
> Restart=on-failure
> Environment=XDG_RUNTIME_DIR=/run/openvswitch
> EnvironmentFile=/etc/openvswitch/default.conf
> EnvironmentFile=-/etc/sysconfig/openvswitch
> EnvironmentFile=-/run/openvswitch.useropts
> LimitSTACK=2M
> @begin_dpdk@
> ExecStartPre=-/bin/sh -c '/usr/bin/chown :$${OVS_USER_ID##*:} /dev/hugepages'
> ExecStartPre=-/usr/bin/chmod 0775 /dev/hugepages
> @end_dpdk@
> ExecStart=/usr/share/openvswitch/scripts/ovs-ctl \
> --no-ovsdb-server --no-monitor --system-id=random \
> ${OVS_USER_OPT} \
> start $OPTIONS
> ExecStop=/usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server stop
> ExecReload=/usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server \
> --no-monitor --system-id=random \
> ${OVS_USER_OPT} \
> restart $OPTIONS
> TimeoutSec=300
Vlad, have you tried running any of these ovs-ctl commands individually? My working theory is that one of these is doing something unexpected with a tmpfs file which exposes a missing kfree() in the shmemfs code that 6faddda69f62 converted to use simple directory offsets.
Ilya Maximets has broken down ovs-ctl even further: https://mail.openvswitch.org/pipermail/ovs-discuss/2023-November/052783.html One or more of these individual steps could be the culprit. (In reply to Chuck Lever from comment #4) > Here is a copy of ovs-vswitchd.service from github: > > > [Unit] > > Description=Open vSwitch Forwarding Unit > > After=ovsdb-server.service network-pre.target systemd-udev-settle.service > > Before=network.target network.service > > Requires=ovsdb-server.service > > ReloadPropagatedFrom=ovsdb-server.service > > AssertPathIsReadWrite=/run/openvswitch/db.sock > > PartOf=openvswitch.service > > > > [Service] > > Type=forking > > PIDFile=/run/openvswitch/ovs-vswitchd.pid > > Restart=on-failure > > Environment=XDG_RUNTIME_DIR=/run/openvswitch > > EnvironmentFile=/etc/openvswitch/default.conf > > EnvironmentFile=-/etc/sysconfig/openvswitch > > EnvironmentFile=-/run/openvswitch.useropts > > LimitSTACK=2M > > @begin_dpdk@ > > ExecStartPre=-/bin/sh -c '/usr/bin/chown :$${OVS_USER_ID##*:} > /dev/hugepages' > > ExecStartPre=-/usr/bin/chmod 0775 /dev/hugepages > > @end_dpdk@ > > ExecStart=/usr/share/openvswitch/scripts/ovs-ctl \ > > --no-ovsdb-server --no-monitor --system-id=random \ > > ${OVS_USER_OPT} \ > > start $OPTIONS > > ExecStop=/usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server stop > > ExecReload=/usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server \ > > --no-monitor --system-id=random \ > > ${OVS_USER_OPT} \ > > restart $OPTIONS > > TimeoutSec=300 > > Vlad, have you tried running any of these ovs-ctl commands individually? My > working theory is that one of these is doing something unexpected with a > tmpfs file which exposes a missing kfree() in the shmemfs code that > 6faddda69f62 converted to use simple directory offsets. Just starting/restarting/reloading/stopping ovs-vswitchd service doesn't reproduce the issue for me. At this point I cannot proceed without a reproducer. Vlad, please continue trying to identify a trigger for this issue. I expect as v6.6 is adopted by leading-edge distros, someone else might encounter this issue and be able to provide a more specific trigger. (In reply to Chuck Lever from comment #7) > At this point I cannot proceed without a reproducer. Vlad, please continue > trying to identify a trigger for this issue. Vlad, any progress? FWIW, I noticed that Chuck wrote a change that contains a fixes tag pointing to the culprit of this regression ( https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=master&id=796432efab1e372d404e7a71cc6891a53f105051 ). I wonder if that in some way might be relevant for this problem. (In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #8) > (In reply to Chuck Lever from comment #7) > > At this point I cannot proceed without a reproducer. Vlad, please continue > > trying to identify a trigger for this issue. > > Vlad, any progress? No. As Chuck suggested in his previous comment: I expect as v6.6 is adopted by leading-edge distros, someone else might encounter this issue and be able to provide a more specific trigger. (In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #8) > FWIW, I noticed that Chuck wrote a change that contains a fixes tag pointing > to the culprit of this regression ( > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/ > ?h=master&id=796432efab1e372d404e7a71cc6891a53f105051 ). I wonder if that in > some way might be relevant for this problem. That's possible, but not probable. Worth a try, though. |