Bug 217884
Summary: | Random kernel panic since > 6.3.7 | ||
---|---|---|---|
Product: | Linux | Reporter: | cyayon |
Component: | Kernel | Assignee: | Virtual assignee for kernel bugs (linux-kernel) |
Status: | NEW --- | ||
Severity: | normal | CC: | bagasdotme |
Priority: | P3 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 6.3.9 | Subsystem: | |
Regression: | Yes | Bisected commit-id: |
Description
cyayon
2023-09-07 18:26:50 UTC
There's not so many commits between 6.3.7 and 6.3.9, so it would be best for you just to perform regression testing using: https://docs.kernel.org/admin-guide/bug-bisect.html It may take quite some time but it's the best shot you've got since your issue is not widespread. (In reply to cyayon from comment #0) > Hi, > > I have random kernel panic on my archlinux server, since upgrade from > > kernel 6.3.7. > It's a headless server, no X11, no graphic card, and it is an intel > processor (not AMD). > > On 6.3.7, everything was fine, since upgrade to 6.3.9, 6.4.3, 6.4.4, and now > 6.4.12, I got random crash (sometime 24h, 48h, sometime a little bit > more...). > > No log in journald/syslog, I managed to get only these three lines (with > mounting /var/log/journal outside my nvme boot disk) : > > BUG: unable to handle page fault for address: 00000000352aa941 > #PF: supervisor write access in kernel mode > #PF: error_code(0x0002) - not-present page > > Some information : > uname -a : Linux xxxxorg 6.4.12-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 24 Aug > 2023 00:38:14 +0000 x86_64 GNU/Linux > dmesg : http://ix.io/4FE8 > journalctl -b -1 | tail -1000 : http://ix.io/4FE9 (only last 1000 lines, too > big) > lspci -vvv : http://ix.io/4FEa > lsmod : http://ix.io/4FEf > lscpu : http://ix.io/4FEh > > Of course, if I rollback to 6.3.7, no more crash. > > Could you please help me to debug ? > Please test latest mainline first. Hello, 6.4.12 is not mainline ? Thanks. 6.4.15 is. Sorry, I understand. The 6.4.12 is the last Archlinux 6.4 kernel. I have to wait 6.5.x … (In reply to Artem S. Tashkinov from comment #4) > 6.4.15 is. Nope. I mean v6.x and v6.x-rcy (as in Linus's tree), not v6.x.y as in the stable one. Hi, This night crash again (kernel 6.4.12). I manage to got some logs from syslog-ng (no log from journald). Here are the last logs just before the crash. http://ix.io/4FTW It seems to be related to nft. thanks. (In reply to cyayon from comment #7) > Hi, > > This night crash again (kernel 6.4.12). I manage to got some logs from > syslog-ng (no log from journald). > > Here are the last logs just before the crash. > > http://ix.io/4FTW > > It seems to be related to nft. > > thanks. OK, then perform bisection to find the culprit commit that introduces your regression. If you don't know how to do so, see kernel documentation [1]. [1]: https://docs.kernel.org/admin-guide/bug-bisect.html (In reply to cyayon from comment #7) > Hi, > > This night crash again (kernel 6.4.12). I manage to got some logs from > syslog-ng (no log from journald). > > Here are the last logs just before the crash. > > http://ix.io/4FTW > > It seems to be related to nft. > > thanks. Now v6.6-rc1 has been released, please test. Since you're about to compile vanilla kernel, see ArchWiki [1] for instructions. [1]: https://wiki.archlinux.org/title/Kernel/Traditional_compilation Hi, Yesterday, I opened a ticket to netfilter (via email). Pablo N. tell me the issue coming from commit bdace3b1a51887211d3e49417a18fdbd315a313b. He also asked me to test 6.4.15 instead of 6.5.2 which is a little behind for this issue. I don't know about 6.6rc1 vs 6.4.15. I am currently testing and keep informed here. Thanks On 11/09/2023 19:54, bugzilla-daemon@kernel.org wrote: > I don't know about 6.6rc1 vs 6.4.15. > The former is release candidate version, tagged from Linus's tree (aka mainline). It is primarily used for testing before official release is made. The latter is stable kernel with fixes backported form mainline. It is the recommended kernel to run on production. For more information, see [1]. Thanks. [1]: https://kernel.org/category/releases.html I would like to say that I didn't know if 6.6rc1 include revert bdace3b1a51887211d3e49417a18fdbd315a313b (like 6.4.15). On 11/09/2023 20:05, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=217884 > > --- Comment #12 from cyayon@nbux.org --- > I would like to say that I didn't know if 6.6rc1 include revert > bdace3b1a51887211d3e49417a18fdbd315a313b (like 6.4.15). > It should already include 26b5a5712eb85e ("netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain") as the fix. thanks Hi, No crash since 3 days with 6.4.15. I will wait again a few days but it should be ok, many thanks ! I asked Pablo N to know if the patch / revert has been merged to 6.5.3, waiting his answer… Oh, no... This morning crash again :(. Here is the journald log : http://ix.io/4Gvd (6.4.15) |