Bug 216026
Summary: | Fails to compile using gcc 12.1 under Ubuntu 22.04 | ||
---|---|---|---|
Product: | Virtualization | Reporter: | Robert Dinse (nanook) |
Component: | kvm | Assignee: | virtualization_kvm |
Status: | RESOLVED CODE_FIX | ||
Severity: | high | CC: | alexander.warth, seanjc |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.18 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | The .config tried to build against, did a make mrproper first. |
This is expected, please check this: https://bugzilla.kernel.org/show_bug.cgi?id=216020#c1 The fact that this was released from a release candidate into mainstream before fixing major compile errors is just screwed up. What is this Microsoft? Some of the GCC-12 errors, including this one[*], are likely GCC bugs. The KVM code has existed for many, many years, i.e. this wasn't something introduced in v5.18. I am working on a small series to guard against KVM bugs in this area, which will in theory squash this warning, but unless someone can prove that an out-of-bounds access really truly is possible, I doubt any "fix" for this will be backported to already-released kernels. Your best bet is to build with CONFIG_WERROR=n and CONFIG_KVM_WERROR=n so that GCC-12's zealotry doesn't break the build. [*] https://lore.kernel.org/all/YofQlBrlx18J7h9Y@google.com I compiled this: Linux nanook 5.17.9 #1 SMP PREEMPT Wed May 18 14:16:39 PDT 2022 x86_64 x86_64 x86_64 GNU/Linux On the exact same machine with the exact same compiler, and save for the new additions which I left at the exact same values, the new additions I left at the default values, in the exact same compiler environment and it compiled without errors, so something DID change in the code. Ah, commit e6148767825c ("Makefile: Enable -Warray-bounds") removed "-Wno-array-bounds", which is why I couldn't find a reference to array-bounds in v5.18 and later. So yeah, v5.18 broke a bunch of code :-/ The patches that Sean Christopherson provided to me via e-mail did allow the 5.18 kernel to compile with gcc 12.1 without errors. I erroneously in an e-mail among Sean Christopherson and other developers in a conversation about these patches stated that networking wasn't working in guests, well, that was a mistake on my part, networking broke because of installing some Unifi software for a new router that broke routing for my virtual machines because it was competing for the same local non-routable subnet. So that was NOT a kernel issue. Got the same error. Is the patch somewhere downloadable? forgot to mention. Same circumstances GCC 12.1 Ubuntu 22.04 (PopOS) I unfortunately had to re-install Linux on my box between the time the patches were installed and my last backup (which I do weekly) so lost them. I was planning on just waiting for 18.1 and hope they had incorporated them by then. Thx for replying. We already have 5.18.1 and AFAIK there is no patch fort this. The build error appears for 5.18.1 too. *** Bug 216056 has been marked as a duplicate of this bug. *** Well if the developers used modern tools I expect this sort of thing would be resolved before the kernel was ever kicked out. But the e-mails I had received led me to believe the patches would be committed, apparently not. I just hope they don't EOL 5.17 before this is fixed. ...5.18.2 still the same issue. It builds fine with this patch set v2 unfortunately it is not included in the 5.18.2 Kernel yet https://patchwork.kernel.org/project/kvm/list/?series=645409 5.18.3 still fails to compile with the same error. Would be wonderful if this were fixed before you EOL 5.17 Yep, TBh I dont understand why the patch has not been merged yet. Meanwhile other patches of Sean Christopherson had been merged already Well again, if people stayed current with their development tools this never would have happened as they would have seen and fixed this as soon as they added the gcc flag to the Makefile. This sort of thing is why I keep my own development tools up to date, which unfortunately causes issues when others do not. *** Bug 216137 has been marked as a duplicate of this bug. *** Still broken in 5.18.4 AND NOW YOU HAVE EOL'd 5.17 without 5.18 WORKING, NOT OKAY! (In reply to Artem S. Tashkinov from comment #19) > *** Bug 216137 has been marked as a duplicate of this bug. *** I would not have created this duplicate if the search function in bugzilla worked properly. But I tried advanced search and searched for bugs I created and it returned zero bugs. Tried to compile 5.18.5, STILL BROKEN. Same Error. This is from 5.18.5: In function ‘reg_read’, inlined from ‘reg_rmw’ at arch/x86/kvm/emulate.c:266:2: arch/x86/kvm/emulate.c:254:27: error: array subscript 32 is above array bounds of ‘long unsigned int[17]’ [-Werror=array-bounds] 254 | return ctxt->_regs[nr]; | ~~~~~~~~~~~^~~~ In file included from arch/x86/kvm/emulate.c:23: arch/x86/kvm/kvm_emulate.h: In function ‘reg_rmw’: arch/x86/kvm/kvm_emulate.h:366:23: note: while referencing ‘_regs’ 366 | unsigned long _regs[NR_VCPU_REGS]; | ^~~~~ cc1: all warnings being treated as errors make[5]: *** [scripts/Makefile.build:288: arch/x86/kvm/emulate.o] Error 1 make[5]: *** Waiting for unfinished jobs.... make[4]: *** [scripts/Makefile.build:550: arch/x86/kvm] Error 2 make[3]: *** [Makefile:1834: arch/x86] Error 2 make[3]: *** Waiting for unfinished jobs.... make[2]: *** [debian/rules:7: build-arch] Error 2 dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2 make[1]: *** [scripts/Makefile.package:83: bindeb-pkg] Error 2 make: *** [Makefile:1542: bindeb-pkg] Error 2 (In reply to Robert Dinse from comment #22) > Tried to compile 5.18.5, STILL BROKEN. Same Error. Developers are well aware, there's no need to report the same issues over and over again, if anything you're making them less willing to resolve these issues sooner rather than later. I am sorry you feel this way but this was a new version of Linux Kernel, so I added a comment for the new version to make sure people were appraised, if someone else had posted this first I would not have. I'm sorry the development folks get in pissing wars, if for example Linus and Nvidia hadn't had their unfriendly stances perhaps I wouldn't have had to switch to Intel graphics. There is unfortunately no viable alternative to Linux for me so I am doing what I can and this seems to be the only avenue available to me, to make issues that are significant impairments, and I don't know what can be more significant than being unable to compile, known. And we're already WAY past "sooner". Not resolved. I tried compiling this with Clang 15, it does not work either, get an error that says error: write on a pipe with no reader, but the clang website says this is a clang problem, apparently a race condition, so it will not compile with the current version of either compiler. (In reply to Artem S. Tashkinov from comment #24) > (In reply to Robert Dinse from comment #22) > > Tried to compile 5.18.5, STILL BROKEN. Same Error. > > Developers are well aware, there's no need to report the same issues over > and over again, if anything you're making them less willing to resolve these > issues sooner rather than later. Its less about Devs also about reporting to people having this bug. Thats also the reason why I have posted it and I'm glad Robert did inform me. It might annoy the devs. But it helps the community on the web searching for the same bug. 1. End users are not meant to compile the kernel. You have a distro kernel for that. 2. If you are willing to compile the kernel you must have the means and experience to resolve issues. 3. The status of the bug doesn't affect how fast it's gonna get resolved or whether Google finds it. 1. You are OT but - Any user who is able to compile the Kernel is an enduser. There is no exclusive contract between the Linux Kernel and the Distro Maintainers. Thats opensource. Any code user is an end user. 2. I'm compiling the kernel and I'm able to solve it by applying the patch given earlier in the thers. What we are asking is to mainline this very patch. Such that other people without knowing of this thread in the nerd corners of the web are able to benefit from the patch. I guess thats why patches are used to solve bugs. If the patch is working but not stable it would also be nice to be mentioned here. 3. I don't care when it is resolved or not. Its a bug ..there is already a patch why is it not mainlined thats the only question. Instead of writing all the 1-3 and putting it to resolved it would be more fruitfuly to any one if you/or anyone else briefly write a oneliner like this: Hey guys patch works but is not fully tested yet...could take a few weeks Or hey guys patch seems to work but breaks xy thats why its not mainlined or hey guys patch works but problem is not really solved yet or hey guys patch seems to work but only masks a compiler error. ... This is how a goal oriented user <-> dev <-> maintainer relation works. At the end we all want a well working kernel. (In reply to Artem S. Tashkinov from comment #30) > 1. End users are not meant to compile the kernel. You have a distro kernel > for that. > 2. If you are willing to compile the kernel you must have the means and > experience to resolve issues. > 3. The status of the bug doesn't affect how fast it's gonna get resolved or > whether Google finds it. Might as well just get rid of Bugzilla then huh? What a putz. There is no good reason on God's green earth that end users should not be able to compile their own kernels, I've been doing so since before there WAS a distribution. Distros often do not configure kernels the way end users need them, nor do they often maintain them anywhere near close to current. Both of these are the reason I compile kernels rather than use the distros. Let's be honestly brutal here for a second, shall we? 1. Do you pay for the Linux kernel or have any sort of contract/agreement with Linux kernel developers? Absolutely no. There's a license attached to the kernel, please read it carefully and thoroughly. 2. Do Linux kernel developers owe you anything? Absolutely no. What makes you believe someone should suddenly give up on the pressing tasks they are being paid for (and the failure to complete those tasks could also mean a lost job) and pay attention to a random self-righteous guy who's screaming his lungs out? You're completely lost. What's more you're _actively spamming_ a mailing list with dozens if not hundreds of subscribers. You don't add any new info, you just demand, demand and demand. Not only that you've already managed to create two duplicates as if it will suddenly make the people responsible for this kernel subsystem give up on their primary job and rush to meet your demands. Please stop. This message does _not_ need a reply. Keep this bug open for Christ's sake if it makes you happy. I'm all for honesty, not so much for brutality, but I do realize that many people who work with computers choose to do so because they're not so good with humans. That said, your position would be appropriate if Linux were still Linus's hobby project. The reality is that much of the real world computation depends upon it, the vast majority of super computers run Linux, Android phones run Linux, most of Googles, Amazons, and even Microsoft's services run Linux, so it has grown beyond hobby status. Self driving cars depend upon it, etc. In short, it is critical to the technological world and technology is critical to 8 billion people surviving on this ball we call Earth. So this requires a bit more professionalism than a hobby project. Isn't my intent to spam a mailing list, it is my intent to make the developers aware of a SERIOUS bug, and when a kernel can't be compiled with the most recent versions of EITHER compiler chains, that is serious, especially when you've End-of-Life'd the previous version which means they are not getting security fixes. I've been involved in some development projects myself, back in the days of 8-bit computers I even wrote a very specialized programming language, but I've never worked with a team of more than half a dozen individuals, so I can't even imagine what it is like for Linus to coordinate several thousand developers. That said, personality issues don't help, best to stick with technical issues, tackle them and move on. This is fixed in 5.19-rc3. The fix is trivial: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/Makefile ccflags-y += -I $(srctree)/arch/x86/kvm ccflags-$(CONFIG_KVM_WERROR) += -Werror Thx Artem. As said before it is basically already fixed by this patch more or less. https://patchwork.kernel.org/project/kvm/patch/20220526210817.3428868-3-seanjc@google.com/ And great that we have reached tech focused rational ground again. 5.19-rc3 compiles good for me with gcc, still a problem with llvm using multiple cores but that's a compiler race issue not a kernel issue. Do wish the patch were applied to mainstream as well, but am satisfied knowing this will be fixed in a few weeks when 5.19 becomes mainstream. 5.18.6 compiled successfully using GCC 12.1 so this issue is fixed in the 5.18 line as well. Much thanks to all involved. me too. Thx for the fix and efforts. Well now I'm wondering if this was really a proper fix of the code or just a fix to make the error invisible. With 5.17.14, a Windows kvm-qemu guest works mostly properly (there is an issue with copy host cpu in virtual-manager, where it takes a one socket 8 core, one thread / core CPU and maps it to 8 sockets 1 core, but I can get around that by manually defining layout. The virtual machine is using i-915/UHD630 GPU virtualization and pass through. Under the 5.17.14 kernel this works properly. Under the 5.18.6 kernel it works right after a fresh boot but within a few hours it loses it's brains, I no longer have a cursor, I can't get the host to communicate properly with the guest anymore. Should I re-open this bug or file a new one with that description? Please open a new bug, there is essentially zero chance this is related to the issues you are seeing. |
Created attachment 301039 [details] The .config tried to build against, did a make mrproper first. CC arch/x86/kvm/../../../virt/kvm/kvm_main.o CC arch/x86/kvm/../../../virt/kvm/eventfd.o CC arch/x86/kvm/../../../virt/kvm/binary_stats.o CC arch/x86/kvm/../../../virt/kvm/vfio.o CC arch/x86/kvm/../../../virt/kvm/coalesced_mmio.o CC arch/x86/kvm/../../../virt/kvm/async_pf.o CC arch/x86/kvm/../../../virt/kvm/irqchip.o CC arch/x86/kvm/../../../virt/kvm/dirty_ring.o CC arch/x86/kvm/../../../virt/kvm/pfncache.o CC arch/x86/kvm/x86.o CC arch/x86/kvm/emulate.o In function ‘reg_read’, inlined from ‘reg_rmw’ at arch/x86/kvm/emulate.c:266:2: arch/x86/kvm/emulate.c:254:27: error: array subscript 32 is above array bounds of ‘long unsigned int[17]’ [-Werror=array-bounds] 254 | return ctxt->_regs[nr]; | ~~~~~~~~~~~^~~~ In file included from arch/x86/kvm/emulate.c:23: arch/x86/kvm/kvm_emulate.h: In function ‘reg_rmw’: arch/x86/kvm/kvm_emulate.h:366:23: note: while referencing ‘_regs’ 366 | unsigned long _regs[NR_VCPU_REGS]; | ^~~~~ cc1: all warnings being treated as errors make[5]: *** [scripts/Makefile.build:288: arch/x86/kvm/emulate.o] Error 1 make[4]: *** [scripts/Makefile.build:550: arch/x86/kvm] Error 2 make[3]: *** [Makefile:1834: arch/x86] Error 2 make[2]: *** [debian/rules:7: build-arch] Error 2 dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2 make[1]: *** [scripts/Makefile.package:83: bindeb-pkg] Error 2 make: *** [Makefile:1542: bindeb-pkg] Error 2