Bug 216026 - Fails to compile using gcc 12.1 under Ubuntu 22.04
Summary: Fails to compile using gcc 12.1 under Ubuntu 22.04
Status: RESOLVED CODE_FIX
Alias: None
Product: Virtualization
Classification: Unclassified
Component: kvm (show other bugs)
Hardware: All Linux
: P1 high
Assignee: virtualization_kvm
URL:
Keywords:
: 216056 216137 (view as bug list)
Depends on:
Blocks:
 
Reported: 2022-05-25 07:56 UTC by Robert Dinse
Modified: 2022-06-27 14:21 UTC (History)
2 users (show)

See Also:
Kernel Version: 5.18
Subsystem:
Regression: No
Bisected commit-id:


Attachments
The .config tried to build against, did a make mrproper first. (260.92 KB, text/plain)
2022-05-25 07:56 UTC, Robert Dinse
Details

Description Robert Dinse 2022-05-25 07:56:58 UTC
Created attachment 301039 [details]
The .config tried to build against, did a make mrproper first.

CC      arch/x86/kvm/../../../virt/kvm/kvm_main.o
  CC      arch/x86/kvm/../../../virt/kvm/eventfd.o
  CC      arch/x86/kvm/../../../virt/kvm/binary_stats.o
  CC      arch/x86/kvm/../../../virt/kvm/vfio.o
  CC      arch/x86/kvm/../../../virt/kvm/coalesced_mmio.o
  CC      arch/x86/kvm/../../../virt/kvm/async_pf.o
  CC      arch/x86/kvm/../../../virt/kvm/irqchip.o
  CC      arch/x86/kvm/../../../virt/kvm/dirty_ring.o
  CC      arch/x86/kvm/../../../virt/kvm/pfncache.o
  CC      arch/x86/kvm/x86.o
  CC      arch/x86/kvm/emulate.o
In function ‘reg_read’,
    inlined from ‘reg_rmw’ at arch/x86/kvm/emulate.c:266:2:
arch/x86/kvm/emulate.c:254:27: error: array subscript 32 is above array bounds of ‘long unsigned int[17]’ [-Werror=array-bounds]
  254 |         return ctxt->_regs[nr];
      |                ~~~~~~~~~~~^~~~
In file included from arch/x86/kvm/emulate.c:23:
arch/x86/kvm/kvm_emulate.h: In function ‘reg_rmw’:
arch/x86/kvm/kvm_emulate.h:366:23: note: while referencing ‘_regs’
  366 |         unsigned long _regs[NR_VCPU_REGS];
      |                       ^~~~~
cc1: all warnings being treated as errors
make[5]: *** [scripts/Makefile.build:288: arch/x86/kvm/emulate.o] Error 1
make[4]: *** [scripts/Makefile.build:550: arch/x86/kvm] Error 2
make[3]: *** [Makefile:1834: arch/x86] Error 2
make[2]: *** [debian/rules:7: build-arch] Error 2
dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2
make[1]: *** [scripts/Makefile.package:83: bindeb-pkg] Error 2
make: *** [Makefile:1542: bindeb-pkg] Error 2
Comment 1 Artem S. Tashkinov 2022-05-25 08:17:23 UTC
This is expected, please check this:

https://bugzilla.kernel.org/show_bug.cgi?id=216020#c1
Comment 2 Robert Dinse 2022-05-25 08:34:34 UTC
The fact that this was released from a release candidate into mainstream before fixing major compile errors is just screwed up.  What is this Microsoft?
Comment 3 Sean Christopherson 2022-05-25 15:59:01 UTC
Some of the GCC-12 errors, including this one[*], are likely GCC bugs.  The KVM code has existed for many, many years, i.e. this wasn't something introduced in v5.18.  I am working on a small series to guard against KVM bugs in this area, which will in theory squash this warning, but unless someone can prove that an out-of-bounds access really truly is possible, I doubt any "fix" for this will be backported to already-released kernels.

Your best bet is to build with CONFIG_WERROR=n and CONFIG_KVM_WERROR=n so that GCC-12's zealotry doesn't break the build.

[*] https://lore.kernel.org/all/YofQlBrlx18J7h9Y@google.com
Comment 4 Robert Dinse 2022-05-25 19:40:55 UTC
I compiled this:

Linux nanook 5.17.9 #1 SMP PREEMPT Wed May 18 14:16:39 PDT 2022 x86_64 x86_64 x86_64 GNU/Linux

On the exact same machine with the exact same compiler, and save for the new additions which I left at the exact same values, the new additions I left at the default values, in the exact same compiler environment and it compiled without errors, so something DID change in the code.
Comment 5 Sean Christopherson 2022-05-25 20:51:04 UTC
Ah, commit e6148767825c ("Makefile: Enable -Warray-bounds") removed "-Wno-array-bounds", which is why I couldn't find a reference to array-bounds in v5.18 and later.  So yeah, v5.18 broke a bunch of code :-/
Comment 6 Robert Dinse 2022-05-26 01:48:00 UTC
The patches that Sean Christopherson provided to me via e-mail did allow the 5.18 kernel to compile with gcc 12.1 without errors.
Comment 7 Robert Dinse 2022-05-27 00:52:38 UTC
I erroneously in an e-mail among Sean Christopherson and other developers in a conversation about these patches stated that networking wasn't working in guests, well, that was a mistake on my part, networking broke because of installing some Unifi software for a new router that broke routing for my virtual machines because it was competing for the same local non-routable subnet.  So that was NOT a kernel issue.
Comment 8 Alexander Warth 2022-05-30 19:47:30 UTC
Got the same error. Is the patch somewhere downloadable?
Comment 9 Alexander Warth 2022-05-30 19:57:41 UTC
forgot to mention. Same circumstances GCC 12.1 Ubuntu 22.04 (PopOS)
Comment 10 Robert Dinse 2022-05-31 02:10:32 UTC
I unfortunately had to re-install Linux on my box between the time the patches were installed and my last backup (which I do weekly) so lost them.  I was planning on just waiting for 18.1 and hope they had incorporated them by then.
Comment 11 Alexander Warth 2022-05-31 10:42:56 UTC
Thx for replying. We already have 5.18.1 and AFAIK there is no patch fort this. The build error appears for 5.18.1 too.
Comment 12 Artem S. Tashkinov 2022-06-01 06:56:25 UTC
*** Bug 216056 has been marked as a duplicate of this bug. ***
Comment 13 Robert Dinse 2022-06-01 07:04:06 UTC
Well if the developers used modern tools I expect this sort of thing would be resolved before the kernel was ever kicked out.  But the e-mails I had received led me to believe the patches would be committed, apparently not.

I just hope they don't EOL 5.17 before this is fixed.
Comment 14 Alexander Warth 2022-06-07 07:41:00 UTC
...5.18.2 still the same issue.
Comment 15 Alexander Warth 2022-06-07 08:54:45 UTC
It builds fine with this patch set v2 unfortunately it is not included in the 5.18.2 Kernel yet

https://patchwork.kernel.org/project/kvm/list/?series=645409
Comment 16 Robert Dinse 2022-06-10 22:48:19 UTC
5.18.3 still fails to compile with the same error.  Would be wonderful if this were fixed before you EOL 5.17
Comment 17 Alexander Warth 2022-06-13 15:16:09 UTC
Yep, TBh I dont understand why the patch has not been merged yet. Meanwhile other patches of Sean Christopherson had been merged already
Comment 18 Robert Dinse 2022-06-14 10:56:06 UTC
Well again, if people stayed current with their development tools this never would have happened as they would have seen and fixed this as soon as they added the gcc flag to the Makefile.

This sort of thing is why I keep my own development tools up to date, which unfortunately causes issues when others do not.
Comment 19 Artem S. Tashkinov 2022-06-15 14:05:46 UTC
*** Bug 216137 has been marked as a duplicate of this bug. ***
Comment 20 Robert Dinse 2022-06-15 22:11:40 UTC
Still broken in 5.18.4 AND NOW YOU HAVE EOL'd 5.17 without 5.18 WORKING, NOT OKAY!
Comment 21 Robert Dinse 2022-06-15 22:17:42 UTC
(In reply to Artem S. Tashkinov from comment #19)
> *** Bug 216137 has been marked as a duplicate of this bug. ***

     I would not have created this duplicate if the search function in bugzilla worked properly.  But I tried advanced search and searched for bugs I created and it returned zero bugs.
Comment 22 Robert Dinse 2022-06-16 21:22:22 UTC
Tried to compile 5.18.5, STILL BROKEN.  Same Error.
Comment 23 Robert Dinse 2022-06-16 21:25:45 UTC
This is from 5.18.5:

In function ‘reg_read’,
    inlined from ‘reg_rmw’ at arch/x86/kvm/emulate.c:266:2:
arch/x86/kvm/emulate.c:254:27: error: array subscript 32 is above array bounds of ‘long unsigned int[17]’ [-Werror=array-bounds]
  254 |         return ctxt->_regs[nr];
      |                ~~~~~~~~~~~^~~~
In file included from arch/x86/kvm/emulate.c:23:
arch/x86/kvm/kvm_emulate.h: In function ‘reg_rmw’:
arch/x86/kvm/kvm_emulate.h:366:23: note: while referencing ‘_regs’
  366 |         unsigned long _regs[NR_VCPU_REGS];
      |                       ^~~~~
cc1: all warnings being treated as errors
make[5]: *** [scripts/Makefile.build:288: arch/x86/kvm/emulate.o] Error 1
make[5]: *** Waiting for unfinished jobs....
make[4]: *** [scripts/Makefile.build:550: arch/x86/kvm] Error 2
make[3]: *** [Makefile:1834: arch/x86] Error 2
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [debian/rules:7: build-arch] Error 2
dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2
make[1]: *** [scripts/Makefile.package:83: bindeb-pkg] Error 2
make: *** [Makefile:1542: bindeb-pkg] Error 2
Comment 24 Artem S. Tashkinov 2022-06-18 17:34:54 UTC
(In reply to Robert Dinse from comment #22)
> Tried to compile 5.18.5, STILL BROKEN.  Same Error.

Developers are well aware, there's no need to report the same issues over and over again, if anything you're making them less willing to resolve these issues sooner rather than later.
Comment 25 Robert Dinse 2022-06-19 00:29:58 UTC
I am sorry you feel this way but this was a new version of Linux Kernel, so I added a comment for the new version to make sure people were appraised, if someone else had posted this first I would not have.

I'm sorry the development folks get in pissing wars, if for example Linus and Nvidia hadn't had their unfriendly stances perhaps I wouldn't have had to switch to Intel graphics.

There is unfortunately no viable alternative to Linux for me so I am doing what I can and this seems to be the only avenue available to me, to make issues that are significant impairments, and I don't know what can be more significant than being unable to compile, known.
Comment 26 Robert Dinse 2022-06-19 00:30:28 UTC
And we're already WAY past "sooner".
Comment 27 Robert Dinse 2022-06-19 00:31:21 UTC
Not resolved.
Comment 28 Robert Dinse 2022-06-20 04:35:07 UTC
I tried compiling this with Clang 15, it does not work either, get an error that says error: write on a pipe with no reader, but the clang website says this is a clang problem, apparently a race condition, so it will not compile with the current version of either compiler.
Comment 29 Alexander Warth 2022-06-20 07:16:08 UTC
(In reply to Artem S. Tashkinov from comment #24)
> (In reply to Robert Dinse from comment #22)
> > Tried to compile 5.18.5, STILL BROKEN.  Same Error.
> 
> Developers are well aware, there's no need to report the same issues over
> and over again, if anything you're making them less willing to resolve these
> issues sooner rather than later.

Its less about Devs also about reporting to people having this bug. Thats also the reason why I have posted it and I'm glad Robert did inform me. It might annoy the devs. But it helps the community on the web searching for the same bug.
Comment 30 Artem S. Tashkinov 2022-06-20 13:39:52 UTC
1. End users are not meant to compile the kernel. You have a distro kernel for that.
2. If you are willing to compile the kernel you must have the means and experience to resolve issues.
3. The status of the bug doesn't affect how fast it's gonna get resolved or whether Google finds it.
Comment 31 Alexander Warth 2022-06-20 14:01:30 UTC
1. You are OT  but - Any user who is able to compile the Kernel is an enduser. There is no exclusive contract between the Linux Kernel and the Distro Maintainers. Thats opensource. Any code user is an end user. 

2. I'm compiling the kernel and I'm able to solve it by applying the patch given earlier in the thers. 

What we are asking is to mainline this very patch. Such that other people without knowing of this thread in the nerd corners of the web are able to benefit from the patch. I guess thats why patches are used to solve bugs.

If the patch is working but not stable it would also be nice to be mentioned here.  

3. I don't care when it is resolved or not. Its a bug ..there is already a patch why is it not mainlined thats the only question.  

Instead of writing all the 1-3 and putting it to resolved it would be more fruitfuly to any one if you/or anyone else briefly write a oneliner like this:
 
Hey guys patch works but is not fully tested yet...could take a few weeks
Or hey guys patch seems to work but breaks xy thats why its not mainlined
or hey guys patch works but problem is not really solved yet
or hey guys patch seems to work but only masks a compiler error.
... 

This is how a goal oriented user <-> dev <-> maintainer relation works.
At the end we all want a well working kernel.
Comment 32 Robert Dinse 2022-06-20 14:33:11 UTC
(In reply to Artem S. Tashkinov from comment #30)
> 1. End users are not meant to compile the kernel. You have a distro kernel
> for that.
> 2. If you are willing to compile the kernel you must have the means and
> experience to resolve issues.
> 3. The status of the bug doesn't affect how fast it's gonna get resolved or
> whether Google finds it.

Might as well just get rid of Bugzilla then huh?
What a putz.

There is no good reason on God's green earth that end users should not be able to compile their own kernels, I've been doing so since before there WAS a distribution.  Distros often do not configure kernels the way end users need them, nor do they often maintain them anywhere near close to current.  Both of these are the reason I compile kernels rather than use the distros.
Comment 33 Artem S. Tashkinov 2022-06-20 15:25:50 UTC
Let's be honestly brutal here for a second, shall we?

1. Do you pay for the Linux kernel or have any sort of contract/agreement with Linux kernel developers? Absolutely no. There's a license attached to the kernel, please read it carefully and thoroughly.

2. Do Linux kernel developers owe you anything? Absolutely no.

What makes you believe someone should suddenly give up on the pressing tasks they are being paid for (and the failure to complete those tasks could also mean a lost job) and pay attention to a random self-righteous guy who's screaming his lungs out?

You're completely lost.

What's more you're _actively spamming_ a mailing list with dozens if not hundreds of subscribers.

You don't add any new info, you just demand, demand and demand. Not only that you've already managed to create two duplicates as if it will suddenly make the people responsible for this kernel subsystem give up on their primary job and rush to meet your demands.

Please stop. This message does _not_ need a reply. Keep this bug open for Christ's sake if it makes you happy.
Comment 34 Robert Dinse 2022-06-21 05:46:02 UTC
I'm all for honesty, not so much for brutality, but I do realize that many people who work with computers choose to do so because they're not so good with humans.

That said, your position would be appropriate if Linux were still Linus's hobby project.

The reality is that much of the real world computation depends upon it, the vast majority of super computers run Linux, Android phones run Linux, most of Googles, Amazons, and even Microsoft's services run Linux, so it has grown beyond hobby status.  Self driving cars depend upon it, etc.  In short, it is critical to the technological world and technology is critical to 8 billion people surviving on this ball we call Earth.  So this requires a bit more professionalism than a hobby project.

Isn't my intent to spam a mailing list, it is my intent to make the developers aware of a SERIOUS bug, and when a kernel can't be compiled with the most recent versions of EITHER compiler chains, that is serious, especially when you've End-of-Life'd the previous version which means they are not getting security fixes.

I've been involved in some development projects myself, back in the days of 8-bit computers I even wrote a very specialized programming language, but I've never worked with a team of more than half a dozen individuals, so I can't even imagine what it is like for Linus to coordinate several thousand developers.

That said, personality issues don't help, best to stick with technical issues, tackle them and move on.
Comment 35 Artem S. Tashkinov 2022-06-21 11:23:09 UTC
This is fixed in 5.19-rc3.

The fix is trivial:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kvm/Makefile

ccflags-y += -I $(srctree)/arch/x86/kvm
ccflags-$(CONFIG_KVM_WERROR) += -Werror
Comment 36 Alexander Warth 2022-06-21 16:05:15 UTC
Thx Artem.

As said before it is basically already fixed by this patch more or less. 

https://patchwork.kernel.org/project/kvm/patch/20220526210817.3428868-3-seanjc@google.com/
Comment 37 Alexander Warth 2022-06-21 16:06:45 UTC
And great that we have reached tech focused rational ground again.
Comment 38 Robert Dinse 2022-06-21 21:45:43 UTC
5.19-rc3 compiles good for me with gcc, still a problem with llvm using multiple cores but that's a compiler race issue not a kernel issue.  Do wish the patch were applied to mainstream as well, but am satisfied knowing this will be fixed in a few weeks when 5.19 becomes mainstream.
Comment 39 Robert Dinse 2022-06-24 00:00:15 UTC
5.18.6 compiled successfully using GCC 12.1 so this issue is fixed in the 5.18 line as well.  Much thanks to all involved.
Comment 40 Alexander Warth 2022-06-26 19:04:31 UTC
me too. Thx for the fix and efforts.
Comment 41 Robert Dinse 2022-06-27 07:41:24 UTC
Well now I'm wondering if this was really a proper fix of the code or just a fix to make the error invisible.

With 5.17.14, a Windows kvm-qemu guest works mostly properly (there is an issue
with copy host cpu in virtual-manager, where it takes a one socket 8 core, one
thread / core CPU and maps it to 8 sockets 1 core, but I can get around that by
manually defining layout.

The virtual machine is using i-915/UHD630 GPU virtualization and pass through.

Under the 5.17.14 kernel this works properly.

Under the 5.18.6 kernel it works right after a fresh boot but within a few hours it loses it's brains, I no longer have a cursor, I can't get the host to communicate properly with the guest anymore.

Should I re-open this bug or file a new one with that description?
Comment 42 Sean Christopherson 2022-06-27 14:21:17 UTC
Please open a new bug, there is essentially zero chance this is related to the issues you are seeing.

Note You need to log in before you can comment on or make changes to this bug.