Created attachment 293191 [details] Bisect log IOMMU groups on Intel systems become less separated Expected: IOMMU Group 95 07:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) IOMMU Group 96 08:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) Actual >= v5.9 : IOMMU Group 86 07:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) IOMMU Group 86 08:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03) This prevents VFIO binding of PCI devices for virtual machines I performed a Bisect on tags v5.8(good) to v5.9(bad) however due to a build break (see bottom) I couldn't narrow it down below these 127 commits some of which are merges :( -----Kernel Build Break Begin------- In file included from ./arch/x86/include/asm/atomic.h:5, from ./include/linux/atomic.h:7, from ./include/linux/llist.h:51, from ./include/linux/irq_work.h:5, from kernel/smp.c:10: kernel/smp.c: In function 'smp_init': ./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_155' declared with attribute error: BUILD_BUG_ON failed: offsetof(struct task_struct, wake_entry_type) - offsetof(struct task_struct, wake_entry) != offsetof(struct __call_single_data, flags) - offsetof(struct __call_single_data, llist) 392 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^ ./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert' 373 | prefix ## suffix(); \ | ^~~~~~ ./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert' 392 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ ./include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) | ^~~~~~~~~~~~~~~~~~ ./include/linux/build_bug.h:50:2: note: in expansion of macro 'BUILD_BUG_ON_MSG' 50 | BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) | ^~~~~~~~~~~~~~~~ kernel/smp.c:687:2: note: in expansion of macro 'BUILD_BUG_ON' 687 | BUILD_BUG_ON(offsetof(struct task_struct, wake_entry_type) - offsetof(struct task_struct, wake_entry) != | ^~~~~~~~~~~~ AR sound/usb/misc/built-in.a AR sound/usb/usx2y/built-in.a make[1]: *** [scripts/Makefile.build:281: kernel/smp.o] Error 1 make[1]: *** Waiting for unfinished jobs.... -----Kernel Build Break End-------
Cherry-pick this (or the whole series) maybe: https://github.com/torvalds/linux/commit/8c4890d1c3358fb8023d46e1e554c41d54f02878 Suffering from the same problem here. i5-4570s + ASUS H87-PRO, if that matters.
Can you please explain how it impacts "VFIO binding of PCI devices for virtual machines"? These two NIC devices always sit in a same group.
(In reply to Lu Baolu from comment #2) > Can you please explain how it impacts "VFIO binding of PCI devices for > virtual machines"? These two NIC devices always sit in a same group. Sure Normally in the good state these NICs would be in separate IOMMU groups however on and after v5.9 they are in the same group. from above < 5.9 IOMMU Group 95 IOMMU Group 96 >= 5.9 IOMMU Group 86 IOMMU Group 86
I guess you/we should paste `tree /sys/kernel/iommu_groups/, before and after.
Created attachment 293441 [details] IOMMU_GOOD The expected state
Created attachment 293443 [details] IOMMU_BAD The bad State
Created attachment 293445 [details] List of pci devices List of PCI devices in the system
I have added a copy of the IOMMU_GROUPS tree to the bug in the good and bad state It looks like the issue is the "Intel Corporation C610/X99 series chipset PCI Express Root Port" at 00:1c I have a machine that could do the bisect, but my skills with git aren't that great so if you can tell me how to add 8c4890d1c3358fb8023d46e1e554c41d54f02878 to my bisect I could get closer to the problematic commit. I did just trying to cherry-pick that commit causes a lot of conflicts. Thanks
Basically when it is good, each pcie slot gets an iommu group and so are the cards that are plugged into them; when it is bad, all the slots and their cards are in one single group :/
Ping. It should really be fixed before 5.9 become a longterm version.
As of 5.9.12 This seems to have been resolved on my machine, I'm marking this as resolved for now.
Indeed.