Bug 209857 - IOMMU Regression for VFIO binding v5.9
Summary: IOMMU Regression for VFIO binding v5.9
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: IOMMU (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_iommu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-26 02:22 UTC by Elliott Lester
Modified: 2020-12-10 11:29 UTC (History)
3 users (show)

See Also:
Kernel Version: v5.9
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Bisect log (73.60 KB, text/plain)
2020-10-26 02:22 UTC, Elliott Lester
Details
IOMMU_GOOD (21.74 KB, text/plain)
2020-11-04 08:05 UTC, Elliott Lester
Details
IOMMU_BAD (21.38 KB, text/plain)
2020-11-04 08:06 UTC, Elliott Lester
Details
List of pci devices (23.36 KB, text/plain)
2020-11-04 08:07 UTC, Elliott Lester
Details

Description Elliott Lester 2020-10-26 02:22:25 UTC
Created attachment 293191 [details]
Bisect log

IOMMU groups on Intel systems become less separated

Expected:
IOMMU Group 95 07:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
IOMMU Group 96 08:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

Actual >= v5.9 :
IOMMU Group 86 07:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
IOMMU Group 86 08:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

This prevents VFIO binding of PCI devices for virtual machines
I performed a Bisect on tags v5.8(good) to v5.9(bad)
however due to a build break (see bottom) I couldn't narrow it down below these 127 commits some of which are merges :(


-----Kernel Build Break Begin-------
In file included from ./arch/x86/include/asm/atomic.h:5,
                 from ./include/linux/atomic.h:7,
                 from ./include/linux/llist.h:51,
                 from ./include/linux/irq_work.h:5,
                 from kernel/smp.c:10:
kernel/smp.c: In function 'smp_init':
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_155' declared with attribute error: BUILD_BUG_ON failed: offsetof(struct task_struct, wake_entry_type) - offsetof(struct task_struct, wake_entry) != offsetof(struct __call_single_data, flags) - offsetof(struct __call_single_data, llist)
  392 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
      |                                      ^
./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert'
  373 |    prefix ## suffix();    \
      |    ^~~~~~
./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert'
  392 |  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
      |  ^~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
   39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
      |                                     ^~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:50:2: note: in expansion of macro 'BUILD_BUG_ON_MSG'
   50 |  BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
      |  ^~~~~~~~~~~~~~~~
kernel/smp.c:687:2: note: in expansion of macro 'BUILD_BUG_ON'
  687 |  BUILD_BUG_ON(offsetof(struct task_struct, wake_entry_type) - offsetof(struct task_struct, wake_entry) !=
      |  ^~~~~~~~~~~~
  AR      sound/usb/misc/built-in.a
  AR      sound/usb/usx2y/built-in.a
make[1]: *** [scripts/Makefile.build:281: kernel/smp.o] Error 1
make[1]: *** Waiting for unfinished jobs....
-----Kernel Build Break End-------
Comment 1 Tom Yan 2020-10-27 16:06:38 UTC
Cherry-pick this (or the whole series) maybe: https://github.com/torvalds/linux/commit/8c4890d1c3358fb8023d46e1e554c41d54f02878

Suffering from the same problem here. i5-4570s + ASUS H87-PRO, if that matters.
Comment 2 Lu Baolu 2020-11-04 02:19:05 UTC
Can you please explain how it impacts "VFIO binding of PCI devices for virtual machines"? These two NIC devices always sit in a same group.
Comment 3 Elliott Lester 2020-11-04 03:18:00 UTC
(In reply to Lu Baolu from comment #2)
> Can you please explain how it impacts "VFIO binding of PCI devices for
> virtual machines"? These two NIC devices always sit in a same group.

Sure
Normally in the good state these NICs would be in separate IOMMU groups however on and after v5.9 they are in the same group.

from above
< 5.9
IOMMU Group 95
IOMMU Group 96

>= 5.9
IOMMU Group 86
IOMMU Group 86
Comment 4 Tom Yan 2020-11-04 05:51:28 UTC
I guess you/we should paste `tree /sys/kernel/iommu_groups/, before and after.
Comment 5 Elliott Lester 2020-11-04 08:05:40 UTC
Created attachment 293441 [details]
IOMMU_GOOD

The expected state
Comment 6 Elliott Lester 2020-11-04 08:06:13 UTC
Created attachment 293443 [details]
IOMMU_BAD

The bad State
Comment 7 Elliott Lester 2020-11-04 08:07:15 UTC
Created attachment 293445 [details]
List of pci devices

List of PCI devices in the system
Comment 8 Elliott Lester 2020-11-04 08:13:57 UTC
I have added a copy of the IOMMU_GROUPS tree to the bug in the good and bad state

It looks like the issue is the 
"Intel Corporation C610/X99 series chipset PCI Express Root Port" at 00:1c

I have a machine that could do the bisect, but my skills with git aren't that great so if you can tell me how to add 8c4890d1c3358fb8023d46e1e554c41d54f02878 to my bisect I could get closer to the problematic commit.
I did just trying to cherry-pick that commit causes a lot of conflicts.

Thanks
Comment 9 Tom Yan 2020-11-07 22:00:15 UTC
Basically when it is good, each pcie slot gets an iommu group and so are the cards that are plugged into them; when it is bad, all the slots and their cards are in one single group :/
Comment 10 Tom Yan 2020-12-04 12:08:54 UTC
Ping. It should really be fixed before 5.9 become a longterm version.
Comment 11 Elliott Lester 2020-12-08 20:48:07 UTC
As of 5.9.12 This seems to have been resolved on my machine, I'm marking this as resolved for now.
Comment 12 Tom Yan 2020-12-10 11:29:43 UTC
Indeed.

Note You need to log in before you can comment on or make changes to this bug.