Bug 92321

Summary: Mapping CompactPCI device through sysfs-pci driver
Product: Drivers Reporter: Georgiy (jediknight.93)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: NEEDINFO ---    
Severity: high CC: bjorn
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.16.0-4-586 Subsystem:
Regression: No Bisected commit-id:
Attachments: lspci -vv
dmesg output
test program
test program output
kernel module with the same functionality; works as intended

Description Georgiy 2015-01-30 08:52:37 UTC
So, the problem can be described as follows:

1. We got 11 completely equal PCI devices, connected through two CompactPCI buses, 6 on one, and 5 on the other.

2. We are trying to access the resources of the devices through the sysfs filesystem, example: /sys/class/pci_bus/0000:04/device/0000:04:0d.0/resource1. First 4 devices allow read/write access to their resources without problems, but:

3.The 5th and 6th devices of both buses don't work: all files exist, but all read operations return a bunch of FFs, regardless of the written values, so I can't really say if the write was successful or not. When one of the first 4 is physically removed, 5th device starts working as usual, same goes for 6 on the bus with 6 devices. It looks like it can only work with 4 devices per bus, not more. It should be noted that CompactPCI allows using 7 PCI devices on the bus at once, according to the specification.

4. It can't really be a hardware problem, because Windows driver(developed long ago by someone we don't have access to) does it just fine.

lspci:
03:0b.0 Multimedia controller: Device 6472:8001 (rev 01)
03:0c.0 Multimedia controller: Device 6472:8001 (rev 01)
03:0d.0 Multimedia controller: Device 6472:8001 (rev 01)
03:0e.0 Multimedia controller: Device 6472:8001 (rev 01)
03:0f.0 Multimedia controller: Device 6472:8001 (rev 01)
04:09.0 Multimedia controller: Device 6472:8001 (rev 01)
04:0a.0 Multimedia controller: Device 6472:8001 (rev 01)
04:0b.0 Multimedia controller: Device 6472:8001 (rev 01)
04:0c.0 Multimedia controller: Device 6472:8001 (rev 01)
04:0d.0 Multimedia controller: Device 6472:8001 (rev 01)
04:0f.0 Multimedia controller: Device 6472:8001 (rev 01)

lspci -vv(equal output for all devices, aside from position on the bus, IRQ number, and memory addresses):

04:0f.0 Multimedia controller: Device 6472:8001 (rev 01)
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx+
Interrupt: pin A routed to IRQ 11
Region 0: I/O ports at d800 [size=128]
Region 1: Memory at febfe800 (32-bit, non-prefetchable) [size=128]

Don't know if I really need show you the code, because it is as simple as it is possible - file is opened, then mmaped, then the resulting pointer is used to write and read into that file. Basically, it's this:

fd = open ( (device_ + "resource" + std::to_string (i)).c_str(), O_RDWR);
ptr = (u_int32_t*) mmap (NULL, 0x7f, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

All paths are valid, that's what was checked first. dmesg output contains no errors nor warnings for the corresponding problem.
Comment 1 Georgiy 2015-01-30 14:52:36 UTC
Created attachment 165261 [details]
lspci -vv

lspci -vv for all presented devices
Comment 2 Georgiy 2015-01-30 14:53:21 UTC
Created attachment 165271 [details]
dmesg output

dmesg output _after_ performed tests
Comment 3 Georgiy 2015-01-30 14:54:30 UTC
Created attachment 165281 [details]
test program

Simple motivation example of strange behavior.
You can see bad ffffffff values on some devices (described here).
Comment 4 Georgiy 2015-01-30 14:55:00 UTC
Created attachment 165291 [details]
test program output

Output for provided test program.
Comment 5 Georgiy 2015-01-30 14:55:21 UTC
Ok, finaly uploaded some files.
Comment 6 Georgiy 2015-02-03 10:16:31 UTC
Created attachment 165641 [details]
kernel module with the same functionality; works as intended

the desired info was acquired completely

root@b4-mrpu-x86:/home/user/PCIDeviceKernelModule# dmesg | grep info
[    0.000000] Using ACPI (MADT) for SMP configuration information
[   49.189417] info: 2.4.1
[   49.297360] info: 2.4.1
[   49.403655] info: 2.4.1
[   49.503624] info: 2.4.1
[   49.603260] info: 2.4.1
[   49.730663] info: 2.4.1
[   49.832868] info: 2.4.1
[   49.934986] info: 2.4.1
[   50.034662] info: 2.4.1
[   50.134300] info: 2.4.1
[   50.247241] info: 3.1.8864
Comment 7 Bjorn Helgaas 2015-03-10 01:08:24 UTC
The kernel module, which uses ioremap(), works as expected.  The user
program, which uses mmap(), fails.  The user-space mmap() path uses
pci_mmap_resource().  Can you add some instrumentation to that path?
We might be able to figure out where things are going wrong.