Bug 68301 - [bisected] Headless OpenCL broken
Summary: [bisected] Headless OpenCL broken
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-08 19:41 UTC by Niels Ole Salscheider
Modified: 2014-01-13 21:54 UTC (History)
2 users (show)

See Also:
Kernel Version: 3.13
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg output (86.16 KB, application/octet-stream)
2014-01-08 19:41 UTC, Niels Ole Salscheider
Details

Description Niels Ole Salscheider 2014-01-08 19:41:56 UTC
Created attachment 121331 [details]
dmesg output

Since 10ebc0bc09344ab6310309169efc73dfe6c23d72, headless OpenCL is broken on my university's Radeon S7000 unless I pass "radeon.runpm=0".

The first OpenCL program after a reboot seems to work, but all after that output an error similar to the following:

radeon: Failed to allocate virtual address for buffer:
radeon:    size      : 4352 bytes
radeon:    alignment : 4096 bytes
radeon:    domains   : 4
radeon:    va        : 0x0000000000800000
radeon: Failed to allocate virtual address for buffer:
radeon:    size      : 4352 bytes
radeon:    alignment : 4096 bytes
radeon:    domains   : 4
radeon:    va        : 0x0000000000800000

In dmesg, I can see that some parts of the initialization routine are repeated:
[  348.906146] [drm] probing gen 2 caps for device 8086:151 = 261ad03/e
[  348.906151] [drm] PCIE gen 3 link speeds already enabled
[  348.909465] [drm] PCIE GART of 1024M enabled (table at 0x0000000000478000).
[...]

After that, the GPU keeps locking up.

I have attached the dmesg output.
Comment 2 Alex Deucher 2014-01-13 19:56:27 UTC
Do graphics work ok for you with runpm=1?  I.e., is it just compute that's causing a problem?
Comment 3 Niels Ole Salscheider 2014-01-13 21:54:34 UTC
> Make sure your kernel has this patch:
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/
> ?id=f244d8b623dae7a7bc695b0336f67729b95a9736

My kernel has this patch but it does not help.

> Do graphics work ok for you with runpm=1?  I.e., is it just compute that's
> causing a problem?

It is a bit difficult to test this because I only have SSH access to the machine at the moment.
Without any parameter, I get the already mentioned error for compute and a GPU lockup entry in dmesg when I try to start X.

With runpm=1, my SSH session hangs a few seconds after I load the radeon module and I cannot open another one. I can still ping the computer, though.

Unfortunately, it will be a few weeks until I have physical access to the machine again.

Note You need to log in before you can comment on or make changes to this bug.