Created attachment 132051 [details] rmmod_radeon_kernel_panic After calling rmmod radeon, kernel show lots of error lines and then crash. It looks like that radeon module does not cleanup hwmon interface at exit. After calling rmmod radeon there is still hwmon interface: $ readlink /sys/class/hwmon/hwmon1 ../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/hwmon/hwmon1 And after calling ls, or cat in hwmon1 kernel crash... See attachment from syslog. $ lspci -nn 01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Sun XT [Radeon HD 8670A/8670M/8690M] [1002:6660]
Created attachment 132201 [details] possible fix Does the attached patch help?
Created attachment 132221 [details] syslog output after modprobe radeon Yes, your patch fixing original problem. Maybe this is candidate for stable releases. Patch tested on 3.14 and system working fine after rmmoding radeon module, no crash after calling: find /sys But now there is another new kernel crash. When I modprobe radeon module again (after previous successful rmmod), kernel crash. See syslog output in attachment.
Created attachment 132231 [details] possible fix Does this help in the second case?
No does not help, kernel still crashing. But now I cannot provide syslog output, because userspace rsyslog daemon does not read log from kernel and write data to disk.. Plus output on framebuffer screen is very quickly overwritten, so I cannot capture it.
Created attachment 132251 [details] pstore log Now I found pstore and its efi backend... I modprobed efi-pstore before rmmoding radeon and dmesg logs were stored into efi after kernel crash. So I belive that something usefull is there for you. Attachment generated by: $ cd /sys/fs/pstore/; cat `ls -r *1; ls -r *2`
Created attachment 132261 [details] possible fix v2 Updated patch.
Created attachment 132281 [details] pstore log Ok, now kernel does not crash after loading radeon module again. I modprobed & rmmoded it more times, there was no problem But after I started Xserver (when radeon module was loaded), I got another kernel crash. See output from efi pstore.
Created attachment 132291 [details] dmesg plymouth log Similar/same problem happends if I start plymouth splash screen (which using intel fb) and then I load radeon module.
Any idea about what to do with last two NULL pointer dereference in radeon_driver_open_kms?
@Alex Deucher: ping
Same issue for amdgpu after unbind (not sure if this should be a separate bug): rook ~ ➤ ls -l /sys/class/hwmon/hwmon1/device lrwxrwxrwx 1 root root 0 cze 8 12:32 /sys/class/hwmon/hwmon1/device -> ../../../0000:03:00.0 rook ~ ➤ cat /sys/class/hwmon/hwmon1/fan1_input [1] 9145 killed cat /sys/class/hwmon/hwmon1/fan1_input Reading fan1_input causes an OOPS: [ 590.507564] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8 [ 590.507584] IP: amdgpu_hwmon_get_fan1_input+0x33/0x84 [ 590.507587] PGD 0 P4D 0 [ 590.507593] Oops: 0000 [#4] PREEMPT SMP PTI [ 590.507597] Modules linked in: [ 590.507610] CPU: 39 PID: 9222 Comm: cat Tainted: G D 4.16.14-gentoo #3 [ 590.507613] Hardware name: ASUSTeK COMPUTER INC. Z10PE-D16 WS/Z10PE-D16 WS, BIOS 3407 03/10/2017 [ 590.507617] RIP: 0010:amdgpu_hwmon_get_fan1_input+0x33/0x84 [ 590.507620] RSP: 0018:ffff97790d2dfd68 EFLAGS: 00010246 [ 590.507624] RAX: 0000000000000000 RBX: ffff9147b5cd3000 RCX: ffff9137b5428ac8 [ 590.507627] RDX: ffff9137b5aa0000 RSI: ffffffffbbb3e1c0 RDI: ffff9137b55f7008 [ 590.507630] RBP: fffffffffffffffb R08: 0000000000000001 R09: 0000000000000000 [ 590.507633] R10: ffff9137af38d400 R11: 0000000000000000 R12: ffffffffbb30d000 [ 590.507636] R13: ffff9147b590d400 R14: ffff91476a16db00 R15: 0000000000000001 [ 590.507639] FS: 00007f3838cbe540(0000) GS:ffff9147bf400000(0000) knlGS:0000000000000000 [ 590.507641] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 590.507643] CR2: 00000000000000b8 CR3: 0000002023776005 CR4: 00000000003606e0 [ 590.507645] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 590.507647] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 590.507649] Call Trace: [ 590.507660] dev_attr_show+0x23/0x44 [ 590.507668] sysfs_kf_seq_show+0x7f/0xce [ 590.507676] seq_read+0x1c1/0x3d1 [ 590.507687] __vfs_read+0x33/0xcc [ 590.507693] vfs_read+0x9a/0xcf [ 590.507696] SyS_read+0x5f/0xa3 [ 590.507703] do_syscall_64+0x79/0x88 [ 590.507711] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 590.507715] RIP: 0033:0x7f38387f8b75 [ 590.507718] RSP: 002b:00007ffff2570970 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 590.507721] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f38387f8b75 [ 590.507723] RDX: 0000000000020000 RSI: 00007f3838cd0000 RDI: 0000000000000003 [ 590.507725] RBP: 0000000000020000 R08: 00000000ffffffff R09: 0000000000000000 [ 590.507727] R10: 000000000000039b R11: 0000000000000246 R12: 00007f3838cd0000 [ 590.507729] R13: 0000000000000003 R14: 00007f3838cd000f R15: 0000000000020000 [ 590.507738] Code: d3 48 83 ec 10 48 8b 97 18 01 00 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 c7 44 24 04 00 00 00 00 48 8b 82 08 49 00 00 <48> 8b 80 b8 00 00 00 48 85 c0 74 15 48 8b ba f8 48 00 00 48 8d [ 590.507821] RIP: amdgpu_hwmon_get_fan1_input+0x33/0x84 RSP: ffff97790d2dfd68 [ 590.507824] CR2: 00000000000000b8 [ 590.507830] ---[ end trace eaed7563e433ab4e ]---