Created attachment 64772 [details] dmesg output on laptop Acer Aspire 3820, when I choose "safe remove" drive in nautilus, this leads to "Buffer I/O error" message in 50% cases and kernel panic in 30% cases. Archlinux x86_64 I was able to reproduce it on virtualbox running on the same laptop: 1cpu, archlinux x86_64 and 512mb flash drive. No guarantee, but you can download the exported VM and flash drive dd image. http://www.sendspace.com/file/ryvxf6 http://www.sendspace.com/file/6as3so console redirected to file in /tmp login as root, no password start Gnome with ~/runme.sh start nautilus plug flash drive browse flash drive contents a little and immediately click "safe remove" also see attached log file. [ 900.820150] EXT3-fs (sdb2): mounted filesystem with ordered data mode here I click "remove" [ 908.439995] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [ 908.440066] IP: [<ffffffff811f75bf>] elv_completed_request+0x3f/0x50 [ 908.440119] PGD 0 [ 908.440139] Oops: 0000 [#1] PREEMPT SMP [ 908.440177] last sysfs file: /sys/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.2/power/level [ 908.440231] CPU 1 [ 908.440246] Modules linked in: cryptd [ 908.440273] quiet_error: 53 callbacks suppressed [ 908.440276] Buffer I/O error on device sdb3, logical block 0 ... [ 908.532372] Kernel panic - not syncing: Fatal exception in interrupt [ 908.534426] Pid: 10, comm: ksoftirqd/1 Tainted: G D 2.6.39-ARCH #1 [ 908.536520] Call Trace: [ 908.538533] [<ffffffff813dee0e>] panic+0xa0/0x1ad ... [ 908.558075] [<ffffffff813e8985>] page_fault+0x25/0x30 [ 908.559736] [<ffffffff811f75bf>] ? elv_completed_request+0x3f/0x50 [ 908.561393] [<ffffffff811fd65c>] __blk_put_request+0x3c/0xe0
downgraded to 2.6.38.8 , works stable
This could already be fixed in 3.0-rc6 ... can you test that?
At <http://bugs.debian.org/631187>, Alexander Kurtz writes: > I just tested 2.6.39-3 from sid and 3.0.0~rc6-1~experimental.1 from > experimental. Unfortunately both reliably panic when safely removing my > external hard drive. 2.6.38-5 (still) works fine. Seems like it's time > for me to do a git bisect, or do you any other ideas? So alas, this doesn't seem to be fixed yet. Based on an oops there, it seems e->ops is NULL (thanks to bwh for analyzing it).
There is a proposed patch here: http://marc.info/?l=linux-scsi&m=131007155700831&w=2 I don't know if it has gotten into Linus's tree yet. Also, this patch may not be a complete fix.
I got a similar bug reported here: https://bugs.launchpad.net/bugs/793796 Indeed it's because of elevator being deallocated (elevator_exit, e->ops = NULL) and later e->ops trying to be used again. I did 2 patches checking for queue being dead (0004 and 0005 at http://people.canonical.com/~herton/lp793796/r5/), reporter confirmed that fixed his problems with them, but indeed it seems other elevator calls could access e->ops after elevator_exit, the patch from James seems more complete. I'll try to get reporter test with only his patch.
I got the reporter to test the patch from http://marc.info/?l=linux-scsi&m=131007155700831&w=2 The results are that the patch doesn't work for him, so something more is needed, he still gets the panic on elv_completed_request: https://launchpadlibrarian.net/75257538/r6panic.jpg (gdb) l *(elv_completed_request+0x3f) 0xffffffff812bd22f is in elv_completed_request (/home/herton/clean/ubuntu-natty/block/elevator.c:812). 807 * request is released from the driver, io must be done 808 */ 809 if (blk_account_rq(rq)) { 810 q->in_flight[rq_is_sync(rq)]--; 811 if ((rq->cmd_flags & REQ_SORTED) && 812 e->ops->elevator_completed_req_fn) 813 e->ops->elevator_completed_req_fn(q, rq); 814 } 815 }