Bug 194965

Summary: Read on a Copy Protected DVD results in Seek Errors
Product: File System Reporter: hagar
Component: UDFAssignee: Jan Kara (jack)
Status: RESOLVED CODE_FIX    
Severity: normal CC: axboe, bfennema, ismail, kernelorg, markus.schwarzenberg, nicodemusjls, preining, theo77186, z
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.10.3 and 4.4.0 Tree: Mainline
Regression: No
Attachments: DVD-Arch-32bit.txt
DVD-Ubuntu-64bit.txt
DVD-Ubuntu-64bit.txt
Program to output cdrom information
output of: sudo cdrecord -v dev=/dev/sr0 -toc > cdrecord-toc.txt
output of: scsi_readcap /dev/sr0
output of: compiled stat_cdrom /dev/sr0 (stat_cdrom.c from attachment #283217)
sr: Debug device size changes
Output of dmesg -w after certain actions (marked UPPRECASE)
Output of stat_cdrom, cdrecord -v dev=/dev/sr0 -toc, and scsi_readcap
scsi readcap, stat cdrom, cdrecord output
stat-cdrom logsc
[PATCH 1/2] bdev: Factor out bdev revalidation into a common helper
[PATCH 2/2] bdev: Refresh bdev size for disks without partitioning

Description hagar 2017-03-23 13:10:11 UTC
Whether using dd Xine Mplayer VLC mount libdvdread libdvdnav libdvdcss the errors are the same -

xine - mplayer - vlc
libdvdcss error: seek error
libdvdread: Can't seek to block 2199591
libdvdcss error: seek error
libdvdread: Can't seek to block 2199591


dd -
error: seek error
Can't seek to block 2199591

The point is different for each DVD.

I even get it on old DVD's

While looking for a solution I dicovered that everybody is blaming libdvdcss, libdvdread and libdvdnav - but it seems to be lower in the kernel than that.

My research has found different Linux Versions, Different Hardware - both 32 and 64 bit.

Makemkv claims -
Device '/dev/sr0' is partially inaccessible due to a bug in Linux kernel (it reports invalid block device size). This can be worked around, but read speed may be very slow.

I am willing to provide any logs or outputs that are required.

two differnt machines
Mint Mate 17 - (Ubuntu 4.8.4)
Arch Linux - kernel 4.10.3
Comment 1 Jan Kara 2017-03-23 19:17:48 UTC
Thanks for report! Can you attach here output of dmesg command after trying to access such disk? And also the output of dvd+rw-mediainfo? Thanks!
Comment 2 hagar 2017-03-24 00:37:41 UTC
Created attachment 255493 [details]
DVD-Arch-32bit.txt

On 24/03/17 03:17, bugzilla-daemon wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=194965
>
> Jan Kara (jack@suse.cz) changed:
>
>             What    |Removed                     |Added
> ----------------------------------------------------------------------------
>               Status|NEW                         |NEEDINFO
>
> --- Comment #1 from Jan Kara (jack@suse.cz) ---
> Thanks for report! Can you attach here output of dmesg command after trying
> to
> access such disk? And also the output of dvd+rw-mediainfo? Thanks!
>
Different errors from each system.

Ubuntu will read error when playing dvd longer that approx 120 mins - 
Ill try to get that error.
Comment 3 hagar 2017-03-24 00:37:41 UTC
Created attachment 255495 [details]
DVD-Ubuntu-64bit.txt
Comment 4 hagar 2017-03-24 01:03:38 UTC
Created attachment 255497 [details]
DVD-Ubuntu-64bit.txt

On 24/03/17 03:17, bugzilla-daemon wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=194965
>
> Jan Kara (jack@suse.cz) changed:
>
>             What    |Removed                     |Added
> ----------------------------------------------------------------------------
>               Status|NEW                         |NEEDINFO
>
> --- Comment #1 from Jan Kara (jack@suse.cz) ---
> Thanks for report! Can you attach here output of dmesg command after trying
> to
> access such disk? And also the output of dvd+rw-mediainfo? Thanks!
>
64 bit vlc play error

again seek errors involved.

DVD would not play past language menu.


Thanks
Comment 5 Mike 2019-06-07 21:29:04 UTC
I'm the author of MakeMKV. The (simplified) code in question that detects the bug is below. In short, something fixes the DVD disc size (the size of underlying block device) to 4GB. All read requests beyond 4G fail. MakeMKV works around by reading the data beyond 4G boundary using raw scsi commands.


>    fd = open("/dev/sr0",O_RDONLY|O_DIRECT|O_LARGEFILE);
>
>    err = ioctl(fd,BLKGETSIZE64,&file_size);
>
>    err = ioctl(fd,BLKSSZGET,&sector_size);
>
>    if ( (file_size==0x3FFFFE00) && (sector_size==2048) )
>    {
>        // print error:
>        // linux kernel bug: the block device size is truncated and is invalid
>        return NULL;
>    }
Comment 6 Jan Kara 2019-06-12 09:33:42 UTC
Uh, I'm really sorry but I totally forgot about this bug. Looking at kernel logs from comments 3 and 4, I can see errors like:

[1237783.184444] sr 3:0:0:0: [sr1] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[1237783.184452] sr 3:0:0:0: [sr1] tag#2 Sense Key : Illegal Request [current] 
[1237783.184458] sr 3:0:0:0: [sr1] tag#2 Add. Sense: Logical block address out of range
[1237783.184463] sr 3:0:0:0: [sr1] tag#2 CDB: Read(10) 28 00 00 2e fe 02 00 00 40 00
[1237783.184467] blk_update_request: I/O error, dev sr1, sector 12318728

Which shows that it is the device itself that reports error when trying to read 64 blocks (likely 2k blocksize) starting from block 0x2efe02 == 3079682. So maybe the kernel thinks the device is bigger than it actually is and tries to read beyond the end of it but then the question is how those dvd-players decide to read large offsets beyond the end of device...

What Mike writes in comment 5 looks somewhat different - there the device size is reported to be 1G - 1 sector so it's smaller (not larger) than expected. Anyway, can any of you run:

scsi_readcap /dev/srXX

(from sg3_utils package) to find out which capacity the drive actually reports. Then please run attached stat_cdrom program which should output some additional information. Hopefully I'll be able to tell more from that. Thanks!
Comment 7 Jan Kara 2019-06-12 09:34:09 UTC
Created attachment 283217 [details]
Program to output cdrom information
Comment 8 Jan Kara 2019-06-12 09:35:41 UTC
Oh, and maybe if you can also provide output of:

cdrecord -v dev=/dev/srXX -toc

to see what TOC on the DVD actually looks like. Thanks!
Comment 9 Markus 2019-08-10 20:45:29 UTC
Created attachment 284311 [details]
output of: sudo cdrecord -v dev=/dev/sr0 -toc > cdrecord-toc.txt
Comment 10 Markus 2019-08-10 20:46:00 UTC
Hi, I'm happy that I've found this bug report since I'm experiencing the same errors as the OP.

I've been faced to this bug just recently, using different kernel versions between 4.16.12-3 and 5.2.5 on two different computers both running opensuse tumbleweed. 

I've collected the requested information, see the new attachments. 

I've tried different kernel versions and libdvdcss versions with no stable success. Once, after one or two hours experimenting, the bug has disappeared temporarily, but I couldn't detect a reliable reason for this change. 

A very interesting fact is that using wine64 and vlc3.0.7-win64 the DVD in the same computer plays correctly, where the direct linux vlc (same version, most probably also using the same libdvdcss version) is faced to the bug.

I've also looked at the place in libdvdcss where the bug is detected, and the symptom is always similar: seek fails at offsets near the dvd capacity. Return value of lseek is a negative number with high absolute value (-1781583872 and similar), but typically no known error code.
Comment 11 Markus 2019-08-10 20:46:58 UTC
Created attachment 284313 [details]
output of: scsi_readcap /dev/sr0
Comment 12 Markus 2019-08-10 20:48:59 UTC
Created attachment 284315 [details]
output of: compiled stat_cdrom /dev/sr0  (stat_cdrom.c from attachment #283217 [details])
Comment 13 Jan Kara 2019-08-12 13:34:59 UTC
Thanks for the data. So in your case the drive reports the media has recorded 3926220 2k sectors. But OTOH the device size is reported as 14801720 512-byte sectors which is actually considerably less than the number of recorded sectors. I guess this discrepancy is what is causing issues.

But what is confusing me is that drivers/scsi/sr.c has in get_sectorsize():

        if (!cdrom_get_last_written(&cd->cdi, &last_written))
               cd->capacity = max_t(long, cd->capacity, last_written);

So the capacity should be at least what the drive reports as the last written sector and as my stat_cdrom shows that is 3926220. So I'm somewhat confused why the logic above didn't trigger and how the device size can be so low. I'll be looking more into this.
Comment 14 theo77186 2019-08-16 11:03:07 UTC
I also encounter this bug with my USB DVD reader. Running stat_cdrom from comment 7 gives incorrect device size (constantly 2097151=0x1fffff, regardless of the DVD) but correct last written sector, which is coherent with lsblk output. This may be the cause of my read errors.

I encounter this with the 5.2.7-1 kernel from Debian sid and 5.3-rc4 (compiled from source upstream) but not any of 4.19.x series both from upstream and distro provided.
Comment 15 Jan Kara 2019-08-29 09:13:37 UTC
Theo, your problem seems to be different than that of other reporters here as for them both old and new kernels fail while for you it seems to be a recent regression. Do you by any chance use pktcdvd driver for accessing the reader? Could you bisect which commit introduced the regression for you?
Comment 16 Jan Kara 2019-08-29 09:29:59 UTC
BTW, Theo, I strongly suspect that in your case, READ_CAPACITY SCSI command submitted from drivers/scsi/sr.c:get_sectorsize() fails for the drive for some reason. Do you see any errors in dmesg with problematic kernels?
Comment 17 Jan Kara 2019-08-29 11:09:30 UTC
Created attachment 284693 [details]
sr: Debug device size changes

Can someone who sees device size different from 0x1fffff run a kernel with this patch and attach here kernel dmesg output after trying to access problematic disk? The dmesg should contain information how we arrived at reported device size...
Comment 18 Markus 2019-09-03 21:35:38 UTC
Today I watched the following, this time using kernel 5.2.10-1-default:

(1) Start as usual: Seek errors as in the original bug description.
(2) By chance the dvd was mounted, because I tried to "watch" one of the encrypted .vob files from a file manager
(3) After that I umounted /dev/sr0
(4) After this libdvdread/dvdcss did correctly get the keys, reporting the same positions as it did in (1), the DVD played correctly.

Afterwards I tried this on another computer, where it mounting/umounting did NOT stop the seek errors.
Comment 19 theo77186 2019-09-03 21:57:24 UTC
Concerning the 0x1fffff bug, there are several points:
1. No suspicious dmesg messages; pktcdvd is loaded and seems to be used. It seems it's it that reports the wrong device size (as seen on lsblk).
2. Further tests shows that there are at least 3 variants of the bug I have. This makes testing the bug slightly more difficult.
The 3 variants I've found are:
1a. (the one I've reported initially): Ox1fffff device size unconditionally. This one appeared between v5.1 and v5.2. I'll bisect to the commit in the weekend as it's quite time-consuming (about 14 kernel compiles).
1b. 0x1fffff only when the DVD is already in the drive on device plug. This one is not deterministic and I cannot reproduce consistently. Happens even with v4.19.
2. For some pairs of DVDs, inserting a first DVD then a second one results the 2nd DVD reporting the device size of the 1st, thus causing the seek errors if the 1st is smaller than the 2nd. Also happens with v4.19; I believe that it's deterministic. To be confirmed.
In all cases, the last written sector is correct.
I suspect that 1b and 2 are somehow related...
Comment 20 Jan Kara 2019-09-05 07:55:32 UTC
(In reply to Markus from comment #18)
> Today I watched the following, this time using kernel 5.2.10-1-default:
> 
> (1) Start as usual: Seek errors as in the original bug description.
> (2) By chance the dvd was mounted, because I tried to "watch" one of the
> encrypted .vob files from a file manager
> (3) After that I umounted /dev/sr0
> (4) After this libdvdread/dvdcss did correctly get the keys, reporting the
> same positions as it did in (1), the DVD played correctly.
> 
> Afterwards I tried this on another computer, where it mounting/umounting did
> NOT stop the seek errors.

Weird. Markus, can you try running a kernel with the patch from comment 17 and attach here output of your 'dmesg' after trying to access the media? Thanks!
Comment 21 Jan Kara 2019-09-05 08:20:44 UTC
(In reply to theo77186 from comment #19)
> Concerning the 0x1fffff bug, there are several points:
> 1. No suspicious dmesg messages; pktcdvd is loaded and seems to be used. It
> seems it's it that reports the wrong device size (as seen on lsblk).
> 2. Further tests shows that there are at least 3 variants of the bug I have.
> This makes testing the bug slightly more difficult.
> The 3 variants I've found are:
> 1a. (the one I've reported initially): Ox1fffff device size unconditionally.
> This one appeared between v5.1 and v5.2. I'll bisect to the commit in the
> weekend as it's quite time-consuming (about 14 kernel compiles).
> 1b. 0x1fffff only when the DVD is already in the drive on device plug. This
> one is not deterministic and I cannot reproduce consistently. Happens even
> with v4.19.
> 2. For some pairs of DVDs, inserting a first DVD then a second one results
> the 2nd DVD reporting the device size of the 1st, thus causing the seek
> errors if the 1st is smaller than the 2nd. Also happens with v4.19; I
> believe that it's deterministic. To be confirmed.
> In all cases, the last written sector is correct.
> I suspect that 1b and 2 are somehow related...

Theo, what you describe looks like the device is not properly reporting "media changed" events and thus the kernel doesn't get the information that the DVD has changed (which triggers rereading of disk size etc.). Anyway, I'm somewhat curious what regressed the behavior for you between 5.1 and 5.2. Maybe it could be related to media change rework done by Martin Wilck starting with commit 673387a93005 "block: genhd: remove async_events field" and ending with commit cdf3e3deb747 "block: check_events: don't bother with events if unsupported". So maybe you can start your bisection effort by checking the status before / after this series.
Comment 22 theo77186 2019-09-05 20:07:55 UTC
Some additional info:
#2 from comment 19 is indeed deterministic and is triggered with all kernels I have.
For the regression I have noticed something: lsblk doesn't show a pktcdvd device on bugged kernels but does show in non-regressed kernels. Maybe for some reason the pktcdvd device fails to be created, thus giving the wrong size?

The commits from comment 21 seems to be before v5.1 which works for me so it couldn't be the regression commits. I'll still check that.

Additionally, disabling completely pktcdvd by "install pktcdvd /bin/true" in a /etc/modprobe.d conf file causes all the bugs of comment 19 to disappear for all kernels and I can read DVDs correctly. This is a workaround but not a definitive solution, though.
Comment 23 Jan Kara 2019-09-06 08:21:32 UTC
(In reply to theo77186 from comment #22)
> Some additional info:
> #2 from comment 19 is indeed deterministic and is triggered with all kernels
> I have.
> For the regression I have noticed something: lsblk doesn't show a pktcdvd
> device on bugged kernels but does show in non-regressed kernels. Maybe for
> some reason the pktcdvd device fails to be created, thus giving the wrong
> size?

Interesting.

> The commits from comment 21 seems to be before v5.1 which works for me so it
> couldn't be the regression commits. I'll still check that.

Don't get misled by the dates. The patches got merged upstream in v5.2-rc1 (git describe --contains 673387a93005).

> Additionally, disabling completely pktcdvd by "install pktcdvd /bin/true" in
> a /etc/modprobe.d conf file causes all the bugs of comment 19 to disappear
> for all kernels and I can read DVDs correctly. This is a workaround but not
> a definitive solution, though.

OK, so at least we know the problem is with pktcdvd. Thanks for testing this.
Comment 24 Markus 2019-09-10 20:22:56 UTC
Created attachment 284911 [details]
Output of dmesg -w  after certain actions (marked UPPRECASE)
Comment 25 Markus 2019-09-10 20:24:39 UTC
Created attachment 284913 [details]
Output of stat_cdrom, cdrecord -v dev=/dev/sr0 -toc, and scsi_readcap
Comment 26 Markus 2019-09-10 20:37:39 UTC
(In reply to Jan Kara from comment 20)

Attachment 284911 [details] was created running kernel 5.2.11-1 (current opensuse tumbleweed default kernel) with the patch from comment 17.


In the dmesg log the following actions are marked in uppercase letters just before the corresponding messages. I did the following:

- INSERT-DVD : inserted a copy protected DVD
- START PLAY -> SEEK ERRORS :  played the DVD (xine dvd://) showing the typical decryption (seek) errors
- MOUNT DVD : mounted the dvd using a file manager, but didn't open a file on the DVD
- UMOUNT: umount /dev/sr0
- EJECT: eject /dev/sr0
- INSERT: re-inserted the DVD 
- PLAY DVD (OK): played the DVD (xine dvd://), now OK w/o decryption errors

In attachment 284913 [details], for the same DVD as used above, the outputs of 
-   ./stat_cdrom /dev/sr0 (see attachment 283217 [details])
-   cdrecord -v dev=/dev/sr0 -toc                 and
-   scsi_readcap /dev/sr0
is recorded.
Comment 27 Markus 2019-09-10 20:45:38 UTC
Two things are especially interesting in attachment 284911 [details]:

- the seek errors are reported directly after insertion of the DVD, not at the attempt to play it
- in the second run (after mounting, umounting, and ejecting) the number and position of the seek errors differ from the first run
Comment 28 Norbert Preining 2019-09-17 04:28:18 UTC
Hi
just to add a me-too, I have seen that on many discs. Attached is the output of the three programs (scsi_readcap, stat_cdrom, cdrecord).
Comment 29 Norbert Preining 2019-09-17 04:29:17 UTC
Created attachment 285015 [details]
scsi readcap, stat cdrom, cdrecord output
Comment 30 Nicodemus Schoenwald 2019-10-06 08:50:52 UTC
I had same issues. Fixed for me by deleting the pktsetup profiles.
"pktsetup -s" to list
"pktsetup -d [device]" to delete

It started with a kernal update for me as well. Reverting kernel did not fix. My assumption is that the kernal update failed to clear the pktsetup list properly, and it was stuck after that.

I'm not sure if this is a full fix, though. I no longer get the error messages, but my dvd read rate is still oddly slow.
Comment 31 Jan Kara 2019-10-16 16:22:41 UTC
Markus, thanks for the debug data and sorry for getting back to you so late. When did you gather the output of 'stat_cdrom'? Before or after events captured in dmesg? Because stat_cdrom shows the device size as 0x1fffff but the kernel log shows:

get_sectorsize: Setting sr0 capacity to 15704880 (scsi size=3926220, last written=3926220)

so at least at that moment the device size was set correctly. Really confusing.

Also can you attach full 'dmesg' log without removing any messages from boot until the moment playing DVD fails from the debug kernel? Maybe that will sched some more light into the problem. Because at this moment I'm really at loss what could be resetting inode size from the correct value to the incorrect one. Thanks!
Comment 32 Jan Kara 2019-10-16 16:26:15 UTC
(In reply to Markus from comment #27)
> Two things are especially interesting in attachment 284911 [details]:
> 
> - the seek errors are reported directly after insertion of the DVD, not at
> the attempt to play it

I'd assume this is an attempt to automount the DVD (looks like UDF trying to read anchor blocks based on the truncated device size).

> - in the second run (after mounting, umounting, and ejecting) the number and
> position of the seek errors differ from the first run

These actually look again like attempts of UDF driver to read anchor blocks but this time from the correct end of device.
Comment 33 Markus 2019-10-17 20:55:56 UTC
(In reply to Jan, comment #31)

I don't exactly remember when I gathered the stat_cdrom output. But I can repeat that before and after inserting and after trying to play the dvd.

I've sent two dmesg logs to you as PM, since they are not filtered. 

When doing that (using patched kernel 5.2.11-1) I discovered that the DVD is handled correctly when it's  already in the drive at booting (a). When the DVD is inserted after booting is complete (b), the the bug is present. 

This difference exists at one of the two computers (desktop), at the other (laptop) the bug is always present.

An interesting difference in the dmesg logs of (a) and (b) on the desktop computer are the lines

  pkt_open_dev+0x95/0x360 [pktcdvd]: Setting bdev sr0 size to 8040898560

which can be found only in the log(a).
Comment 34 Markus 2019-10-17 21:31:30 UTC
Created attachment 285531 [details]
stat-cdrom logsc

Outputs of ./stat_cdrom /dev/sr0, summary:

For the buggy state it reports:  Device size: 2097151
For the working state reporting: Device size: 15704880
Comment 35 Markus 2019-10-17 21:35:27 UTC
(In reply to Nicodemus Schoenwald from comment #30)
> I had same issues. Fixed for me by deleting the pktsetup profiles.
> "pktsetup -s" to list
> "pktsetup -d [device]" to delete

Here this doesn't help.
Comment 36 Jan Kara 2019-10-18 08:48:53 UTC
Thanks for the debug data! So now things finally start to make more sense. The thing that makes difference is this message in the logs in the "DVD in drive while booting" case:

[    2.583538] __blkdev_get+0x38c/0x4f0: Setting bdev sr0 size to 8040898560

To explain a bit more: There are two places where we store disk size in the kernel. One is controlled by set_capacity() function and in both kernels we see it is set correctly from the message (e.g. from the "bad" case):

[   70.965055] get_sectorsize: Setting sr0 capacity to 15704880 (scsi
 size=3926220, last written=3926220)

But then this capacity number needs to be carried over to block device size through bd_set_size() which should generate the first message I have written above. But there's no such message in the "bad" log. So bd_set_size() does not happen in the wrong case and that's the reason why the block device size is wrong.

And the only way which I can see why it does not happen is that someone has /dev/sr0 opened before inserting the medium as bd_set_size() gets called only on first open. And this opener is actually pktcdvd as the message:

[    6.602201] pktcdvd: pktcdvd0: writer mapped to sr0

means pktcdvd driver opened sr0 and has it open as long as pktcdvd0 exists. I consider this a bug (where exactly needs some thinking and discussion :)) but as a workaround you can remove 80-pktsetup.rules from /etc/udev/rules.d/.
Comment 37 Mike 2019-10-18 22:16:36 UTC
(In reply to Jan Kara from comment #36)

> And the only way which I can see why it does not happen is that someone has
> /dev/sr0 opened before inserting the medium as bd_set_size() gets called
> only on first open.
That makes sense. This bug also happens very often when the drive is also used with VM software - vmware or virtualbox, as it keeps the device open for guest access.
Comment 38 Markus 2019-10-18 23:31:27 UTC
(In reply to Jan Kara from comment #36)

Thank you so much, Jan. Finally there is a workaround.

> I consider this a bug (where exactly needs some thinking and discussion :))
> but as a workaround you can remove 80-pktsetup.rules from /etc/udev/rules.d/.

This helps. I had to disable the default rule in /lib/udev/rules.d, though, as there was no such rule in /etc. Btw., did this by putting a symlink to /dev/null named 80-pktsetup.rules in /etc/udev/rules.d/ .
Comment 39 Markus 2019-10-20 21:32:52 UTC
(In reply to Jan Kara from comment #36)

> I consider this a bug (where exactly needs some thinking and discussion :))
> but as a workaround you can remove 80-pktsetup.rules from /etc/udev/rules.d/.

At least for built in optical drives, perhaps the pktsetup udev triggered 
actions should take place on media insertion/removal rather than on device 
addition/removal (which practically doesn't take place) ?

Currently we have as default in /lib/udev/rules.d/80-pktsetup.rules 

  ACTION=="add", SUBSYSTEM=="block", ENV{ID_CDROM}=="1", RUN+="/usr/sbin/pktsetup %E{MAJOR}:%E{MINOR}"
  ACTION=="remove", SUBSYSTEM=="block", ENV{ID_CDROM}=="1", RUN+="/usr/sbin/pktsetup -d %E{MAJOR}:%E{MINOR}"

udev generates a change event when a disk is inserted/ejected, where apparently 
ID_CDROM_MEDIA is only set ("1") when a disk is found in the drive, so I'm now 
experimenting with the following modification (/etc/udev/rules.d/80-pktsetup.rules), 
and this seems to work:

  ACTION=="change", SUBSYSTEM=="block", ENV{ID_CDROM_MEDIA}=="1", RUN+="/usr/sbin/pktsetup %E{MAJOR}:%E{MINOR}"
  ACTION=="change", SUBSYSTEM=="block", ENV{ID_CDROM_MEDIA}!="1", RUN+="/usr/sbin/pktsetup -d %E{MAJOR}:%E{MINOR}"

However I don't know which events are generated when a removable optical drive is 
plugged/unplugged (And I'm no udev expert at all).

What's the best place to discuss this issue further?
Comment 40 Jan Kara 2019-10-21 08:29:46 UTC
I guess you can discuss this with systemd / udev developers. But these days the usefulness of pktsetup is diminishing (USB sticks are so much more convenient for read-write or even append workload). So I'm not sure if it's worth investing too much effort into this.

Also after some more reading of block device code I'm convinced the bug is there and we should update device size of every open (for devices with partition scan enabled this already happens). Attached patches should fix the problem (at least they seem to in my testing with kvm).
Comment 41 Jan Kara 2019-10-21 08:30:22 UTC
Created attachment 285595 [details]
[PATCH 1/2] bdev: Factor out bdev revalidation into a common helper
Comment 42 Jan Kara 2019-10-21 08:30:48 UTC
Created attachment 285597 [details]
[PATCH 2/2] bdev: Refresh bdev size for disks without partitioning
Comment 43 Markus 2019-11-03 10:04:32 UTC
(In reply to Jan Kara from comment #41 and #42)

For me the patches #41 and #42 fix the bug.

I have tested them by comparing a standard kernel 5.3.6-1-default (opensuse) versus the patched version of the same kernel. 

The standard udev rules were active (i.e. NOT using the workaround from comment #38).

While the bug is visible with the standard kernel, the patched version works correctly. In the kernel log, the relevant messages appear:

- after inserting a first DVD after boot
[   62.892281] sr0: detected capacity change from 1073741312 to 8040898560

- after ejecting the first DVD and inserting another one:
[  133.809360] sr0: detected capacity change from 8040898560 to 8245069824

Thanks a lot!

Markus
Comment 44 Andrew Udvare 2019-11-07 00:36:50 UTC
I tried the patches on 5.3.8 on Gentoo and things work much better. As the original comment says, MakeMKV would give messages about a bug.

I tested reading DVDs and Blu-ray discs and got appropriate messages:

Nov 05 19:15:22 limelight kernel: sr1: detected capacity change from 1073741312 to 6481942528
Nov 05 19:17:36 limelight kernel: sr0: detected capacity change from 1073741312 to 14246674432
Nov 05 19:23:17 limelight kernel: sr0: detected capacity change from 14246674432 to 6481942528

(I have two drives on my system.)

The read speed for Blu-ray was much faster.
Comment 45 Jan Kara 2019-11-08 12:37:40 UTC
Thanks for testing! Jens has accepted the patch to his tree so the bug should get fixed with 5.4 or 5.5 at latest. Thanks for all people involved in hunting this bug down!