Most recent kernel where this bug did not occur: 2.6.13.2, perhaps 2.6.14/ 2.6.15 (I found it broken since 2.6.16) Distribution: Debian Unstable Hardware Environment: i686, USB Mass Storage Device Genesyslogic 001600016303 Software Environment: cdrecord (newest released version), Nero Linux Problem Description: cdrecord cannot burn, error message: Cannot allocate memory. write_g1: scsi sendcmd: no error CDB: 2A 00 00 00 00 00 00 00 1B 00 status: 0x0 (GOOD STATUS) cmd finished after 0.000s timeout 200s cdrecord: A write error occured. Nero simply complains about an unknown burn error without details. # lsusb Bus 005 Device 001: ID 0000:0000 Bus 004 Device 001: ID 0000:0000 Bus 003 Device 001: ID 0000:0000 Bus 002 Device 001: ID 0000:0000 Bus 001 Device 006: ID 05e3:0701 Genesys Logic, Inc. USB 2.0 IDE Adapter Bus 001 Device 001: ID 0000:0000 # dmesg | grep usb usb usb1: Product: EHCI Host Controller usb usb1: Manufacturer: Linux 2.6.17.8 ehci_hcd usb usb1: SerialNumber: 0000:00:10.4 usb usb1: uevent usb usb1: configuration #1 chosen from 1 choice usb usb1: adding 1-0:1.0 (config #1, interface 0) usb 1-0:1.0: uevent hub 1-0:1.0: usb_probe_interface hub 1-0:1.0: usb_probe_interface - got id hub 1-0:1.0: no power switching (usb 1.0) drivers/usb/core/inode.c: creating file '001' drivers/usb/core/inode.c: creating file '002' usb usb2: default language 0x0409 usb usb2: new device strings: Mfr=3, Product=2, SerialNumber=1 usb usb2: Product: UHCI Host Controller usb usb2: Manufacturer: Linux 2.6.17.8 uhci_hcd usb usb2: SerialNumber: 0000:00:10.0 usb usb2: uevent usb usb2: configuration #1 chosen from 1 choice usb usb2: adding 2-0:1.0 (config #1, interface 0) usb 2-0:1.0: uevent hub 2-0:1.0: usb_probe_interface hub 2-0:1.0: usb_probe_interface - got id hub 2-0:1.0: no power switching (usb 1.0) drivers/usb/core/inode.c: creating file '001' drivers/usb/core/inode.c: creating file '003' usb usb3: default language 0x0409 usb usb3: new device strings: Mfr=3, Product=2, SerialNumber=1 usb usb3: Product: UHCI Host Controller usb usb3: Manufacturer: Linux 2.6.17.8 uhci_hcd usb usb3: SerialNumber: 0000:00:10.1 usb usb3: uevent usb usb3: configuration #1 chosen from 1 choice usb usb3: adding 3-0:1.0 (config #1, interface 0) usb 3-0:1.0: uevent hub 3-0:1.0: usb_probe_interface hub 3-0:1.0: usb_probe_interface - got id hub 3-0:1.0: no power switching (usb 1.0) drivers/usb/core/inode.c: creating file '001' drivers/usb/core/inode.c: creating file '004' usb usb4: default language 0x0409 usb usb4: new device strings: Mfr=3, Product=2, SerialNumber=1 usb usb4: Product: UHCI Host Controller usb usb4: Manufacturer: Linux 2.6.17.8 uhci_hcd usb usb4: SerialNumber: 0000:00:10.2 usb usb4: uevent usb usb4: configuration #1 chosen from 1 choice usb usb4: adding 4-0:1.0 (config #1, interface 0) usb 4-0:1.0: uevent hub 4-0:1.0: usb_probe_interface hub 4-0:1.0: usb_probe_interface - got id hub 4-0:1.0: no power switching (usb 1.0) usb 1-5: new high speed USB device using ehci_hcd and address 2 drivers/usb/core/inode.c: creating file '001' drivers/usb/core/inode.c: creating file '005' usb usb5: default language 0x0409 usb usb5: new device strings: Mfr=3, Product=2, SerialNumber=1 usb usb5: Product: UHCI Host Controller usb usb5: Manufacturer: Linux 2.6.17.8 uhci_hcd usb usb5: SerialNumber: 0000:00:10.3 usb usb5: uevent usb usb5: configuration #1 chosen from 1 choice usb usb5: adding 5-0:1.0 (config #1, interface 0) usb 5-0:1.0: uevent hub 5-0:1.0: usb_probe_interface hub 5-0:1.0: usb_probe_interface - got id hub 5-0:1.0: no power switching (usb 1.0) usb 1-5: device descriptor read/64, error -32 drivers/usb/core/inode.c: creating file '001' usb 1-5: device descriptor read/64, error -32 usb 1-5: new high speed USB device using ehci_hcd and address 3 usb 1-5: device descriptor read/64, error -32 usb 1-5: device descriptor read/64, error -32 usb 1-5: new high speed USB device using ehci_hcd and address 4 usb 1-5: device not accepting address 4, error -32 usb 1-5: new high speed USB device using ehci_hcd and address 5 usb 1-5: device not accepting address 5, error -32 usb 1-8: new high speed USB device using ehci_hcd and address 6 usb 1-8: default language 0x0409 usb 1-8: new device strings: Mfr=2, Product=3, SerialNumber=4 usb 1-8: Product: USB Mass Storage Device usb 1-8: Manufacturer: Genesyslogic usb 1-8: SerialNumber: 001600016303 usb 1-8: uevent usb 1-8: configuration #1 chosen from 1 choice usb 1-8: adding 1-8:1.0 (config #1, interface 0) usb 1-8:1.0: uevent usb-storage 1-8:1.0: usb_probe_interface usb-storage 1-8:1.0: usb_probe_interface - got id usb-storage: device found at 6 usb-storage: waiting for device to settle before scanning usbcore: registered new driver usb-storage usb-storage: device scan complete
device information about the usb burner from scsi cd driver: Vendor: TSSTcorp Model: CD/DVDW TS-H552U Rev: US03 Type: CD-ROM ANSI SCSI revision: 00 sr0: scsi3-mmc drive: 1x/48x writer cd/rw xa/form2 cdda tray sr 2:0:0:0: Attached scsi CD-ROM sr0 sr 2:0:0:0: Attached scsi generic sg1 type 5
Hae you tried replacing the USB cable, or plugging it into a different port on the computer? If nothing else works, please use the usbmon facility (instructions are in the kernel source file Documentation/usb/usbmon.txt; note that you will have to set CONFIG_USB_MON and CONFIG_DEBUG_FS) and attach a copy of the usbmon log showing what happens when you try to write a CD or DVD.
Created attachment 8849 [details] usbmon output as requested
I don't think that it has to do with the port or cabling because it works with Kernel 2.6.13. Attached you find the debug output from usbmon.
I just saw that the original filename is lost: it is a gzipped text file
There's no indication in the log of any problem involving the USB stack. Every command was successfully transmitted and acknowledged. A few of the commands weren't understood by the drive, but that's the fault of the CD-writing program, not the kernel. There also were several replies saying "Cannot read medium - incompatible format", but they were near the beginning and so presumably were harmless. The log shows that several commands took a few seconds to carry out, so _something_ was going on. Is it possible to get a similar log from a kernel where the program works?
I just did the same action twice, first with Kernel 2.6.13.2, then with Kernel 2.6.17.8. So I hope you will find a difference between both logs. With Kernel 2.6.13.2 I got a working video cd, with 2.6.17.8 the cd remained unwritten (I was still able to use it for burning another file). This indicates, that the burning process does not really start though the drive is flashing its LED, indicating that something is going on.
Created attachment 8861 [details] usbmon output: successfull burning process with 2.6.13.2
Created attachment 8862 [details] usbmon output: unsuccessfull burning process with 2.6.17.8
It's weird. The two logs have a very large portion in common, but then they diverge. If you really did run the same program both times, it sent different commands to the drive under the two OS versions. I don't know why it would do that. Unfortunately the usbmon information for 2.6.13 doesn't include the details of each command, so I can't compare them directly. The main divergence between the two logs occurs at a place where there's a 3-second delay in 2.6.17 and no delay in 2.6.13. And this is a delay _between_ commands, not during a command, so I don't know what caused it. Perhaps it is a permissions issue. Were you logged in as root when you ran the tests? There still isn't any sign that this is a kernel problem. A better approach might be to start by doing some debugging on cdrecord. For instance, try adding -V 2>some_debugging_file to the command line when you run it, so that the commands can be compared and examined in detail. (There will be a lot of debugging data.)
Created attachment 8867 [details] cdrecord -V output: unsuccessfull burning process with 2.6.17.8
Created attachment 8868 [details] cdrecord -V output: successfull burning process with 2.6.13.2
Created attachment 8869 [details] cdrecord standard output when error occurs
Created attachment 8870 [details] usbmon output: unsuccessfull burning process with 2.6.17.8
Created attachment 8871 [details] usbmon output: unsuccessfull burning process with 2.6.17.8
Created attachment 8872 [details] usbmon output: successfull burning process with 2.6.13.2
I was root all the time, yes. The problem exists with any burning program I found: the debian version of cdrecord, the current official cdrecord release and Nero Linux. As the last one is commercial software and does not depend on cdrtools, and as everything works when I simply boot the old kernel and change nothing else, I think it has to be a kernel problem. Just attached debugging output from cdrecord and corresponding new usbmon output. I did run exactly the same command in both cases. Perhaps the problem is at scsi layer?
Progress! I think you're right; the problem is at the SCSI layer or the block layer. The important divergence point in the cdrecord -V logs is here: 2.6.13: Executing 'read buffer cap' command on Bus 2 Target 0, Lun 0 timeout 200s CDB: 5C 00 00 00 00 00 00 00 0C 00 cmd finished after 0.000s timeout 200s Executing 'write_g1' command on Bus 2 Target 0, Lun 0 timeout 200s CDB: 2A 00 FF FF FF 6A 00 00 1F 00 cmd finished after 0.132s timeout 200s 2.6.17: Executing 'read buffer cap' command on Bus 3 Target 0, Lun 0 timeout 200s CDB: 5C 00 00 00 00 00 00 00 0C 00 cmd finished after 0.003s timeout 200s Executing 'write_g1' command on Bus 3 Target 0, Lun 0 timeout 200s CDB: 2A 00 FF FF FF 6A 00 00 1F 00 cdrecord: Cannot allocate memory. write_g1: scsi sendcmd: no error CDB: 2A 00 FF FF FF 6A 00 00 1F 00 status: 0x0 (GOOD STATUS) cmd finished after 0.000s timeout 200s As you can see, the WRITE command succeeded in 2.6.13 but failed with (presumably) -ENOMEM status in 2.6.17. The usbmon logs confirm this; in both logs the READ BUFFER CAP command is present, but the WRITE command is present only in the 2.6.13 log. So in 2.6.17 that command never even got down to the USB level. I don't know which kernel interface your version of cdrecord uses to send its commands. Probably SG_IO but maybe something else. Try running it in 2.6.17 under strace and see what you get. That will tell us where in the kernel to start debugging.
yes, it is SG_IO: [pid 5825] ioctl(3, SG_IO, 0xbfb28d0c) = -1 ENOMEM (Cannot allocate memory) attached is the full strace output
Created attachment 8877 [details] strace output of cdrecord
Okay. The next place to debug is in the kernel source file block/scsi_ioctl.c, the sg_io() function. There are only a few places where it might return -ENOMEM: the calls to blk_get_request, kmalloc, blk_rq_map_user_iov, or blk_rq_map_user. Can you add code in there to print out the interesting values when -ENOMEM occurs? To help eliminate excessive noise, you can filter out all commands other than WRITE by checking for cmd[0] == 0x2a. At a guess, the most likely cause for this error is that hdr->iovec_count or hdr->dxfer_len is too large. hdr->dxfer_len should be 63488; it's quite possible that it needs to be set down to 32768. You can do this using the -ts= option in cdrecord.
Thank you, this was the right guess: it worked with -ts=32768 :-) Will this be generally necessary from now on, or can you change something in the kernel code to fix this?
I don't think the kernel will change in this respect. In fact, it was a change to usb-storage that made the -ts option necessary. We were told by an engineer at Genesys Logic specifically that their USB-IDE bridges need the transfer size to be <= 32768 bytes. Now in fact they seem to work just fine with transfers as large as 65536, but we decided to be conservative and follow the engineer's advice. The problem is, of course, that cdrecord doesn't have any good way of learning about this size limit -- so when it submits a WRITE command with a 63 KB buffer, of course the command fails. So now you have two choices. One is to use the -ts option whenever you run cdrecord. That's a perfectly good way of doing things. The other is to manually increase the transfer size limit to 64 KB. You can do that by using the max_sectors attribute as explained here: http://www.linux-usb.org/FAQ.html#i5 I've never heard of anyone having a problem using the Genesys chip at the higher setting, so it should be okay for you too. The embarassing thing is that I knew all of this, and had even debugged a very similar problem for someone else a while back. But it all slipped my mind, so you had to go through the whole tedious debugging process. I'm sorry for all that unnecessary work. But at least we found the reason for your problem.
No problem, I was glad to get so much help and was prepared to do debugging. Indeed everything (including NeroLinux) works when using the command echo 128 >/sys/block/sr0/device/max_sectors Are there any FAQs that could include this information in case someone else has this problem? Again, thanks for your help! I think you can close this bug now.
There's the FAQ I referred you to at www.linux-usb.org. It wouldn't hurt to add an entry explaining the issue with cdrecord and Genesys Logic interfaces. I can't close the bug report because I'm not the owner. As the submitter, however, _you_ can close it.
The option ts=<transfersize> has been added to cdrecord as it seems that nobody from the Linux kernel team is interested in fixing the related Linux kernel bug: Linux does not set up the return value for the ioctl(f, SG_GET_RESERVED_SIZE &val) correctly. cdrecord would have no problems if the Linux kernel would implement it's interfaces correctly. Is there a chance that Linux will be fixed, so there is no need for a workaround anymore?
J
Whether Linux implements a new ioctl or not does not change the problem: libscg needs the related information at the time when it decides about the max DMA size. If something like: /sys/class/scsi_generic/sgN/device/block:srM/queue/max_sectors_kb is used, it would be completely useless to libscg as libscg cannot care about other drivers sitting at a higher level than the SCSI tranport layer. I checked a Linux systen and found: /sys/class/scsi_generic/sg0/device/block/queue/max_hw_sectors_kb Even this would not be usable by libscg except if there was a way to retrieve the sg instance number from an open fd and to verify that the fd is connected to a /dev/sg* device node. Libscg only has the open fd when it tries to retrieve the max DMA size.... I tried to implement a workaround using a /sys/ based hack. This hack really looks ugly and demonstrates that /sys/ is not a real alternativer to a clean set of functionality provided via an ioctl(2) interface. I am not sure whether this attempt for a workaround helps, so I am interested in feedback..... Plase check the 2.01.01a23-pre source tree.
I meanwhile did receive a test result asnd it seems that it is no possible to retrieve the needed information using /sys/* If you don't have any suggestion on how to get to the information without a need to know about unrelated higher level interface drivers, there is no way to work around the problem without a fix inside te Linux kernel.
A patch changing the value returned by SG_GET_RESERVED_SIZE, as suggested by J