Bug 12206

Summary: Freecom 1TB Firewire drive claims max_rom = 2 but can only handle max_rom = 0
Product: Drivers Reporter: Christian Mueller (cm1)
Component: IEEE1394Assignee: Stefan Richter (stefanr)
Status: CLOSED CODE_FIX    
Severity: normal CC: cm1
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.22 Subsystem:
Regression: --- Bisected commit-id:

Description Christian Mueller 2008-12-12 12:57:17 UTC
Latest working kernel version:
None

Earliest failing kernel version:
None

Distribution:
Debian Etch, Kernel 2.6.22 from backports.org

Hardware Environment:
- Gigabyte GA-MA790GP-DS4H (AMD790GP, TI OHCI 1394 controller)
- AMD Phenom 9550
- 32 bits
- 2GB RAM
- Freecom FireWire Harddrive 1TB

Software Environment:
N/A

Problem Description:
When connecting the Freecom drive, the IEEE subsystem recognizes the new node but fails to configure it. I added debug prints to the drivers and the outcome was the following:

ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[17]  MMIO=[fdcff000-fdcff7ff]  Max Packet=[2048]  IR/IT contexts=[4/8]
csr1212: bus_read(0xfffff0000400, 64)
csr1212: kv_len = 20
ieee1394: Node added: ID:BUS[0-00:1023]  GUID[0001db800000007b]
csr1212: bus_read(0xfffff0000400, 1024)
csr1212: kv_len = 16
ieee1394: Host added: ID:BUS[0-01:1023]  GUID[00a8d5e000001fd0]
ieee1394: processing root directory for 0-00:1023
ieee1394: processing root dir entry: key.id = 3, key.type = 0, value = 0x00001b8c
csr1212: bus_read(0xfffff0000480, 64)
csr1212: kv_len = 4116
csr1212: kv_len ends beyond ConfigROM; kv->offset = 1152, cache->offset = 1024, cache->size = 1024
ieee1394: processing root directory for 0-01:1023
ieee1394: processing root dir entry: key.id = 3, key.type = 0, value = 0x0000a8d5
ieee1394: processing root dir entry: key.id = 1, key.type = 2, value = 0x00000006
ieee1394: processing root dir entry: key.id = 12, key.type = 0, value = 0x000083c0

Note how csr1212 tries to read 64 bytes from the bus to get a textual node with the vendor name but gets a length of 4116 bytes which exceeds the size of the configuration ROM. Using an [outdated and hacked] version of ieee1394diag, it showed the same general problem; further poking around reveiled that the first quadlet of the textual node had the same CRC checksum as the bus info block. That was too much of a coincidence, of course -- it seemed as if the read_bus() operation returned the same buffer as when it was previously called to get the bus info block.

Seeing all the quirks in csr1212.c about devices claiming max_rom values which they can't handle, I forced the max_rom value down to 4 bytes and, presto, the hard disk was recognized properly and works fine ever since.

I would suggest to add a module parameter to ieee1394 (in csr1212.c) like so:

#include <linux/module.h>
#include <linux/moduleparam.h>

static int max_rom_io = -1;
module_param(max_rom_io, int, S_IRUGO | S_IWUSR);
MODULE_PARM_DESC(max_rom_io, "maximum I/O size for configuration ROM; default: -1 (use node default), values: 0/1/2 (maps to 4/64/1024 bytes)");

Further down in csr1212.c, allow the module parameter to override the max_rom value retrieved from the device:


*** existing code ***
        if (!csr->ops->get_max_rom) {
                csr->max_rom = mr_map[0];       /* default value */
        } else {
                int i = csr->ops->get_max_rom(csr->bus_info_data,
                                              csr->private);
                if (i & ~0x3)
                        return -EINVAL;
                csr->max_rom = mr_map[i];
        }

*** new code ***
        /* allow module parameter to override max_rom */
        if (max_rom_io != -1 && (max_rom_io & ~0x3) == 0)
                csr->max_rom = mr_map[max_rom_io];


I did not include a patch because my kernel is rather old (2.6.22), has patches from Debian, etc.


Steps to reproduce:
- Attach a Freecom FireWide 1TB Harddrive to any any firewire-capable Linux box.
Comment 1 Christian Mueller 2008-12-12 13:40:07 UTC
On second thought, the problem might not be related to the Freecom controller not supporting more than 4 bytes for ROM reads per se but that the 64 bytes at offset 0xfffff0000480 are too many because the corresponding textual node is less than 64 bytes long. I seem to remember it was more like 16 bytes, total.

Either way, restricting the ROM read size to 4 bytes fixes the problem and doesn't seem to have any impact on the performance once the drive has been recognized.
Comment 2 Anonymous Emailer 2008-12-12 13:53:18 UTC
Reply-To: stefanr@s5r6.in-berlin.de

Please Cc linux1394-devel@lists.sourceforge.net in replies, along with
bugme-daemon@bugzilla.kernel.org.

bugme-daemon@bugzilla.kernel.org wrote:
> Seeing all the quirks in csr1212.c about devices claiming max_rom values
> which
> they can't handle, I forced the max_rom value down to 4 bytes and, presto,
> the
> hard disk was recognized properly and works fine ever since.

Good work figuring out this device bug.

> I would suggest to add a module parameter to ieee1394 (in csr1212.c)

No, this kind of thing needs to work without user intervention.
csr1212.c should switch down to quadlet reads when it detects that it
got bogus data, or simply if the bus info block contains the Feecom OUI.

BTW, the new firewire-core driver always only reads the whole config ROM
in quadlet reads; but it can easily afford to do so because there is
much less overhead due to gap count optimization.


[...]

bugme-daemon@bugzilla.kernel.org wrote:
> ------- Comment #1 from cm1@mumac.de  2008-12-12 13:40 -------
> On second thought, the problem might not be related to the Freecom controller
> not supporting more than 4 bytes for ROM reads per se but that the 64 bytes
> at
> offset 0xfffff0000480 are too many because the corresponding textual node is
> less than 64 bytes long. I seem to remember it was more like 16 bytes, total.
> 
> Either way, restricting the ROM read size to 4 bytes fixes the problem and
> doesn't seem to have any impact on the performance once the drive has been
> recognized.

So how about the following change in csr1212_read_keyval():

 		if ((kv_len + (kv->offset - cache->offset)) > cache->size) {
 			/* The Leaf or Directory claims its length extends
 			 * beyond the ConfigROM image region and thus beyond the
 			 * end of our cache region.  Therefore, we abort now
 			 * rather than seg faulting later. */
-			return -EIO;
+			if (csr->max_rom == 4)
+				return -EIO;
+
+			csr->max_rom = 4;
+			continue;
 		}
Comment 3 Christian Mueller 2008-12-12 14:09:01 UTC
Wow, this was quick!

Yes, this suggestion might work for this particular drive. But it relies 
on the stale data returned from the 64 byte read to yield a length for 
the textual node that's too large for the config ROM. Maybe it would be 
better to do what firewire-core does and fix the config ROM reads to 4 
bytes as well?

Thanks,
--Christian




bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12206
> 
> 
> 
> 
> 
> ------- Comment #2 from anonymous@kernel-bugs.osdl.org  2008-12-12 13:53
> -------
> Reply-To: stefanr@s5r6.in-berlin.de
> 
> Please Cc linux1394-devel@lists.sourceforge.net in replies, along with
> bugme-daemon@bugzilla.kernel.org.
> 
> bugme-daemon@bugzilla.kernel.org wrote:
>> Seeing all the quirks in csr1212.c about devices claiming max_rom values
>> which
>> they can't handle, I forced the max_rom value down to 4 bytes and, presto,
>> the
>> hard disk was recognized properly and works fine ever since.
> 
> Good work figuring out this device bug.
> 
>> I would suggest to add a module parameter to ieee1394 (in csr1212.c)
> 
> No, this kind of thing needs to work without user intervention.
> csr1212.c should switch down to quadlet reads when it detects that it
> got bogus data, or simply if the bus info block contains the Feecom OUI.
> 
> BTW, the new firewire-core driver always only reads the whole config ROM
> in quadlet reads; but it can easily afford to do so because there is
> much less overhead due to gap count optimization.
> 
> 
> [...]
> 
> bugme-daemon@bugzilla.kernel.org wrote:
>> ------- Comment #1 from cm1@mumac.de  2008-12-12 13:40 -------
>> On second thought, the problem might not be related to the Freecom
>> controller
>> not supporting more than 4 bytes for ROM reads per se but that the 64 bytes
>> at
>> offset 0xfffff0000480 are too many because the corresponding textual node is
>> less than 64 bytes long. I seem to remember it was more like 16 bytes,
>> total.
>>
>> Either way, restricting the ROM read size to 4 bytes fixes the problem and
>> doesn't seem to have any impact on the performance once the drive has been
>> recognized.
> 
> So how about the following change in csr1212_read_keyval():
> 
>                 if ((kv_len + (kv->offset - cache->offset)) > cache->size) {
>                         /* The Leaf or Directory claims its length extends
>                          * beyond the ConfigROM image region and thus beyond
> the
>                          * end of our cache region.  Therefore, we abort now
>                          * rather than seg faulting later. */
> -                       return -EIO;
> +                       if (csr->max_rom == 4)
> +                               return -EIO;
> +
> +                       csr->max_rom = 4;
> +                       continue;
>                 }
> 
> 
Comment 4 Anonymous Emailer 2008-12-12 14:26:15 UTC
Reply-To: stefanr@s5r6.in-berlin.de

bugme-daemon@bugzilla.kernel.org wrote:
> ------- Comment #3 from cm1@mumac.de  2008-12-12 14:09 -------
> Wow, this was quick!
> 
> Yes, this suggestion might work for this particular drive. But it relies 
> on the stale data returned from the 64 byte read to yield a length for 
> the textual node that's too large for the config ROM. Maybe it would be 
> better to do what firewire-core does and fix the config ROM reads to 4 
> bytes as well?
> 
> Thanks,
> --Christian

Ah, right, the check

>>                 if ((kv_len + (kv->offset - cache->offset)) > cache->size) {

is probably not suitable to catch this firmware bug.

I have to check in nodemgr whether we can indeed limit ->bus_read to
quadlet reads without causing regressions for those who have more than
one device attached to the bus.
Comment 5 Anonymous Emailer 2008-12-12 16:44:36 UTC
Reply-To: stefanr@s5r6.in-berlin.de

>> ------- Comment #3 from cm1@mumac.de  2008-12-12 14:09 -------
...
>> Maybe it would be better to do what firewire-core does and fix the
>> config ROM reads to 4 bytes as well?

Alas nodemgr reads the whole config ROM of each node on the bus after
each bus reset, before a driver like sbp2 can proceed with their work.
Well, sbp2 and to lesser extent eth1394 are the only drivers which
actually care what nodemgr is doing.  Anyway, on buses with more than a
single device attached, nodemgr should get done with its business as
quickly as possible in order to proceed with SBP-2 reconnection at the
earliest opportunity.  So in this sense, needlessly un-optimizing the
fetching of config ROMs would hurt a little bit.

The new firewire-core driver does this a lot better:  It can easily
afford to fetch the config ROM in quadlet reads because
  - it makes small transactions cheap by gap count optimization,
  - it usually does not re-read the config ROM of nodes which were
    already present on the bus before a reset,
  - it schedules driver updates of old nodes (e.g. SBP-2 reconnect)
    before fetching the ROM of new nodes.

So, a least intrusive fix for the old ieee1394 stack for Freecom devices
could look like this:

--- linux.orig/drivers/ieee1394/nodemgr.c
+++ linux/drivers/ieee1394/nodemgr.c
@@ -115,8 +115,14 @@ static int nodemgr_bus_read(struct csr12
 	return error;
 }
 
+#define OUI_FREECOM_TECHNOLOGIES_GMBH 0x0001db
+
 static int nodemgr_get_max_rom(quadlet_t *bus_info_data, void *__ci)
 {
+	/* Freecom FireWire Hard Drive firmware bug */
+	if (be32_to_cpu(bus_info_data[3]) >> 8 == OUI_FREECOM_TECHNOLOGIES_GMBH)
+		return 0;
+
 	return (be32_to_cpu(bus_info_data[2]) >> 8) & 0x3;
 }
 

However, I had a quick look at the max_rom field of a couple of SBP-2
HDD enclosures here.  (HDDs are the type of devices where a quickly
acting nodemgr on a bus with several devices matters most.)  I checked
HDDs with the following bridge chips:
  1x INIC-2430, 1x OXF911, 2x OXFW912, 1x OXUF922, 1x OXUF924DSB,
  1x PL-3507, 1x TSB42AA9, 1x TSB42AA9
Also CD-RWs with the following chips:
  1x PL-3507, 2x (guessed) SYM13FW500

All of them had max_rom = 0, except the OXUF924DSB which had max_rom = 1
(i.e. the 924 also supports 64-byte aligned 64-byte block read
requests).

So, this anecdotal evidence indicates that the impact of a general
limitation of nodemgr to quadlet reads would be of negligible impact in
the large majority of use cases because quadlet reads are already used
for most devices anyway.

So, does anybody else here have an opinion whether a specific Freecom
workaround or a general limitation of nodemgr to quadlet reads is
preferable?
Comment 6 Stefan Richter 2008-12-13 03:02:11 UTC
I categorize this bug as blocker of bug 10046 in order to document that this issue is already "fixed" by availability of the new firewire driver stack.  I will nevertheless look into a solution for the old ieee1394 stack.
Comment 7 Christian Mueller 2008-12-13 09:25:27 UTC
Fair enough. But maybe the module option would be a way to allow 
overriding the max_rom value for those few who need it while all others 
will get the previous behavior?

I completely agree that this should be automatic but given the 
circumstances and the lack of time to analyze the impact of a fixed 
value of 4 for max_rom to the rest of the stack and/or multiple devices, 
a module option would both be safe and allow those who need it to tweak 
the behavior.
Comment 8 Anonymous Emailer 2008-12-13 12:13:38 UTC
Reply-To: stefanr@s5r6.in-berlin.de

I wrote:
> I had a quick look at the max_rom field of a couple of SBP-2
> HDD enclosures here.  (HDDs are the type of devices where a quickly
> acting nodemgr on a bus with several devices matters most.)  I checked
> HDDs with the following bridge chips:
>   1x INIC-2430, 1x OXF911, 2x OXFW912, 1x OXUF922, 1x OXUF924DSB,
>   1x PL-3507, 1x TSB42AA9, 1x TSB42AA9
> Also CD-RWs with the following chips:
>   1x PL-3507, 2x (guessed) SYM13FW500
> 
> All of them had max_rom = 0, except the OXUF924DSB which had max_rom = 1
> (i.e. the 924 also supports 64-byte aligned 64-byte block read
> requests).

I now also checked a low-end IIDC camera (Fire-i), a low-end mini DV
camcorder, two OXFW970 based consumer audio devices, and a DVB box.  All
of them have max_rom = 0 too, except the DVB box which features max_rom
= 2 (read requests of up to 1kB supported).

> So, this anecdotal evidence indicates that the impact of a general
> limitation of nodemgr to quadlet reads would be of negligible impact in
> the large majority of use cases because quadlet reads are already used
> for most devices anyway.
Comment 9 Stefan Richter 2008-12-16 11:27:41 UTC
fixed in the upcoming 2.6.28-rc9 by commit
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=25a41b280083259d05d68f61633194344a1f8a9f

I will also submit it for 2.6.27.y.

I plan to submit the more general patch "ieee1394: ignore nonzero Bus_Info_Block.max_rom, fetch config ROM in quadlets" for 2.6.29(-rc1).
http://marc.info/?l=linux1394-devel&m=122920635613052
Comment 10 Stefan Richter 2008-12-20 03:57:02 UTC
also fixed in 2.6.27.10