Bug 15691 - mISDN hfcpci.c: module reference count underflow
Summary: mISDN hfcpci.c: module reference count underflow
Status: RESOLVED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: ISDN (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Karsten Keil
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-04 14:46 UTC by Andreas Mohr
Modified: 2013-12-10 18:19 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.32.10
Tree: Mainline
Regression: No


Attachments
Proposed fix (3.90 KB, patch)
2010-04-25 16:57 UTC, Karsten Keil
Details | Diff

Description Andreas Mohr 2010-04-04 14:46:26 UTC
Hello,

I just tried (yet another) attempt at a working ISDN setup on my box,
consisting of 2.6.32.10, Asterisk 1.6.2.6, mISDN (kernel), mISDNuser current git and LCR current git (using chan_lcr).

The good: it is semi-working (can at least get hold of incoming caller numbers and notify users, which is one of the larger parts that I wanted to get working).
The bad: several little semi-fatal quirks, among these [SUBJECT].

After a while, I end up with hfcpci.ko showing a module reference count of -4
(yup --> it's pretty much reboot time in case of a need for serious reinitialization).


Looking at actual code in drivers/isdn/hardware/mISDN/hfcpci.c, it is fortunately quite easy to see how this is possible:

open_bchannel() does:

        if (test_and_set_bit(FLG_OPEN, &bch->Flags))
                return -EBUSY; /* b-channel can be only open once */
.
.
.
        if (!try_module_get(THIS_MODULE))
                printk(KERN_WARNING "%s:cannot get module\n", __func__);

And then hfc_bctrl() does:

        case CLOSE_CHANNEL:
                test_and_clear_bit(FLG_OPEN, &bch->Flags);
                if (test_bit(FLG_ACTIVE, &bch->Flags))
                        deactivate_bchannel(bch);
                ch->protocol = ISDN_P_NONE;
                ch->peer = NULL;
                module_put(THIS_MODULE);
                ret = 0;
                break;

It is quite easy to imagine some user component simply invoking another CLOSE_CHANNEL upon global component shutdown time, "just to make sure" (in fact it's not to blame at all, missing an "active" CLOSE_CHANNEL might happen, in which case shutdown code should close it unconditionally).
At least that's what I think is happening here (possibly _twice_ for each B-channel of this card).
Since CLOSE_CHANNEL has no check whether the channel is _actually_ open, this seems to be the way that I end up with a reference count of -4.
In the B channel case, preventing this should be easy:
since it sets FLG_OPEN on open, we probably merely need to actively check
test_and_clear_bit(FLG_OPEN, &bch->Flags)
return value on CLOSE_CHANNEL.
However the D channel doesn't implement any such flag (yet).

How to go about this issue?

Severity high since it easily provokes a reboot requirement.

Thanks,

Andreas Mohr
Comment 1 Andreas Mohr 2010-04-04 14:55:34 UTC
Hohumm, just found out that hfcsusb.c and several other mISDN drivers seem to have similar handling of B/D channels.
Comment 2 Karsten Keil 2010-04-25 16:57:56 UTC
Created attachment 26134 [details]
Proposed fix

This should fix this refcounting issue, it was not related to B-channel open/close, this is very easy. The issue is, that a L2 open of the D-channel
also implicit opens the L1 of the D-channel in some places, but not call the L1 open, but closing the D-channel, calls close for all entities (TEI manager, L2 and the stack itself) - but the TEI manager did not call open, so the refcount was not increased.

Note, that if you use NT mode, the L2 stack will be not shutdown on closing the L2
sockets by default (which result in a positiv refcount). You can set a flag to do the shutdown on close after open the socket, or use the clean_layer2 utility to shutdown the NT l2 instances (which then should result in a refcount of zero). This is not new behavior, but was hidden by the refcounting bug.
Comment 3 Andreas Mohr 2010-04-25 21:08:12 UTC
Ah, that sounds about right, thanks!

I'd think that normally a lower layer should validate (be protected against) any rogue input given by upper layers (thus a reference count issue would be avoided). However in the Linux module reference count implementation case this would be difficult since one would have to implement some user tracking to judge on whether a certain user is entitled to call close or not. Which probably isn't such a terribly good excuse for current reference count handling, but...

But at least it's good to see an existing issue in the upper layer fixed, thank you for this streamlined processing!

I'm currently unable to test the patch, but feel free to ask for testing (once circumstances allow for that).
Comment 4 Joerg Esser 2010-09-14 16:19:41 UTC
Patch works out of the box without problems.
Tested with hfcmulti and hfcpci.
Maybe this can go into the kernel tree ?

hf,

Joerg

Note You need to log in before you can comment on or make changes to this bug.