Bug 10077

Summary: nozomi (driver for Qualcomm 3G PCMCIA adapter) crashes
Product: Drivers Reporter: Antek Grzymala (awaria)
Component: PCMCIAAssignee: Frank Seidel (fseidel)
Status: CLOSED CODE_FIX    
Severity: normal CC: greg
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.25-rc2-git6 Subsystem:
Regression: --- Bisected commit-id:
Attachments: relevant kernel config
output from dmesg
PIN perl script
nozomi.c
nozomi.disasm
nozomi.s
Latest nozomi version (of my private tree) probably fixing this issue
possible fix

Description Antek Grzymala 2008-02-23 13:29:04 UTC
Earliest failing kernel version: 2.6.23 with gregkh patches
Distribution: Gentoo Linux
Hardware Environment: Option N.V. Qualcomm MSM6275 UMTS chip [1931:000c]

Problem Description:

When I boot my system with the Option card inserted I get the following in my syslog:

pccard: CardBus card inserted into slot 0
Initializing Nozomi driver 2.1d (build date: Feb 22 2008 21:34:48)
nozomi 0000:03:00.0: Init, new card found
nozomi 0000:03:00.0: Card type is: 2048
PCI: Enabling device 0000:03:00.0 (0000 -> 0002)
ACPI: PCI Interrupt 0000:03:00.0[A] -> GSI 18 (level, low) -> IRQ 18
------------[ cut here ]------------
WARNING: at arch/x86/mm/ioremap.c:137 __ioremap+0x1c2/0x1e3()
Modules linked in: nozomi(+) evdev isofs zlib_inflate loop yenta_socket rsrc_nonstatic pcmcia pcmcia_core firewire_ohci firewire_core crc_itu_t arc4 ecb ohci1394 ieee1394 snd_hda_intel snd_pcm snd_timer snd soundcore sdhci snd_page_alloc mmc_core tifm_7xx1 hci_usb tifm_core iwl3945 mac80211 bluetooth cfg80211 tpm_infineon tpm tpm_bios tg3 sg psmouse
Pid: 3480, comm: modprobe Not tainted 2.6.25-rc2-git6 #1
 [<c01249ea>] warn_on_slowpath+0x4e/0x5e
 [<c0258d1d>] ? acpi_pci_allocate_irq+0x65/0x6f
 [<c012a282>] ? __request_region+0x6a/0xa0
 [<c021f549>] ? pci_request_region+0x81/0x21a
 [<c0116619>] __ioremap+0x1c2/0x1e3
 [<c021f82d>] ? pci_request_selected_regions+0x32/0x67
 [<c0116653>] ioremap_nocache+0xa/0xc
 [<f8eb5023>] nozomi_card_init+0x209/0x642 [nozomi]
 [<c0180f8b>] ? find_inode+0x3a/0x64
 [<c01acbc0>] ? sysfs_ilookup_test+0x0/0x11
 [<c01acbc0>] ? sysfs_ilookup_test+0x0/0x11
 [<c0181076>] ? ifind+0x2a/0x88
 [<c01acbc0>] ? sysfs_ilookup_test+0x0/0x11
 [<c01ad12d>] ? sysfs_addrm_finish+0x16/0x1b8
 [<c01aced1>] ? sysfs_add_one+0x3e/0x8f
 [<c01acf6d>] ? sysfs_addrm_start+0x4b/0x87
 [<c022155a>] ? pci_match_device+0xa1/0xa9
 [<c0221622>] pci_device_probe+0x44/0x5f
 [<c0282c04>] driver_probe_device+0x81/0x157
 [<c0282e0d>] __driver_attach+0x8c/0x8e
 [<c02820f8>] bus_for_each_dev+0x41/0x5f
 [<c0282ab3>] driver_attach+0x19/0x1b
 [<c0282d81>] ? __driver_attach+0x0/0x8e
 [<c028297a>] bus_add_driver+0x1a5/0x20b
 [<c022158d>] ? pci_device_remove+0x0/0x3a
 [<c0282f9f>] driver_register+0x3d/0xe9
 [<c01720d3>] ? cdev_add+0x31/0x33
 [<c0374b42>] ? mutex_lock+0xe/0x20
 [<c0171f8c>] ? exact_lock+0x0/0x11
 [<c0221800>] __pci_register_driver+0x35/0x65
 [<f8d160e2>] nozomi_init+0xe2/0x100 [nozomi]
 [<c014493c>] sys_init_module+0x119/0x1bef
 [<c0280b0c>] ? device_remove_file+0x0/0x11
 [<c016fba2>] ? rw_verify_area+0x5a/0xb9
 [<c0170748>] ? sys_read+0x3d/0x64
 [<c0103f1a>] sysenter_past_esp+0x5f/0x85
 =======================
---[ end trace f44e4108df509696 ]---

I also tried removing the module (modprobe -r), loading it again (with the card still inserted) and then running a perl script that sends some data to the /dev/noz0 port. Here's what I get in my kernel log then:

Unloading Nozomi driver
pccard: CardBus card inserted into slot 0
Initializing Nozomi driver 2.1d (build date: Feb 22 2008 21:34:48)
nozomi 0000:03:00.0: Init, new card found
nozomi 0000:03:00.0: Card type is: 2048
PCI: Enabling device 0000:03:00.0 (0000 -> 0002)
ACPI: PCI Interrupt 0000:03:00.0[A] -> GSI 18 (level, low) -> IRQ 18
BUG: unable to handle kernel NULL pointer dereference at 00000008
IP: [<f8eb3c1b>] :nozomi:ntty_write_room+0x5a/0x6e
*pdpt = 0000000030c8e001 *pde = 0000000000000000 
Oops: 0000 [#1] SMP 
Modules linked in: nozomi sit tunnel4 aes_i586 aes_generic tun ipv6 bridge llc snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_indigoio fuse evdev isofs zlib_inflate loop yenta_socket rsrc_nonstatic pcmcia pcmcia_core firewire_ohci firewire_core crc_itu_t arc4 ecb ohci1394 ieee1394 snd_hda_intel snd_pcm snd_timer snd soundcore sdhci snd_page_alloc mmc_core tifm_7xx1 hci_usb tifm_core iwl3945 mac80211 bluetooth cfg80211 tpm_infineon tpm tpm_bios tg3 sg psmouse [last unloaded: nozomi]

Pid: 13425, comm: perl Not tainted (2.6.25-rc2-git6 #1)
EIP: 0060:[<f8eb3c1b>] EFLAGS: 00010202 CPU: 0
EIP is at ntty_write_room+0x5a/0x6e [nozomi]
EAX: 00000000 EBX: f287d060 ECX: 00000000 EDX: 00000001
ESI: 00000000 EDI: f287d09c EBP: f1bf8efc ESP: f1bf8ef0
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process perl (pid: 13425, ti=f1bf8000 task=f1c74e00 task.ti=f1bf8000)
Stack: 0000000f 0000000f f1c33000 f1bf8f38 c026c0e5 f1c33c00 f2358840 f1c33130 
       f1c33c00 f1bf8f38 00000000 f1c74e00 c011d086 f1c33134 f1c33134 0000000f 
       f1c33000 f2358840 f1bf8f6c c0269ac7 0000000f 0000000f 08872980 c026bfeb 
Call Trace:
 [<c026c0e5>] ? write_chan+0xfa/0x304
 [<c011d086>] ? default_wake_function+0x0/0xd
 [<c0269ac7>] ? tty_write+0x11c/0x1a7
 [<c026bfeb>] ? write_chan+0x0/0x304
 [<c01701c2>] ? vfs_write+0x8b/0x11f
 [<c02699ab>] ? tty_write+0x0/0x1a7
 [<c01707ac>] ? sys_write+0x3d/0x64
 [<c0103f1a>] ? sysenter_past_esp+0x5f/0x85
 [<c0370000>] ? init_centaur+0x1a9/0x2e9
 =======================
Code: f5 0e 4c c7 85 c0 74 17 31 f6 8b 43 38 85 c0 75 17 89 f8 e8 d2 0e 4c c7 89 f0 5b 5e 5f 5d c3 31 f6 89 f0 5b 5e 5f 5d c3 8b 43 08 <8b> 50 08 2b 50 0c 8b 70 04 29 d6 89 f8 e8 ad 0e 4c c7 eb d9 55 
EIP: [<f8eb3c1b>] ntty_write_room+0x5a/0x6e [nozomi] SS:ESP 0068:f1bf8ef0
---[ end trace f44e4108df509696 ]---
Comment 1 Frank Seidel 2008-02-25 01:44:09 UTC
Hi, i'm currently trying to reproduce this bug. Antek, could you please also post/attach your kernel config?
Comment 2 Antek Grzymala 2008-02-25 02:51:47 UTC
Created attachment 14976 [details]
relevant kernel config

As requested I'm attaching my kernel config.
Comment 3 Frank Seidel 2008-02-25 06:39:42 UTC
Antek, thanks for the config. But even with it i really cannot reproduce this problem here now (and i'm trying really very hard to).
Could you please also provide a full dmesg (i e.g. miss the initialization ok message from the nozomi driver in your first post)?

What does this perl script send to the card? Perhaps with this i can reproduce the problem here.
Comment 4 Frank Seidel 2008-02-25 06:42:03 UTC
BTW did your card work before 2.6.23?
Comment 5 Frank Seidel 2008-02-25 08:22:17 UTC
Antek, please could you also attach your nozomi.c, nozomi.disasm (via objdump -d drivers/char/nozomi.o > drivers/char/nozomi.disasm) and nozomi.s (via make drivers/char/nozomi.s in your kernel tree) ?
Comment 6 Antek Grzymala 2008-02-25 12:43:10 UTC
Hi, thank you for your work involving this bug. The card certainly did work using code from pharscape.org (in particular using the nozomi-2.21a ebuild from http://bugs.gentoo.org/show_bug.cgi?id=144913). This driver stopped compiling on 2.6.23 (AFAIK), there were some patches floating around, but it never really worked. This includes the big patch against the kernel tree (from gregkh) which made the driver actually compile but I started getting errors like the one in this bug.
Comment 7 Antek Grzymala 2008-02-25 12:45:12 UTC
Created attachment 14984 [details]
output from dmesg

This is almost full (beginning got eaten by the ring buffer) output from dmesg, includes first loading the driver and then trying to run the perl script (included as the next attachment). The perl script is used to enter PIN.
Comment 8 Antek Grzymala 2008-02-25 12:49:36 UTC
Created attachment 14985 [details]
PIN perl script
Comment 9 Antek Grzymala 2008-02-25 12:54:00 UTC
However, please note that we get the crash before I actually run the script, so the first crash seems unrelated (and sorry about comment noise).
Comment 10 Frank Seidel 2008-02-25 13:52:05 UTC
Mh, thats really odd :( I never happend to see an error like this. I know that doesn't help you, but until now i only heared reports how stable nozomi currently is.

How about the other nozomi.* files of you (comment #5)?

Btw. do you have a recent firmware on your nozomi card? Could you ever try your card on another machine? Or perhaps the other way round: do use also use other pccards (with or without problems) in your laptop?
Comment 11 Antek Grzymala 2008-02-25 14:07:51 UTC
One moment... Forgot about the files from comment #5. My nozomi card works under Windows, I'll try to see if there's newer firmware for this. I'll also try on another machine if I get a chance.

As to other PCMCIA hardware I actually do have a problem with a recently bought IndigoIO audio interface (http://bugzilla.kernel.org/show_bug.cgi?id=9955), it also works perfectly under windows so it's probably not a hardware issue.
Comment 12 Antek Grzymala 2008-02-25 14:08:24 UTC
Created attachment 14988 [details]
nozomi.c
Comment 13 Antek Grzymala 2008-02-25 14:08:53 UTC
Created attachment 14989 [details]
nozomi.disasm
Comment 14 Antek Grzymala 2008-02-25 14:09:21 UTC
Created attachment 14990 [details]
nozomi.s
Comment 15 Frank Seidel 2008-02-25 21:57:59 UTC
Thanks for the files. I will look later on today into them.

But, your other bug (#9955) makes me currently believe there is a problem with your pccard slot (at least under linux.. sadfully i also had a hp notebook where it didn't work stable under linux for me). The problem with your indigoio audio card seems extremely similar (also has problems getting ressources on card probing/initialization).
So, yes, a test of your card in another machine probably would be very helpfull.

Thanks,
Frank
Comment 16 Frank Seidel 2008-02-26 08:40:05 UTC
After looking close after both bugs (this and #9955) i definitely think there is a problem with your pcmcia (PCIxx12) bridge.
Neither should the Indigo IO nor the nozomi driver bite at that point (and both seem to work in this regard on other machines). Both cannot move on in their probe function after the pci resource allocation.

I really wish i could help you more with this problem (as i am really dedicated to fix possible issues with the nozomi driver), but imho you need to fix your pcmcia first (e.g. via some workarounds in /etc/pcmcia/config.opts).

Please, feel free to reopen when you are/made sure this isn't a problem of your machine (and i'll promise to also keep an eye on it).
Comment 17 Frank Seidel 2008-03-04 15:27:14 UTC
Antek, i have a small update for the nozomi driver that works very well here (but that also applied to the old/current version in my case).
While i don't really expect it, there's a slight chance it might help you. Would you like to give this patch a try?

ftp://ftp.kernel.org/pub/linux/kernel/people/fseidel/nozomi/nozomi_update_2.1e.patch

Thanks,
Frank
Comment 18 Frank Seidel 2008-03-05 08:21:18 UTC
Finally i found a machine and a way to (possibly) reproduce your problem. At least i also now also get a oops on the very first access to /dev/noz0.
Comment 19 Frank Seidel 2008-03-05 08:36:10 UTC
Created attachment 15153 [details]
Latest nozomi version (of my private tree) probably fixing this issue
Comment 20 Frank Seidel 2008-03-05 08:38:41 UTC
Created attachment 15154 [details]
possible fix

(sorry, the last attachment - via url to patch - didn't work as expected)
Comment 21 Frank Seidel 2008-03-05 08:40:42 UTC
Antek, could you please test if this patch also fixes your problem?

Thanks,
Frank
Comment 22 Antek Grzymala 2008-03-06 01:25:16 UTC
Hi, thanks for your ongoing work on this bug. I tested the new driver today with 2.6.25-rc4-git1 and unfortunately it oops upon loading the driver. The difference is that when I try to write to the device node it responds with:

# echo 1234 > /dev/noz0
-su: /dev/noz0: No such device

Do you think I should open a new bug about a regression in the PCMCIA subsystem? This device used to work with older kernels (and obviously, older versions of nozomi driver), and I suppose my audio-card problem is also caused by the same regression.

relevant snippet from dmesg follows:

pccard: CardBus card inserted into slot 0
Initializing Nozomi driver 2.1e (build date: Mar  6 2008 09:47:37)
nozomi 0000:03:00.0: Init, new card found
PCI: Enabling device 0000:03:00.0 (0000 -> 0002)
ACPI: PCI Interrupt 0000:03:00.0[A] -> GSI 18 (level, low) -> IRQ 18
nozomi 0000:03:00.0: Card type is: 2048
------------[ cut here ]------------
WARNING: at arch/x86/mm/ioremap.c:137 __ioremap+0x1bb/0x1dc()
Modules linked in: nozomi(+) sit tunnel4 aes_i586 aes_generic tun ipv6 snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_indigoio fuse evdev isofs zlib_inflate loop yenta_socket rsrc_nonstatic pcmcia_core firewire_ohci firewire_core crc_itu_t arc4 ecb ohci1394 ieee1394 sdhci tifm_7xx1 snd_hda_intel mmc_core snd_pcm tifm_core snd_timer iwl3945 mac80211 snd hci_usb sg soundcore snd_page_alloc cfg80211 bluetooth tg3 tpm_infineon tpm tpm_bios usb_storage psmouse
Pid: 8926, comm: modprobe Not tainted 2.6.25-rc4-git1 #1
 [<c01240aa>] warn_on_slowpath+0x4e/0x5e
 [<c0217f0e>] ? vsnprintf+0x2f8/0x603
 [<c012495d>] ? release_console_sem+0x1d1/0x1dd
 [<c0115f82>] __ioremap+0x1bb/0x1dc
 [<c0115fbc>] ioremap_nocache+0xa/0xc
 [<f8fcb475>] nozomi_card_init+0x29e/0x695 [nozomi]
 [<c01ac330>] ? sysfs_ilookup_test+0x0/0x11
 [<c01ac330>] ? sysfs_ilookup_test+0x0/0x11
 [<c0180786>] ? ifind+0x2a/0x88
 [<c01ac330>] ? sysfs_ilookup_test+0x0/0x11
 [<c01ac89d>] ? sysfs_addrm_finish+0x16/0x1b8
 [<c01ac641>] ? sysfs_add_one+0x3e/0x8f
 [<c01ac6dd>] ? sysfs_addrm_start+0x4b/0x87
 [<c022056a>] ? pci_match_device+0xa1/0xa9
 [<c0220632>] pci_device_probe+0x44/0x5f
 [<c0281bc4>] driver_probe_device+0x81/0x157
 [<c0281dcd>] __driver_attach+0x8c/0x8e
 [<c02810b8>] bus_for_each_dev+0x41/0x5f
 [<c0281a73>] driver_attach+0x19/0x1b
 [<c0281d41>] ? __driver_attach+0x0/0x8e
 [<c028193a>] bus_add_driver+0x1a5/0x20b
 [<c022059d>] ? pci_device_remove+0x0/0x3a
 [<c0281f5f>] driver_register+0x3d/0xe9
 [<c01717e3>] ? cdev_add+0x31/0x33
 [<c0374132>] ? mutex_lock+0xe/0x20
 [<c017169c>] ? exact_lock+0x0/0x11
 [<c0220810>] __pci_register_driver+0x35/0x65
 [<f8da10e2>] nozomi_init+0xe2/0x100 [nozomi]
 [<c0144075>] sys_init_module+0x117/0x1bdf
 [<c027fb2c>] ? device_remove_file+0x0/0x11
 [<c016f2a2>] ? rw_verify_area+0x5a/0xb9
 [<c016fe48>] ? sys_read+0x3d/0x64
 [<c0103eda>] sysenter_past_esp+0x5f/0x85
 =======================
---[ end trace 517cc97b8fceb731 ]---
Comment 23 Frank Seidel 2008-03-06 02:05:05 UTC
Hi, thanks for your fast repsonse :-). Yes, i think you have a severe problem with your pcmcia/cardbus system. The ioremap call in nozomi triggering this warning (but this is no oops btw, or do you have anything left out or another backtrace?) is fully valid and should work without those problems. If i remember correctly there is already someone working on this problem, but filing a bug to track this issue surely won't hurt.

The "No such device" error is itended and caused by the new patch. On a working (cardbus-)system this only (possibly) happens on/while card initialization (on the first fractions of a second after the card got plugged in) when the device is currently still setup but not yet fully ready to use. This problem (accessing the card that very early) until now caused a real oops (as the one at the end of your initial description here).
The access is now deferred until its really safe to use the card (preventing any problems like this). BTW in the path of the ftp url in comment #17 i currently put updates to the nozomi driver every now and then (at least for now until i perhaps setup a own tree). It's always good to have some more good souls willing to test my code :-)

So, as your ioremap problem is not related to the nozomi driver and the only real oops seen is fixed, i think it's fair to say this bug is resolved.
(Of course i'll try to push those patches in the upstream version, but probably for 2.6.25 its already too late now)
Comment 24 Frank Seidel 2008-03-10 22:51:02 UTC
Ok, just for record: it wasn't really too late (at least not for a pure bugfix). So i isolated the changes to prevent the mentioned oops and thanks to Greg it already now is in current linus git tree.
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=661b4e89daf10e3f65a1086fd95c7a84720ccdd2
Comment 25 Antek Grzymala 2008-04-19 15:22:50 UTC
This got fixed in the final 2.6.25 release. Last release candidate I tried still failed.

My related soundcard problem http://bugzilla.kernel.org/show_bug.cgi?id=9955 was also fixed.

Thank you all who helped (and who fixed the actual bug).