Bug 7730

Summary: intel-rng does not work when built into the kernel not as a module
Product: Drivers Reporter: Ominousfate
Component: OtherAssignee: Michael Buesch (mb)
Status: CLOSED CODE_FIX    
Severity: low CC: jbeulich, mb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.19 (and 2.6.18-gentoo-r4) Subsystem:
Regression: --- Bisected commit-id:
Attachments: output for "lspci -n"
dmesg output when intel-rng is built-in
dmesg output when intel-rng is a module

Description Ominousfate 2006-12-21 22:02:39 UTC
Most recent kernel where this bug did *NOT* occur: 2.6.17
Distribution: Gentoo
Hardware Environment:

/proc/cpuinfo:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 8
model name      : Celeron (Coppermine)
stepping        : 3
cpu MHz         : 598.211
cache size      : 128 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr sse
bogomips        : 1197.43

lspci -vvv:
00:00.0 Host bridge: Intel Corporation 82810 GMCH [Graphics Memory Controller
Hub] (rev 02)
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort+ >SERR- <PERR-
	Latency: 0

00:01.0 VGA compatible controller: Intel Corporation 82810 CGC [Chipset Graphics
Controller] (rev 02) (prog-if 00 [VGA])
	Subsystem: Intel Corporation 82810 CGC [Chipset Graphics Controller]
	Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 0
	Interrupt: pin A routed to IRQ 7
	Region 0: Memory at ec000000 (32-bit, prefetchable) [disabled] [size=64M]
	Region 1: Memory at e8000000 (32-bit, non-prefetchable) [disabled] [size=512K]
	Capabilities: [dc] Power Management version 1
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:1e.0 PCI bridge: Intel Corporation 82801AB PCI Bridge (rev 02) (prog-if 00
[Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 0
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
	I/O behind bridge: 00003000-00003fff
	Memory behind bridge: e8100000-e9ffffff
	Prefetchable memory behind bridge: f0000000-f7ffffff
	Secondary status: 66MHz- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR+ NoISA+ VGA+ MAbort- >Reset- FastB2B-

00:1f.0 ISA bridge: Intel Corporation 82801AB ISA Bridge (LPC) (rev 02)
	Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 0

00:1f.1 IDE interface: Intel Corporation 82801AB IDE (rev 02) (prog-if 80 [Master])
	Subsystem: Intel Corporation 82801AB IDE
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 0
	Region 4: I/O ports at 1800 [size=16]

00:1f.2 USB Controller: Intel Corporation 82801AB USB (rev 02) (prog-if 00 [UHCI])
	Subsystem: Intel Corporation 82801AB USB
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 0
	Interrupt: pin D routed to IRQ 11
	Region 4: I/O ports at 1820 [size=32]

00:1f.3 SMBus: Intel Corporation 82801AB SMBus (rev 02)
	Subsystem: Intel Corporation 82801AB SMBus
	Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Interrupt: pin B routed to IRQ 9
	Region 4: I/O ports at 1810 [size=16]

00:1f.5 Multimedia audio controller: Intel Corporation 82801AB AC'97 Audio (rev 02)
	Subsystem: Intel Corporation Unknown device 5643
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 0
	Interrupt: pin B routed to IRQ 9
	Region 0: I/O ports at 1200 [size=256]
	Region 1: I/O ports at 1300 [size=64]

01:0b.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2 MX/MX 400]
(rev b2) (prog-if 00 [VGA])
	Subsystem: eVga.com. Corp. Unknown device b039
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 248 (1250ns min, 250ns max)
	Interrupt: pin A routed to IRQ 3
	Region 0: Memory at e9000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at f0000000 (32-bit, prefetchable) [size=128M]
	[virtual] Expansion ROM at e8120000 [disabled] [size=64K]
	Capabilities: [60] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

01:0d.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev 21)
	Subsystem: Netgear FA310TX
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 64
	Interrupt: pin A routed to IRQ 9
	Region 0: I/O ports at 3000 [size=256]
	Region 1: Memory at e8110000 (32-bit, non-prefetchable) [size=256]
	[virtual] Expansion ROM at e8140000 [disabled] [size=256K]

01:0e.0 Communication controller: Conexant HSF 56k Data/Fax Modem (rev 01)
	Subsystem: Mac System Co Ltd HP
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping-
SERR+ FastB2B+
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 64
	Interrupt: pin A routed to IRQ 7
	Region 0: Memory at e8100000 (32-bit, non-prefetchable) [size=64K]
	Region 1: I/O ports at 3400 [size=8]
	Capabilities: [40] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

Software Environment:

Tested with 2.6.19 tarball directly from kernel.org and compiled with gcc
version 4.1.1 (Gentoo 4.1.1-r1).

But it also happens with this patched kernel version currently running:
Linux version 2.6.18-gentoo-r4 (root@p4v1l10nC3l3r0n) (gcc version 4.1.1 (Gentoo
4.1.1-r1)) #6 Mon Dec 18 22:56:25 PST 2006

Problem Description:

When intel-rng is built as a module the device works and /dev/hw_random is
accessible. Also dmesg shows this line: Intel 82802 RNG detected.

If intel-rng is built-in the kernel rather than as a module, the above line from
dmesg is missing and /dev/hw_random is not accessible.

This problem happens with the mainline kernel 2.6.19.1 from kernel.org and
2.6.18-gentoo-r4. I tested without loading any binary modules.

Steps to reproduce:

When configuring kernel set this as built-in rather than a module:
Device Drivers->Character devices->Intel Hardware Random Number Generator
Support (Defined at drivers/char/hw_random/Kconfig:13).

Workaround:

Compile intel-rng as a module. Or use kernel 2.6.17 or earlier.

Keywords:

built-in character intel-rng hw_random i810 module random rng

Side note: I think the patch to allow intel-rng to be built as a module was not
present in 2.6.17 and earlier but is for 2.6.18 and above.
Comment 1 Jan Beulich 2006-12-22 02:39:18 UTC
I cannot reproduce this. Could you check boot.msg for RNG related messages, or
possibly attach it here? Could you please also clarify whether the 2.6.18-*
kernel you're referring to is plain 2.6.18 based, or which of the 2.6.18.x ones
it is based on? Finally (and just to be sure), could you please also attach
lspci -n output?
Comment 2 Ominousfate 2006-12-23 23:09:13 UTC
Created attachment 9940 [details]
output for "lspci -n"
Comment 3 Ominousfate 2006-12-23 23:11:28 UTC
Created attachment 9941 [details]
dmesg output when intel-rng is built-in

Unmodified 2.6.18.1 from kernel.org.
Comment 4 Ominousfate 2006-12-23 23:12:27 UTC
Created attachment 9942 [details]
dmesg output when intel-rng is a module

Unmodified 2.6.18.1 kernel from kernel.org
Comment 5 Ominousfate 2006-12-23 23:53:33 UTC
Also when trying to recreate the bug by having the module built-in; if you have
intel-rng.ko sitting in /lib/modules/$kernel-version from a previous kernel
compile where it was set as a module and you did "make modules_install", be sure
it is not loading.
Probably a clearer way to explain how I recreated the bug:
Delete /lib/modules/$kernel-version. Then configure your kernel so that
intel-rng is built-in. Then do make and make modules_install to recreate the
/lib/modules/$kernel-version tree (intel-rng.ko should not be in the tree). Copy
System.map and bzImage to /boot and update the bootloader and reboot.
Comment 6 Jan Beulich 2007-01-02 07:21:04 UTC
This must then be a result of the splitting of hw_random, although I can't yet
see why. Michael, do you have any idea?

Originator (no clue what your name is), can you perhaps add a printk to the top
of drivers/char/hw_random/intel-rng.c:mod_init(), so we know whether this
routine is being called at all. Also, am I right in assuming that the sole
difference between the two kernels you posted the messages for is the
CONFIG_HW_RANDOM_INTEL setting?
Comment 7 Michael Buesch 2007-01-02 14:41:27 UTC
I think we shouldn't be using subsys_initcall().
Can you change it to module_init() and try again?

I don't remember why we did subsys_initcall() in the first place. I think every
RNG driver should use module_init(). This really seems to be a bug in every RNG
driver.
Comment 8 Ominousfate 2007-01-03 22:29:41 UTC
I added a printk message at the top of the mod_init() function and compiled with
the module as built in. The message displayed on boot up so mod_init() is being
called.

Not sure if this helps but I also tried changing this:

        if (!pci_dev_present(pci_tbl))
                goto out; /* Device not found. */

to this

        if (!pci_dev_present(pci_tbl))
        {
                printk("*** !pci_dev_present(pci_tbl) evaluated true ***\n");
                goto out; /* Device not found. */
        }

to see if pci_dev_present(pci_tbl) evaluates to false and indeed it did as I saw
the inserted printk message on bootup.

Also the 2 dmesg outputs I attached are from the same exact kernel sources,
2.6.18.1, (consecutive compiles actually as the uname info at the top show #1
and #2, first with built in and then as a module)so yes the only difference is
the CONFIG_HW_RANDOM_INTEL setting.

*Michael's suggestion fixes the problem.
By replacing this at the bottom of intel-rng.c:

	subsys_initcall(mod_init);

with this

	module_init(mod_init);

the device is detected and works when it is also built in. Also I tested it as
a module just in case and it still works as a module.
Comment 9 Michael Buesch 2007-01-04 03:43:42 UTC
I will do a patch and submit it upstream.
Comment 10 David Brownell 2007-01-12 14:57:44 UTC
Huh, and I was getting ready to submit a patch to make the OMAP RNG 
use subsys_initcall, along with other fixes.  Obviously, that RNG does 
not have PCI-derived constraints. 
 
Agreed that PCI init sequence requirements must be obeyed, but not that 
doing such late initialization is preferred.  In the general case, 
drivers and subsystems may need to access RNG functionality themselves, 
so doing that init as early as practical is safest ... ensuring that 
randomness is there. 
Comment 11 Michael Buesch 2007-01-13 06:06:33 UTC
No you are wrong. Nobody can use the _hardware_ RNG before userspace is up and
rngd it running. And that is damn late...