Bug 12680

Summary: Entropy pool problem
Product: Drivers Reporter: Valentin QUEQUET (valentin.quequet)
Component: OtherAssignee: drivers_other
Status: CLOSED DOCUMENTED    
Severity: normal CC: rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Any 2.6.29 release candidate Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 12398    

Description Valentin QUEQUET 2009-02-09 09:12:58 UTC
Latest working kernel version: 2.6.28.4
Earliest failing kernel version: 2.6.29-rc1
Distribution:
Hardware Environment: IA32 i686 Athlon model 10
Software Environment: Pristine kernel 2.6.29-rc4-git1 + Debian Lenny/Sid
Problem Description: Not having a VIA PadLock hardware incurs a long delay in probing on modules insertion attempt:

  I can't say whether this abnormally long probe delay occurs on "padlock_aes" or "padlock_sha" module insertion attempt or both.

  I do not have such hardware indeed.

I've never observed this problem so far with Linux version <= 2.6.28.4 .

Steps to reproduce:

Power-on your system ;-)

In hope my report will prove useful.

Sincerely,
Valentin QUEQUET.
Comment 1 Anonymous Emailer 2009-02-09 13:05:40 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon,  9 Feb 2009 09:12:59 -0800 (PST)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=12680
> 
>            Summary: Not having a VIA PadLock hardware incurs a long delay in
>                     probing on modules insertion attempt.
>            Product: Drivers
>            Version: 2.5
>      KernelVersion: Any 2.6.29 release candidate
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: drivers_other@kernel-bugs.osdl.org

hm, we don't seem to have a bugzilla category for crypto.

>         ReportedBy: v.quequet-techniques@orange.fr
> 
> 
> Latest working kernel version: 2.6.28.4
> Earliest failing kernel version: 2.6.29-rc1

I'll ask Rafael to track this as a post-2.6.28 regression.

> Distribution:
> Hardware Environment: IA32 i686 Athlon model 10
> Software Environment: Pristine kernel 2.6.29-rc4-git1 + Debian Lenny/Sid
> Problem Description: Not having a VIA PadLock hardware incurs a long delay in
> probing on modules insertion attempt:
> 
>   I can't say whether this abnormally long probe delay occurs on
>   "padlock_aes"
> or "padlock_sha" module insertion attempt or both.
> 
>   I do not have such hardware indeed.
> 
> I've never observed this problem so far with Linux version <= 2.6.28.4 .
> 
> Steps to reproduce:
> 
> Power-on your system ;-)
> 
> In hope my report will prove useful.
> 

Neither of those drivers have changed in six months, so the breakage
must be elsewhere.

I guess this should be easy for others to reproduce.

How long is the "long" delay?

Please do this:

	add "log_buf_len=1M" to the kernel boot command line
	reboot

	dmesg -n 8
	modprobe padlock_aes &
	sleep 1
	echo t > /prog/sysrq-trigger
	dmesg -s 1000000 > foo

and then send us `foo'.  Pleas avoid wordwrapping it.

This will permit us to see where `modprobe' is getting stuck.

Thanks.
Comment 2 Valentin QUEQUET 2009-02-09 14:02:07 UTC
Andrew Morton wrote :
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Mon,  9 Feb 2009 09:12:59 -0800 (PST)
> bugme-daemon@bugzilla.kernel.org wrote:
> 
>> http://bugzilla.kernel.org/show_bug.cgi?id=12680
>>
>>            Summary: Not having a VIA PadLock hardware incurs a long delay in
>>                     probing on modules insertion attempt.
>>            Product: Drivers
>>            Version: 2.5
>>      KernelVersion: Any 2.6.29 release candidate
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: Other
>>         AssignedTo: drivers_other@kernel-bugs.osdl.org
> 
> hm, we don't seem to have a bugzilla category for crypto.
> 
>>         ReportedBy: v.quequet-techniques@orange.fr
>>
>>
>> Latest working kernel version: 2.6.28.4
>> Earliest failing kernel version: 2.6.29-rc1
> 
> I'll ask Rafael to track this as a post-2.6.28 regression.
> 
>> Distribution:
>> Hardware Environment: IA32 i686 Athlon model 10
>> Software Environment: Pristine kernel 2.6.29-rc4-git1 + Debian Lenny/Sid
>> Problem Description: Not having a VIA PadLock hardware incurs a long delay
>> in
>> probing on modules insertion attempt:
>>
>>   I can't say whether this abnormally long probe delay occurs on
>>   "padlock_aes"
>> or "padlock_sha" module insertion attempt or both.
>>
>>   I do not have such hardware indeed.
>>
>> I've never observed this problem so far with Linux version <= 2.6.28.4 .
>>
>> Steps to reproduce:
>>
>> Power-on your system ;-)
>>
>> In hope my report will prove useful.
>>
> 
> Neither of those drivers have changed in six months, so the breakage
> must be elsewhere.

Hello,

I fear you're right:

It's setting up an LVM volume which takes over 20 seconds, sometimes 
more than 1 minute on my system when I run Linux 2.6.29-rcX... ; and no 
problem with Linux <= 2.6.28.4 though.

I understand this is quite a vague description of the trouble, and I'm 
considering more analysis in the near future.

This might be a Debian Lenny/Sid specific bug. I'm sorry for my 
eagerness to fill in my initial bug report, because I wrongly considered 
PadLock-related modules responsible for that delay.

> I guess this should be easy for others to reproduce.
> 
> How long is the "long" delay?
> 
> Please do this:
> 
>       add "log_buf_len=1M" to the kernel boot command line
>       reboot
> 
>       dmesg -n 8
>       modprobe padlock_aes &
>       sleep 1
>       echo t > /prog/sysrq-trigger
>       dmesg -s 1000000 > foo
> 
> and then send us `foo'.  Pleas avoid wordwrapping it.
> 
> This will permit us to see where `modprobe' is getting stuck.

No delay observed doing these commands ; I was wrong.

> Thanks.

Thanks too.

On the Debian Lenny/Sid average user point of view, it is still a 
post-2.6.28 regression, though. Note: I only did comparisons between 
different Pristine kernel versions.

But I am sorry not having more accurate clues about that trouble.

In hope my report will prove useful.

Sincerely,
Valentin QUEQUET
Comment 3 Valentin QUEQUET 2009-02-11 08:40:26 UTC
Andrew wrote :
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Mon,  9 Feb 2009 09:12:59 -0800 (PST)
> bugme-daemon@bugzilla.kernel.org wrote:
> 
>> http://bugzilla.kernel.org/show_bug.cgi?id=12680
>>
>>            Summary: Not having a VIA PadLock hardware incurs a long delay in
>>                     probing on modules insertion attempt.
>>            Product: Drivers
>>            Version: 2.5
>>      KernelVersion: Any 2.6.29 release candidate
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: Other
>>         AssignedTo: drivers_other@kernel-bugs.osdl.org
> 
> hm, we don't seem to have a bugzilla category for crypto.
> 
>>         ReportedBy: v.quequet-techniques@orange.fr
>>
>>
>> Latest working kernel version: 2.6.28.4
>> Earliest failing kernel version: 2.6.29-rc1
> 
> I'll ask Rafael to track this as a post-2.6.28 regression.

Hello,

More about this ; see below.

>> Distribution:
>> Hardware Environment: IA32 i686 Athlon model 10
>> Software Environment: Pristine kernel 2.6.29-rc4-git1 + Debian Lenny/Sid
>> Problem Description: Not having a VIA PadLock hardware incurs a long delay
>> in
>> probing on modules insertion attempt:
>>
>>   I can't say whether this abnormally long probe delay occurs on
>>   "padlock_aes"
>> or "padlock_sha" module insertion attempt or both.
>>
>>   I do not have such hardware indeed.
>>
>> I've never observed this problem so far with Linux version <= 2.6.28.4 .
>>
>> Steps to reproduce:
>>
>> Power-on your system ;-)
>>
>> In hope my report will prove useful.
>>
> 
> Neither of those drivers have changed in six months, so the breakage
> must be elsewhere.
> 
> I guess this should be easy for others to reproduce.
> 
> How long is the "long" delay?
> 
> Please do this:
> 
>       add "log_buf_len=1M" to the kernel boot command line
>       reboot
> 
>       dmesg -n 8
>       modprobe padlock_aes &
>       sleep 1
>       echo t > /prog/sysrq-trigger
>       dmesg -s 1000000 > foo
> 
> and then send us `foo'.  Pleas avoid wordwrapping it.
> 
> This will permit us to see where `modprobe' is getting stuck.
> 
> Thanks.

Thanks too ; the problem was elsewhere.

Hello the hurd,


I've finally found why my computer seems to hang (pause) quite lengthy 
when I boot Pristine Linux 2.6.29-rcX... instead of Pristine Linux 
2.6.28.4 (for example).

The reason is that the cryptographic keys generation for the Device 
Mapper takes longer with 2.6.29 than with 2.6.28 under certain 
circumstances.


To notice a non-negligible delay in the key generation phase, the system 
must fit the following both 2 conditions:

   1) The system PRNG entropy pool must lack of entropy normally brought 
in the form of environmental noise.

   2) The system must initiate its Device-Mapper-Encrypted (dm-crypt) 
partitions with boot-time dynamically generated
        cryptographic keys using "/dev/random" as key file. (the 3rd 
field of "/etc/crypttab" ; see "man crypttab")


Such a long delay in the key generation phase can be avoided if the 
system fits either of the following 2 conditions:

   1) The excitated user stresses its keyboard and mouse (generates much 
environmental noise) to provide the PRNG entropy pool with much entropy. 
(Or some other peripheral generates noise : network interface, ...)

   2) The system initiates dm-crypt partitions using "/dev/urandom" as 
key file.


But in the scenario where both
   1) environmental noise is reduced to the minimum (no user 
'excitation' and mouse and NIC unplugged)
and
   2) where dm-crypt partitions are initialized with "/dev/random" as 
key file,
there is a huge difference whether I boot Linux 2.6.28.y or Linux 
2.6.29-rcX... .


In order to provide you with meaningful information but not too much, I 
join a few "bootchart"-generated logs (bootchart*.tgz) plus their 
".svgz" corresponding diagrams (Pruned and Not-Pruned) for the following 
test cases:

Having always environmental noise reduced at its minimum possible level.
Using alternately 2.6.28 and 2.6.29 Linux versions.
Using alternately "/dev/random" and "/dev/urandom" as dm-crypt key file.

There are then 4 test cases for which I join files, and for each test 
case, I provide:
   - The "bootchart*.tgz" bootchart report.
   - The Not-Pruned ".svgz" corresponding SVG diagram.
   - The Pruned ".svgz" corresponding SVG diagram.

Thus leading to the following 12 files:

-r--r--r-- 1 testr testr 174682 Feb 11 17:10 
DevRandom_bootchart-2.6.28.4.BootChart_Report.tgz
-r--r--r-- 1 testr testr 102648 Feb 11 17:10 
DevRandom_bootchart-2.6.28.4.Not-Pruned_SVG_Diagram.svgz
-r--r--r-- 1 testr testr  26010 Feb 11 17:10 
DevRandom_bootchart-2.6.28.4.Pruned_SVG_Diagram.svgz
-r--r--r-- 1 testr testr 327701 Feb 11 17:10 
DevRandom_bootchart-2.6.29-rc4-git1.BootChart_Report.tgz
-r--r--r-- 1 testr testr 175522 Feb 11 17:10 
DevRandom_bootchart-2.6.29-rc4-git1.Not-Pruned_SVG_Diagram.svgz
-r--r--r-- 1 testr testr  39844 Feb 11 17:10 
DevRandom_bootchart-2.6.29-rc4-git1.Pruned_SVG_Diagram.svgz
-r--r--r-- 1 testr testr 138401 Feb 11 17:10 
DevUrandom_bootchart-2.6.28.4.BootChart_Report.tgz
-r--r--r-- 1 testr testr  80691 Feb 11 17:10 
DevUrandom_bootchart-2.6.28.4.Not-Pruned_SVG_Diagram.svgz
-r--r--r-- 1 testr testr  21136 Feb 11 17:10 
DevUrandom_bootchart-2.6.28.4.Pruned_SVG_Diagram.svgz
-r--r--r-- 1 testr testr 152979 Feb 11 17:10 
DevUrandom_bootchart-2.6.29-rc4-git1.BootChart_Report.tgz
-r--r--r-- 1 testr testr  78323 Feb 11 17:10 
DevUrandom_bootchart-2.6.29-rc4-git1.Not-Pruned_SVG_Diagram.svgz
-r--r--r-- 1 testr testr  20745 Feb 11 17:10 
DevUrandom_bootchart-2.6.29-rc4-git1.Pruned_SVG_Diagram.svgz

But for the sake of convenience, I tar them all as 
"Dev-Random_regression_on_post-2.6.28_kernels.tar"

In hope my report will prove useful.

Sincerely,
Valentin QUEQUET

n.b. : Don't hesitate to ask me for more files or explanations.
Comment 4 Anonymous Emailer 2009-02-11 09:16:50 UTC
Reply-To: akpm@linux-foundation.org


(cc dm-devel)

On Wed, 11 Feb 2009 17:27:42 +0100 Valentin QUEQUET <v.quequet-techniques@orange.fr> wrote:

> 
> I've finally found why my computer seems to hang (pause) quite lengthy 
> when I boot Pristine Linux 2.6.29-rcX... instead of Pristine Linux 
> 2.6.28.4 (for example).
> 
> The reason is that the cryptographic keys generation for the Device 
> Mapper takes longer with 2.6.29 than with 2.6.28 under certain 
> circumstances.

So it's device-mapper userspace?

Is this new behaviour in recent kernel versions?  Some kernel change
caused /dev/random accesses to wait for longer before sufficient
entropy has been gathered?


> To notice a non-negligible delay in the key generation phase, the system 
> must fit the following both 2 conditions:
> 
>    1) The system PRNG entropy pool must lack of entropy normally brought 
> in the form of environmental noise.
> 
>    2) The system must initiate its Device-Mapper-Encrypted (dm-crypt) 
> partitions with boot-time dynamically generated
>         cryptographic keys using "/dev/random" as key file. (the 3rd 
> field of "/etc/crypttab" ; see "man crypttab")
> 
> 
> Such a long delay in the key generation phase can be avoided if the 
> system fits either of the following 2 conditions:
> 
>    1) The excitated user stresses its keyboard and mouse (generates much 
> environmental noise) to provide the PRNG entropy pool with much entropy. 
> (Or some other peripheral generates noise : network interface, ...)
> 
>    2) The system initiates dm-crypt partitions using "/dev/urandom" as 
> key file.
> 
> 
> But in the scenario where both
>    1) environmental noise is reduced to the minimum (no user 
> 'excitation' and mouse and NIC unplugged)
> and
>    2) where dm-crypt partitions are initialized with "/dev/random" as 
> key file,
> there is a huge difference whether I boot Linux 2.6.28.y or Linux 
> 2.6.29-rcX... .
> 
> 
> In order to provide you with meaningful information but not too much, I 
> join a few "bootchart"-generated logs (bootchart*.tgz) plus their 
> ".svgz" corresponding diagrams (Pruned and Not-Pruned) for the following 
> test cases:
> 
> Having always environmental noise reduced at its minimum possible level.
> Using alternately 2.6.28 and 2.6.29 Linux versions.
> Using alternately "/dev/random" and "/dev/urandom" as dm-crypt key file.
> 
> There are then 4 test cases for which I join files, and for each test 
> case, I provide:
>    - The "bootchart*.tgz" bootchart report.
>    - The Not-Pruned ".svgz" corresponding SVG diagram.
>    - The Pruned ".svgz" corresponding SVG diagram.
> 
> Thus leading to the following 12 files:
> 
> -r--r--r-- 1 testr testr 174682 Feb 11 17:10 
> DevRandom_bootchart-2.6.28.4.BootChart_Report.tgz
> -r--r--r-- 1 testr testr 102648 Feb 11 17:10 
> DevRandom_bootchart-2.6.28.4.Not-Pruned_SVG_Diagram.svgz
> -r--r--r-- 1 testr testr  26010 Feb 11 17:10 
> DevRandom_bootchart-2.6.28.4.Pruned_SVG_Diagram.svgz
> -r--r--r-- 1 testr testr 327701 Feb 11 17:10 
> DevRandom_bootchart-2.6.29-rc4-git1.BootChart_Report.tgz
> -r--r--r-- 1 testr testr 175522 Feb 11 17:10 
> DevRandom_bootchart-2.6.29-rc4-git1.Not-Pruned_SVG_Diagram.svgz
> -r--r--r-- 1 testr testr  39844 Feb 11 17:10 
> DevRandom_bootchart-2.6.29-rc4-git1.Pruned_SVG_Diagram.svgz
> -r--r--r-- 1 testr testr 138401 Feb 11 17:10 
> DevUrandom_bootchart-2.6.28.4.BootChart_Report.tgz
> -r--r--r-- 1 testr testr  80691 Feb 11 17:10 
> DevUrandom_bootchart-2.6.28.4.Not-Pruned_SVG_Diagram.svgz
> -r--r--r-- 1 testr testr  21136 Feb 11 17:10 
> DevUrandom_bootchart-2.6.28.4.Pruned_SVG_Diagram.svgz
> -r--r--r-- 1 testr testr 152979 Feb 11 17:10 
> DevUrandom_bootchart-2.6.29-rc4-git1.BootChart_Report.tgz
> -r--r--r-- 1 testr testr  78323 Feb 11 17:10 
> DevUrandom_bootchart-2.6.29-rc4-git1.Not-Pruned_SVG_Diagram.svgz
> -r--r--r-- 1 testr testr  20745 Feb 11 17:10 
> DevUrandom_bootchart-2.6.29-rc4-git1.Pruned_SVG_Diagram.svgz
> 
> But for the sake of convenience, I tar them all as 
> "Dev-Random_regression_on_post-2.6.28_kernels.tar"
> 
> In hope my report will prove useful.
> 
> Sincerely,
> Valentin QUEQUET
> 
> n.b. : Don't hesitate to ask me for more files or explanations.
> 
Comment 5 Milan Broz 2009-02-11 11:28:57 UTC
Andrew Morton wrote:
> (cc dm-devel)
> 
> On Wed, 11 Feb 2009 17:27:42 +0100 Valentin QUEQUET
> <v.quequet-techniques@orange.fr> wrote:
> 
>> I've finally found why my computer seems to hang (pause) quite lengthy 
>> when I boot Pristine Linux 2.6.29-rcX... instead of Pristine Linux 
>> 2.6.28.4 (for example).
>>
>> The reason is that the cryptographic keys generation for the Device 
>> Mapper takes longer with 2.6.29 than with 2.6.28 under certain 
>> circumstances.
> 
> So it's device-mapper userspace?

No. cryptsetup (which is probably "device-mapper userspace" here) reads
/dev/random only during luksFormat or during manipulating with keyslots
(adding key for example).

The situation you are talking about is when you have for example swap
encrypted with random key. It is initscripts which owns /etc/crypttab
and which just tell cryptsetup "use /dev/random as keyfile".

Also initscripts are responsible for loading of random seed to 
properly initialize RNG *before* this.

Most distributions uses two steps - mount volume with /var
(where is the random seed stored) and later mount encrypted volumes
using random key.

I do not know if the delay in new kernel is bug, but the problem
with lack of entropy during system boot is "known" problem.
(Imagine 128bit random key which use fast-generated key with only
few random bits because of lack of entropy... better to not
use encryption at all then use such key!)

(if you use LUKS, the random key is generated during luksFormat and
you do not need random data (entropy) on activation, you just need
enter known passphrase to unlock keyslot with the volume key.)

Milan
--
mbroz@redhat.com
Comment 6 Valentin QUEQUET 2009-02-11 13:00:01 UTC
Note : My answer(s) follow(s) Milan's post,
          with a few exceptions sclattered throughout his reply, but 
resumed further though.

Milan Broz wrote :
> Andrew Morton wrote:
>> (cc dm-devel)
>>
>> On Wed, 11 Feb 2009 17:27:42 +0100 Valentin QUEQUET
>> <v.quequet-techniques@orange.fr> wrote:
>>
>>> I've finally found why my computer seems to hang (pause) quite lengthy 
>>> when I boot Pristine Linux 2.6.29-rcX... instead of Pristine Linux 
>>> 2.6.28.4 (for example).
>>>
>>> The reason is that the cryptographic keys generation for the Device 
>>> Mapper takes longer with 2.6.29 than with 2.6.28 under certain 
>>> circumstances.
>> So it's device-mapper userspace?

I don't know ; sorry for not knowing everything.

> 
> No. cryptsetup (which is probably "device-mapper userspace" here) reads
> /dev/random only during luksFormat or during manipulating with keyslots
> (adding key for example).
> 
> The situation you are talking about is when you have for example swap
> encrypted with random key. It is initscripts which owns /etc/crypttab
> and which just tell cryptsetup "use /dev/random as keyfile".

I use the following config file under Debian Lenny/Sid :

Config File "/etc/intitab" contains:

{

   # <target name> <source device>         <key file>      <options>
   crswap_hda2 /dev/hda2 /dev/random swap,cipher=aes-cbc-essiv:sha256
   crtmp_hda5 /dev/hda5 /dev/random tmp,cipher=aes-cbc-essiv:sha256

}

> Also initscripts are responsible for loading of random seed to 
> properly initialize RNG *before* this.
> 
> Most distributions uses two steps - mount volume with /var
> (where is the random seed stored) and later mount encrypted volumes
> using random key.

I didn't know that either ; excuse, please, my great ignorance.

> I do not know if the delay in new kernel is bug, but the problem
> with lack of entropy during system boot is "known" problem.
> (Imagine 128bit random key which use fast-generated key with only
> few random bits because of lack of entropy... better to not
> use encryption at all then use such key!)

It's even not a problem ; one must know that GOOD RANDOMNESS requires 
TIME to collect ENVIRONMENTAL NOISE ; and that TRUE RANDOMNESS is 
impossible without a dedicated device like a Lava Lamp, ... .

> (if you use LUKS, the random key is generated during luksFormat and
> you do not need random data (entropy) on activation, you just need
> enter known passphrase to unlock keyslot with the volume key.)

I don't plan this alternative though.

However, I consider PassPhrase-Seeded cryptographic keys for some 
purpose, maybe, but NOT FOR SWAP or /TMP directory. (In case of a 
keylogger ...)

> Milan
> --
> mbroz@redhat.com

Hello the hurd,

To resume, 2.6.29-rcX is harder than 2.6.28.Y at providing /dev/random 
output towards userspace.

Maybe, the kernel itself makes a personal use of this entropy pool for, 
let's say, processes' memory layout randomization ??????

I know nothing about Dear Linux kernel !


In hope my report will prove useful,

Sincerely,
Valentin QUEQUET
Comment 7 Alan 2009-03-19 05:03:21 UTC
Newer kernels are fussier about what they consider "random" but that is not a bug rather some timeable and observable sources are no longer eligible.

I'm closing this because

a) its a policy decision for the user/userspace about what level of randomness they want
b) the kernel can't create entropy out of thin air
c) the kernel provides a facility for user space to save/restore entropy into the random device - although clearly that has its own security considerations

Thus it's not something the kernel can do anything about.