Bug 14638 - device mapper unaccounted null pointer
Summary: device mapper unaccounted null pointer
Status: CLOSED INVALID
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: LVM2/DM (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Alasdair G Kergon
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-11-19 00:38 UTC by bugzilla
Modified: 2009-11-21 00:19 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.32-rc7
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description bugzilla 2009-11-19 00:38:00 UTC
I've been using cryptsetup to access an external hard drive of mine via an encrypted mapping, pointing postgresql towards it for (relatively) steady disk access. I upgraded to Mainline (2.6.32-rc7) and found that within a day or so the device mapper would somehow mysteriously "lose" its mapping entirely. The mount point remains mounted, but cannot even be statted, as "ls: cannot access /media/extern". When I umount /media/extern, it works without error, but "cryptsetup remove extern" fails permanenty, necessitating a full reboot.

cryptsetup I'm pretty sure is a front-end for the device mapper, using the dm-crypt module. Not sure if that would specifically be a problem. What happens is I type "cryptsetup status extern" and it reports to me the following:

/dev/mapper/extern is active:
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  device:  (null)
  offset:  0 sectors
  size:    310472190 sectors
  mode:    read/write

where it says "device: (null)" it should obviously say "device: /dev/sdb1". It initially did say that, and worked perfectly normally. But some unknown trigger (while I was away from the computer to boot) causes the pointer to the device to be overwritten with a null pointer.

I don't know much about the device mapper, or how to get more detailed statistics, or any possible solutions to the problem. But I didn't see it posted as a bug, so thought I'd point it out so wiser folks than me can know what to look out for. I'm going back to 2.6.31.6 to see if the problem is there too, or if it was a recent addition.

I'm on an amd64 system, with an external USB hard drive enclosure that's never given me any problems before. It's still powered up and there's no error regarding it in dmesg, and fdisk -l /dev/sdb gives me perfectly normal readings, so I don't think it's a hard disk failure, but instead a failure of something in the mount-point/device mapper/device chain.
Comment 1 Andrew Morton 2009-11-19 23:36:41 UTC
reassigned to lvm/dm
Comment 2 Alasdair G Kergon 2009-11-20 00:10:50 UTC
Please supply diagnostics from dmsetup:
  dmsetup info -c
  dmsetup table
  dmsetup status

Look for the major:minor device number on the 'crypt' table line, then check the state of that device directly.

(NB If you have an old version of dmsetup, delete the encryption key before posting this output!  Current versions remove it automatically unless you add --showkeys.)
Comment 3 bugzilla 2009-11-20 05:07:52 UTC
Damn I already reset. If it happens again I'll run those diagnostics. I goofed and typed dm-setup so thought it was just not installed.
Comment 4 Milan Broz 2009-11-20 08:29:40 UTC
The (null) in cryptsetup status output can mean that cryptsetup simply cannot find underlying device in /dev (according to its major:minor pair). This is not kernel pointer btw, only userpace problem.
(I'll probably change that (null) to "device not found" or so.)

It can happen when you unplug USB disk while mapping is still active. Note that if you plug the device back, it is mapped to different device, the old mapping remains dead. Power glitch, cable malfunction etc can cause this also.
Comment 5 Milan Broz 2009-11-20 08:31:28 UTC
(Read syslog - any disconnect or error message before that?)
Comment 6 bugzilla 2009-11-21 00:14:09 UTC
It's a relief that it's not going to dereference a null pointer in my kernel! What would also be good is if "cryptsetup remove" would still remove the mapping, even if the device is not found. I didn't unplug the device, but it might have hiccupped. It always maps to /dev/sdc but the major:minor pair may have changed (didn't check that).
Comment 7 Alasdair G Kergon 2009-11-21 00:19:04 UTC
Doesn't look like there's anything to change in the kernel here, so closing this.  Follow up on the dm-crypt@saout.de mailing list if necessary.

Note You need to log in before you can comment on or make changes to this bug.