Bug 85481 - ECC errors (EBADMSG) reading UBI fs
Summary: ECC errors (EBADMSG) reading UBI fs
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Flash/Memory Technology Devices (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: David Woodhouse
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-02 20:33 UTC by Angelo
Modified: 2016-08-29 14:28 UTC (History)
1 user (show)

See Also:
Kernel Version: 3.16.2
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Angelo 2014-10-02 20:33:55 UTC
Dear,

i have just ported kernel 3.16.2 to work on an embedded system, composed by:

ARM CORTEX AM1808 (TI) cpu
nand flash mt28f1g08abb

Once flashed the rootfs.ubi image on the mtd partition, i get errors as soon as the file ssytem get mounted.

Seems the first ecc error is detected as soon as the kernel driver starts to
read the ubifs (so file system) data part of the rootfs.ubi image.

Before reading the file system data, so attaching, there is no ecc error detected at all.

I added traces on some kernel file as nand_base.c.


Ubi scanning / attaching  ...

nand_read_page_hwecc_oob_first page    :3659
nand_read_page_hwecc_oob_first correct p:c883d800 p[0]:p[1] 00:00 i:0 eccpos[i]:06 ecc_code[i]:0b;
nand_read_page_hwecc_oob_first correct p:c883da00 p[0]:p[1] 00:00 i:10 eccpos[i]:16 ecc_code[i]:58;
nand_read_page_hwecc_oob_first correct p:c883dc00 p[0]:p[1] 00:00 i:20 eccpos[i]:26 ecc_code[i]:cf;
nand_read_page_hwecc_oob_first correct p:c883de00 p[0]:p[1] 00:00 i:30 eccpos[i]:36 ecc_code[i]:8b;
nand_read_page_hwecc_oob_first page    :3660
nand_read_page_hwecc_oob_first correct p:c883e000 p[0]:p[1] 00:00 i:0 eccpos[i]:06 ecc_code[i]:9b;
nand_read_page_hwecc_oob_first correct p:c883e200 p[0]:p[1] 00:00 i:10 eccpos[i]:16 ecc_code[i]:f1;
nand_read_page_hwecc_oob_first correct p:c883e400 p[0]:p[1] 00:00 i:20 eccpos[i]:26 ecc_code[i]:26;
nand_read_page_hwecc_oob_first correct p:c883e600 p[0]:p[1] ff:ff i:30 eccpos[i]:36 ecc_code[i]:3f;
UBI: volume 0 ("rootfs") re-sized from 205 to 456 LEBs
UBI: attached mtd6 (name "rootfs", size 60 MiB) to ubi0
UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
UBI: min./max. I/O unit sizes: 2048/2048, sub-page size 512
UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
UBI: good PEBs: 480, bad PEBs: 0, corrupted PEBs: 0
UBI: user volume: 1, internal volumes: 1, max. volumes count: 128
UBI: max/mean erase counter: 1/0, WL threshold: 4096, image sequence number: 272604537
UBI: available PEBs: 0, total reserved PEBs: 480, PEBs reserved for bad PEB handling: 20
UBI: background thread "ubi_bgt0d" started, PID 995
gpio-keys gpio-keys.0: Failed to request GPIO 126, error -517
platform gpio-keys.0: Driver gpio-keys requests probe deferral
omap_rtc da830-rtc: setting system clock to 2014-10-02 15:59:28 UTC (1412265568)
ALSA device list:
  No soundcards found.

*** reading the file system here ***

At page 3712 there is the first of the file system blocks
3712        3713           3714         3715
EC HEADER  |  VID HEADER  |  fs data   |   fs data   etc
                           ^
                           ^

nand_read_page_hwecc_oob_first page    :3714
nand_read_page_hwecc_oob_first error   p:c7906000 p[0]:p[1] 31:18 i:0 eccpos[i]:06 ecc_code[i]:1f;    <<< ERROR
nand_read_page_hwecc_oob_first correct p:c7906200 p[0]:p[1] 00:00 i:10 eccpos[i]:16 ecc_code[i]:00;
nand_read_page_hwecc_oob_first correct p:c7906400 p[0]:p[1] 00:00 i:20 eccpos[i]:26 ecc_code[i]:00;
nand_read_page_hwecc_oob_first correct p:c7906600 p[0]:p[1] 00:00 i:30 eccpos[i]:36 ecc_code[i]:00;
ecc_failed !!
nand_read_page_hwecc_oob_first page    :3715
nand_read_page_hwecc_oob_first correct p:c7906800 p[0]:p[1] 00:00 i:0 eccpos[i]:06 ecc_code[i]:00;
nand_read_page_hwecc_oob_first correct p:c7906a00 p[0]:p[1] 00:00 i:10 eccpos[i]:16 ecc_code[i]:00;
nand_read_page_hwecc_oob_first correct p:c7906c00 p[0]:p[1] 00:00 i:20 eccpos[i]:26 ecc_code[i]:00;
nand_read_page_hwecc_oob_first correct p:c7906e00 p[0]:p[1] 00:00 i:30 eccpos[i]:36 ecc_code[i]:00;
UBI warning: ubi_io_read: error -74 (ECC error) while reading 4096 bytes from PEB 2:4096, read only 4096 bytes, retry


I am tracing the first 2 bytes only of each 512B eccblock.
I verified, first 2 bytes with errors (0x31, 0x18) are sane, as in the rootfs.ubi file.

So, i am supposing these errors are caused from a misalignment from u-boot and kernel davinci / nand drivers that calculate the ecc values.

I am using this ecc layout from my board.c:

/* NAND ECC modified to reflect the DaVinci RBL layout (i.e. 512B rather than 2kB)
 * patch from http://processors.wiki.ti.com/index.php/DM365_Nand_ECC_layout
 */
static struct nand_ecclayout ipam390_nand_ecclayout = {
	.eccbytes	= 40,
	.eccpos		= {6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
			  22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
			  38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
			  54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
	},
	.oobfree	= {{2, 4}, {16, 6}, {32, 6}, {48, 6} },
};


I tried also to boot a jffs2 image, same issues, same kind of ecc errors.

As nand writer i used u-boot:
U-Boot 2014.07-03397-gab92542 (Oct 02 2014 - 16:14:43)
Comment 1 Angelo 2014-10-03 07:58:28 UTC
I did another sime test:

1) from U.Boot 2014.07, that should be aligned with recent kernel,

U-Boot > ubi part rootfs
UBI: attaching mtd1 to ubi0
UBI: physical eraseblock size:   131072 bytes (128 KiB)
UBI: logical eraseblock size:    129024 bytes
UBI: smallest flash I/O unit:    2048
UBI: sub-page size:              512
UBI: VID header offset:          512 (aligned 512)
UBI: data offset:                2048
UBI: volume 0 ("rootfs") re-sized from 202 to 472 LEBs
UBI: attached mtd1 to ubi0
UBI: MTD device name:            "mtd=6"
UBI: MTD device size:            60 MiB
UBI: number of good PEBs:        480
UBI: number of bad PEBs:         0
UBI: max. allowed volumes:       128
UBI: wear-leveling threshold:    4096
UBI: number of internal volumes: 1
UBI: number of user volumes:     1
UBI: available PEBs:             0
UBI: total number of reserved PEBs: 480
UBI: number of PEBs reserved for bad PEB handling: 4
UBI: max/mean erase counter: 1/0
U-Boot > ubi info 
UBI: MTD device name:            "mtd=6"
UBI: MTD device size:            60 MiB
UBI: physical eraseblock size:   131072 bytes (128 KiB)
UBI: logical eraseblock size:    129024 bytes
UBI: number of good PEBs:        480
UBI: number of bad PEBs:         0
UBI: smallest flash I/O unit:    2048
UBI: VID header offset:          512 (aligned 512)
UBI: data offset:                2048
UBI: max. allowed volumes:       128
UBI: wear-leveling threshold:    4096
UBI: number of internal volumes: 1
UBI: number of user volumes:     1
UBI: available PEBs:             0
UBI: total number of reserved PEBs: 480
UBI: number of PEBs reserved for bad PEB handling: 4
UBI: max/mean erase counter: 1/0

U-Boot > ubi part rootfs
UBI: mtd1 is detached from ubi0
UBI: attaching mtd1 to ubi0
UBI: physical eraseblock size:   131072 bytes (128 KiB)
UBI: logical eraseblock size:    129024 bytes
UBI: smallest flash I/O unit:    2048
UBI: sub-page size:              512
UBI: VID header offset:          512 (aligned 512)
UBI: data offset:                2048
UBI error: ubi_io_read: error -74 while reading 64 bytes from PEB 478:0, read 64 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 64 bytes from PEB 479:0, read 64 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 479:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 477:512, read 512 bytes
UBI error: ubi_io_read: error -74 while reading 512 bytes from PEB 478:512, read 512 bytes

So at second re-attach, error is there. 
This is the same behavior of the kernel, that at first attach seems to damage the image. Still investigating.
Comment 2 Angelo 2014-10-03 08:13:32 UTC
And i confirm, the same to happen on kernel 3.16.2:

Did this test:

Let the kernel attach UBI volume on empty partition (nand erase.part done from u-boot).

mmc0: host does not support reading read-only switch. assuming write-enable.
mmc0: new high speed SDHC card at address 1234
mmcblk0: mmc0:1234 SA04G 3.63 GiB 
 mmcblk0: p1
platform leds-gpio: Driver leds-gpio requests probe deferral
looking .. list name davinci-mcbsp.0, my name davinci-pcm-audio
looking .. list name snd-soc-dummy, my name davinci-pcm-audio
barix-ipam390 barix-ipam390.0: ASoC: platform davinci-pcm-audio not registered
platform barix-ipam390.0: Driver barix-ipam390 requests probe deferral
UBI: scanning is finished
UBI: empty MTD device detected
UBI: attached mtd6 (name "rootfs", size 60 MiB) to ubi0
UBI: PEB size: 131072 bytes (128 KiB), LEB size: 129024 bytes
UBI: min./max. I/O unit sizes: 2048/2048, sub-page size 512
UBI: VID header offset: 512 (aligned 512), data offset: 2048
UBI: good PEBs: 480, bad PEBs: 0, corrupted PEBs: 0
UBI: user volume: 0, internal volumes: 1, max. volumes count: 128
UBI: max/mean erase counter: 0/0, WL threshold: 4096, image sequence number: 2453409313
UBI: available PEBs: 456, total reserved PEBs: 24, PEBs reserved for bad PEB handling: 20
UBI: background thread "ubi_bgt0d" started, PID 994

** looping here (for (;;)) after driver prijnt this message. **

Restart the kernel:
.....
TCP: cubic registered
NET: Registered protocol family 17
platform leds-gpio: Driver leds-gpio requests probe deferral
barix-ipam390 barix-ipam390.0: ASoC: platform davinci-pcm-audio not registered
platform barix-ipam390.0: Driver barix-ipam390 requests probe deferral
UBI: attaching mtd6 to ubi0
UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 0:0, read only 64 bytes, retry
UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 0:0, read only 64 bytes, retry
mmc0: host does not support reading read-only switch. assuming write-enable.
UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 0:0, read only 64 bytes, retry
mmc0: new high speed SDHC card at address 1234
UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 0:0, read 64 bytes
mmcblk0: mmc0:1234 SA04G 3.63 GiB 
 mmcblk0: p1
platform leds-gpio: Driver leds-gpio requests probe deferral

...
Comment 3 Angelo 2014-10-03 08:25:04 UTC
Did still a check, 
left 
	.ecclayout

unset in my board.c, to let the default be used.

Still, erased partition (empty), booted 2 times (with the for(;;) block, ) 
at second time i have same ecc errors.

barix-ipam390 barix-ipam390.0: ASoC: platform davinci-pcm-audio not registered
platform barix-ipam390.0: Driver barix-ipam390 requests probe deferral
UBI: attaching mtd6 to ubi0
UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 0:0, read only 64 bytes, retry
UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 0:0, read only 64 bytes, retry
UBI warning: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB 0:0, read only 64 bytes, retry
UBI error: ubi_io_read: error -74 (ECC error) while reading 64 bytes from PEB
UBI warning: ubi_io_read: error -74 (ECC error) while reading 512 bytes from PEB 0:512, read only 512 bytes, retry
UBI warning: ubi_io_read: error -74 (ECC error) while reading 512 bytes from PEB 0:512, read only 512 bytes, retry
UBI warning: ubi_io_read: error -74 (ECC error) while reading 512 bytes from PEB 0:512, read only 512 bytes, retry
Comment 4 Angelo 2014-10-06 20:30:14 UTC
Disabling sub-pages write in nand_base.c i have the driver wrking properly.

So this must be connected to the fact my nand(mt28f1g08abb) memory doesn't have 512B subpages.
So mtd should detect it.
Comment 5 karl.beldan 2016-08-29 14:28:09 UTC
https://lkml.org/lkml/2016/8/29/71 should fix that.

Note You need to log in before you can comment on or make changes to this bug.