Bug 207001

Summary: Kernel panic: null pointer dereference in cfi_check_err_status
Product: Drivers Reporter: Ewald Comhaire (e.comhaire)
Component: Flash/Memory Technology DevicesAssignee: David Woodhouse (dwmw2)
Status: NEW ---    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.4.x and later Subsystem:
Regression: No Bisected commit-id:
Attachments: Patch

Description Ewald Comhaire 2020-03-27 14:53:10 UTC
Created attachment 288093 [details]
Patch

Certain JEDEC compliant flash chips e.g. SST 39LF040 cause a kernel panic when being written to (e.g. simple fw_setenv command).  Those chips all confirm to the AMD command set (cfi_cmdset_0002).

The problem was identified a section from /drivers/mtd/chips/cfi_cmdset_0002.c:

static int cfi_use_status_reg(struct cfi_private *cfi)
{
	struct cfi_pri_amdstd *extp = cfi->cmdset_priv;
	u8 poll_mask = CFI_POLL_STATUS_REG | CFI_POLL_DQ;

	return extp->MinorVersion >= '5' &&
		(extp->SoftwareFeatures & poll_mask) == CFI_POLL_STATUS_REG;


It turns out that "extp" is not being initialized for all chips in this family and thus possibly (often) a null pointer.

The problem was introduced with commit 99a125f: https://github.com/torvalds/linux/commit/99a125f8edec391e423962847c6fd1d6994f0ad8#diff-ce19a4da6dab0e4e75d69a371602658d

Verified patch in attachment. Note that it is not needed to inline the function, but in my case performance was down compared to kernel 4.x (e.g. 4.19.x) since this function is called thousands of times per second in my use case.
Anyhow, the C compiler will decide whether to inline depending on chosen optimization mode.