Bug 216453 - scsi: megaraid_sas: possible null pointer dereference in megasas_slave_alloc()
Summary: scsi: megaraid_sas: possible null pointer dereference in megasas_slave_alloc()
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: SCSI (show other bugs)
Hardware: ARM Linux
: P1 normal
Assignee: linux-scsi@vger.kernel.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-09-06 07:40 UTC by Zixuan Fu
Modified: 2022-09-06 07:40 UTC (History)
0 users

See Also:
Kernel Version: 5.10.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Zixuan Fu 2022-09-06 07:40:41 UTC
Hello,

Our fault injection tool finds a possible null-pointer dereference in the
megaraid_sas driver in Linux 5.10.0:

In the file drivers/scsi/megaraid/megaraid_sas_base.c:
In megasas_get_seq_num(), the call to dma_alloc_coherent() may fail: 
6459: el_info = dma_alloc_coherent(&instance->pdev->dev,
                                   sizeof(struct megasas_evt_log_info),
                                   &el_info_h,
                                   GFP_KERNEL);

This error is propagated to its caller megasas_start_aen().
6749: if (megasas_get_seq_num(instance, &eli))
6750:     return -1;

Then it is propagated again to its caller megasas_probe_one().
7428: if (megasas_start_aen(instance)) {
7429:     dev_printk(KERN_DEBUG, &pdev->dev, "start aen failed\n");
7430:     goto fail_start_aen;
7431: }

In error handling code of megasas_probe_one(), it removes the pointer
`instance` from `megasas_mgmt_info.instance`:
7445: megasas_mgmt_info.instance[megasas_mgmt_info.max_index] = NULL;

But it stores the pointer `instance` in the pdev by calling pci_set_drvdata()
before and do nothing about it in error handling code:
7401: pci_set_drvdata(pdev, instance);

Then, in another thread, megasas_slave_alloc() is called. This function calls
megasas_lookup_instance() to get the pointer `instance`, which can not be
found in `megasas_mgmt_info.instance`. Therefore, NULL is returned:
2087: instance = megasas_lookup_instance(sdev->host->host_no);

This causes a null-pointer dereference bug:
2095: if ((instance->pd_list_not_supported || 
           instance->pd_list[pd_index].driveState == MR_PD_STATE_SYSTEM))

If we just add a check for `instance`, another bug is found.
megasas_fault_detect_work() is called by a thread. and it retrieves the
pointer `instance` from `work`:
In the file drivers/scsi/megaraid/megaraid_sas_base.c:
1901: struct megasas_instance *instance = 
        container_of(work, struct megasas_instance, fw_fault_work.work);

Because the structure `instance` points to is broken, the following calls
about `instance` causes some page-faults:
1907: fw_state = instance->instancet->read_fw_status_reg(instance) &
                 MFI_STATE_MASK;
1911: dma_state = instance->instancet->read_fw_status_reg(instance) &
                  MFI_STATE_DMADONE;
...

I am not quite sure how to fix this possible bug. Any feedback would be
appreciated, thanks!

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>

Best wishes,
Zixuan Fu

Note You need to log in before you can comment on or make changes to this bug.